The CATARACTS challenge paper has been accepted for publication in Medical Image Analysis. This dataset has been published on IEEE dataport.

Video Collection¶

The dataset consists of 50 videos of cataract surgeries performed in Brest University Hospital between January 22, 2015 and September 10, 2015. Reasons for surgery included age-related cataract, traumatic cataract and refractive errors. Patients were 61 years old on average (minimum: 23, maximum: 83, standard deviation: 10). There were 38 females and 12 males. Informed consent was obtained from all patients. Surgeries were performed by three surgeons: a renowned expert (48 surgeries), a one-year experienced surgeon (1 surgery) and an intern (1 surgery). Surgeries were performed under an OPMI Lumera T microscope (Carl Zeiss Meditec, Jena, Germany). Videos were recorded with a 180I camera (Toshiba, Tokyo, Japan) and a MediCap USB200 recorder (MediCapture, Plymouth Meeting, USA). The frame definition was 1920x1080 pixels and the frame rate was approximately 30 frames per second. Videos had a duration of 10 minutes and 56 s on average (minimum: 6 minutes 23 s, maximum: 40 minutes 34 s, standard deviation: 6 minutes 5 s). In total, more than nine hours of surgery have been video recorded.

Reference Standard¶

Tool Usage Annotation¶

All surgical tools visible in microscope videos were first listed and labeled by the surgeons (see Fig 1). Then, the usage of each tool in videos was annotated independently by two non-M.D. experts. A tool was considered to be in use whenever it was in contact with the eyeball. Therefore, a timestamp was recorded by both experts whenever one tool came into contact with the eyeball, and also when it stopped touching the eyeball. Up to three tools may be used simultaneously: two by the surgeon (one per hand) and sometimes one by an assistant. Annotations were performed at the frame level, using a web interface connected to an SQL database.



1. biomarker	2. Charleux cannula	3. hydrodissection cannula	4. Rycroft cannula	5. viscoelastic cannula	6. cotton	7. capsulorhexis cystotome

8. Bonn forceps	9. capsulorhexis forceps	10. Troutman forceps	11. needle holder	12. irrigation / aspiration handpiece	13. phacoemulsifier handpiece	14. vitrectomy handpiece

15. implant injector	16. primary incision knife	17. secondary incision knife	18. micromanipulator	19. suture needle	20. Mendez ring	21. Vannas scissors

Adjudication¶

Finally, annotations from both experts were adjudicated: whenever expert 1 annotated that tool A was being used, while expert 2 annotated that tool B was being used instead of A, experts watched the video together and jointly determined the actual tool usage. However, the precise timing of tool/eyeball contacts was not adjudicated. Therefore, a probabilistic reference standard was obtained:

0: both experts agree that the tool is not being used,
1: both experts agree that the tool is being used,
0.5: experts disagree.

Inter-rater agreement, before and after adjudication, is reported in Table 1.


Tool	Before adjudication	After adjudication
biomarker	0.835	0.835
Charleux cannula	0.949	0.963
hydrodissection cannula	0.868	0.982
Rycroft cannula	0.882	0.919
viscoelastic cannula	0.860	0.975
cotton	0.947	0.947
capsulorhexis cystotome	0.994	0.995
Bonn forceps	0.793	0.798
capsulorhexis forceps	0.836	0.849
Troutman forceps	0.764	0.764
needle holder	0.630	0.630
irrigation/aspiration handpiece	0.995	0.995
phacoemulsifier handpiece	0.996	0.997
vitrectomy handpiece	0.998	0.998
implant injector	0.980	0.980
primary incision knife	0.959	0.961
secondary incision knife	0.846	0.852
micromanipulator	0.990	0.995
suture needle	0.893	0.893
Mendez ring	0.941	0.953
Vannas scissors	0.823	0.823

Example of Result¶

Tool usage, during a typical surgery without any complications, is illustrated in Fig. 2.

Training and Test Sets¶

The dataset was divided into a training set (25 videos) and a test set (25 videos). Division was made in such a way that 1) each tool appears in the same number of videos from both subsets (plus or minus one) and 2) the test set only contains videos from surgeries performed by the renowned expert. Apart from that, division was made at random. In total, the training set contains 4 hours and 42 minutes of video and the test set contains 4 hours and 24 minutes of video.