Particle Selection Bakeoff

To discuss more productively various aspect in automatic particle selection at the Workshop on Automatic Particle Selection, we propose that participants test their methods using a common dataset before the workshop. During the workshop, we will compare results in a bakeoff. As a starting point, we have set up a web page to distribute an annotated image dataset of around 1000 keyhole limpet hemocyanin (KLH) particles. The web site can be found at: http://ami.scripps.edu/prtl_data.  We strongly encourage all participants to present results at the workshop using either MANUALLY or AUTOMATICALLY picked KLH particles from the KLH dataset.  For those of you who already have programs or algorithms for particle selection, we will highly encourage you to do so.  In addition, we welcome any manual selected results from this dataset to help evaluate automatically selected particles.

Participating in the bakeoff will help us better understand the strengths and weaknesses of various methods, as well as the urgent need of and problems involved in establishing benchmark particle datasets for automatic particle selection. Although selecting KLH particles is a relatively "easy" problem to approach, as the particles are large, symmetric and readily visible, we hope that this will serve as a common basis to provide some insight into the performance of various approaches for automatic particle selection.

A specification for the bakeoff is drafted in the following including how to prepare results, deadline for sending in your results, and how to assess different picks. Please send any opinions, suggestions, and feedback regarding the specification, especially on how to assess picks to the workshop contact at zhu4@scripps.edu.

Hand In Your Results

The bakeoff will be limited on picking side-view (rectangular) KLH particles in far-from-focus images named like 01nov26b.???.???.???.??2.mrc. Use the name xxx.002.txt for the generated file that contains coordinates of side-view KLH particles picked in image named xxx.002.mrc, with the origin of the coordinate system being at the bottom-left corner of the image. Each row of file xxx.002.txt should record coordinates of only one particle. If no particle is picked from an image file, no coordinates file should be generated. For example, for image file 01nov26b.012.007.001.002.mrc, your program might generate a file called 01nov26b.012.007.001.002.txt with the following contents:

1x1y1
2x2y2
3x3y3
...

       Note: the first column represents the number of particle, the second and third columns are respectively horizontal and vertical coordinates of picked particles.

Also create a file README containing any information that is important or helpful for other people to understand your results. For instance, you may briefly describe your algorithm and some design decisions you make in implementing the computer code.

When you get the coordinates files and README file ready in good working order, copy these files into a empty directory and make sure that the directory does not contain additional files. Then use the Unix tar or winzip command to make your files into one archive file named in the form of yyy.tar or yyy.zip, where yyy stands for your last name. Finally send in the tar file as an attachment via e-mail before the deadline of March 31, 2002.

Assess Your Picks

As we know, even for experts, the final set of particles selected from the same set of images may vary from person to person. Even for the same expert, his/her criteria of determining whether to pick a particle may change with time (that is, from image to image) during a single experiment session. This raises the question How do we build a truth dataset of single particles to evaluate machine algorithms? We do not have an immediate answer to this question right now, but this should be a discussion topic during the workshop. For this reason, we currently propose to assess results generated using different approaches, including the manual picked results we already have, by comparing one result against another~Rs. By doing this, we will be able to generate a confusion matrix like the one shown in Table 3 on our particle picking web page (http://ami.scripps.edu/prtl_data/klh/index.htm). This will be discussed at the workshop.

Back to the workshop page