Monday, September 7, 2009

Activity 14 - Pattern Recognition

Classifying objects accurately based on visual information is one of the most basic yet interesting and amazing representation of how the human brain and the human senses work together. Characterization of objects usually starts with identifying the kinds or classes of objects. This can be done by first defining the basic features of the objects in a class, usually by shape, size or color, that can be used to easily discriminate objects among different classes. The human brain stores this knowledge of features of objects in a class and also determining the main differences in the basic features among classes. Using this information, new objects can be classified based on the basic features compared with that defined for the classes, finding out to which class its set of features most closely resembles.
This is the model used by computer vision for automatically classifying objects based on visual information. The process revolves around pattern recognition by creating the set of basic features for an object to serve as pattern. Image processing techniques can be applied to provide the automated gathering of basic feature sets. Classes are defined by the patterns of objects belonging to that class. Then, new objects, having features to be classified are determined on which class they fall into.
Minimum distance classification is one way of quantifying the resemblance of a pattern of an unclassified object to the patterns of different classes of objects. The principle basically follows the fact that the object can be classified to a class to which its set of features has the least distance from the mean of the features of the objects from that class. By having the set of features contained in a feature vector (each element represents a feature and each vector represents an object), the distance can be calculated using the Euclidean distance formula.
In this activity, red blood cells (RBCs) are classified whether normal (Erythrocyte) or crenated (Echinocyte). Crenation happens when a cell is exposed to a hypertonic solution, causing it to lose water by osmosis and shrink producing an abnormally shaped cell (http://en.wikipedia.org/wiki/Crenation). An image of a normal and crenated RBC is shown below (indicated by the arrow).

Normal


Crenated
(image taken from http://www.healthsystem.virginia.edu/internet/hematology/hessidb/alphabeticalglossary.cfm)

It is important to study the effects of the environment to a cell or to an individual whether it can survive in it or not. Analysis of RBCs in different solutions with different concentrations is one way of providing this study. Classifying normal and crenated RBCs and providing a statistics for the classified objects is therefore critical. However, sufficient number of samples may be required to make conclusions about the cell-environment interaction. Classifying a large number of RBCs for a sufficient sample size is a very tedious and time consuming task, which is why automation by computer vision has been a rapidly-developing technology. Automatic recognition of crenated and normal RBCs is demonstrated here.
Based on the change in shape of the crenated from normal RBC, a set of basic features can be created. Thresholding (im2bw()) was applied first to separate the cells in the image from the background. Labeling (bwlabel()) was then applied to clean the image from small fragments and incomplete cells (at the edges) and also to remove overlapping cells. Morphological operations should not be applied to remove these since this may alter the shape of the cells, which is critical in defining the features. Two trials were done using images with both normal and crenated RBCs as shown below.

Trial 1

Original

Thresholding


Labeling

Trial 2

Original
(image taken from http://www.isrvma.org/article/63_1_1.htm)


Thresholding


Labeling

From the results of image processing, five normal and five crenated RBCs were visually classified and taken to serve as the training set and to define the set of basic features for the corresponding class. The basic features chosen were (i) the ratio of the square of the perimeter versus the area, and (ii) the ratio of the standard deviation of the radius of cell versus the mean. These features were used so as to highlight the main difference in the shape of the normal and crenated RBCs. Also, dimensionless units should be used for the method to be invariant in terms of size.
The features were easily obtained by using the command follow() in Scilab to determine the coordinates of the contour of the cells. The perimeter is just the number of the coordinates of the contour, while the area is obtained using Green's theorem (Activity 2). The standard deviation of the cell is obtained from the coordinates of the contour by first subtracting the mean (x, y) to the (x, y) values of the contour. The set of radius was then obtained by Pythagorean formula using the resulting (x, y), and then calculating for the standard deviation. The mean of the set of features for the five objects would define the features for corresponding class.
Now, for each cell, the set of features was extracted to serve as the test set as described previously, and this was compared to the mean for the normal and the mean for the crenated RBCs. By minimum distance classification, if the set of features for that cell is closer to the mean for the normal RBCs, then it is classified as normal. Otherwise, it is crenated. The resulting classification was verified by comparison with visual classification.
Analysis was also done by looking at the scatter plot of the object features, shown below. This would determine if the features for a class are well separated from the other class. The training set features indeed are isolated from each other. However, the test set still has some of the cells 'creeping' into the region of the other class. There is a large deviation of the features for the crenated RBCs, as seen in the plots, more obviously in trial 2. However, it can still be noticed that there is a definite region to where most of the objects in a class fall into.

Trial 1

Trial 2
Align Center
From the minimum distance classification, the summary of the classification is presented in the table below. Based on the scatter plot, the results seem logical because for trial 1, the normal and crenated features have a more definite separation compared to trial 2. Moreover, the features for the crenated RBCs have a higher deviation from the mean compared to the normal and many crenated RBCs are very close to the normal RBC features. This produced a very low correctly classified percentage of the crenated RBCs. Since the normal RBCs are very close to the mean, it has a very high classification percentage. One major reason why there are better results for trial 1 is because the image has a higher resolution compared to that used for trial 2. The cells occupy more pixels, which helps in increasing the effect of the difference in shape from the normal and crenated RBCs. Low resolution images are difficult to handle for the classification process. The 'spikes' at the edge of the crenated RBCs would no longer be evident if it is represented by fewer pixels.



From the results obtained, this technique demonstrates a feasible method for computer vision in classifying RBCs. After initializing the process by taking known crenated and normal RBCs for the mean set of features, automatic classification can be applied to classify RBCs, for example, in a whole slide, or for different slides, using the same setup (magnificatio, camera saturation, etc.). Some of the limitations are (i) the mean set of features can only be used in the same setup, (ii) overlapping cells are not classified, and (iii) high resolution setups are needed for accurate cell classification.
For this activity, I would like to give myself a grade of 10 for doing a very good job. I used a very interesting sample and I think my classification is very satisfactory for this kind of sample. The discussions I provided are, somehow, also extensive.
I would like to thank our professor, Dr. Gay Jane Perez, for the guidance in doing this activity, and Ms. Jica Monsanto and Mr. Jay Samuel Combinido for their help in the process of determining features for classification.

Reference
M. Soriano, Applied Physics 186 Activity 14 - Pattern Recognition Manual, 2008.

No comments:

Post a Comment