3.3 Algorithm Results
2.3 Algorithm Methods
4.3 Algorithm Discussion
3.3.1 Magnetic Gating Report
3.3.2 Probability Bin Clustering Analysis
3.3.3 WEKA ANN Report
3.3.4 Support Vector Machine(SVM) Report
3.3.5 Automated Classifier Report
Magnetic Gates: Preliminary results showed some clear limitations of this supervised classifier, especially the tendency to favor large clusters over small ones, thus choosing statistical significance over biological significance. This is being resolved with referential gating, which can set the position of the gate based on statistical information derived from other gates (control populations).
Probability Bin Clustering Analysis (PBCA): Preliminary results confirmed clear limitations of both of these unsupervised classifiers, especially in their tendency to favor large clusters over small ones. We do not have the ability to constrain algorithms to specific sub-population calculations in any automated way. We evaluated a subset of the SIV Use Case Data to find CD4+ T-cells and CD8+ T-cells with induced cytokine responses using FlowJo's probability clustering tool. It was impossible to find the precise subsets, even with extensive manual guidance. Cytokine clusters are outliers whereas FlowJo's binning tool defines clusters based on event density. In samples where cytokine clusters are not there (negative control), PBCA does not recognize the context of the tube, so it fails to base its analysis on the control and spins each tube individually. Using Frequency Minus One (FMO) controls in the analysis would provide better modeling here. PBCA had difficulty analyzing the data in one shot because of the many parameters in this use case. Many biologically irrelevant clusters were formed, and time was required to figure out which clusters needed to be merged. This process (and errors) varied among tubes, so there is not a straightforward way to automate the process.
Artificial Neural Networks (ANNs) are traditional pattern recognition frameworks that use weighted directed graphs to map input patterns to an output classification [21]. We have tested a set of ANNs that represent the major categories using two platforms for implementation, WEKA and MATLAB.
Waikato Environment for Knowledge Analysis (WEKA) is an open suite of machine learning tools that were employed initially to test both the platform and the success of ANNs on flow data. The result of this evaluation was that a set of ANNs classified the SIV data with greater than 85% success using ten-fold cross-validation as a metric for evaluation. The WEKA environment was found to be limiting and was not used further, but the success rate despite the limitations encouraged further study of ANNs using MATLAB. The same MATLAB procedures were used for both ANNs and SVMs, and the outcomes will be discussed together.
Support Vector Machines (SVMs) are a class of regularized multivariate classification models that are widely used for predictive modeling of multidimensional data. Non-linear boundary problems are addressed using support vector machines by including a basis function that maps the input data into a transformed space that allows a linear discriminant to separate the classes. Using MATLAB, several experiments were performed for calibration and proof of concept. In this implementation we considered five ANN variants: a traditional feed forward network, a feed forward radial basis function network, a competitive network (LVQ), a probabilistic network, and a cascade forward network. We used two SVMs, one using a polynomial basis function, and the other a radial basis function. Complete descriptions of all classifiers are available in the SVM report. Initially all seven classifiers were used to process the synthetic data to demonstrate that classifying distinct populations was a trivial task to classifiers and to associate error rates with expected difficulty of classification. Calibration experiments were performed using the SIV data and lymphocyte identification as a metric. In these experiments appropriate sizes for training vectors, design of training data, and degrees for the polynomial basis function were determined. One thousand event-training vectors were selected for stability of result. Training data was created by sampling from multiple samples assigned to be controls, based on improved results compared to using a single training sample. All data is available in the classifier report. Positive training events were determined using the average score among experts as a metric. Events with scores of 1.000 were events considered positive by all experts and were used as exemplars. Events with average scores of less than 0.5 were considered negative. Choosing events that were universally classified as negative was not selected, as it was experimentally observed that the training data set became over-simplified and the resulting decision boundaries were unintuitive. The SVM and ANN classifiers were then used to identify the six populations at the bottom of the gating hierarchy, the populations expressing either CD4 or CD8 along with one of three cytokines. Match ratio was used to evaluate the results. Most classifiers were able to achieve an average match ratio exceeding 0.9.

Figure 1: Variance among multiple classification algorithms applied to SIV data.
We also looked at alternative methods of evaluating the quality of various classification results and the match ratio metric itself. The metrics are especially useful for identifying which events are most difficult to classify. We have used the match ratio to identify those events, and then for each user, and each automated classifier, have identified the events classified as positive and negative within the difficult to classify subsets, and created a profile of the events of each class for each user or classifier based on MFI. We then created a profile of universally agreed upon positive and were able to compare profiles to see if a user or automated classifier defined set of positive events, for example, among the difficult to classify subset matched the profile of the universally agreed upon positives. In doing this analysis we have shown that the automated classifiers match the patterns most closely. Figure 2 displays the improvement in minimizing the distance in centroid between the universally classified events and the disputed events for the classifiers versus the experts, averaged over several classifiers and twelve samples.
| % Improvement of classifiers compared to expert | |
|---|---|
| Robust Mean MFI | 38 ± 11 % |
Robust Mean Stdev |
9 ± 8% |
Figure 2: Averages for three pattern recognition tools, and 12 data files for comparison of MFI bothby median and standard deviation for pattern recognition compared to experts.
Both ANN and SVM can be evaluated through MATLAB or R scripts, whereas PBCA and magnetic gates will be calculated by FlowJo.