Loading

 

3.2 Use Cases Overview

Tree Star analyzed three use case datasets.

Synthetic data allows us to model the workflow of our analysis using known results. These data sets have been developed to rate classification algorithms. Synthetic data has two key advantages over real data when the description of algorithms is considered: 1) the correct classification of each event is known, and 2) data can model specific confounding factors, which particular algorithms should be able to handle with varying degrees of success. Matching particular data sets to algorithms that are well suited to the characteristics of said data should improve the quality of the resulting classifications.

For the GvHD and SIV use cases, we have learned that there is more variability between manual gating than we had expected. As expected, variance between human classifiers decreased with training. We need to have open communication with the contributing collaborators about their gating decision logic -- to refine the standard operating procedures for the data analysis, and to decrease the variance between manual gaters. We have utilized the MiFlowCyt standard for describing the use cases.

Tree Star is in the process of working with Ryan Brinkman's and Joern Schmitz's labs to clarify and update the standard procedures for the next iteration of this project. We are building the infrastructure to automate the generation and processing of workspaces for human classification, which will enable larger scale experiments on the nuances of gating.

Please see these reports for more information:

3.2.1 Synthetic Data
3.2.2 Graph vs. Host Disease (GvHD)
3.2.3 Simian Immunodeficiency Virus (SIV)