Loading

 

 

2.1.6.1.1 Software Requirements and Specificiations

Back to the Software Development Plan

Draft Version

Infrastructure: Database
The pipeline to process files from their original data, through analysis and meta-analysis is simple. Initially anticipated to use a simple file-system-based storage, the project has grown in scope to need a database at the center of its architecture. Support of external classifiers and our workflow necessitates three different intermediate states, as classifications are processed. Comparing classifiers introduces combinatorial explosion. Comparing metrics adds another dimension to the scoring matrix. An automated repository is essential to run large-scale experiments.

Requirements
A: Storage of raw data files, workspaces, reports and artifacts related to analysis.
B: Database associations of files to specific experiments, experts, and time series.
C: Ability to generate reports and statistics on experiments, by logical query.
D: Administrative interface for the administration of experiments and data files.
E: Client neutrality. Usable from a web applet, a plugin in FlowJo, or from R.

Design
The design pattern used in our repository is based on the common LAMP solution stack, using Linux, Apache, MySQL, and PHP in a standard combination. We add Tomcat as a wrapper around FlowJo, our existing cytometry analysis software. This allows us to run analysis on a server in a scripted environment.
A: The MySQL relational database (RDMS) [66] was selected for the secure storage of project data. It is a well-supported, industry-standard database solution.
B: The Apache Web server[67] in concert with the Tomcat Application Server[68] was selected for application hosting. Apache / Tomcat are common technologies that are very well-suited to the needs of reliable, high-performance data-centric applications.
C: The FlowJo Engine has been implemented as a TCP server application [69]. Multiple engine instances run on numerous servers, providing strong scalability and reliability.
D: Open scripting languages, such as Ruby, PHP, or Perl, are used for managing access to engine, analysis results, data collation, etc. The Java language [70] and Eclipse IDE [76] are used in tool creation, and tools are wrapped in Tomcat.
E: The system is designed around Linux[71] servers, but most Unix flavors (Solaris, BSD, OSX), as well as Microsoft platforms, are supported by all of these tools.

First Implementation is found at: http://flowdx.com/flowdx.sql

How does FlowJo communicate with tomcat?

Waiting to get a server Flowjo engine running on linux.

Can I ask for a certain file on that server.

Can I delete a file - it should fail.

Can I get a file list

Have there been updates to the db?

Are the workspaces in the db?

  • Schema – See flowdx.sql and get file table image
    Need new schema image
  • Location- on which server. – right now it is on Hood
  • Contents and status
  • URL of the user interface to view the match ratio results

Sample queries:

  1. For SIV, display a list of the available .wsps from the db and the target pops and allow me to select which to use for the consensus and which .wsp to compare against the consensus.
  2. Build a consensus from the expert gaters for all fcs files for all target populations for SIV. Let me look at these files and see how many events have a probability of inclusion of 0.5 (==ambiguous inclusion in the gates) Does one fcs file stand out as being difficult to gate?
  3. Compare all CD4 IL-2 populations (including those from algorithms) against the consensus of interns.
  4. Compare all analysis of GvHD against Agent1's gating (build tally with just Agent1 popmask).
  5. Show the consensus files (= tally file) for all manual gaters for SIV for all target populations and all fcs files.
  6. Create a .wsp file with the CD4 IL2 populations (exported .fcs files) for SIV monkey 2 for time-point March 29, 2007 and group by agent role (expert, intern, or algorithm). This will enable an overlay plot.
  7. Show the match ratio results of query 6 using the experts as the consensus.
    Calculate match ratio of GvHD CD4, CD8b population using experts as the consensus for patient1.
  8. Calculate the match ratio of ANNs on synthetic data using John, Aaron, and Maciej as the consensus.
  9. Show all .wsp and populations from the Intern gating.
  10. Show all the populations from all workspaces for fcs fileID.