1 / 35

Director’s Challenge IT Overview

Director’s Challenge IT Overview. NCICB. NCICB-SAIC. Agenda. Goal. Build a Microarray Data and Analysis Portal. Development Overview. Object and Data Models Software Development Process Application Architecture Currently Developed and Deployed Functionality.

sakina
Download Presentation

Director’s Challenge IT Overview

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Director’s Challenge IT Overview NCICB NCICB-SAIC

  2. Agenda • Goal • Build a Microarray Data and Analysis Portal • Development Overview • Object and Data Models • Software Development Process • Application Architecture • Currently Developed and Deployed Functionality • Future Enhancements (Use Cases) • Data Analysis • Biological Analysis (Integration with caBIO) • LIMS • EVS

  3. Overall Goal of Microarray Data Portal Transform Numerical Data into Biological Data • Provide Convenient Means of Submitting Experiments • Variety of Methods to Query the Database • Integrate and Develop Cluster and Pattern Analysis Tools • Integrate Ontology and Annotation Tools • Develop Architecture to Facilitate Items 1-4

  4. Reporting a Microarray Experiment • Experimental Data • Image Files • Data Files • Experimental Description • Purpose of Study • Experimental Details • Sample information • Clinical Data Standard Needed to Describe Microarray Experiment

  5. Microarray Standards • MIAME • Minimum Information About a Microarray Experiment • Experimental Design, Array Design, Hybridization, Samples, Measurements and Normalization Controls • MAML • Microarray Markup Language • XML Implementation of the MIAME Standard • Industry moving towards MAGEML • MAGEML • XML Implementation of the MIAME Standard • Formed Via Merge of MAML and GEML Standards • De Facto Widespread Industry Support

  6. Director’s Challenge Data Model • Based Upon MAGEML Object Model • Facilitates Data Exchange and Standard Upload • Support for Annotation, Ontology, and Analysis Tools • Tables Required for Integration with caBIO • Additional tables to hold clinical data in upcoming months • One of the First Public MAGEML Databases • Instantiated and Populated • Schema Available for Download • ErWIN Diagram Available for Download

  7. Director’s Challenge Artifacts • dc.nci.nih.gov/informatics • Object Model • Use Cases • Sequence Diagrams • Data Models • SQL script • ErWIN diagram • Java API • Links to Industry Standards • Evolution of Existing Microarray Tools

  8. Director’s Challenge Object Model • Based Upon the MAGEML Standard • Classes Model MAGEML Elements and Relationships • Objects Encapsulate Data and Methods to Access Data • Java Applications can Easily Exchange Objects • Objects Written to MAGEML (XML) for Non-Java Applications • Integration with NCICB caBIO Objects

  9. NCICB Development Standards • Java Programming Language • Objects Encapsulate Data and Methods to Manage Data • Many API’s to Facilitate Rapid Application Development • Open Standards • Java Community Process • Industry Standards (i.e., MAGEML) • Open Source Architecture • Web and Application Servers (Apache,Tomcat,JBoss) • No Database-Specific Code (Triggers,Stored Procedures) • Open Access • XML, HTTP, SOAP, RDF– Variety of Languages

  10. Software Development Process • Rational Unified Process (RUP) • Use Cases to Capture Business Requirements • Sequence Diagrams Mapping Process Across Application Layers • Class Diagrams to Map Business Concepts to Class Objects • eXtreme Programming • Assignments Partitioned Between Small Teams • Application Segmented into Smaller Deliverables • Tailored Specifically for Dynamic Requirements Environment NCICB-SAIC Approach Combines Both, Resulting in an Iterative, Flexible, and Highly Responsive Software Development Process

  11. Director’s Challenge API • Design Patterns– Judicious Use of GoF and J2EE Patterns • DynamicJavaBean– Versatile Implementation of JavaBean • Object Factories– Control of Object Instantiation • UserProfileBean– Customizes User Experience • Metadata-Driven Configuration– Ease of Development Result is an Extensible and Configurable API

  12. Available Configuration Parameters • Database or DTD Metadata • Error Codes (Type,Message) • Dependent Fields • TextParseBean Field—Element/Column • Field Name—Field Title • Form Name– Elements/Tables • Placeholder Name—Column/Element Name • Required Fields • Retrieved Element Name—Id • Non-Persisted Elements • Auto Assigned Elements—Form • Query Statement Metadata Configuration Parameters Loaded into Memory on Startup

  13. Metadata-Driven Configuration – Mapping Pkg • Metadata XML File Generated by DatabaseMetadataUtil Class • Encapsulates Referential Constraints for XML or Database • Element—Data Type Mapping (for Conversion or Type Check) • Element—Primary Key or Id Mapping • Exported Keys Map– Associative Table or IDREFS Constraints Metadata Parameters Generated and Loaded on Startup

  14. Benefits of Dynamic Configuration • Changes to Database—Auto Update of O/R Mapping Layer • Specify Application Behavior via XML Configuration Files • Facilitates DynamicJavaBean Implementation • Object Reuse via Object Factories • Redeployment or Reconfiguration via XML Files– No Recompile Configuration Parameters Loaded into Memory on Startup

  15. Dynamic Java Beans • Properties are Not Hardcoded into DynamicJavaBean • Implementing Classes Extend or Composed of Hashtable • Facilitates Object Reuse via Factory Design Pattern • Metadata-Driven,Dynamic Object Definition • Changes to Class Definition– XML File Update

  16. Director’s Challenge Architecture ManagerServlet RequestHandler Input Persistence RDBMS

  17. Challenge of Experiment Submission Capturing Rich Set of MIAME Information Vs. Ease of Use • Prepopulate Fields with UserProfile and FormInputBean Data • Dynamically Tailor Form Fields Based Upon Previous Entries • Personalize Drop Down Lists via UserProfile Preferences • Capture Common Field Data and Autogenerate Missing Items Transform Numerical Data into Biological Data

  18. Standard “Catch All” Form

  19. Customizing Page View

  20. Targeted Submission Functionality • XML or Form-Based Submission of MIAME-MAGE Information • Upload of Data Text Files • Upload or Manual Submission of Image Files • Leverage Architecture Design to Facilitate Ease of Use

  21. Prepopulating a Page

  22. Queries • Currently Implemented • Basic search/detail • Hardware Search • Software Search • Coming Soon • Advanced Search/detail • Protocol Search • Chip Search

  23. Future Directions • Implement Domain Object Model • Fully Implement All Search Use Cases • Develop Annotation and Onotology Tools (Integrate with caBIO) • Integrate xClust Cluster Analysis Tool • Data Retrieval and Processing to Support Analysis Tools • Develop Pattern Analysis Tools • Batch Upload/Download • Generate MAGEML XML file upon experiment submission

  24. Integration with NCICB caBIO and CGAP

  25. Integration with NCICB caBIO • Value-added Functionalities • Java API for Annotations and Ontologies • Easy Retrieval of Information in the Form of Objects

  26. Annotation Using caBIO to Access Gene Information • Reporter on Chip: • IMAGE clone • Affy probe set • Genbank ID • UniGene ID • Gene Info: • Annotation • Ontology • caBIO • Sequence • Gene AnnotationBean

  27. Gene OntologyGene Expression by Functional Aspects • OntologyBean • (GoOntology) • getGenes () • getAllGenes() • (ontology/children) • Genes • Sequences • Categorize genes of interest • Explore data by gene categories • Reporters on Chip

  28. Gene Ontology Implementation Goal: Enable user to obtain microarray data for a list of genes based on gene ontology term Steps: • G1. Get GO term by browsing GO Browser and by searching cGAP’s GO database • G2. Get a gene list based on user specified GO term • G3. Get expression data for a gene list by searching microarray database

  29. Gene Ontology Term Goal: Enable user to obtain an accurate GO term Approaches: • G1. Get GO term by browsing GO Browser • G 2. Get GO term by searching cGAP’s GO database • G • Vocabulary Control • Help users determine a GO term for their biological question

  30. Summary • Capture MIAME data in a MAGEML compliant database • Data Portal – valued added functionality • Bioinformatics Integration • Analytic tools

  31. Acknowledgements • Development Team • John Yost • Jennifer Long • Cheng-Cheng Huang • Nick Xiao • Johnita Beasley • Additional Thanks • caBIO • CGAP • madB

More Related