1 / 42

Architecting the Virtual Organisation

Architecting the Virtual Organisation. Rob Gill Biology Domain MDR-IT March 2007. Cost. Gene ID. FTIM. Candidate Selection. Target to Lead. Gene-to-Target. Disease Association. Identify gene or target protein important for disease mechanism. Identify and Optimise

kreyes
Download Presentation

Architecting the Virtual Organisation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Architecting the Virtual Organisation Rob Gill Biology Domain MDR-IT March 2007

  2. Cost Gene ID FTIM Candidate Selection Target to Lead Gene-to-Target Disease Association Identify gene or target protein important for disease mechanism. Identify and Optimise Inhibitors / modulators • Evaluate • Safety • Dose • Toxicity • Test Hypothesis in Man • Efficacy Target Selection Preclinical Candidate PoC, Clinical Drug Discovery Process GSK is a research based pharmaceutical company – new products from R&D pipeline File Launch Screening Hit to Lead

  3. ? Informatics links the information used to find and validate new targets TARGETS SCREENS CANDIDATES DRUGS BUT….. All these tools are built and integrated within GSK

  4. R&D Strategy: Virtualisation • Total investment in drug research: • GSK R&D: ~15,000 Scientists, >$4B R&D spend • PhRMA member companies¥ (2005): >$51B R&D spend • Biotechs (2006): >$25B R&D spend • R&D strategy is to tap into external knowledge and expertise through a network of external alliances sharing the risk, reward and control • Centre of Excellence for External Drug Discovery (CEEDD) launched in 2005 ¥Includes GSK

  5. R&D Strategy: Globalisation • The rate of growth of scientific and technical graduates in Asia is out pacing the US and Europe • China is one of the fastest growing countries: • well over a million scientists and engineers are graduating each year (National Bureau of Statistics, 2005) • by 2010 it is predicted that China will surpass the United States in the number of science and engineering PhDs conferred (National Bureau of Economic Research) • R&D strategy is to be able to tap into this wealth of knowledge along with other developments in the global market.

  6. GSK IT Definition of Virtualisation “Virtualisation will allow GSK to operate seamlessly across organisational, process and technology boundaries enabling GSK to: exploit process expertise, partner effectively and flexibly outsource” Here’s why…. • Development of Alliances • Access to Service Providers (CROs) • Integrated Outsource Partners • Better access for key opinion leaders • Acquisition • Globalised distributed workforce

  7. Challenge for IT in GSK • Business trend is away from the “fortress” towards an “ecosystem” • However the GSK IT environment was not designed for this • Significant overhead required to be integrated into R&D/GSK processes • Most solutions are “bolted on” to our current infrastructure • Virtualisation is one of the key streams of GSK strategy

  8. Technical Architectures Speed Vs Flexibility Need to simplify our environment Necessity to support remote sites Reduce costs Development Support Information Architecture (IA) Capture processes across Biology Publish and maintain models Use to direct the generation of data services Creation/Use of Ontologies Identify “integration points” Design out overlaps Drive GSK Integration IT Techniques needed to support this transition

  9. Business Process Modelling & Model Driven Design • If we are to fully embrace the Virtualisation challenge we need to look again at how we capture and implement business processes. • Process Definition and Information Standards • Architects and Business analyst in conjunction with business process owners define the process model and information passing between process tasks using a business process modeling tool • Implementation • Software developers implement services and define business rules • Process Optimization and Monitoring • Business process owners can then monitor metrics to analyze and optimize the process • Better modeled processes and SOA approaches then open up GSK systems to a number of delivery approaches • Enterprise Service Bus (ESB) • Workflow Tools (InforSense, Taverna …)

  10. Inventory System QC System Request System Delivery System email phone phone phone Business Process Management Architecture Bespoke point-to-point interfaces between systems and adhoc communication do not allow business process to be easily reconfigured, outsourced, or tracked. Workflow is either hard-coded into application or handled by adhoc communication. Workflow is externalized and visible and point-to-point interfaces eliminated. This allows business process to be easily reconfigured, outsourced and metrics tracked. Process Server Request Product Build Product QC Product Deliver Product Task queue Process Monitor (captures metrics) Product system QC System Delivery System Request System

  11. Use of Information Architecture • Business Process Management takes us closer to the Virtualisation goals • However building this on top of an unmanaged data architecture still leaves us unable to fully benefit from the distributed approach • To properly Virtualise we need to both understand business process and have a managed data architecture.

  12. Connection Spaghetti Historical Ad-Hoc Integration (As Required) Networks View Reagents View Gene View Platform View

  13. Hidden Spaghetti Move to “services” (But no IA) Networks View Reagents View Gene View Platform View Services Services Services

  14. Networks Data Protein Data Gene Data Omics Results Modelled data and business process BRAD Perspective Genetics Perspective Genomics Perspective Business Discovery Perspective Services Services Services Data Gene Platform Network

  15. Networks Data Protein Data Gene Data Omics Results Workflow Analysis Visualise Get Gene Sequence Get Protein Sequence Get Gene Annotation Get platform results Get all results Filter By Technology Get closest neighbours Filter by Species Get All Pathways Gene Platform Network

  16. Why Workflow ? • Drives modularisation and aligns with model driven approach • Good for prototyping and process capture • Very fast response time for development • Captures scientific knowledge as part of the workflow design process • Built in flexibility for a rapidly changing environment • Supports Integrity and retention initiatives • Lowers the barrier of software development

  17. Target Application Architecture Visualization & Analysis Tools Workflow Analytics Data Integration Request Management Sequencing LIMS Study Design & Sample ID Generation Data Marts microArray LIMS Experimental & Analysis Results Bio Catalog Proteomics LIMS Sample Management Cellular LIMS Biological Inventory Metabolomics LIMS Reference Biology (Sequences, Genes, Proteins, Biological Networks)

  18. Biology Target Architecture Goals • Access to all Biology applications made accessible to vendors and partners by hosting in open environment. • Remove all major business workflow from individual applications and implement workflow control through business process management with well-defined Service Data Objects. • Look to standardize data formats and participate in semantic GRID computing

  19. Demonstrations of Workflow and Grid usage at GSK POC’s in the Biology Domain

  20. Demonstrators in the portfolio space • Workflow tools have been heavily used across both the Biology and Chemistry space • Some moving to production status • Still some Challenges to overcome (discussed later) • Primary use in process management and reporting • Portfolio space has some novel and potentially valuable opportunities. • Using Workflow from the “Top down” has the opportunity to cause integration across different scientific disciplines • Portfolio view will need to span the whole GSK space as we move to the virtualised environment

  21. Portfolio support using Workflow and Semantic Grid • Target is an overloaded term. • Confusion over meaning and identification • Controlled Vocabulary, Standards and inappropriate identification has lead to a complex mix of data types which are difficult to traverse. • A GSK “target” is actually a scientifically generated concept linking a disease to a set of entities and relationships needing to be tested within models.

  22. Concept Capture “Gene” Data “Disease” Data “Compound” Data Concept Concept ID Sequence db Reagent db Assay db Assay Results Reagent ID Sequence ID Assay ID

  23. Target Management • Once captured as a network it is possible to query and search across “Target space” in a far more flexible and intuitive way. • Standards can be enforced and aligned with GSK requirements. • Automated analysis can be used to answer numerous questions about the inferences and results generated within a portfolio program. • Using services generated in the Biology and Chemistry space it is possible to annotate the portfolio in new and novel ways. And answer fundamental questions around portfolio progression. • This process uses simple independent services on top of available GSK data and is reusable

  24. Inventory System QC System Request System Delivery System email Gene ID phone phone phone Disease Association X Target Selection Preclinical Candidate PoC, Clinical X Drug Discovery Process (again) GSK is a research based pharmaceutical company – new “Concepts” for R&D pipeline File Launch Screening Hit to Lead

  25. 1. Workflow in reagent tracking POC • Using Workflow to manage reagent processing • Numerous questions can be asked: • Where is / how many Target(s) ? • Programs using a molecular target • Availability of reagents for a program • Lead series generated from a program • Mapping status of bio-reagents to Molecular Target • Which Bio-reagents used within a specific screen • Very process driven approach focussed on GSK data.

  26. ProtA ProtB Gene Info Query Assay Info Query Chemical Query Bioreagent Info Query Target Identification SoC Assay Development Compound 123 Multimeric protein

  27. Workflow challenges for Production Status • Service availability • Using services in this manner requires a high availability of services • Service Orchestration • Need to ensure that once started a workflow runs to completion • Service support • Can this be managed internally and externally

  28. 2. Semantic Grid for Portfolio POC • Create an environment to query “concept” data across GSK and externally • Access novel algorithms / analysis for use on concept Networks to expand knowledge and identify project “risks”. • Data expansion (e.g. platform analysis) • Used in Validation / Reagent generation • Network expansion (looking for connections) • Used by Discovery groups / Informatics

  29. GeneA TransA_2 TransA_3 TransA_1 GRID GSK00000 Transcript Analysis “Concept Novelty” Bioreagent Analysis Assay may fail to identify this variant. Assay Validation Analysis Target Identification Assay Development X

  30. Business Value ?? • Methodology opens up Target space for broader analysis • Supports MDR requirements for target tracking and analysis • Linkage of Biology and Chemistry domains for better tracking • Drives standards and vocabulary down to the business • Removes need to redundantly identify “Molecular Target” throughout process. • Methods can be used to spot issues with programs and allow “kill early” decisions • Transcript analysis and assays • Bio-reagent availability (inc. AB’s / Crystallography etc.) • Methods and services can be reused across organisation outside of portfolio area

  31. GRID Challenges in Industry • Service Productionisation • Availability, Security, Granularity • Relationship management • Define the negotiation paradigm • Licensed service model • Implement local scoped catalogue • Service Orchestration • How to manage skews in Ontologies & technology • Vendors • Academia • Knowledge Engineers ?? • GSK is looking at developing standards

  32. One such group is SIMDAT • Objectives: • Develop federated versions of problem solving environments • Support of distributed product and process development • Test and enhance grid technology for access to distributed data bases • Tools for semantic transformation between these data bases • Grid support for knowledge discovery • Promote defacto standards • Raise awareness in important industrial sectors

  33. What is SIMDAT ?

  34. SIMDAT - Overview Seven Grid-technology development areas: Grid infrastructure Distributed Data Access VO Administration Workflows Ontologies Analysis Services Knowledge Services Four sectors of international economic importance: Automotive Pharmaceutical Aerospace Meteorology The solution of industrially relevant complex problems using data-centric Grid technology.

  35. Drivers for joining SIMDAT • SIMDAT provides a platform to look at all the following requirements and develop novel strategies to deliver them: • IT/IX tasked with delivering cutting edge analysis and data management systems • Need to develop an environment to drive integration and introduce flexibility. • Buy not build when technology or tools are available and appropriate • Lower “Activation Energy” around external collaborations • Look to Academia for “cutting edge” research • Look to Vendors for solid support / continuity & Value • Ensure Pharma discovery process continues as a production system with inbuilt flexibility • Look for opportunities to Virtualise GSK infrastructure • Support the needs for Of shoring and Outsourcing

  36. SIMDAT Solution Vendors Security Payment Relationships Gene Sample ?? ?? Microarray ?? Continuity Value Other Pharma / Biotech GSK architectural Direction Applications Service Enactment PSTUD. B* GATE GSK

  37. GSK GnG database Integrate and Output Inpharmatica via GRIA First B2B Scenario

  38. SIMDAT gives us • Framework for Service Consumption • Specialised Software Granularity • Data and Algorithm Services • Accounting and Consumption Model • Stratified Information • Open Service Market • Choreography to Orchestration • Relationship Management • Specialised Services • Focused Licensing • Zero Setup • Negotiated Catalogue • Dynamic Fee Dimensioning

  39. GRIA wrapper BIOCLIP Service GRIA wrapper Chemisry Services Implemented Components of the B2B Scenario Licensed Services Job execution Annotate project portal Semantic Broker InforSense KDE User Knowledge Portal Workflow Updating SB E2E Job execution Job execution Inpharmatica Service Updater GRIA E2E Job execution

  40. Final Scenario; B2A

  41. In Conclusion • Virtualisation & Globalisation are key objectives for GSK • IT must move to support these initiatives • Workflow and semantic directions can support this goal but MUST move into production • Consolidation of the various efforts in Grid and Ontology would make this move simpler.

  42. Acknowledgements MDR-IT • Richard Ashe Biology technical architect • Simon Dear Biology information architect • Mike Moore Biology POC • John Armstrong Biology Business Analysis • Bart Ailey SIMDAT Contractor • Thank you !

More Related