1 / 24

Identity management – life sciences perspective

Identity management – life sciences perspective. Ugis Sarkans European Bioinformatics Institute. European Bioinformatics Institute. Outstation of the European Molecular Biology Laboratory International organisation created by treaty ( cf CERN, ESA)

matsu
Download Presentation

Identity management – life sciences perspective

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Identity management – life sciences perspective UgisSarkans European Bioinformatics Institute

  2. European Bioinformatics Institute • Outstation of the European Molecular Biology Laboratory • International organisation created by treaty (cf CERN, ESA) • EMBL-EBI has 400 Staff, €30 Million Budget, several million users • 15 year history of service provision and scientific excellence • Sited at the Wellcome Trust Genome Campus Hinxton, Cambridge, UK after European competition 2008 funding sources

  3. EMBL-EBI Mission • To provide freely available data and bioinformatics services to all facets of the scientific community in ways that promote scientific progress • To contribute to the advancement of biology through basic investigator-driven research in bioinformatics • To provide advanced bioinformatics training to scientists at all levels, from PhD students to independent investigators • To help disseminate cutting-edge technologies to industry

  4. Life sciences Medicine Agriculture Pharmaceuticals Biotechnology Environment Bio-fuels Cosmaceuticals Neutraceuticals Consumer products Personal genomes Etc… Literature and ontologies Literature and ontologies CitExplore , GO CitExplore , GO Genomes Genomes Nucleotide sequence Nucleotide sequence Ensembl , Ensembl Ensembl , Ensembl EMBL - Bank EMBL - Bank Genomes, EGA Genomes, EGA Proteomes Proteomes UniProt , PRIDE Gene expression UniProt , PRIDE Gene expression ArrayExpress ArrayExpress Protein structure Protein structure PDBe PDBe Protein families, Protein families, motifs and domains motifs and domains Chemical entities Chemical entities InterPro InterPro ChEBI , ChEMBL ChEBI , ChEMBL Protein interactions Protein interactions IntAct IntAct Pathways Pathways Reactome Reactome Systems Systems BioModels BioModels Comprehensive, universal, integrated…

  5. Challenges facing information infrastructure for life sciences • The growth of biomedical data is faster than the Moore's law • Data generated in geographically distributed manner, but needs to be tightly integrated for interpretation • Data analysis algorithms need to be applied to combined datasets on raw data level • Human research subject data (clinical data) needs to be integrated with bio-molecular data raising the privacy issues and need for highly controlled access • The data analysis algorithms are becoming more compute intensive – the need for parallelisation

  6. Dynamic growth response Available disk space Log(data volume) Time

  7. Dynamic growth response Data to be stored Available disk space Log(data volume) Time

  8. What is Elixir? • An EU Framework 7 Preparatory Phase Project • Coordinated by Prof Janet Thornton, Director EMBL-EBI • To construct a plan for the operation of a sustainable infrastructure for biological information in Europe • €4.5 million grant awarded May 2007, three year term • 32 member consortium engaging many of Europe’s main bioinformatics funding agencies and research institutes • Deliverables are memoranda of understanding to fund the implementation phase which could cost €500 million • Interested parties should register as stake-holders via the ELIXIR Website: www.elixir-europe.org

  9. ESFRI The European Strategy Forum on Research Infrastructures • Created by the Commission in February 2002 • Adopted by the Competitiveness Council in April 2002 • Representatives of EU Member States, Associated States, and one representative of the European Commission. • Chairman: Prof Carlo Rizzuto (Sincrotrone Trieste S.c.p.A.-ELETTRA, IT) • To support a coherent approach to policy-making on research infrastructures in Europe • To act as an incubator for international negotiations about concrete initiatives

  10. European Roadmap for Research Infrastructures • 35 ‘mature’ projects for new large scale Research Infrastructures • Based on an international peer review process • Covers all scientific areas, regardless of possible location • Likely to be realized in the next 10 to 20 years • Supported by a relevant European partnership or intergovernmental research organisation. • Impact on science and technology development at international level • Support new ways of doing science in Europe • Contribute to the enhancement of the European Research Area

  11. Roadmap projects summary. • 6 Social Science & Humanities • 8 Environmental Sciences • 3 Energy • 6 Biomedical and Life Sciences • 7 Material Sciences • 5 Astronomy, Astro-, Nuclear and Particle Physics • 1 Computer and Data Treatment (transverse) http://cordis.europa.eu/esfri/

  12. Cost of 35 Mature ESFRI RI Projects Computing £300M Social Science Environment £1,300 Physics £3,600 Biomedical £1,600 Energy £2,200 Materials £4,500 Total Capital Cost = €13,696 Million

  13. The ten ESFRI BMS RI

  14. ELIXIR Scientific & Technical Structure 15

  15. BMS Support of the European Grand Challenges ELIXIR will provide Infrastructure for the other ESFRI BMS RI.

  16. BioMedBridges • Call 8 (Research) Topic 2.3.2 “Clustering the ESFRI BMS.” • Coordinated by Janet Thornton • To create the links between the ESFRI BMS RI • €10.6M over 4 years, 21 participating organisations, 12 WP • To “build bridges” between the infrastructures • Deliverables are infrastructure components that will link data from the different domains of the ESFRI BMS RI to ELIXIR Core Datasets • It is anticipated that these components will be incorporated into ELIXIR Construction Phase • ESFRI BMS RIs will be doing the work • e-Infrastructure Advisory Panel: GÉANT, DANTE, EGI.eu, PRACE

  17. BioMedBridges Structure of Proposal • WP1 Management • WP2 Outreach and inreach • WP3 ESFRI BMS Standards Description and Harmonization • WP4 Technical integration • WP5 Secure access • Five Use Cases WP6 – WP12 • WP6 Interoperability of large scale image data sets from different biological scales • WP7 PhenoBridge - crossing the species bridge between mouse and human • WP8 Personalized Medicine - integrating complex data sets to understand disease pathogenesis and improve biomarker and treatment selection • WP9 From cells to molecules - integrating structural data • WP10 Integrating disease related data and terminology from samples of different types • WP11 Technology Watch • WP12 Training

  18. EMBL-EBI: Most important data collections Genomes & Genes • Ensembl: Joint project with Sanger Institute - high-quality annotation of vertebrate genomes • Ensembl Genomes: Environment for genome data from other taxons • 1000 Genomes: Catalogue of human variation from major World populations • EGA*: European Genotype Archive* – genotype, phenotype and sequences from individual subjects and controls • ENA: European Nucleotide Archive – all DNA & RNA, nextgen reads and traces Transcription • ArrayExpress: Archive of transcriptomics and other functional genomics data • Expression Atlas: Differentially expressed genes in tissues, cells, disease states & treatments Protein • UniProt: Archive of protein sequences and functional annotation • InterPro: Integrated resource for protein families, motifs and domains • PRIDE: Public data repository for proteomics data • PDBe: Protein and other macromolecular structure and function Small molecules • ChEBI: Chemical entities of biological interest • ChEMBL: Bioactive compounds, drugs and drug-like molecules, properties and activities Processes • IntAct: Public repository for molecular interaction data • Reactome: Biochemical pathways and reactions in human biology • Biomodels: Mathematical models of cellular processes Ontologies • GO: Gene Ontology, consistent descriptions of gene products Scientific literature • CiteXplor: Bibliographic query system * Requires authentication

  19. Data supporting publication – typical lifecycle published manuscript submitted manuscript author public data restricted data reviewer

  20. European Genome-phenome Archive (EGA) • Primary archive for any data consented for research but not for fully public distribution • all data must be de-identified and in accordance with the informed consent. • Controlled access to the data • distributed access policy: • Data Access Committee (DAC) • data release policy – data access application and data access agreement • EGA supports only data access decisions that are based on the original informed consent • authorized users have personal accounts in our system • access to the data requires account password • data decryption requires a separate key that must be requested and is sent off line HSF - 20.1.2011

  21. EGA works with Data Access Committees (DAC) HSF - 20.1.2011

  22. Mechanics of secure data access (5) Secure Server responds to FTP requests directly; FTP client downloads the custom-encrypted file FTP Client Authentication of FTP clients is inherently insecure; we may have to require FTPS compliant clients (RFC 4217) (1) Request for whole file for download (with username/ password) (4) Requested BAM data decrypted, and re-encrypted using client key Secure Server (2) EGA verifies user and provides list of authorized list of files. (3) EGA provides archival encryption key and ile path in the archive. This requires a secure API to facilitate access into the EGA master database EGA secure layer EGA secure layer

  23. Acknowledgements • Andrew Lyall, ELIXIR project manager • Paul Flicek, IlkkaLappalainen, EGA • AlvisBrazma, Functional Genomics, BioMedBridges security

More Related