International Biobank and Cohort Studies: Developing a Harmonious Approch
This presentation is the property of its rightful owner.
Sponsored Links
1 / 30

Standards The P 3 G knowledge database PowerPoint PPT Presentation


  • 88 Views
  • Uploaded on
  • Presentation posted in: General

International Biobank and Cohort Studies: Developing a Harmonious Approch February 7-8, 2005, Atlanta; GA. Jan-Eric Litton Karolinska Institutet, Stockholm Sweden. Standards The P 3 G knowledge database. Sharing data. ID MURA_BACSU STANDARD; PRT; 429 AA. DE PROBABLE UDP. -.

Download Presentation

Standards The P 3 G knowledge database

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Standards the p 3 g knowledge database

International Biobank and Cohort Studies: Developing a Harmonious ApprochFebruary 7-8, 2005, Atlanta; GA

Jan-Eric Litton

Karolinska Institutet, Stockholm

Sweden

  • Standards

  • The P3G knowledge database


Standards the p 3 g knowledge database

Sharing data

ID MURA_BACSU STANDARD; PRT; 429 AA.

DE PROBABLE UDP

-

N

-

ACETYLGLUCOSAMINE 1

-

CARBOXYVINYLTRANSFERASE

DE (EC 2.5.1.7) (ENOYLPYRUVATE TRANSFERASE) (UDP

-

N

-

ACETYLGLUCOSAMINE

DE ENOLPYRUVYL TRANSFERASE) (EPT).

GN MURA OR MURZ.

OS BACILLUS SUBTILIS.

OC BACTERIA; FIRMICUTES; BACILLUS/CLOSTRIDIUM GROUP; BACILLACE

AE;

OC BACILLUS.

KW PEPTIDOGLYCAN SYNTHESIS; CELL WALL; TRANSFERASE.

FT ACT_SITE 116 116 BINDS PEP (BY SIMILARITY).

FT CONFLICT 374 374 S

-

> A (IN REF. 3).

SQ SEQUENCE 429 AA; 46016 MW; 02018C5C CRC32;

MEKLNIAGGD SLNGTVHISG AKNSAVALIP ATILANSEVT IEGLPEISDI ETLR

DLLKEI

GGNVHFENGE MVVDPTSMIS MPLPNGKVKK LRASYYLMGA MLGRFKQAVI GLPG

GCHLGP

RPIDQHIKGF EALGAEVTNE QGAIYLRAER LRGARIYLDV VSVGATINIM LAAV

LAEGKT

IIENAAKEPE IIDVATLLTS MGAKIKGAGT NVIRIDGVKE LHGCKHTIIP DRIE

AGTFMI


Standards the p 3 g knowledge database

A historical essay:

The Machine Screw

  • Principle discovered around 400 BC

  • Limited use until machine tools made mass production possible (18th cent.)

  • Every machine shop and foundry made unique sizes and thread dimensions

  • 1841: Joseph Whitworth presented “The Uniform System of Screw-Threads” to Britain’s Institute of Civil Engineers

  • 1864: William Sellers proposes “On a Uniform System of Screw Threads” to the Franklin Institute, Philadelphia

  • Enabled interchangeable parts and tooling for mechanization and mass production

  • 1945: British and American standards merged


Standards the p 3 g knowledge database

Point-to-point integration

of data

Merge results

  • Application includes subprogram

  • to each different data source

  • Operations on data must be

  • processed by an application

    • Lots of coding efforts

    • Fully dependent of

    • data resources


Standards the p 3 g knowledge database

Data Warehouse

  • Data are loaded in the database

  • Data need filtering, cleaning,

  • transformation

  • Data must be refreshed

    • Scripts must be written

    • Timeconsuming to refresh data

    • Up-to-date data can not be

    • guaranteed

ODBC - JDBC


Standards the p 3 g knowledge database

ODBC – JDBC and more

Federated data

  • Data stay untouched

    • Integrates

    • heterogeneous local or

    • remote data sources

    • through wrappers

  • Just need to know what

  • data should be available

  • to whom and how to access them

  • It makes all data look

  • like it is one virtual database

  • hiding the data layer complexity


Standards the p 3 g knowledge database

Ontologies

  • Controlled vocabulary means

  • only one controlled term is used for a given concept

  • Data Model:

    • Data structuring mechanism in which an ontology is expressed


Standards the p 3 g knowledge database

Data model


Standards the p 3 g knowledge database

World Wide Biobanking

124

.ca

.us

840

.se

ISO-code 3166

Sweden=752

The National

Board of Health

and Welfare

id=1

id=1


Standards the p 3 g knowledge database

World Wide Biobanking

  • Communication with other biobanks

  • XML


Standards the p 3 g knowledge database

Sample identification

752-08-123456789-4

2D Matrix code for DNA storage at normalized concentration

SE KI Biobank # Sample ID


Standards the p 3 g knowledge database

P3G Knowledge Database

Knowledge Curation and Information Technology

International Working Group

on Knowledge Curation

And Information Technology

P3Gdb

Knowledgebase on Phenotypes,

Genetic Analysis Methods, and

Policies related to Biobanks and

Population Genetics Research

Data Entry core

IT core


Standards the p 3 g knowledge database

P3G Knowledge Database

Knowledge Curation and Information Technology

  • The advantages of integrating databases in different aspects of Biobanks as public resources.

  • The first requirement that has to be fulfilled to enable biobank communication is a unique identity for each biobank

  • Second, a common nomenclature is needed in order to communicate between biobanks.


Standards the p 3 g knowledge database

P3G Knowledge Database

The potential impact of integrating will be:

  • Promote communication within and between major biobanking initiatives thereby helping to overcome existing fragmentation of population genomic research.

  • Enhance the effective sharing and synthesis of information, thereby addressing the need for very large sample sizes and helping to promote collaborative international genetic epidemiological and clinical research.

  • Avoid the expensive mistakes and inefficiencies that can arise when individual initiatives repeatedly “re-invent the wheel”, thereby saving funders and researchers a lot of time and money


Standards the p 3 g knowledge database

P3G Knowledge Database

Knowledge Curation and Information Technology

The Road Map:

  • WG 1: Nomenclature

  • WG 2:Sample handling

  • WG 3:Biobank information

  • WG 4:Phenotype data

  • WG 5:Genotype data

  • WG 6:Data modeling

  • WG 7:Database Integration

  • WG 8:Security

  • WG 9:Output and analysis

  • WG 10:Documentation


Standards the p 3 g knowledge database

P3G Knowledge Database

The road map: Phenotype

  • Describe data format naming conventions

  • P3G data format standard (Start with GenomEUtwin documents)

  • Describe relations between the entities

  • Describe entities and their attributes

  • Sync genotype data

  • Questionnaires (validation)

  • Clinical measures

  • Laboratory phenotypes


Standards the p 3 g knowledge database

P3G Knowledge Database

The road map: Data modeling

  • Conceptual data modeling using UML (Unified Modeling

    Language)

  • Build conceptual harmonized data model for genotype and phenotype data

  • Sequence variation standardization

  • Provide standardized data transfer format

  • Tracking of samples

  • XML and OWL for future use


Standards the p 3 g knowledge database

P3G Knowledge Database

The road map: Sampling handling

  • Sample collection

  • Sample identification

  • Data collection

  • Structure and standardization of data

  • Quality control procedures

  • Ethical and legal aspects


Standards the p 3 g knowledge database

P3G Knowledge Database

The road map:


Standards the p 3 g knowledge database

P3G Knowledge Database

Physical entities


Standards the p 3 g knowledge database

P3G Knowledge Database

Physical entities


Standards the p 3 g knowledge database

P3G Knowledge Database

Physical entities


Standards the p 3 g knowledge database

P3G Knowledge Database

Physical entities


Standards the p 3 g knowledge database

P3G Knowledge Database

Physical entities


Standards the p 3 g knowledge database

P3G Knowledge Database

Donor entities


Standards the p 3 g knowledge database

P3G Knowledge Database

Sampling entities


Standards the p 3 g knowledge database

P3G Knowledge Database

The road map:

  • Using models which remain stable as the technological landscape changes around them - Model Driven Architecture


Standards the p 3 g knowledge database

P3G Knowledge Database

Knowledge Curation and Information Technology

The Road Map: Starting point

  • 1: Nomenclature

  • 2:Sample handling

    Biobank information

  • 3:Phenotype data

    Genotype data

    Data modeling

  • 4:Database Integration

    Security

  • 5: Ethics, governance, policy, socio-demographic


Standards the p 3 g knowledge database

P3G Knowledge Database

Knowledge Curation and Information Technology

The Road Map: Starting point

  • Name IWG-leaders

  • Name Cores

  • Now, open a KDB members area under www.p3gconsortium.org, to start the knowledge database

  • IWG-KDB meeting late spring 2005

  • Coordinate with other activities


Standards the p 3 g knowledge database

[email protected]

[email protected]

[email protected]


  • Login