The european radiobiological archives era
This presentation is the property of its rightful owner.
Sponsored Links
1 / 20

The European Radiobiological Archives (ERA) PowerPoint PPT Presentation


  • 79 Views
  • Uploaded on
  • Presentation posted in: General

The European Radiobiological Archives (ERA). Paul Schofield & Jonathan Bard Cambridge U Edinburgh U. Supported by European Commission contracts: FIR1-CT-2000-20097 & FI6R – SSA -2006 - 028275. Background.

Download Presentation

The European Radiobiological Archives (ERA)

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


The european radiobiological archives era

The European Radiobiological Archives (ERA)

Paul Schofield & Jonathan Bard

Cambridge U Edinburgh U

Supported by European Commission contracts:

FIR1-CT-2000-20097 & FI6R – SSA -2006 - 028275


Background

Background

  • In the 40s, 50s & 60s, there was a great deal of research to understand the scientific basis and effects of irradiation

  • This work was done across the world on a wide range of animals

  • The closed set of USA, European and Japanese data was archived during the ‘80s in an old and (apparently) poor version of ACCESS

  • The archiving has been done using disease, pathological and anatomical terminology that is inconsistent across labs and animals

  • The EU have funded a small group led by Paul Schofield (Cambridge) to make the database useful


The international radiobiology archives

The International Radiobiology Archives

  • Set of HTML files on a Web Site describing all individual studies from > 300 000 individuals. Links allow the search for specific studies using certain radiation or other exposure or certain strains/species;

  • Database in ACCESS of all studies + data from ~200K individuals on survival, pathology, and clinical etc (~350MB)

  • Description of the ACCESS DB listing all tables and their fields as well the forms and their use including the underlying computer codes;

  • Database of selected references in ENDNOTE.


Era access database

ERA ACCESS Database

  • Relational database with a hierarchical structure;

  • Forms provided with Visual Basic code allows a user to

    • Browse through data,

    • Search for groups with specific characteristics,

    • Select groups for further study,

    • Perform preliminary Statistics and/or Export selected data for other statistics;

  • Re- or meta-analysis of different studies would be facilitated by

    • Combining (control) groups

    • Lumping diseases for analysis into

      • Families (e.g. all malignant liver tumors),

      • Classes (e.g. all systemic tumors).

Lumping diseases is crucial for comparing studies from different laboratories using different pathologicalcriteria,..


Hierarchical structure of the access database

Pers_Status

Lab

Person

Ref_Pers

Status

Referenc

Study_Pers

Study

Group_Con

Study_Ref

RefKey

Strain

Group_Treat

Keyword

Group

Gender

Treat_Type

Applicat

Unit

KeywordCat

AgeCat

Treat_Class

PathDone

Ind_Main

Ind_Treat

ICD7

Meas_Type

DeathMode

Ind_Measur

ICD8

Common tables

ICD9

Ind_PathMan

KDS

Ind_PathRod

Dogs only

Man only

Rodents only

Ind_Pedigree

Hierarchical Structure of the ACCESS Database

Ind_PathDog


Number of individuals in the database

Number of Individuals in the database

  • Some individual group data unobtainable from several Japanese and one US institute (~60K individuals).

  • The US archives include ~40K beagles; a 20 year study as compared to the 5 years for a rodent study.


Species and individuals in different archives

Species and Individuals in Different Archives


A typical opaque webpage

A typical (opaque) webpage


Problems

Problems

  • Current classifications are a set of simple controlled vocabularies

  • Granularity is highly variable

  • Cannot separate anatomy & disease for some codes e.g. ICD9 has terms like pulmonary cancer

  • Cross species computing is very complex

  • No interoperability with other resources

  • Uses high-level Control Vocabularies for Disease and Family

    These are reasonable but unwieldy

  • Nothing is standardised cross the database!!

  • It is not easy to use


Need for keeping legacy radiobiology archives

Need for keeping legacy Radiobiology Archives

  • Most scientists involved in such studies are retired or dead

  • Few institutions (perhaps 4) now do this work.

  • Substantial risks exist that irreplaceable data will be discarded

  • Budgetary, legal and political limitations and popular opposition to animal studies make new animal studies unlikely.

  • Old animal studies represent a substantial investment

    • a study on 50-100 dogs or 5000 mice costs about €10M

    • the studies in the IRA database cost > 2x109€in)

  • Several studies are not yet fully evaluated

  • Old data can, with modern statistical methods, yield information on

    • dose, dose-rate and radionuclide effects

    • ageing, cancer & diseases.


Era pro funds the current work

ERA-PRO funds the current work

  • Migration of data to Oracle database

  • Checking of data with original input sources - validation/curation

  • Improving access and functionality

  • An easy, intuitivve search facility for the user

  • Interoperability with other databases

    • Compliance with current ontology and other data standards

    • Coding of pathology diagnoses


Migration of data to oracle database

Migration of data to Oracle database

  • This is being done by a database group in Germany

    • Professional advice is to produce a DBMS-free version (in XML?)

    • Load this version into Oracle

      Validation of data

  • Hand checking of randomly selected datasets with original submission

  • The few errors are mainly systematice phase shifts or a keystroke error

    Better access, functionality & interoperability

  • This involves using modern coding to access the data

  • Our policy is not to touch the data itself but to use look-up tables

  • This means

    • Starting with standard ontologies for pathology and anatomy

    • Making links that allow them to handle all relevant species


Bio ontologies

Bio-Ontologies

  • There are now various codings that can be used to annotate biological terms

  • Some are formal ontologies (sets of structured knowledge) - others are less well organised

  • Anatomy:

    • FMA (human) MA (mouse embryo an adult)

    • Others for c elegans, zebrafish and Drosophila

  • Pathology

    • ICD9 (human) every known disease (105 terms?)

    • SNOVET (animals)every animal disease

    • MPATH (mouse)mouse pathologies (600 terms)

  • There are others such as EULEP codes that are inaccessible

    • Some have mixed anatomy and pathology (pulmonary tumour)

      The ERA only uses old ones!


Current disease coding in era

Current disease coding in ERA

?

Suggested approach


Pathology terminology

Pathology terminology

  • Could handle some pathology by implementing

    • ICD9 ontology for human and adding anatomy codes?

    • MPATH for rodents with high level anatomical codes?

  • This excludes dogs and SNOVET is not an ontology

  • Unlikely to find a solution with a common detailed pathology ontology for all three species.

  • Many of the problems of unifying Pathology/disease terms depend on the disaggregation of Pathology and Anatomy and there is no common anatomy with adequate spatial detail


Anatomy ontology resources

Anatomy ontology resources

Many: resources, formats, philosophies, purposes, variable content,


Handling anatomy

Handling anatomy

Options

  • Attempt full anatomy and pathology mappings for each species - NO

  • Use dynamic mapping facilities of cross-species anatomies - NO

  • Modify/generalise an existing model anatomy - possible

    Current approach

  • Use adult mouse ontology (cut down to the necessary level of detail)

  • Abstract it to be species-independent

  • Map on this ERA anatomy (topo) and disease anatomy terms (Dis-rod)

  • Use a second code to define species

  • Link the dual ontology to the data via a look-up table


The draft anatomy look up table

The draft anatomy look-up table

TOPOterm from the Topology list

DIS-RODterm from the Disease-Rodent list

NEWterm invented to improve organisation

Adult tissues (SYN: whole body) (TOPO)

Body cavities(NEW)

Pleural cavity (DIS-ROD)

Pleural mesothelium(DIS-ROD)

Peritoneal cavity(DIS-ROD)

Peritoneal mesothelium (DIS-ROD)

Meosthelium(DIS-ROD)DAG – dual parentage

Peritoneal mesothelium (DIS-ROD)

Pleural mesothelium(NEW)

Cardiovascular system(TOPO)

Blood vessels(TOPO) (DIS-ROD)

Heart (TOPO) (DIS-ROD)

Myocardium (DIS-ROD)

Pericardium (DIS-ROD)

Lymphatic vessels(TOPO)SYN: system(DIS-ROD)


Handling pathology

Handling pathology

  • The mix of SNOMED, SNOVET, ICD9 etc is just too cumbersome

    • The added problem is the mix of pathology and anatomy

  • The 600 MPATH terms cover everything

    Current approach

  • Use MPATH for the search page

  • Where there is added anatomy, use an additional MA code

  • Link MPATH to data via look-up tables


In two years time

In two years time ….

  • The user will have a search page based around ontologies and controlled vocabularies for

    • Species

    • General anatomy terms (based around the mouse)

    • Pathology terms (based around the mouse)

  • We will provide an underlying Oracle DB and series of look-up tables and links that will allow a user to

    • Identify the experiments that include data on his search terms

    • Extract individual animal data that meet the search criteria

      At least, that is the plan!


  • Login