The european radiobiological archives era
1 / 20

The European Radiobiological Archives (ERA) - PowerPoint PPT Presentation

  • Uploaded on

The European Radiobiological Archives (ERA). Paul Schofield & Jonathan Bard Cambridge U Edinburgh U. Supported by European Commission contracts: FIR1-CT-2000-20097 & FI6R – SSA -2006 - 028275. Background.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about ' The European Radiobiological Archives (ERA)' - abiola

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
The european radiobiological archives era
The European Radiobiological Archives (ERA)

Paul Schofield & Jonathan Bard

Cambridge U Edinburgh U

Supported by European Commission contracts:

FIR1-CT-2000-20097 & FI6R – SSA -2006 - 028275


  • In the 40s, 50s & 60s, there was a great deal of research to understand the scientific basis and effects of irradiation

  • This work was done across the world on a wide range of animals

  • The closed set of USA, European and Japanese data was archived during the ‘80s in an old and (apparently) poor version of ACCESS

  • The archiving has been done using disease, pathological and anatomical terminology that is inconsistent across labs and animals

  • The EU have funded a small group led by Paul Schofield (Cambridge) to make the database useful

The international radiobiology archives
The International Radiobiology Archives

  • Set of HTML files on a Web Site describing all individual studies from > 300 000 individuals. Links allow the search for specific studies using certain radiation or other exposure or certain strains/species;

  • Database in ACCESS of all studies + data from ~200K individuals on survival, pathology, and clinical etc (~350MB)

  • Description of the ACCESS DB listing all tables and their fields as well the forms and their use including the underlying computer codes;

  • Database of selected references in ENDNOTE.

Era access database

  • Relational database with a hierarchical structure;

  • Forms provided with Visual Basic code allows a user to

    • Browse through data,

    • Search for groups with specific characteristics,

    • Select groups for further study,

    • Perform preliminary Statistics and/or Export selected data for other statistics;

  • Re- or meta-analysis of different studies would be facilitated by

    • Combining (control) groups

    • Lumping diseases for analysis into

      • Families (e.g. all malignant liver tumors),

      • Classes (e.g. all systemic tumors).

Lumping diseases is crucial for comparing studies from different laboratories using different pathologicalcriteria,..

Hierarchical structure of the access database































Common tables





Dogs only

Man only

Rodents only


Hierarchical Structure of the ACCESS Database


Number of individuals in the database
Number of Individuals in the database

  • Some individual group data unobtainable from several Japanese and one US institute (~60K individuals).

  • The US archives include ~40K beagles; a 20 year study as compared to the 5 years for a rodent study.


  • Current classifications are a set of simple controlled vocabularies

  • Granularity is highly variable

  • Cannot separate anatomy & disease for some codes e.g. ICD9 has terms like pulmonary cancer

  • Cross species computing is very complex

  • No interoperability with other resources

  • Uses high-level Control Vocabularies for Disease and Family

    These are reasonable but unwieldy

  • Nothing is standardised cross the database!!

  • It is not easy to use

Need for keeping legacy radiobiology archives
Need for keeping legacy Radiobiology Archives

  • Most scientists involved in such studies are retired or dead

  • Few institutions (perhaps 4) now do this work.

  • Substantial risks exist that irreplaceable data will be discarded

  • Budgetary, legal and political limitations and popular opposition to animal studies make new animal studies unlikely.

  • Old animal studies represent a substantial investment

    • a study on 50-100 dogs or 5000 mice costs about €10M

    • the studies in the IRA database cost > 2x109€in)

  • Several studies are not yet fully evaluated

  • Old data can, with modern statistical methods, yield information on

    • dose, dose-rate and radionuclide effects

    • ageing, cancer & diseases.

Era pro funds the current work
ERA-PRO funds the current work

  • Migration of data to Oracle database

  • Checking of data with original input sources - validation/curation

  • Improving access and functionality

  • An easy, intuitivve search facility for the user

  • Interoperability with other databases

    • Compliance with current ontology and other data standards

    • Coding of pathology diagnoses

Migration of data to oracle database
Migration of data to Oracle database

  • This is being done by a database group in Germany

    • Professional advice is to produce a DBMS-free version (in XML?)

    • Load this version into Oracle

      Validation of data

  • Hand checking of randomly selected datasets with original submission

  • The few errors are mainly systematice phase shifts or a keystroke error

    Better access, functionality & interoperability

  • This involves using modern coding to access the data

  • Our policy is not to touch the data itself but to use look-up tables

  • This means

    • Starting with standard ontologies for pathology and anatomy

    • Making links that allow them to handle all relevant species

Bio ontologies

  • There are now various codings that can be used to annotate biological terms

  • Some are formal ontologies (sets of structured knowledge) - others are less well organised

  • Anatomy:

    • FMA (human) MA (mouse embryo an adult)

    • Others for c elegans, zebrafish and Drosophila

  • Pathology

    • ICD9 (human) every known disease (105 terms?)

    • SNOVET (animals) every animal disease

    • MPATH (mouse) mouse pathologies (600 terms)

  • There are others such as EULEP codes that are inaccessible

    • Some have mixed anatomy and pathology (pulmonary tumour)

      The ERA only uses old ones!

Current disease coding in era
Current disease coding in ERA


Suggested approach

Pathology terminology
Pathology terminology

  • Could handle some pathology by implementing

    • ICD9 ontology for human and adding anatomy codes?

    • MPATH for rodents with high level anatomical codes?

  • This excludes dogs and SNOVET is not an ontology

  • Unlikely to find a solution with a common detailed pathology ontology for all three species.

  • Many of the problems of unifying Pathology/disease terms depend on the disaggregation of Pathology and Anatomy and there is no common anatomy with adequate spatial detail

Anatomy ontology resources
Anatomy ontology resources

Many: resources, formats, philosophies, purposes, variable content,

Handling anatomy
Handling anatomy


  • Attempt full anatomy and pathology mappings for each species - NO

  • Use dynamic mapping facilities of cross-species anatomies - NO

  • Modify/generalise an existing model anatomy - possible

    Current approach

  • Use adult mouse ontology (cut down to the necessary level of detail)

  • Abstract it to be species-independent

  • Map on this ERA anatomy (topo) and disease anatomy terms (Dis-rod)

  • Use a second code to define species

  • Link the dual ontology to the data via a look-up table

The draft anatomy look up table
The draft anatomy look-up table

TOPO term from the Topology list

DIS-RODterm from the Disease-Rodent list

NEW term invented to improve organisation

Adult tissues (SYN: whole body) (TOPO)

Body cavities(NEW)

Pleural cavity (DIS-ROD)

Pleural mesothelium (DIS-ROD)

Peritoneal cavity (DIS-ROD)

Peritoneal mesothelium (DIS-ROD)

Meosthelium (DIS-ROD) DAG – dual parentage

Peritoneal mesothelium (DIS-ROD)

Pleural mesothelium (NEW)

Cardiovascular system (TOPO)

Blood vessels (TOPO) (DIS-ROD)

Heart (TOPO) (DIS-ROD)

Myocardium (DIS-ROD)

Pericardium (DIS-ROD)

Lymphatic vessels (TOPO) SYN: system (DIS-ROD)

Handling pathology
Handling pathology

  • The mix of SNOMED, SNOVET, ICD9 etc is just too cumbersome

    • The added problem is the mix of pathology and anatomy

  • The 600 MPATH terms cover everything

    Current approach

  • Use MPATH for the search page

  • Where there is added anatomy, use an additional MA code

  • Link MPATH to data via look-up tables

In two years time
In two years time ….

  • The user will have a search page based around ontologies and controlled vocabularies for

    • Species

    • General anatomy terms (based around the mouse)

    • Pathology terms (based around the mouse)

  • We will provide an underlying Oracle DB and series of look-up tables and links that will allow a user to

    • Identify the experiments that include data on his search terms

    • Extract individual animal data that meet the search criteria

      At least, that is the plan!