slide1 n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Diseases PowerPoint Presentation
Download Presentation
Diseases

Loading in 2 Seconds...

play fullscreen
1 / 49

Diseases - PowerPoint PPT Presentation


  • 113 Views
  • Uploaded on

Genes. Diseases. Diseases. Diseases. Physiology. Diseases. Physiology. Genes. Genes. Anatomy. Diseases. Physiology. Anatomy. Diseases. Physiology. Anatomy. Diseases. Physiology. Anatomy. Diseases. Physiology. Anatomy. Diseases. Physiology. Anatomy. Diseases. Anatomy.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Diseases' - akira


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide1

Genes

Diseases

Diseases

Diseases

Physiology

Diseases

Physiology

Genes

Genes

Anatomy

Diseases

Physiology

Anatomy

Diseases

Physiology

Anatomy

Diseases

Physiology

Anatomy

Diseases

Physiology

Anatomy

Diseases

Physiology

Anatomy

Diseases

Anatomy

Genes

Genes

Genes

Genes

Genes

Genes

Novel relationships & Deeper insights

Medical Informatics

Bioinformatics

slide2

& YOU

Mining Bio-Medical Mountains

How Computer Science can help Biomedical Research and Health Sciences

Anil Jegga

Division of Biomedical Informatics,

Cincinnati Children’s Hospital Medical Center (CCHMC)

Department of Pediatrics, University of Cincinnati

http://anil.cchmc.org

Anil.Jegga@cchmc.org

acknowledgement
Acknowledgement

Biomedical Engineering/Bioinformatics

  • Jing Chen
  • Sivakumar Gowrisankar
  • Vivek Kaimal

Computer Science

  • Amit Sinha
  • Mrunal Deshmukh
  • Divya Sardana

Electrical Engineering

  • Nishanth Vepachedu
slide4

Two Separate Worlds…..

Disease World

Genome

Variome

Transcriptome

Regulome

miRNAome

  • Name
  • Synonyms
  • Related/Similar Diseases
  • Subtypes
  • Etiology
  • Predisposing Causes
  • Pathogenesis
  • Molecular Basis
  • Population Genetics
  • Clinical findings
  • System(s) involved
  • Lesions
  • Diagnosis
  • Prognosis
  • Treatment
  • Clinical Trials……

Interactome

Pharmacogenome

Metabolome

Physiome

Pathome

Medical Informatics

Bioinformatics & the “omes

PubMed

Proteome

Disease Database

Patient Records

OMIM

Clinical Synopsis

Clinical Trials

382 “omes” so far………

and there is “UNKNOME” too - genes with no function known

http://omics.org/index.php/Alphabetically_ordered_list_of_omics

With Some Data Exchange…

slide5

now…. The number 1 FAQ

How much biology should I know??

No simple or straight-forward answer… unfortunately!

But the mantra is:

Interact routinely with biologists

OR

Work with the biologists or the biological data

slide6

But I want to learn some basics…

  • http://www.ncbi.nlm.nih.gov/Education
  • http://www.ebi.ac.uk/2can/
  • http://www.genome.gov/Education/
  • http://genomics.energy.gov/
  • Books
  • Introduction to Bioinformatics by Teresa Attwood, David Parry-Smith
  • A Primer of Genome Science by Gibson G and Muse SV
  • Bioinformatics: A Practical Guide to the Analysis of Genes and Proteins, Second Edition by Andreas D. Baxevanis, B. F. Francis Ouellette
  • Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology by Dan Gusfield
  • Bioinformatics: Sequence and Genome Analysis by David W. Mount
  • Discovering Genomics, Proteomics, and Bioinformatics by A. Malcolm Campbell and Laurie J. Heyer
slide7

And the other FAQs….

  • What bioinformatics topics are closest to computer science?
  • Should computer science departments involve themselves in preparing their graduates for careers in bioinformatics?
  • And if so, what topics should they cover?
  • And how much biology should they be taught?
  • Lastly, how much effort should be expended in re-directing computer scientists to do work in bioinformatics?

Cohen, 2005; Communications of the ACM

slide8

Issues to be considered……..

  • Computer science Vs molecular biology – Subject & Scientists - Cultural differences
  • Current goals of molecular biology, genomics (or biomedical research in a broader sense)
  • Data types used in bioinformatics or genomics
  • Areas within computer science of interest to biologists
  • Bioinformatics research - Employment opportunities
biological challenges computer engineers
Biological Challenges - Computer Engineers
  • Post-genomic Era and the goal of bio-medicine
    • to develop a quantitative understanding of how living things are built from the genome that encodes them.
  • Deciphering the genome code
    • Identifying unknown genes and assigning function by computational analysis of genomic sequence
    • Identifying the regulatory mechanisms
    • Identifying their role in normal development/states vs disease states
slide10

Biological Challenges - Computer Engineers

  • Data Deluge: exponential growth of data silos and different data types
    • Human-computer interaction specialists need to work closely with academic and clinical biomedical researchers to not only manage the data deluge but to convert information into knowledge.
  • Biological data is very complex and interlinked!
    • Creating information systems that allow biologists to seamlessly follow these links without getting lost in a sea of information - a huge opportunity for computer scientists.
slide11

Biological Challenges - Computer Engineers

A major goal in molecular biology is Functional Genomics – Study of the relationships among genes in DNA & their function – in normal and disease states

  • Networks, networks, and networks!
    • Each gene in the genome is not an independent entity. Multiple genes interact to perform a specific function.
    • Environmental influences – Genotype-environment interaction
    • Integrating genomic and biochemical data together into quantitative and predictive models of biochemistry and physiology
    • Computer scientists, mathematicians, and statisticians will ALL be an integral and critical part of this effort.
informatics biologists expectations
Informatics – Biologists’ Expectations
  • Representation, Organization, Manipulation, Distribution, Maintenance, and Use of information, particularly in digital form.
  • Functional aspect of bioinformatics: Representation, Storage, and Distribution of data.
    • Intelligent design of data formats and databases
    • Creation of tools to query those databases
    • Development of user interfaces or visualizations that bring together different tools to allow the user to ask complex questions or put forth testable hypotheses.
slide13

Informatics – Biologists’ Expectations

  • Developing analytical tools to discover knowledge in data
    • Levels at biological information is used:
      • comparing sequences – predict function of a newly discovered gene
      • breaking down known 3D protein structures into bits to find patterns that can help predict how the protein folds
      • modeling how proteins and metabolites in a cell work together to make the cell function…….
finally what does informatics mean to biologists
Finally….What does informatics mean to biologists?

The ultimate goal of analytical bioinformaticians is to develop predictive methods that allow biomedical researchers and scientists to model the function and phenotype of an organism based only on its genomic sequence. This is a grand goal, and one that will be approached only in small steps, by many scientists from different but allied disciplines working cohesively.

slide15

Biology – Data Structures

Four broad categories:

Strings: To represent DNA, RNA, amino acid sequences of proteins

Trees: To represent the evolution of various organisms (Taxonomy) or structured knowledge (Ontologies)

Sets of 3D points and their linkages: To represent protein structures

Graphs: To represent metabolic, regulatory, and signaling networks or pathways

slide16

Biology – Data Structures

  • Biologists are also interested in
  • Substrings
  • Subtrees
  • Subsets of points and linkages, and
  • Subgraphs.

Beware: Biological data is often characterized by huge size, the presence of laboratory errors(noise), duplication, and sometimes unreliability.

slide17

Support Complex Queries – A typical demand

  • Get me all genes involved in or associated with brain development that are differentially expressed in the Central Nervous System.
  • Get me allgenesinvolved in brain developmentinhumanandmouse that also showiron ion binding activity.
  • For this set of genes, what aspects of function and/or cellular localization do they share?
  • For this set of genes, what mutations are reported to cause pathological conditions?
slide18

Model Organism Databases: Common Issues

  • Heterogeneous Data Sets - Data Integration
    • From Genotype to Phenotype
    • Experimental and Consensus Views
  • Incorporation of Large Datasets
    • Whole genome annotation pipelines
    • Large scale mutagenesis/variation projects (dbSNP)
  • Computational vs. Literature-based Data Collection and Evaluation (MedLine)
  • Data Mining
    • extraction of new knowledge
    • testable hypotheses (Hypothesis Generation)
slide19

Bioinformatic Data-1978 to present

  • DNA sequence
  • Gene expression
  • Protein expression
  • Protein Structure
  • Genome mapping
  • SNPs & Mutations
  • Metabolic networks
  • Regulatory networks
  • Trait mapping
  • Gene function analysis
  • Scientific literature
  • and others………..
slide20

Human Genome Project – Data Deluge

No. of Human Gene Records currently in NCBI: 29413 (excluding pseudogenes, mitochondrial genes and obsolete records).

Includes ~460 microRNAs

NCBI Human Genome Statistics – as on February12, 2008

slide21

The Gene Expression Data Deluge

Till 2000: 413 papers on microarray!

Problems Deluge!

Allison DB, Cui X, Page GP, Sabripour M. 2006. Microarray data analysis: from disarray to consolidation and consensus. Nat Rev Genet. 7(1): 55-65.

slide22

Information Deluge…..

A researcher would have to scan 130 different journals and read 27 papers per dayto follow a single disease, such as breast cancer (Baasiri et al., 1999 Oncogene 18: 7958-7965).

  • 3 scientific journals in 1750
  • Now - >120,000 scientific journals!
  • >500,000 medical articles/year
  • >4,000,000 scientific articles/year
  • >16 million abstracts in PubMed derived from >32,500 journals
slide23

Data-driven Problems…..

  • How to name or describe proteins, genes, drugs, diseases and conditions consistently and coherently?
  • How to ascribe and name a function, process or location consistently?
  • How to describe interactions, partners, reactions and complexes?

Some Solutions

  • Develop/Use controlled or restricted vocabularies (IUPAC-like naming conventions, HGNC, MGI, UMLS, etc.)
  • Create/Use thesauruses, central repositories or synonym lists (MeSH, UMLS, etc.)
  • Work towards synoptic reporting and structured abstracting
  • Generally, the names refer to some feature of the mutant phenotype
  • Dickie’s small eye (Thieler et al., 1978, Anat Embryol (Berl), 155: 81-86) is now Pax6
  • Gleeful: "This gene encodes a C2H2 zinc finger transcription factor with high sequence similarity to vertebrate Gli proteins, so we have named the gene gleeful (Gfl)." (Furlong et al., 2001, Science 293: 1632)

What’s in a name!

Rose is a rose is a rose is a rose!

Gene Nomenclature

  • Disease names
  • Mobius Syndrome with Poland’s Anomaly
  • Werner’s syndrome
  • Down’s syndrome
  • Angelman’s syndrome
  • Creutzfeld-Jacob disease
  • Accelerin
  • Antiquitin
  • Bang Senseless
  • Bride of Sevenless
  • Christmas Factor
  • Cockeye
  • Crack
  • Draculin
  • Dickie’s small eye
  • Draculin
  • Fidgetin
  • Gleeful
  • Knobhead
  • Lunatic Fringe
  • Mortalin
  • Orphanin
  • Profilactin
  • Sonic Hedgehog
slide24

Rose is a rose is a rose is a rose….. Not Really!

What is a cell?

  • any small compartment
  • (biology) the basic structural and functional unit of all organisms; they may exist as independent units of life (as in monads) or may form colonies or tissues as in higher plants and animals
  • a device that delivers an electric current as a result of chemical reaction
  • a small unit serving as part of or as the nucleus of a larger political movement
  • cellular telephone: a hand-held mobile radiotelephone for use in an area divided into small sections, each with its own short-range transmitter/receiver
  • small room in which a monk or nun lives
  • a room where a prisoner is kept

Image Sources: Somewhere from the internet…

slide25

Semantic Groups, Types and Concepts:

  • Semantic Group Biology – Semantic Type Cell
  • Semantic Groups ObjectORDevices – Semantic Types Manufactured Device or Electrical Device or Communication Device
  • Semantic Group Organization – Semantic Type Political Group

Foundation Model Explorer

slide26

HEPATOCELLULAR CARCINOMA SOMATIC [ARG249SER]

CTNNB1

TP53*

MET

Hepatocellular Carcinoma

TP53

aflatoxin B1, a mycotoxin induces a very specific G-to-T mutation at codon 249 in the tumor suppressor gene p53.

Environmental Effects

  • COLORECTAL CANCER [3-BP DEL, SER45DEL]
  • COLORECTAL CANCER [SER33TYR]
  • PILOMATRICOMA, SOMATIC [SER33TYR]
  • HEPATOBLASTOMA, SOMATIC [THR41ALA]
  • DESMOID TUMOR, SOMATIC [THR41ALA]
  • PILOMATRICOMA, SOMATIC [ASP32GLY]
  • OVARIAN CARCINOMA, ENDOMETRIOID TYPE, SOMATIC [SER37CYS]
  • HEPATOCELLULAR CARCINOMA SOMATIC [SER45PHE]
  • HEPATOCELLULAR CARCINOMA SOMATIC [SER45PRO]
  • MEDULLOBLASTOMA, SOMATIC [SER33PHE]

The REAL Problems

Many disease states are complex, because of many genes (alleles & ethnicity, gene families, etc.), environmental effects (life style, exposure, etc.) and the interactions.

slide27

ALK in cardiac myocytes

  • Cell to Cell Adhesion Signaling
  • Inactivation of Gsk3 by AKT causes accumulation of b-catenin in Alveolar Macrophages
  • Multi-step Regulation of Transcription by Pitx2
  • Presenilin action in Notch and Wnt signaling
  • Trefoil Factors Initiate Mucosal Healing
  • WNT Signaling Pathway
  • HEPATOCELLULAR CARCINOMA
  • LIVER:
    • Hepatocellular carcinoma;
    • Micronodular cirrhosis;
    • Subacute progressive viral hepatitis
  • NEOPLASIA:
    • Primary liver cancer
  • CBL mediated ligand-induced downregulation of EGF receptors
  • Signaling of Hepatocyte Growth Factor Receptor

CTNNB1

MET

  • Estrogen-responsive protein Efp controls cell cycle and breast tumors growth
  • ATM Signaling Pathway
  • BTG family proteins and cell cycle regulation
  • Cell Cycle
  • RB Tumor Suppressor/Checkpoint Signaling in response to DNA damage
  • Regulation of transcriptional activity by PML
  • Regulation of cell cycle progression by Plk3
  • Hypoxia and p53 in the Cardiovascular system
  • p53 Signaling Pathway
  • Apoptotic Signaling in Response to DNA Damage
  • Role of BRCA1, BRCA2 and ATR in Cancer Susceptibility….Many More…..

TP53

The REAL Problems

slide28

Methods for Integration

  • Link driven federations
    • Explicit links between databanks.
  • Warehousing
    • Data is downloaded, filtered, integrated and stored in a warehouse. Answers to queries are taken from the warehouse.
  • Others….. Semantic Web, etc………
slide29

Link-driven Federations

  • Creates explicit links between databanks
  • query: get interesting results and use web links to reach related data in other databanks
  • Examples: NCBI-Entrez, SRS
slide35

Link-driven Federations

  • Advantages
    • complex queries
    • Fast
  • Disadvantages
    • require good knowledge
    • syntax based
    • terminology problem not solved
slide36

Data Warehousing

Data is downloaded, filtered, integrated and stored in a warehouse. Answers to queries are taken from the warehouse.

  • Advantages
  • Good for very-specific, task-based queries and studies.
  • Since it is custom-built and usually expert-curated, relatively less error-prone.
  • Disadvantages
  • Can become quickly outdated – needs constant updates.
  • Limited functionality – For e.g., one disease-based or one system-based.
slide37

Algorithms in Bioinformatics

  • Finding similarities among strings
  • Detecting certain patterns within strings
  • Finding similarities among parts of spatial structures (e.g. motifs)
  • Constructing trees
    • Phylogenetic or taxonomic trees: evolution of an organism
    • Ontologies – structured/hierarchical representation of knowledge
  • Classifying new data according to previously clustered sets of annotated data
slide38

Algorithms in Bioinformatics

Reasoning about microarray data and the corresponding behavior of pathways

Predictions of deleterious effects of changes in DNA sequences

Computational linguistics: NLP/Text-mining. Published literature or patient records

Graph Theory – Gene regulatory networks, functional networks, etc.

Visualization and GUIs (networks, application front ends, etc.)

slide39

Disease Gene Identification and Prioritization

Hypothesis: Majority of genes that impact or cause disease share membership in any of several functional relationships OR Functionally similar or related genes cause similar phenotype.

  • Functional Similarity – Common/shared
    • Gene Ontology term
    • Pathway
    • Phenotype
    • Chromosomal location
    • Expression
    • Cis regulatory elements (Transcription factor binding sites)
    • miRNA regulators
    • Interactions
    • Other features…..
slide40

Background, Problems & Issues

  • Most of the common diseases are multi-factorial and modified by genetically and mechanistically complex polygenic interactions and environmental factors.
  • High-throughput genome-wide studies like linkage analysis and gene expression profiling, tend to be most useful for classification and characterization but do not provide sufficient information to identify or prioritize specific disease causal genes.
slide41

Background, Problems & Issues

Since multiple genes are associated with same or similar disease phenotypes, it is reasonable to expect the underlying genes to be functionally related.

Such functional relatedness (common pathway, interaction, biological process, etc.) can be exploited to aid in the finding of novel disease genes. For e.g., genetically heterogeneous hereditary diseases such as Hermansky-Pudlak syndrome and Fanconianaemia have been shown to be caused by mutations in different interacting proteins.

slide42

PPI - Predicting Disease Genes

Direct protein–protein interactions (PPI) are one of the strongest manifestations of a functional relation between genes.

Hypothesis: Interacting proteins lead to same or similar disease phenotypes when mutated.

Several genetically heterogeneous hereditary diseases are shown to be caused by mutations in different interacting proteins. For e.g. Hermansky-Pudlak syndrome and Fanconianaemia. Hence, protein–protein interactions might in principle be used to identify potentially interesting disease gene candidates.

slide43

Which of these interactants are potential new candidates?

7

Known Disease Genes

66

HPRD

BioGrid

Mining human interactome

778

Direct Interactants of Disease Genes

Indirect Interactants of Disease Genes

  • Prioritize candidate genes in the interacting partners of the disease-related genes
  • Training sets: disease related genes
  • Test sets: interacting partners of the training genes
slide48

the Ultimate Goal…….

Disease World

Medical Informatics

Bioinformatics

Genome

Variome

Transcriptome

Regulome

Disease Database

  • Personalized Medicine
  • Decision Support System
  • Outcome Predictor
  • Course Predictor
  • Diagnostic Test Selector
  • Clinical Trials Design
  • Hypothesis Generator…..

Proteome

  • Name
  • Synonyms
  • Related/Similar Diseases
  • Subtypes
  • Etiology
  • Predisposing Causes
  • Pathogenesis
  • Molecular Basis
  • Population Genetics
  • Clinical findings
  • System(s) involved
  • Lesions
  • Diagnosis
  • Prognosis
  • Treatment
  • Clinical Trials……

Interactome

Patient Records

Pharmacogenome

Metabolome

Physiome

Pathome

Clinical Trials

Computer Engineers

PubMed

OMIM

slide49

“To him who devotes his life to science, nothing can give more happiness than increasing the number of discoveries, but his cup of joy is full when the results of his studies immediately find practical applications”

— Louis Pasteur

Thank You!

& YOU

http://sbw.kgi.edu/