slide1 l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Applying Semantic Technologies to the Glycoproteomics Domain PowerPoint Presentation
Download Presentation
Applying Semantic Technologies to the Glycoproteomics Domain

Loading in 2 Seconds...

play fullscreen
1 / 29

Applying Semantic Technologies to the Glycoproteomics Domain - PowerPoint PPT Presentation


  • 123 Views
  • Uploaded on

Applying Semantic Technologies to the Glycoproteomics Domain. W. S York May 15, 2006. Some Goals of Glycoproteomics . How do changes in the expression levels of specific genes alter the expression of specific glycans on the cell surface?

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Applying Semantic Technologies to the Glycoproteomics Domain' - trapper


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide2

Some Goals of Glycoproteomics

  • How do changes in the expression levels of specific genes alter the expression of specific glycans on the cell surface?
  • Are changes in the expression of specific glycans at the cell surface related to cell function, cell development, and disease?
  • What are the mechanisms by which specific glycans at the cell surface affect cell function, cell development, and the progression of disease?
slide3

Challenges of Glycoproteomics

  • Vast amounts of data collected by high-throughput experiments - better methods for data archival, retrieval, and analysis are needed
  • Complex structures of glycans and glycoproteins – better methods for representing branched structures and finding structural and functional homologies are needed
  • Complex Biology and Biochemistry – better methods to find relationships between the glycoproteome and biological processes are needed
slide4

Glycoproteomics Solutions

  • Brute-force analysis of flat data files
    • Too much data
    • Data is heterogeneous
    • What does the data represent?
  • Relational databases
    • Data is well organized
    • Data organization is relatively rigid
    • What does the data represent?
  • Semantic Technologies
    • Data is well organized
    • Data organization is flexible
    • Concepts represented by data are accessible
    • Relationships between concepts are accessible
slide5

What is Semantic Technology?

The implication is that enabling computers to “understand” the meanings of and relationships between concepts will allow them to reason and communicate in a way that is analogous to the way humans do.

Semantics:1. (Linguistics) The study or science of meaning in language.2. (Linguistics) The study of relationships between signs and symbols and what they represent.

The American Heritage® Dictionary of the English Language, Fourth Edition

Semantic Technology:

The use of formal representations of concepts and their relationships to enable efficient, intelligent software.

Ontology (Computer Science):

A model that represents a domain and is used to reason about the objects in that domain and the relations between them.

http://en.wikipedia.org/wiki/Ontology_(computer_science)

slide6

A Simple Ontology

Organism

is_a

is_a

Animal

Plant

is_a

is_a

is_a

is_a

is_a

Lion

Cow

Deer

Hosta

Alfalfa

is_a

is_a

is_a

is_a

is_a

Elsa

Elsie

Bambi

My Hosta

Peter’s Alfalfa

ate

is_a

ate

ate

ate

Simba

slide7

A Simple Ontology

Organism

is_a

is_a

Animal

Plant

eats

is_a

eats

is_a

Carnivore

Herbivore

is_a

is_a

is_a

is_a

is_a

Lion

Cow

Deer

Hosta

Alfalfa

is_a

is_a

is_a

is_a

is_a

Elsa

Elsie

Bambi

My Hosta

Peter’s Alfalfa

ate

is_a

ate

ate

ate

Simba

slide8

is_a

molecule

molecular

fragment

is_a

carbohydrate

moiety

is_a

monoglycosyl

moiety

residue

glycan

moiety

is_a

N-glycan

is_a

amino acid

residue

O-glycan

carbohydrate

residue

The Structure of GlycO – Concept Taxonomy

chemical

entity

slide9

residue

glycan

moiety

is_a

N-glycan

is_a

amino acid

residue

O-glycan

carbohydrate

residue

The Structure of GlycO

– Concept Taxonomy

slide10

The Structure of GlycO

– Concept Taxonomy

– Instances and Properties

has_residue

N-glycan_00020

is_linked_to

residue

is_instance_of

glycan

moiety

N-glycan

a-D-Manp 4

N-glycan core

b-D-Manp

is_a

N-glycan

is_a

amino acid

residue

is_instance_of

is_instance_of

O-glycan

carbohydrate

residue

slide11

The GlycO Ontology in Protégé

3 Top-Level Classes are Defined in GlycO

slide12

The GlycO Ontology in Protégé

Semantics Include Chemical Context

This Class Inherits from 2 Parents

slide13

The GlycO Ontology in Protégé

The -D-Manp residues in N-glycans are found in 8 different chemical environments

slide14

b-D-GlcpNAc

-(1-6)+

b-D-GlcpNAc

-(1-2)-

b-D-GlcpNAc

-(1-2)+

b-D-GlcpNAc

-(1-4)-

a-D-Manp

-(1-6)+

b-D-Manp

-(1-4)-

b-D-GlcpNAc

-(1-4)-

b-D-GlcpNAc

a-D-Manp

-(1-3)+

GlycoTree – A Canonical Representation of N-Glycans

We give a residue in this position the same name, regardless of the specificstructure it resides in

Semantics!

N. Takahashi and K. Kato, Trends in Glycosciences and Glycotechnology, 15: 235-251

slide15

The GlycO Ontology in Protégé

Bisecting -D-GlcpNAc

slide17

The GlycO Ontology in Protégé

1,3-linked -L-Fucp

slide20

Ontology Population Workflow

[][Asn]{[(4+1)][b-D-GlcpNAc]

{[(4+1)][b-D-GlcpNAc]

{[(4+1)][b-D-Manp]

{[(3+1)][a-D-Manp]

{[(2+1)][b-D-GlcpNAc]

{}[(4+1)][b-D-GlcpNAc] {}}[(6+1)][a-D-Manp]

{[(2+1)][b-D-GlcpNAc]{}}}}}}

slide21

Ontology Population Workflow

<Glycan>

<aglycon name="Asn"/>

<residue link="4" anomeric_carbon="1" anomer="b" chirality="D" monosaccharide="GlcNAc">

<residue link="4" anomeric_carbon="1" anomer="b" chirality="D" monosaccharide="GlcNAc">

<residue link="4" anomeric_carbon="1" anomer="b" chirality="D" monosaccharide="Man" >

<residue link="3" anomeric_carbon="1" anomer="a" chirality="D" monosaccharide="Man" >

<residue link="2" anomeric_carbon="1" anomer="b" chirality="D" monosaccharide="GlcNAc" >

</residue>

<residue link="4" anomeric_carbon="1" anomer="b" chirality="D" monosaccharide="GlcNAc" >

</residue>

</residue>

<residue link="6" anomeric_carbon="1" anomer="a" chirality="D" monosaccharide="Man" >

<residue link="2" anomeric_carbon="1" anomer="b" chirality="D" monosaccharide="GlcNAc">

</residue>

</residue>

</residue>

</residue>

</residue>

</Glycan>

slide22

The ProPreO Ontology in Protégé

3 Top-Level Classes are Defined in ProPreO

slide23

The ProPreO Ontology in Protégé

This Class Inheritsfrom 2 Parents

slide24

The ProPreO Ontology in Protégé

This Class Inheritsfrom 2 Parents

slide25

Semantic Annotation of MS Data

parent ion charge

830.9570 194.9604 2

580.2985 0.3592

688.3214 0.2526

779.4759 38.4939

784.3607 21.7736

1543.7476 1.3822

1544.7595 2.9977

1562.8113 37.4790

1660.7776 476.5043

parent ion m/z

parent ionabundance

fragment ion m/z

fragment ionabundance

ms/ms peaklist data

slide26

Semantically Annotated MS Data

<ms/ms_peak_list>

<parameter instrument=micromass_QTOF_2_quadropole_time_of_flight_mass_spectrometer

mode = “ms/ms”/>

<parent_ion m/z = 830.9570 abundance=194.9604 z=2/>

<fragment_ion m/z = 580.2985 abundance = 0.3592/>

< fragment_ion m/z = 688.3214 abundance = 0.2526/>

< fragment_ion m/z = 779.4759 abundance = 38.4939/>

< fragment_ion m/z = 784.3607 abundance = 21.7736/>

< fragment_ion m/z = 1543.7476 abundance = 1.3822/>

< fragment_ion m/z = 1544.7595 abundance = 2.9977/>

< fragment_ion m/z = 1562.8113 abundance = 37.4790/>

< fragment_ion m/z = 1660.7776 abundance = 476.5043/>

<ms/ms_peak_list>

OntologicalConcepts

slide27

Web Services Based Workflow for Proteomics1

Biological Sample

Analysis by MS/MS

Agent

Raw Data to

Standard Format

Agent

Data

Pre- process2

Agent

DB Search

(Mascot/Sequest)

Agent

Results Post-process

(ProValt3)

O

I

O

I

O

I

O

I

O

Storage

Raw Data

Standard Format

Data

Filtered Data

Search Results

Final Output

Biological Information

1 Design and Implementation of Web Services based Workflow for proteomics. Journal of Proteome Research. Submitted

2 Computational tools for increasing confidence in protein identifications. Association of Biomolecular Resource Facilities

Annual Meeting, Portland, OR, 2004.

3 A Heuristic method for assigning a false-discovery rate for protein identifications from Mascot database search results. Mol.

Cell. Proteomics. 4(6), 762-772.

slide28

An Integrated Semantic Information System

  • Formalized domain knowledge is in ontologies
    • The schema defines the concepts
    • Instances represent individual objects
    • Relationships provide expressiveness
  • Data is annotated using concepts from the ontologies
  • The semantic annotations facilitate the identification and extraction of relevant information
  • The semantic relationships allow knowledge that is implicit in the data to be discovered
slide29

Satya Sahoo

Christopher Thomas

Cory Henson

Ravi Pavagada

Amit Sheth

Krzysztof Kochut

John Miller

James Atwood

Lin Lin

Alison Nairn

Gerardo Alvarez-Manilla

Saeed Roushanzamir

Michael Pierce

Ron Orlando

Kelley Moremen

Parastoo Azadi

Alfred Merrill