Gus overview
This presentation is the property of its rightful owner.
Sponsored Links
1 / 18

GUS Overview PowerPoint PPT Presentation


  • 35 Views
  • Uploaded on
  • Presentation posted in: General

GUS Overview. June 18, 2002. GUS-3.0. Genomics Unified Schema. Supports application and data integration Uses an extensible architecture. Is object-oriented even though it uses an underlying relational database management system (Oracle).

Download Presentation

GUS Overview

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Gus overview

GUS Overview

June 18, 2002


Gus 3 0

GUS-3.0

Genomics Unified Schema

  • Supports application and data integration

  • Uses an extensible architecture.

  • Is object-oriented even though it uses an underlying relational database management system (Oracle).

  • Warehouse instead of federation for local stable copy

  • Uses standards for bulk data exchange (e.g., MAGE)


Gus usage

GUS Usage

  • Annotation

    • of genomes - gene models, sequence features

    • of genes - gene function, gene expression, gene regulation

  • Data mining

    • Develop algorithms and queryable resource

  • Publish

    • Map identifiers with other resources/ databases

    • URL for entry retrieval/ ad hoc queries in web interface


Gus 3 0 name spaces

GUS-3.0 Name Spaces

GUS has 5 name spaces compartmentalizing different types of information.


Application integration plasmodb

Application Integration: PlasmoDB

PublicDatabases

TIGRSangerStanford

PlasmodiumInvestigators

Existing implementation

Future implementation

QTL,POP,

SNP, Clinical

GenBank, InterPro, GO, etc

GenomicSequence

microArray& SAGEExperiments

GSSs &ESTs

MappingData

Annotation

Object Layer

Oracle/SQL

DoTS

TESS

RAD

Core

SRes

AutomatedAnalysis &Integration

Annotator’s Interface

Java Servlets &Perl CGI

GenePlotCD

WWW queries,browsing, & download

GenePlotSoftware


Gus supports multiple projects

DoTS

RAD

TESS

SRES

Core

GUS Supports Multiple Projects

AllGenes

PlasmoDB

EPConDB

Java Servlets

Oracle RDBMS

Other sites,

Other projects

Object Layer for Data Loading


Main aspects of gus development

Main Aspects of GUS Development

  • Choice of development tools

    • Schema:

      • CREATE TABLE statements

      • Documentation plug-in: input is tab- delimited text

      • UML - Rational Rose, PowerDesigner

    • Code: CVS

  • Areas to emphasize

    • Plug-ins

    • Work flow

    • TESS

    • Proteomics

    • Images

  • Preferred type of user interface

    • JSP

    • PHP


Data integration

Data Integration

DoTS

  • GO

  • Species

  • Tissue

  • Dev. Stage

  • Genes, gene models

  • STSs, repeats, etc

  • Cross-species analysis

Genomic

Sequence

Ontologies

  • Characterize transcripts

  • RH mapping

  • Library analysis

  • Cross-species analysis

  • DOTS

Transcribed

Sequence

SRes

RAD

TESS

  • Arrays

  • SAGE

  • Conditions

  • Binding Sites

  • Patterns

  • Grammars

  • Domains

  • Function

  • Structure

  • Cross-species analysis

Gene Regulation

Transcript

Expression

Protein

Sequence

Core

  • Ownership

  • Protection

  • Algorithms

  • Similarity

  • Versioning

  • Workflow

Data Provenance

Transcription factors

up-regulated in

acute myeloid leukemia

with sequence similarity to c-fos

and common promoter motifs


Gus overview

Identify shared

TF binding sites

Genomic alignment

and comparative

Sequence analysis

TESS

RAD

GUS

EST clustering

and assembly


Gus approach to schema

GUS Approach to Schema

  • Think objects

    • Parents and children

    • Subclassing with views

  • Views

    • Start with generic Imp table (e.g., NAFeatureImp) that contains base attributes plus generic attributes of various datatypes

    • Superclass view (e.g., NAFeature) just has base attributes

    • Subclass views (e.g., RNAFeature) have additional attributes using generic attributes

  • Strongly-typed

    • Tend to avoid “name-value” pairs


Dots central dogma

DoTS Central Dogma

Gene

Genomic

Sequence

Gene

Instance

Gene

Feature

NA

Feature

NA

Sequence

RNA

RNA

Sequence

RNA

Instance

RNA

Feature

Protein

Protein

Sequence

Protein

Instance

Protein

Feature

AA

Sequence

AA

Feature


Gus overview

DoTS Schema Has Been Driven By Building Gene Indices

Genomic

Sequence

mRNA/EST

Sequence

Clustering and

Assembly

Gene predictions

GenScan/ HMMer, PHAT

SIM4 or BLAT

Predicted

Genes

DoTS consensus

Sequences

Merge Genes

Gene/RNA cluster

assignment

Annotate DoTS

Manual Annotation

Tasks

Gene

Index

framefinder

RNAs

Proteins

translation

BLASTX

PFAM, Smart, ProDom

BLASTP

Other computed annotation

(EPCR,

AssemblyAnatomyPercent,

Index Key Words,

SNP analysis)

BLAST Similarities

Functional predictions

Protein

Motifs

GO Functions


Dots gene indices are based on clustering and assembling ests

DoTS Gene Indices Are Based on Clustering and Assembling ESTs


Rad 3 0 schema incorporates mage and experience with microarrays

RAD 3.0 Schema Incorporates MAGE and Experience With Microarrays

LIMS for Data Analysis. Also holds SAGE.


Status of gus namespaces

Status of GUS Namespaces

  • Core

    • Tables exist, Workflow documented

  • Sres

    • Tables exist

  • DoTS

    • Tables exist, some documentation

  • RAD

    • Version 3.0 to include MAGE, experience

      • Pretty much complete

    • Tables exist, mostly documented

  • TESS

    • Tables ready but not created


Schema development

Schema Development

  • Releases on Sourceforge:

    • CREATE TABLE statements

    • Table dumps from Core::TableInfo, Core::DatabaseDocumentation

    • Gifs of ER diagrams

  • Adding tables between releases

    • In CVS tree?

    • Use message forum for discussion


Documentation

Documentation

  • Schema Browser looks at TableInfo

  • Plug-in

    • Populates DatabaseDocumentation

    • Input:

      Table\t\tDescription of table

      Table\tAttribute\tDescription of attribute


Gus schema browser

GUS Schema Browser

  • http://www.cbil.upenn.edu/cgi-bin/GUS30/schemaBrowser.pl?db=GUS30

  • Points at GUS30 on CBIL development database server (erebus).

    • Need to move? Maintain release view?

  • DoTS Tables:

    • Central dogma

    • Evidence/ Similarity

    • ProjectLink

    • SequenceGroupImp/ SequenceGroupExperimentImp

    • Plasmomap?

  • Other tables of interest?


  • Login