Exploring Williams-Beuren Syndrome using
This presentation is the property of its rightful owner.
Sponsored Links
1 / 19

Exploring Williams-Beuren Syndrome using my Grid PowerPoint PPT Presentation


  • 148 Views
  • Uploaded on
  • Presentation posted in: General

Exploring Williams-Beuren Syndrome using my Grid. R.D. Stevens, a H.J. Tipney, b C.J. Wroe, a T.M. Oinn, c M. Senger, c P.W. Lord, a C.A. Goble, a A. Brass, a M. Tassabehji b. b University of Manchester, Academic Unit of Medical Genetics St Mary’s Hospital.

Download Presentation

Exploring Williams-Beuren Syndrome using my Grid

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Exploring williams beuren syndrome using my grid

Exploring Williams-Beuren Syndrome using myGrid

R.D. Stevens,a H.J. Tipney,b C.J. Wroe,a T.M. Oinn,c

M. Senger,c P.W. Lord,a C.A. Goble,a A. Brass,a

M. Tassabehji b

bUniversity of Manchester,

Academic Unit of Medical Genetics

St Mary’s Hospital

cEuropean Bioinformatics Institute

Wellcome Trust Genome Campus

Hinxton

a Department of Computer Science

University of Manchester


Williams beuren syndrome wbs

Williams-Beuren Syndrome (WBS)

  • Congenital disorder caused by sporadic gene deletion

  • 1/20,000 live births

  • Effects multiple systems – muscular, nervous, circulatory

  • Characteristic facial features

  • Unique cognitive profile

  • Mental retardation (IQ 40-100, mean~60, ‘normal’ mean ~ 100 )

  • Outgoing personality, friendly nature, ‘charming’

  • Haploinsuffieciency of the region results in the phenotype


Williams beuren syndrome microdeletion

~1.5 Mb

7q11.23

Patient deletions

*

*

WBS

SVAS

GTF2IRD2P

Physical Map

CTA-315H11

‘Gap’

CTB-51J22

FKBP6T

POM121

GTF2IP

NOLR1

NCF1P

PMS2L

STAG3

Chr 7 ~155 Mb

Block B

Block A

Block C

Williams-Beuren Syndrome Microdeletion

Eicher E, Clark R & She, X An Assessment of the Sequence Gaps: Unfinished Business in a Finished Human Genome. Nature Genetics Reviews (2004) 5:345-354

Hillier L et al. The DNA Sequence of Human Chromosome 7. Nature (2003) 424:157-164

A-cen

B-cen

C-cen

C-mid

B-mid

A-mid

B-tel

A-tel

C-tel

WBSCR1/E1f4H

WBSCR5/LAB

GTF2IRD1

WBSCR21

WBSCR22

WBSCR18

WBSCR14

GTF2IRD2

POM121

NOLR1

BAZ1B

BCL7B

FKBP6

GTF2I

CLDN3

CLDN4

CYLN2

STX1A

LIMK1

NCF1

TBL2

RFC2

FZD9

ELN


Exploring williams beuren syndrome using my grid

12181 acatttctac caacagtgga tgaggttgtt ggtctatgtt ctcaccaaat ttggtgttgt 12241 cagtctttta aattttaacc tttagagaag agtcatacag tcaatagcct tttttagctt 12301 gaccatccta atagatacac agtggtgtct cactgtgatt ttaatttgca ttttcctgct 12361 gactaattat gttgagcttg ttaccattta gacaacttca ttagagaagt gtctaatatt 12421 taggtgactt gcctgttttt ttttaattgg gatcttaatt tttttaaatt attgatttgt 12481 aggagctatt tatatattct ggatacaagt tctttatcag atacacagtt tgtgactatt 12541 ttcttataag tctgtggttt ttatattaat gtttttattg atgactgttt tttacaattg 12601 tggttaagta tacatgacat aaaacggatt atcttaacca ttttaaaatg taaaattcga 12661 tggcattaag tacatccaca atattgtgca actatcacca ctatcatact ccaaaagggc 12721 atccaatacc cattaagctg tcactcccca atctcccatt ttcccacccc tgacaatcaa 12781 taacccattt tctgtctcta tggatttgcc tgttctggat attcatatta atagaatcaa

Filling a genomic gap in Silico

  • Identify new, overlapping sequence of interest

  • Characterise the new sequence at nucleotide and amino acid level

Cutting and pasting between numerous web-based services i.e. BLAST, InterProScan etc


Exploring williams beuren syndrome using my grid

Filling a genomic gap in silico

  • Frequently repeated – info rapidly added to public databases

  • Time consuming and mundane

  • Don’t always get results

  • Huge amount of interrelated data is produced – handled in notebooks and files saved to local hard drive

  • Much knowledge remains undocumented:

    Bioinformatician does the analysis

    Advantages:

    Specialist human intervention at every step, quick and easy access

    to distributed services

    Disadvantages:

    Labour intensive, time consuming, highly repetitive and error prone

    process, tacit procedure so difficult to share both protocol and

    results


Why workflows and services

Why Workflows and Services?

Workflow = general technique for describing and enacting a process

Workflow = describes what you want to do, not how you want to do it

Web Service = how you want to do it

Web Service = automated programmatic internet access to applications

  • Automation

    • Capturing processes in an explicit manner

    • Tedium! Computers don’t get bored/distracted/hungry/impatient!

    • Saves repeated time and effort

  • Modification, maintenance, substitution and personalisation

  • Easy to share, explain, relocate, reuse and build

  • Available to wider audience: don’t need to be a coder, just need to know how to do Bioinformatics

  • Releases Scientists/Bioinformaticians to do other work

  • Record

    • Provenance: what the data is like, where it came from, its quality

    • Management of data (LSID - Life Science Identifiers)


My grid

myGrid

  • E-Science pilot research project funded by EPSRC www.mygrig.org.uk

  • Manchester, Newcastle, Sheffield, Southampton, Nottingham, EBI and RFCGR, also industrial partners.

  • ‘targeted to develop open source software to support personalised in silico experiments in biology on a grid.’

www.mygrid.org.uk

Which means….

Distributed computing – machines, tools, databanks, people

Personalisation

Provenance and Data management

Enactment and notification

A virtual lab ‘workbench’, a toolkit which serves life science communities.


Exploring williams beuren syndrome using my grid

SOAPLAB

Web Service

Any Application

Web Service

e.g. DDBJ BLAST

Workflow Components

Freefluo

Freefluo Workflow engine to run workflows

Scufl Simple Conceptual Unified Flow Language

Taverna Writing, running workflows & examining results

SOAPLAB Makes applications available


Exploring williams beuren syndrome using my grid

Query nucleotide sequence

Williams Workflow Plan

RepeatMasker

BLASTwrapper

Pink: Outputs/inputs of a service

Purple: Tailor-made services

Green: Emboss soaplab services

Yellow: Manchester soaplab services

GenBank Accession No

Promotor Prediction

URL inc GB identifier

TF binding Prediction

Translation/sequence file. Good for records and publications

prettyseq

Regulation Element Prediction

GenBank Entry

Amino Acid translation

Sort for appropriate Sequences only

Identifies PEST seq

epestfind

Identify regulatory elements in genomic sequence

Seqret

Identifies FingerPRINTS

pscan

MW, length, charge, pI, etc

Nucleotide seq (Fasta)

pepstats

6 ORFs

Predicts Coiled-coil regions

RepeatMasker

pepcoil

tblastn Vs nr, est, est_mouse, est_human databases.

Blastp Vs nr

Coding sequence

GenScan

BlastWrapper

Restriction enzyme map

restrict

SignalP

TargetP

PSORTII

sixpack

Predicts cellular location

transeq

CpG Island locations and %

cpgreport

Identifies functional and structural domains/motifs

InterPro

RepeatMasker

Repetitive elements

ORFs

Hydrophobic regions

Pepwindow?

Octanol?

Blastn Vs nr, est databases.

ncbiBlastWrapper


Exploring williams beuren syndrome using my grid

The Williams

Workflows

A

B

C

A: Identification of overlapping sequence

B: Characterisation of nucleotide sequence

C: Characterisation of protein sequence


The workflow experience

The Workflow Experience

Have workflows delivered on their promise?

YES!

  • Correct and Biologically meaningful results

  • Automation

    • Saved time, increased productivity

    • Process split into three, you still require humans!

  • Sharing

    • Other people have used and want to develop the workflows

  • Change of work practises

    • Post hoc analysis. Don’t analyse data piece by piece receive all data all at once

    • Data stored and collected in a more standardised manner

    • Results amplification

    • Results management and visualisation


Exploring williams beuren syndrome using my grid

The Workflow Experience

  • Activation energy versus Reusability trade-off

    • Lack of ‘available’ services, levels of redundancy can be limited

    • But once available can be reused for the greater good of the community

  • Licensing of Bioinformatics Applications

    • Means can’t be used outside of licensing body

    • No license = access third-party websites

  • Instability of external services

    • Research level

    • Reliant on other peoples servers

    • Taverna can retry or substitute before graceful failure

  • Shims


Shims

Shims

shim

(sh m) n. A thin, often tapered piece of material used to fill gaps, make something level, or adjust something to fit properly.

shimmed,shim·ming,shims

To fill in, level, or adjust by using shims or a shim.

  • Explicitly capturing the process

  • Unrecorded ‘steps’ which aren’t realised until attempting to build something

  • Enable services to fit together


Shims1

‘I want to identify new sequences which overlap with my query sequence and determine if they are useful’

Sequence database entry

Fasta format sequence

Genbank format sequence

Sequence

i.e. last

known 3000bp

Identify new sequences and determine their degree of identity

Mask

BLAST

Simplify and Compare

Retrieve

Lister

Old BLAST result

Alignment of full query sequence V full ‘new’ sequence

BLAST2

Shims


The biological results

WBSCR21

WBSCR27

WBSCR24

WBSCR18

WBSCR22

WBSCR28

STX1A

CLDN3

CLDN4

RP11-148M21

RP11-731K22

RP11-622P13

314,004bp extension

All nine known genes identified

The Biological Results

Four workflow cycles totalling ~ 10 hours

The gap was correctly closed and all known features identified

WBSCR14

ELN

CTA-315H11

CTB-51J22


Conclusions

Conclusions

  • It works – a new tool has been developed which is being utilised by biologists

  • More regularly undertaken, less mundane, less error prone

  • Once notification is installed won’t even need to initiate it

  • More systematic collection and analysis of results

  • Increased productivity

  • Services: only as good as the individual services, lots of them, we don’t own them, many are unique and at a single site, research level software, reliant on other peoples services, licenses

  • Activation energy


Future directions

Future Directions

  • Scheduling and Notification

  • Portals

  • Results visualisation

  • Re-use: other genomic disorders, Graves Disease


Acknowledgments

www.mygrid.org.uk

Acknowledgments

  • Dr May Tassabehji

  • Prof Andy Brass

  • Medical Genetics team at St Marys Hospital, Manchester

  • Wellcome Trust


Exploring williams beuren syndrome using my grid

myGrid People

www.mygrid.org.uk

Core

Matthew Addis, Nedim Alpdemir, Tim Carver, Rich Cawley, Neil Davis, Alvaro Fernandes, Justin Ferris, Robert Gaizaukaus, Kevin Glover, Carole Goble, Chris Greenhalgh, Mark Greenwood, Yikun Guo, Ananth Krishna, Peter Li, Phillip Lord, Darren Marvin, Simon Miles, Luc Moreau, Arijit Mukherjee, Tom Oinn, Juri Papay, Savas Parastatidis, Norman Paton, Terry Payne, Matthew Pockock Milena Radenkovic, Stefan Rennick-Egglestone, Peter Rice, Martin Senger, Nick Sharman, Robert Stevens, Victor Tan, Anil Wipat, Paul Watson and Chris Wroe.

Users

Simon Pearce and Claire Jennings, Institute of Human Genetics School of Clinical Medical Sciences, University of Newcastle, UK

Hannah Tipney, May Tassabehji, Andy Brass, St Mary’s Hospital, Manchester, UK

Postgraduates

Martin Szomszor, Duncan Hull, Jun Zhao, Pinar Alper, John Dickman, Keith Flanagan, Antoon Goderis, Tracy Craddock, Alastair Hampshire

Industrial

Dennis Quan, Sean Martin, Michael Niemi, Syd Chapman (IBM)

Robin McEntire (GSK)

Collaborators

Keith Decker


  • Login