architectural constraints on current bioinformatics integration systems n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Architectural Constraints on Current Bioinformatics Integration Systems PowerPoint Presentation
Download Presentation
Architectural Constraints on Current Bioinformatics Integration Systems

Loading in 2 Seconds...

play fullscreen
1 / 27

Architectural Constraints on Current Bioinformatics Integration Systems - PowerPoint PPT Presentation


  • 94 Views
  • Uploaded on

Architectural Constraints on Current Bioinformatics Integration Systems. Norman Paton Department of Computer Science University of Manchester Manchester, UK <norm>@cs.man.ac.uk. Structure of Presentation. Current integration proposals. What they support. What they don’t support, and why.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

Architectural Constraints on Current Bioinformatics Integration Systems


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
architectural constraints on current bioinformatics integration systems

Architectural Constraints on Current Bioinformatics Integration Systems

Norman Paton

Department of Computer Science

University of Manchester

Manchester, UK

<norm>@cs.man.ac.uk

structure of presentation
Structure of Presentation
  • Current integration proposals.
    • What they support.
    • What they don’t support, and why.
  • Requirements for integration.
    • What could be useful, and why.
  • Grid opportunities.
    • Relevant Grid technologies.
    • Absent Grid technologies.
slide5
SRS

Sequence Retrieval System

http://srs.ebi.ac.uk/

srs in use
SRS In Use

List of Databases

Search Interfaces

Selected Databases

srs results
SRS Results

Links to Result Records

bionavigator
BioNavigator
  • BioNavigator combines data sources and the tools that act over them.
  • As tools act on specific kinds of data, the interface makes available only tools that are applicable to the data in hand.

Online trial from:

https://www.bionavigator.com/

initiating navigation
Initiating Navigation

Select database

Enter accession number

viewing selected data
Viewing Selected Data

Navigate to related programs

Relevant display options

chaining analyses in macros
Chaining Analyses in Macros

Chained collections of navigations can be saved as macros and restored for later use.

current public integration systems
Current Public Integration Systems
  • Location: data is replicated – under control.
  • Integration model: often minimal.
  • Architecture: The architecture is often two-tier.
  • Analysis support: Query and analysis access is carefully contained.

Only very careful instantiation of the classification

yields sufficiently predictable performance.

example analysis
Example Analysis
  • Data:
    • Yeast genome sequence.
    • Protein-protein interaction data.
    • 350 transcriptome experiments.
    • Overall database ~350Mb.
  • Analysis:
    • Correlate transcription of interacting proteins.
features of experience
Features of Experience
  • Challenging to conduct single runs of analyses – must break into bits.
  • These are modest data sets compared with what is coming.
  • Environment has been designed with analysis in mind.
  • These analyses will never make it into the public release!
requirements for integration1
Requirements for Integration
  • Location: replication is transparent.
  • Integration model: standards.
  • Architecture: Flexible, multiple tier.
  • Analysis support: Arbitrary analyses over diverse data sets.

True integration in bioinformatics should not just be

data oriented, but involve integration of analyses.

three tier architecture
Clients handle user interaction and presentation.

Application servers perform computation and analysis.

Data servers manage and query databases.

Client

Application

Server

Data

Server

Three Tier Architecture
three tier architecture1
Three Tier Architecture
  • Scaleability:
    • Replace/Upgrade components as needed.
    • Replace/Upgrade layers independently.
  • Flexibility:
    • Application server layer protects clients from changes in database layer.

Classical three tier architectures are configured

statically, and are adapted slowly as needs evolve.

necessary and missing
Necessary:

Directory services.

Discovery services.

Co-allocation.

Data replication.

Workload management.

Accounting and payment.

Missing:

Databases.

Data models.

Heterogeneity resolution.

Personalisation.

Web services.

Standards.

Necessary and Missing
dynamic multi tier
Dynamic Multi-Tier

Resources need to be identified,

selected and

scheduled

dynamically.

Client

Application

Server

Application

Server

Application

Server

Data

Server

Data

Server

grid classification
Grid Classification

The current Grid is not the answer, but the answer

subsumes the current facilities of the Grid.

summary
Summary
  • Current integration facilities in biology:
    • Are cunningly restrictive.
    • Make the most of limited distributed computational architectures.
  • The Grid is bringing to the table:
    • Resource description facilities.
    • Resource scheduling and workflow management facilities.
  • The Grid does not directly address current needs in biology, but its descendents may.