Cluster Computer For Bioinformatics Applications

# Cluster Computer For Bioinformatics Applications - PowerPoint PPT Presentation

Nile University, Bioinformatics Group. Cluster Computer For Bioinformatics Applications. Hisham Adel 2008. Done By:. Hisham Adel Hassan. Supervised by: Dr. Mohamed Aboualhouda. Points. Introduction. Cluster and Supercomputers. Cluster Types and Advantages. Our Cluster.

Nile University,

Bioinformatics Group.

Cluster Computer For Bioinformatics Applications

2008

Done By:

Supervised by:

Dr. Mohamed Aboualhouda

Points

• Introduction.
• Cluster and Supercomputers.
• Our Cluster.
• Cluster Performance.
• Cluster Computer for Basic Problems.
• General Idea about Sequence Alignment.
• BLAST and Parallel BLAST Algorithm.
• Sequence Alignment and Parallel Sequence Alignment.
• Learned Skills.

Cluster Definition

• Group of computers and servers (connected together) that act like a single system.
• Each system called a Node.
• Node contain one or more Processor , Ram ,Hard disk and LAN card.
• Nodes work in Parallel.
• We can increase performance by adding more Nodes.

Cluster types

• Load Balancing Cluster (Parallel BLAST).
• Computing Cluster(Parallel sequence alignment).
• High-availability (HA) clusters.

• Performance.
• Scalability.
• Maintenance.
• Cost.

Our Cluster

Node 4

Node 1

Internet

Internet

switch

Node 3

Node 2

Internet

Internet

Our Cluster specification

Communication : Switch 5-Port 10/100Mbps.

Processor and Ram:

-Master Node

Duo core Processor 1.86 GHZ.

Ram 1GB.

-Node 1

Pentium 4

Ram 1GB.

-Node 2

Pentium 4

Ram 1GB

-Node 3

Pentium 4

Ram 512 MB

Our Cluster specification (cont’)‏

• Operating System OPEN SUSE 10.3

http://software.opensuse.org/

• MPICH2

http://www.mcs.anl.gov/research/projects/mpich2/

Performance of the Cluster is affected by

1-Node speed.

2-Running Program.

Running Program(Parallel)‏

Data sent

Data sent

Data sent

Running Program(Parallel)‏

Working…

Working…

Working…

Working…

Running Program(Parallel)‏

Finished…

Results

Get results…

Finished…

Results

Finished…

Results

Sequence Alignment

Used to :

1-Compare between sequences.

2-Search databases.

How to Align two Sequences.

if we have two sequences A A A C G A

A A T G A

Let match=1, gap=-1 , miss-match=0.

they can be aligned as:

1- A A A C G A

| | | | | | Score=3

A A T _ G A

2- A A A C _ G A

| | | | | | | Score=1

A A _ _ T G A

BLAST

(Basic Local Alignment Search Tool)‏

Searching DataBases

BLAST Algorithm

(High scoring pairs)‏

Blast search types.

BLASTN -Compares a nucleotide query sequence against a nucleotide sequence

database.

BLASTP- Compares an amino acid query sequence against a protein sequence

database.

TBLASTN- Compares a protein query sequence against a nucleotide sequence

Database.

BLASTX- Compares nucleotide query sequence against a protein sequence database.

Parallel BLAST(cont’)‏

Formatdb.c

Nucleotide sequence database “formatdb -i DATABASE -p F “.

Protein sequence database “formatdb -i DATABASE -p T “.

Parallel BLAST(cont’)‏

Linux_Cluster_BLASTALL.c

“blastall -p BLAST Search Type -d DATABASE -i QUERY FILE -o out . Txt”

Results

Average of running 1000 Query, 1000 times.

Results(cont’)‏

Average of running 1000 Query, 1000 times.

Results(cont’)‏

Average of running 1000 Query, 1000 times.

• Performane: Batter by using CLUSTER.
• Scalability:More Nodes time decrease.

Sequence Alignment

Compare between sequences

Sequence Alignment

• Introduction.
• Sequence Alignment Benefits.
• Sequence Alignment Types.

Our Sequence Alignment Program

• Pairwise Alignment.
• Built Using Needleman-Wunsch algorithm.

Learned Skills.

• Using Linux (Suse 10.3) operating system.
• Programming using C language.
• Cluster computers and how to build one.
• MPICH2 for message passing interfaces between nodes.
• Latex.
• Team working, and helping each other.
• Presentation skills.