faster finds from gallo to google n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Faster finds from Gallo to Google PowerPoint Presentation
Download Presentation
Faster finds from Gallo to Google

Loading in 2 Seconds...

play fullscreen
1 / 19

Faster finds from Gallo to Google - PowerPoint PPT Presentation


  • 112 Views
  • Uploaded on

Faster finds from Gallo to Google. Presented to the Niagara University Bioinformatics Seminar Dr. Laurence Boxer Department of Computer and Information Sciences. Applications to string search problems from:

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Faster finds from Gallo to Google' - rollo


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
faster finds from gallo to google

Faster finds from Gallo to Google

Presented to the Niagara University Bioinformatics Seminar

Dr. Laurence Boxer

Department of Computer and Information Sciences

Applications to string search problems from:

L. Boxer and R. Miller, Coarse Grained Gather and Scatter Operations with Applications, Journal of Parallel and Distributed Computing, 64 (2004), 1297-1320

the problem
The Problem:

Examples using case-insensitive exact matches

Given two character strings, a “pattern” and a “text” (with the text typically much larger than the pattern), find all matching copies of the pattern in the text.

P: agtacagtac

T: actaactagtacagtacagtacaactgtccatccg

Output:

P: Gallo

T: If Professor Gallo serves many gallons of home-brewed wine to students who do dastardly deeds in the hallowed DePaul hallways, how many will go to the gallows? Better they should have a singalong…. He used a lame pickup line: “Is this little gal lonely?”

Output:If Professor Gallo serves many gallons of home-brewed wine to students who do dastardly deeds in the hallowed DePaul hallways, how many will go to the gallows? Better they should have a singalong…. He used a lame pickup line: “Is this little gal lonely?”

additional finds when a small number of errors mismatch insert delete are permitted
Additional “finds” when a small number of errors (mismatch, insert, delete) are permitted

P: Gallo

T: If Professor Gallo serves many gallons of home-brewed wine to students who do dastardly deeds in the hallowed DePaul hallways, how many will go to the gallows? Better they should have a singalong…. He used a lame pickup line: “Is this little gal lonely?”

Output:If Professor Gallo serves many gallons of home-brewed wine to students who do dastardly deeds in the hallowed DePaul hallways, how many will go to the gallows?... Better they should have a singalong.… He used a lame pickup line: “Is this little gallonely?”

1 character mismatch: “h” for “g”

Must delete one space for perfect match

Must insert one “l” for perfect match

analysis of algorithms
Analysis of algorithms
  • Seek to estimate proportionalrunning time T(n) of an algorithm when applied to a data set of size n.
  • T(n) = Θ(f(n)) if, for large n, T(n) is approximately proportional to f(n).
  • T(n) = O(f(n)) if, for large n, T(n)< something that’s Θ(f(n)).
  • Emphasis on large n; for small n, even an inefficient algorithm may finish in acceptable time.
previous state of knowledge for exact string matching algorithms for sequential computers
Previous State of Knowledge for exact string matching (algorithms for sequential computers)
  • Using absolute value notation for # of characters in string, suppose
  • |T| = n, |P| = m,
  • where 1 <m < n (usually, m << n).
  • In the worst case, all the input must be considered (otherwise, we may miss a match). There exist Θ(n)-time solutions for sequential computers, which, therefore, are optimal in the worst case.
  • However, n may be so large that Θ(n) time may be unacceptable.
  • Speedup may come by using sequential algorithms highly probable to run faster than worst-case time (topic of another talk).
  • We may use parallel computers to get faster results (topic of today’s talk).
  • Therefore, input size is Θ(m+n). Since
  • n < m+n < n+n = 2n,
  • input size is Θ(n).
parallel vs sequential computers
Parallel vs. sequential computers
  • Ideally, a parallel computer with q processors should solve a problem in 1/q – th of the time that a sequential computer requires.
  • Thus, if is the time for a sequential computer to solve a given problem, then we want the parallel computer to use
  • But achieving this level of speedup may be difficult or impossible, because time is required to exchange data among processors.
  • The time required for standard data exchange operations depends on the configuration of processors.
examples of parallel architectures with times to broadcast a unit of data
Examples of parallel architectures with times to broadcast a unit of data

Linear array. q-1 = Θ(q) steps to send a unit of data from leftmost to rightmost processor

  • Source row (linear array) broadcasts across row.

2. In parallel, each column linear array broadcasts across column.

example tree
Example - tree

In 1st step, root broadcasts to each of its “children;” in subsequent steps, in parallel, nodes at a given level that have just received the datum broadcast to their children. Thus, time is proportional to number of levels, Θ(log q).

communications problems for string matching problems
Communications problems for string matching problems
  • Data is distributed (in segments of consecutive characters) among processors:
  • Occurrences of matches may be broken among processors. Hence want to share copies of 1stm-1 characters of T in a processor with processor containing previous segment of T.
  • Would be useful to have copy of P in each processor.
slide11

For the exact matching problem …

who

------------ P: Gallo

lows

------------ P: Gallo

------------ P: Gallo

------------ P: Gallo

  • Suppose we take the following steps:
  • Each processor gets a copy of all of P.
  • Each processor gets the 1stm-1 characters of T initially stored in the processor with the next segment of T.

Then, in parallel, each processor can run an optimal sequential algorithm on its portion of the data in

time.

so how do we perform these data movements efficiently
So, how do we perform these data movements efficiently?
  • Keys: efficient gather and scatter operations
  • Gather: given a unit of data in each processor, get a copy of each of these values into one processor.
slide13
Scatter: return gathered items to their original processors (typically after modification by a sequential algorithm)
how to gather scatter efficiently q of processors
How to gather/scatter efficiently (q = # of processors)
  • If not already known, identify a minimal spanning tree (MST) rooted at the processor to which data is to be gathered. This is done as follows:
  • Root sends message to each neighbor.
  • Each non-root processor waits for a message. First message to arrive identifies processor’s parent. Upon receipt, send message to each neighbor identifying sender’s parent.
  • Each processor receives messages described above. If A receives a message from B identifying A as parent of B, A knows B is A’s child.
  • Advanced techniques show this takes O(q) time.
  • Performing the gather: In parallel, each processor sends data to its parent processor in the MST until each value reaches the root processor. This takes Θ(q) time.
  • Thus, a gather operation takes Θ(q) time.

To scatter efficiently: reverse the direction of data flow for a gather operation: Θ(q) time.

getting a complete copy of p to each processor assuming m n q p small enough to fit one processor
Getting a complete copy of P to each processor, assuming m < n/q (P small enough to fit one processor)
  • Gather a dummy record from each processor to one processor – Θ(q) time.
  • Gather P to this processor, pipelining the data flow if more than one character of P is stored in any processor. Time is Θ(m+q) = Θ(max{m,q}) .
  • For each character of P, tag each dummy record with the character and scatter, pipelining. Pipelining allows reduction of the time from what one might expect to require Θ(mq) time (m separate scatters of Θ(q) time apiece) to Θ(md+q) = Θ(max{md,q}) (m scatters that overlap in time), where d (degree bound) is the maximum number of neighboring processors to any given processor (1 <d<q - 1).
  • Total time: Θ(md+q) = Θ(max{md,q}). If both md < n/q and q < n/q, the total time is O(n/q).
slide16
Getting each processor them-1 characters of T that follow the processor’s last character of T (case 1):

Suppose processors holding consecutive segments of T are adjacent (this is possible for linear arrays, meshes using snake-like order for processors, hypercubes; not for trees, etc). Then:

  • In parallel, each odd-numbered processor gets the 1stm-1 characters of T that are stored in . This takes Θ(m) time via direct communication (since these processors are adjacent).
  • Similarly, in parallel, each even-numbered processor gets the 1stm-1 characters of T that are stored in . This takes Θ(m) time via direct communication.
  • Thus, total time for this process is Θ(m).
slide17
Getting each processor them-1 characters of T that follow the processor’s last character of T (case 2):

Suppose processors holding consecutive segments of T are not adjacent. Then:

  • In parallel, each processor copies its 1stm-1 characters of T with tags containing the index of the processor with the previous segment. This takes Θ(m) time.
  • Sort these (m-1) q = Θ(mq) data values by their processor index tags so that they each end up in the processor with the previous segment. This takes

time.

  • Thus, total time for this task is
slide18

Thus, we have the following algorithm for the exact string pattern matching problem on a coarse grained parallel computer with q processors:

0) T is distributed among processors in segments of n/q characters apiece.

  • Distribute to each processor a copy of all of P as described above, in Θ(md+q) = Θ(max{md,q}) time. If both q < n/q (coarse grained parallel computer) and md < n/q, the total time is O(n/q).
  • Distribute to each processor a copy of the 1stm-1 characters of the next segment of T. This takes Θ(m) time if processors with consecutive segments are adjacent; time otherwise.
  • Each processor runs an optimal sequential algorithm on its n/q+m-1 characters of T in

time. This reduces to Θ(n/q), since m=O(n/q).

thus we get optimal worst case running time n q under the following conditions
Thus, we get optimal worst-case running time Θ(n/q) under the following conditions:
  • If processors with consecutive segments of T are adjacent, when

q < n/q (equivalently, ) and md < n/q; i.e., if

max{md, q} < n/q.

  • If processors with consecutive segments of T are not adjacent, we need the stronger restriction ,

which is true, for example, when

- equivalently, when .