1 / 14

More Specialized Data Structures

More Specialized Data Structures. String data structures Spatial data structures. String Data Structures. String Operations. String indexing Pattern matching Find pattern P in text T Find common substrings among a set of a strings Application Domains Bioinformatics Google search!.

lark
Download Presentation

More Specialized Data Structures

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. More Specialized Data Structures String data structures Spatial data structures

  2. String Data Structures

  3. String Operations • String indexing • Pattern matching • Find pattern P in text T • Find common substrings among a set of a strings • Application Domains • Bioinformatics • Google search!

  4. A simplified hash table for strings 0.Build a lookup table of size |Σ|wfor all w-length words in D 1 2 3 4 5 6 7 Σ={A,C,G,T} w = 2  42 (=16) entries in lookup table S1: C A G T C C T S2: C G T T C G C Lookup table: AA AC AG AT CA CC CG CT GA GC GG GT TA TC TG TT S1,4 S1,2 S1,1 S1,5 S1,3 S1,6 S2,1 S2,6 S2,3 S2,2 S2,4 S2,5

  5. PATRICIA trees • “Practical Algorithm to Retrieve Information Coded in Alphanumeric” • Compacted trie of a set of strings • Dictionary searches made easy

  6. Suffix Tree • Compacted trie of all suffixes of a string 1 2 3 4 5 6 B A N A N A Find Pattern: “ANAN” Think how to implement Google Search?

  7. Generalized Suffix Tree (GST) WINDOW$ INDIGO$ 1234567 1234567 $ D ND I $OG O W (1, 7) (2, 7) (2, 5) ND OW$ $ $OGI OW$ $OGI $OG $W INDOW$ $ (2, 4) (2, 2) (1, 3) (1, 5) (2, 6) (2, 3) (1, 4) $OGI OW$ (1, 6) (1, 1) (2, 1) (1, 2)

  8. Spatial Data Structures

  9. Spatial Data Structures Bounding rectangle Points in 2-D

  10. c … F D E G …. Recursive Bisection Quad trees(4-way trees) • Technique for spatial domain decomposition root Source: Handbook of Data Structures & Applications, Chapman & Hall/CRC Press, 2005

  11. Compact path into single edge Compacted Quad-trees (for 2D data) 2D space with data Quad-tree decomposition N E • Each node has exactly 4 children (for 4 quadrants) • For 3D data, the corresponding tree is called an oct-tree Source: Handbook of Data Structures & Applications, Chapman & Hall/CRC Press, 2005

  12. (a1,b1) Range Query Result (a2,b2) Range Queries on Quad-trees (0,0)

  13. Oct-Trees (for 3D data) • Issue: • What happens if • the data is unevenly • (ie., non-uniformly)distributed ? • Most of the levels in the tree will be empty Solution: “Compacted Oct-trees”

  14. k-d trees (for k dimensions) • Maintain a combined binary search tree for all dimensions • Recursively bisect each dimension, alternating dimensions at each level of the tree

More Related