1 / 21

PhyQL: A Phylogenetic Visual Query Engine

PhyQL: A Phylogenetic Visual Query Engine. Shahriyar Hossain  , Munirul Islam  , Jesmin  , Hasan M Jamil  Integration Informatics Laboratory, Computer Science, Wayne State University  Department of Genetic Engineering and Biotechnology, University of Dhaka, Bangladesh  BIBM 2008.

alta
Download Presentation

PhyQL: A Phylogenetic Visual Query Engine

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. PhyQL: A Phylogenetic Visual Query Engine Shahriyar Hossain, Munirul Islam, Jesmin, Hasan M Jamil Integration Informatics Laboratory, Computer Science, Wayne State University Department of Genetic Engineering and Biotechnology, University of Dhaka, Bangladesh BIBM 2008 Integration Informatics Research Group

  2. What is a Phylogenetic Tree? Integration Informatics Research Group

  3. Integration Informatics Research Group

  4. Queries: • Least Common Ancestor <root> <node>rayfinned fish</node> <inode> <node>lungfish</node> <inode> <inode> <node>salamanders</node> <node>frogs</node> </inode> . . . </inode> </inode> </root> for $root in doc(“tree.xml")//root return <span> <h1> { $root/node/text() } </h1> </span> Integration Informatics Research Group

  5. Phylogenetic Query Language: Select: select a subset of trees that match a given criteria Join: Join two trees based on a pair of nodes Subset: Subset queries retrieve part of a given tree Integration Informatics Research Group

  6. Tree Join Using Path Operators SubTree Projection Integration Informatics Research Group

  7. PhyQL: Visual Query Interface SELECT JOIN User SUBTREE Translator DB XML /NEXUS From User / Interoperable Databases Wrappers XSB Integration Informatics Research Group

  8. Why XSB? • eliminates left recursion problem Path(X,Z) :- Path(X,Y), Edge(Y,Z) • Stores intermediate results (by tabling method) • Model-based (order of writing rules doesn’t matter) Path(X,Y) :- edge(X,Y) Path(X,Y) :- Path(X,Y), edge(Y,Z) • its in-memory database queries are an order of magnitude faster than methods such as tuProlog. :- odbc_import(conn, 'tbl_treeinfo'(‘rootId', ‘author'), tree). :- odbc_import(conn, 'tbl_nodeinfo'('nodeId', 'nodename'), node). :- odbc_import(conn, 'tbl_edge'('parentId', 'childId'), edge). Integration Informatics Research Group

  9. <tree author="stern"> <node type=“*"> <node type=“?"> <node> Stanhopea_gibbosa </node> <node> Stanhopea_vasquezii </node> </node> <node> Stanhopea_shuttleworthii </node> </node> </tree> node(Y1, ‘Stanhopea_shuttleworthii’), node(Y2, ‘Stanhopea_gibbosa’), node(Y3, ‘Stanhopea_vasquezii), edge(Y4,Y2), edge(Y4,Y3), lca(Y0,Y4,Y1), edge(Y0,Y1) Integration Informatics Research Group

  10. Integration Informatics Research Group

  11. Integration Informatics Research Group

  12. Integration Informatics Research Group Integration Informatics Research Group

  13. Integration Informatics Research Group

  14. Summary • PhyQL offers a simple web-based visual query interface • Logic based tree query operations • Modifications to query tools only requires change in logic rules • Proposed architecture can also applied to protein-protein interaction networks, metabolic pathways etc. Future Work: • Database Interoperability – allow retrievingintegrate phylogenetic data during query submission • ReQuery – query on the result set • Tree Similarity Estimation

  15. Thank You! me: http://homopan.wayne.edu/PhD Students/Munirul Islam/index.htm Integration Informatics Research Group

  16. Uses of Phylogenetic Trees: • date events of divergence of species • what is the most common ancestor of all living species? • identify geographic origins of new disease outbreaks Integration Informatics Research Group

  17. Crimson • Uses nested subtrees to avoid long strings • Zheng, Y. S. Fisher, S. Cohen, S. Guo, J. Kim, and S. B. Davidson. 2006. Crimson: A Data Management System to Support Evaluating Phylogenetic Tree Reconstruction Algorithms. 32nd International Conference on Very Large Data Bases, ACM, pp. 1231-1234.

  18. 0.1.1 0.1.2 0.2.1.1 0.2.1.2 0.2.2 A B C D E 0.1 0.2.1 0.2 0 Dewey system: Integration Informatics Research Group

  19. A B C D E Find clade for: Z = (<CS+Ds) Find common pattern starting from left SELECT * FROM nodes WHERE (path LIKE “0.2.1%”); Integration Informatics Research Group

  20. A B C D E 3 4 5 6 11 12 13 15 16 10 14 7 2 9 8 17 1 18 Depth-first traversal scoring each node with a left and right ID Integration Informatics Research Group

  21. A B C D E 3 4 5 6 10 11 12 13 15 16 14 2 7 9 8 17 1 18 Minimum Spanning Clade of Node 5 SELECT * FROM nodes INNER JOIN nodes AS include ON (nodes.left_id BETWEEN include.left_id AND include.right_id) WHERE include.node_id = 5 ; Integration Informatics Research Group

More Related