1 / 29

DBease: Making Databases User-Friendly and Easily Accessible

DBease: Making Databases User-Friendly and Easily Accessible. Guoliang Li, Ju Fan , Hao Wu, Jiannan Wang, Jianhua Feng Database Group, Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China. How to Access Databases?.

nevan
Download Presentation

DBease: Making Databases User-Friendly and Easily Accessible

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. DBease: Making Databases User-Friendly and Easily Accessible Guoliang Li, Ju Fan, Hao Wu, Jiannan Wang, Jianhua Feng Database Group, Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China

  2. How to Access Databases? • Traditional database-access methods: • SQL Selecttitle, author, booktitle, yearFromdblp WheretitleContains “search” AndbooktitleContains “cidr” • Query-by-exmaple (Form) • Keyword Search “search cidr” cidr CIDR'11 - DBease

  3. Comparison of Different Methods Usability CIDR'11 - DBease

  4. Keyword Search • Is traditional keyword search good enough? Too many results! No result! CIDR'11 - DBease

  5. Form-based Search • Form-based Search has the same problem. Complicated and still no result! CIDR'11 - DBease

  6. Our Solution Type-Ahead Search Type-Ahead Search in Forms Usability SQL Suggestion CIDR'11 - DBease

  7. What is Type-Ahead Search? CIDR'11 - DBease

  8. Type-Ahead Search • Advantages • On-the-fly giving users instant feedback • Helping users navigate the underlying data • Tolerating inconsistencies between query and data • Supporting Synonyms • Supporting XML data • Supporting Multiple tables CIDR'11 - DBease

  9. Problem Formulation • Data: A set of records • Query • Q = {p1, p2, …, pl}: a set of prefixes • δ:Edit-distance threshold • Result • A set of records having all query prefixes or their similar forms (conjunctive) • Edit Distance: • The number of edit operations • (insertion, deletion, substitution) • transformed a string to another • ed(string, stang) =2 CIDR'11 - DBease

  10. Indexing • Trie Index • Words: root to leaves • Inverted lists on leaves CIDR'11 - DBease

  11. Algorithm • Step 1: Find similar prefixes incrementally • Step 2: Retrieve the leaf nodes of similar prefixes • Step 3: Compute union lists of inverted lists of leaf nodes • Step 4: Intersect the union lists of query keywords =cid r CIDR'11 - DBease

  12. Type-Ahead Search in Forms Type-Ahead Search Type-Ahead Search in Forms Usability CIDR'11 - DBease

  13. What is Type-Ahead Search in Forms? CIDR'11 - DBease

  14. Type-Ahead Search in Forms • Problem Formulation • Data: A relation with multiple attributes • Query: A set of prefixes on attributes in a form interface • Answers: • Local results of the focused attribute • Global results of the relation • Advantages • On-the-fly Faceted Search • Supporting Aggregation CIDR'11 - DBease

  15. Data Partition • Global Table  Local Tables CIDR'11 - DBease

  16. Indexing • Each attribute • Trie • Mapping Tables • Local Global • Global  Local CIDR'11 - DBease

  17. Our Solution xml CIDR'11 - DBease xml database (albert) xml database (bob) xml search (albert) xml security (alice) Title: Author: xml database xml search xml security

  18. Our Solution Trie a L - G Mapping Table l T 1 1 , 2 i b T 2 3 T 3 4 c e T 4 5 e G - L Mapping Table r 1 T 1 2 T 1 a 5: alice 3 T 2 4 T 3 xml 4: albert 5 T 4 CIDR'11 - DBease xml database, albert xml search, albert xml security, alice Title: Author: al albert alice

  19. SQL Suggestion Type-Ahead Search Type-Ahead Search in Forms Usability SQL Suggestion CIDR'11 - DBease

  20. What is SQL Suggestion? CIDR'11 - DBease

  21. SQL Suggestion • Problem Formulation • Data: A database with multiple tables • Query: A set of keywords • Answers: Relevant SQL queries • Advantages • Suggest SQL queries based on keywords • Help users formulate SQL queries to find accurate results • Designed for both SQL programmers and Internet users • Group answers based on SQL structures • Support Aggregation • Support Range queries CIDR'11 - DBease

  22. Our Solution • Suggest Templates from Keywords • A template is a structure in the databases • Modeled as a graph • Nodes: entities (table names or attribute names) • Edges: foreign keys or membership • Suggest SQL queries from Templates • Mapping between keywords and templates keyword paper ir (a) Query (b) Template (c) SQL CIDR'11 - DBease

  23. Template Suggestion • Template Generation • Extension from basic entities (tables) • Template Ranking • Template weight • Pagerank • Relevancy between a keyword and an entity • Tf*idf • Algorithms • Fagin algorithms • Threshold-based pruning techniques CIDR'11 - DBease

  24. SQL Suggestion • SQL suggestion model • Mapping from keywords to templates • Matching is a set of mappings with all keywords • Weighted set-covering problem (NP-hard) • SQL ranking • Relevancy between keywords and attributes • Attribute weight • Algorithms • Greedy algorithms CIDR'11 - DBease

  25. Search: dbease http://dbease.cs.tsinghua.edu.cn Keyword Search: http://dbease.cs.tsinghua.edu.cn/ipubmed/ http://dbease.cs.tsinghua.edu.cn/dblpsearch/ Form-based Search: http://dbease.cs.tsinghua.edu.cn/seaform/ SQL: http://dbease.cs.tsinghua.edu.cn/sqlsugg/

  26. Thanks, questions?

  27. Differences to Google Instant Search • Fuzzy prefix matching • Google firstly predicts queries, and then use the top queries to search the documents. Google may involve false negatives, while we can find the accurate top-k answers. CIDR'11 - DBease

  28. Differences to Complete Search • Fuzzy prefix matching • Different index structures • More efficient CIDR'11 - DBease

  29. Differences to Keyword Search • Effectiveness • SQL Suggestion supports range queries, and aggregation functions. • SQL Suggestion can group answers. • SQL Suggestion can help users to express their query intent more accurately. • Efficiency • Faster CIDR'11 - DBease

More Related