Interactive sql query suggestion
Sponsored Links
This presentation is the property of its rightful owner.
1 / 35

Interactive SQL Query Suggestion PowerPoint PPT Presentation


  • 73 Views
  • Uploaded on
  • Presentation posted in: General

Database Research Group. Interactive SQL Query Suggestion. Making Databases User-Friendly. Ju Fan , Guoliang Li, and Lizhu Zhou Database Research Group, Tsinghua University ICDE 2011 – Apr. 13, Hanover. Outline. Motivation Overview of SQL Query Suggestion Template Suggestion

Download Presentation

Interactive SQL Query Suggestion

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Database

Research

Group

Interactive SQL Query Suggestion

Making Databases User-Friendly

Ju Fan, Guoliang Li, and Lizhu Zhou

Database Research Group, Tsinghua University

ICDE 2011 – Apr. 13, Hanover


Outline

  • Motivation

  • Overview of SQL Query Suggestion

  • Template Suggestion

  • SQL Generation

  • Experiments

  • Conclusion

SQL Suggestion, ICDE 2011


Outline

  • Motivation

  • Overview of SQL Query Suggestion

  • Template Suggestion

  • SQL Generation

  • Experiments

  • Conclusion

SQL Suggestion, ICDE 2011


SQL: Powerful Yet Difficult

  • SQL is powerful but difficult for inexperienced users to pose queries

    • Be skillful in SQL syntax to express query intent

    • Have a thorough understanding of the schema

SQL Suggestion, ICDE 2011


SQL Assistant Tools

  • Target Users

    • The novice users who struggle with the basic SQL syntax or the structure of the schema.

  • Limitations

    • Only support metadata and SQL syntax

    • Require users to manually join multi-tables

SQL Suggestion, ICDE 2011


Keyword Search over RDB

  • Keyword Search over Relational DB

    • Data: A database with multiple tables

    • Query: Keywords

    • Answer: Joined tuples containing the keywords

  • Limitations

    • Cannot precisely capture users’ query intent

    • May involve irrelevant results

    • Cannot support aggregate functions, range queries, etc.

SQL Suggestion, ICDE 2011


SQL Suggestion from Keywords

SQL Suggestion, ICDE 2011


Features of SQL Suggestion

  • Objective: Assist users to formulate SQL queries using keywords

  • Main Features

    • Query intent prediction

    • Answer grouping

    • Aggregation queries

    • Range queries

SQL Suggestion, ICDE 2011


Comparison of Query Paradigms

Easier

Keyword Search

SQL Suggestion

Usability

SQL

Expressiveness

More Powerful

SQL Suggestion, ICDE 2011


Outline

  • Motivation

  • Overview of SQL Query Suggestion

  • Template Suggestion

  • SQL Generation

  • Experiments

  • Conclusion

SQL Suggestion, ICDE 2011


Problem Definition

Query: Keywords

User

Data: A Database

with Multiple Tables

Answer: SQL Queries

SQL Suggestion, ICDE 2011


A Two-Step Framework

One of Relevant Templates

Step 1

Template Suggestion

“count paper ir”

User

One of Generated SQL Queries

Step 2

SQL Generation

SELECTCOUNT (P.id)

FROM Paper P, Author A, Write W

WHERE A.nameCONTAINS “ir”

AND A.id = W.aidAND P.id = W.pid

SQL Suggestion, ICDE 2011


Outline

  • Motivation

  • Overview of SQL Query Suggestion

  • Template Suggestion

  • SQL Generation

  • Experiments

  • Conclusion

SQL Suggestion, ICDE 2011


Template Suggestion

One of Relevant Templates

Step 1

Template Suggestion

“count paper ir”

User

One of Generated SQL Queries

b

Step 2

SQL Generation

SELECTCOUNT (P.id)

FROM Paper P, Author A, Write W

WHERE A.nameCONTAINS “ir”

AND A.id = W.aidAND P.id = W.pid

SQL Suggestion, ICDE 2011


Queryable Template

  • The skeleton of SQL queries that models the joined entities and their attributes.

  • A template is an undirected graph

SQL Suggestion, ICDE 2011


Template Generation

  • Atom Entities

    • E.g., Paper

  • Expansion Rules

    • E.g., P – W  P – W – A

  • Combinatory Explosion

    • A ranking model for avoiding exploring all templates

SQL Suggestion, ICDE 2011


Template Ranking Model

=

∑k∈Q

P(Q,T)

P(T)

∑R∈T

P(k|R)

P(R|T)

Query

KeywordsQ

Keyword2 … Keywordn

Keyword1

P(k|R):Relevance of

Rtok(TF-IDF)

Entities in

template T

Paper

Write

Author

P(T)

Query Ability of T

P(R|T):Importance ofRtoT:(PageRank)

SQL Suggestion, ICDE 2011


Top-k Suggestion Algorithm

wR

=

∑k∈Q

P(Q,T)

P(T)

∑R∈T

P(k|R)

P(R|T)

  • Fagin Algorithm [5]

    • Lists of templates for all entities ordered by P(R|T)

  • Indexing

    • Inverted Index:

      • Entity-to-Template

    • Forward Index:

      • Template-to-Entity

wp*

wA*

wW*

P(P|T)

P(A|T)

P(W|T)

SQL Suggestion, ICDE 2011


Outline

  • Motivation

  • Overview of SQL Query Suggestion

  • Template Suggestion

  • SQL Generation

  • Experiments

  • Conclusion

SQL Suggestion, ICDE 2011


SQL Generation

One of Relevant Templates

Step 1

Template Suggestion

“count paper ir”

User

One of Generated SQL Queries

Step 2

SQL Generation

SELECTCOUNT (P.id)

FROM Paper P, Author A, Write W

WHERE A.nameCONTAINS “ir”

AND A.id = W.aidAND P.id = W.pid

SQL Suggestion, ICDE 2011


Match Keywords to Attributes

database count author

A Matching

Keyword-to-Attribute

Mapping

σ

SELECTION

Projection

Φ

Φ ,σ

Aggregation

Φ

π

Φ

Φ

Φ

σ

id

id

name

title

booktitle

year

Paper

Author

SQL Suggestion, ICDE 2011


SQL Generation Model

∑m∈M

ρ(k,A)

I(A)

S(M)=

database count author

A Matching

σ

π

ρ(k,A)

the degree of a mapping

Φ

I(A):

the importance of mapped attributes

(Entropy)

id

id

name

title

booktitle

year

Paper

Author

SQL Suggestion, ICDE 2011


Best SQL Query Generation

∑m∈M

ρ(k,A)

I(A)

S(M)=

  • Optimization Problem

  • Weighted Set Covering Problem (NP-hard)

    • A greedy approximation algorithm

  • Extensions

    • Find Top-k matchings

MAX.

SQL Suggestion, ICDE 2011


Outline

  • Motivation

  • Overview of SQL Query Suggestion

  • Queryable Template Suggestion

  • SQL Generation from Templates

  • Experiments

  • Conclusion

SQL Suggestion, ICDE 2011


Experiment Setup

  • Data sets

    • DBLP: More than one million publication records

    • DBLIFE: Activity information of people in DB comm.

  • Query sets, E.g.,

    • count author mining (DBLP)

    • database jim gray (DBLIFE)

  • Baseline method: DISCOVER-II

  • User-Study for effectiveness evaluation

SQL Suggestion, ICDE 2011


Template Suggestion

Precision-Recall Curves on the DBLife data set

SQL Suggestion, ICDE 2011


SQL Generation

Query: database Jim Gray

Precisions on the DBLife data set

SQL Suggestion, ICDE 2011


Record Retrieval

Query: count author mining

  • Advantages of SQL Suggestion

  • Support aggregation functions

  • Support meta-data matching

Precisions on the DBLife data set

SQL Suggestion, ICDE 2011


Efficiency

Efficiency Comparison

(DBLife)

SQL Suggestion, ICDE 2011


Scalability

Scalability (DBLP)

SQL Suggestion, ICDE 2011


Outline

  • Motivation

  • Overview of SQL Query Suggestion

  • Queryable Template Suggestion

  • SQL Generation from Templates

  • Experiments

  • Conclusion

SQL Suggestion, ICDE 2011


Conclusion

  • An effective and user-friendly keyword-based method

  • Assist users to formulate SQL queries

  • Suggest templates relevant to keyword queries

  • Generate SQL queries from templates

  • Extensive experiments

SQL Suggestion, ICDE 2011


Future Work

  • This study opens many new interesting and challenging problems

    • Cardinality estimation of suggested SQL queries

    • Personalized SQL suggestion

SQL Suggestion, ICDE 2011


Thanks

Demo: http://tastier.cs.tsinghua.edu.cn/sqlsugg

My Homepage: http://dbgroup.cs.tsinghua.edu/fanju

SQL Suggestion, ICDE 2011


Comparison with Existing Work

  • CN-Based Methods

    • Better template ranking

    • SQL Generation

    • Aggregation functions, range queries, etc.

SQL Suggestion, ICDE 2011


  • Login