ranking based processing of sql queries n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Ranking-based Processing of SQL Queries PowerPoint Presentation
Download Presentation
Ranking-based Processing of SQL Queries

Loading in 2 Seconds...

play fullscreen
1 / 25

Ranking-based Processing of SQL Queries - PowerPoint PPT Presentation


  • 119 Views
  • Uploaded on

Ranking-based Processing of SQL Queries. Date: 2012/1/16 Source: Hany Azzam (CIKM’11) Speaker: Er -gang Liu Advisor: Dr . Jia -ling Koh. Outline. Introduction The Core Retrieval Models TF-IDF LM Model Tuple Retrieval Algorithm SQL-to-PSQL Basic Views

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Ranking-based Processing of SQL Queries' - maxwell-allen


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
ranking based processing of sql queries

Ranking-based Processing of SQL Queries

Date: 2012/1/16

Source:HanyAzzam (CIKM’11)

Speaker: Er-gang Liu

Advisor: Dr. Jia-ling Koh

outline
Outline
  • Introduction
  • The Core Retrieval Models
    • TF-IDF
    • LM Model
  • Tuple Retrieval
  • Algorithm SQL-to-PSQL
    • Basic Views
    • TF-IDF-based Processing of SQL Queries
    • LM-based Processing of SQL Queries
  • Experiment
  • Conclusion
introduction
Introduction
  • Motivation:
    • Support document/context and tuple retrieval
    • “Seamlessly” integrated IR+DB technology
  • Goal:
    • Using IR models for processing SQL queries and develops the application of PSQL for tuple retrieval.
introduction1
Introduction

Typical SQL Query

Decompose

Index Part

Retrieval Part

tf idf rsv
TF-IDF RSV

c

t1, t1, t2

d1

d2

t1,t2

d3

t1,t3

t2

d4

TF(t1,d1) =

  • IDF(t1,c) = -log2
  • ND(c) : number of Documents in collection “c”
  • nD(t,c) : number of Documents with term “t" in collection “c”,
  • dft: nD(t,c) is the document frequency.
  • NL(c) : number of Locations in collection “c”
  • nL(t,c) : number of Locations with term “t".
  • NL(d) and nL(t,d) : Location-based counts for document “d”,
  • tfd :=nL(t,d)
tf idf rsv1
TF-IDF RSV

t1, t1, t2

d1

d2

t1,t2

d3

t1,t3

  • WTF-IDF(t1,d1,t1,c) =
  • WTF-IDF(t2,d1,t2,c) =

t2

d4

Q = t1 ,t2

  • TF-IDF term weight
    • weight is defined as follows:
lm rsv
LM RSV

c

t1, t1, t2

d1

d2

t1,t2

d3

t1,t3

t2

d4

P(t1|c) = =

  • Language modelling (LM)
    • within-document term probability (foreground model)

P(t1|d1) = =

    • Collection-wide term probability (background model).
lm rsv1
LM RSV

c

t1, t1, t2

d1

d2

t1,t2

d3

t1,t3

t2

d4

  • WLM(t1,d1,c) = log( 1+ = 0.611
  • WLM(t2,d1,c)= log( 1+

Q = t1 ,t2

  • RSVLM(t1,d1,c) = 0.611+
  • Language modelling (LM)
    • The LM term weight is defined

as follows:

sql2psql algorithm basic views
SQL2PSQL ALGORITHM Basic Views

Tuple-based (Location-based) Probabilities, P_Z(X)

sql2psql algorithm basic views1
SQL2PSQL ALGORITHM Basic Views

Conditional Probabilities, Pz(X|Y)

sql2psql algorithm basic views2
SQL2PSQL ALGORITHM Basic Views
  • Value-based (Document-based) Probabilities Pz[x](X|Y)
sql2psql algorithm basic views3
SQL2PSQL ALGORITHM Basic Views
  • Information-based Probabilities Pz(X infors)
tf idf based processing of sql queries1
TF-IDF-based Processing of SQL Queries
  • value1 = saling , value2 = east
lm based processing of sql queries1
LM-based Processing of SQL Queries
  • value1 = saling , value2 = east
experiment
Experiment

The aim is to investigate the implementation of the retrieval models by examining how much quality could be achieved and at what cost.

experiment evaluation
Experiment - Evaluation
  • MAP(Mean Average Precision)

Topic 1 : There are 4 relative page‧ rank : 1, 2, 4, 7

Topic 2 : There are 5 relative page‧ rank : 1,3,5,7,10

Topic 1 Average Precision : (1/1+2/2+3/4+4/7)/4=0.83。

Topic 2 Average Precision : (1/1+2/3+3/5+4/7+5/10)/5=0.45。

MAP= (0.83+0.45)/2=0.64。

  • Reciprocal Rank

Topic 1 Reciprocal Rank : (1+1/2+1/4+1/7)/4=0.83。

Topic 2 Reciprocal Rank : (1+1/3+1/5+1/7+1/10)/5=0.45。

conclusion
Conclusion

Support the high-level (abstract) modelling of general and specific retrieval tasks (ad-hoc retrieval, classification, summarisation, structured document retrieval, hypertext retrieval, multimedia retrieval, ...)