slide1 l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Efficient Interactive Fuzzy Keyword Search Shengyue Ji 1 , Guoliang Li 2 , Chen Li 1 , Jianhua Feng 2 1 University PowerPoint Presentation
Download Presentation
Efficient Interactive Fuzzy Keyword Search Shengyue Ji 1 , Guoliang Li 2 , Chen Li 1 , Jianhua Feng 2 1 University

Loading in 2 Seconds...

play fullscreen
1 / 20

Efficient Interactive Fuzzy Keyword Search Shengyue Ji 1 , Guoliang Li 2 , Chen Li 1 , Jianhua Feng 2 1 University - PowerPoint PPT Presentation


  • 156 Views
  • Uploaded on

Efficient Interactive Fuzzy Keyword Search Shengyue Ji 1 , Guoliang Li 2 , Chen Li 1 , Jianhua Feng 2 1 University of California, Irvine 2 Tsinghua University. Traditional Keyword Search. Too many results!. No result!. Complicated and still no result!.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Efficient Interactive Fuzzy Keyword Search Shengyue Ji 1 , Guoliang Li 2 , Chen Li 1 , Jianhua Feng 2 1 University' - maude


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide1

Efficient Interactive Fuzzy Keyword Search

Shengyue Ji1, Guoliang Li2, Chen Li1, Jianhua Feng2

1 University of California, Irvine

2 Tsinghua University

traditional keyword search
Traditional Keyword Search

Too many results!

No result!

Complicated and still

no result!

interactive fuzzy keyword search
Interactive Fuzzy Keyword Search

Features:

  • Interactive: data exploration
  • Fuzzy: error tolerant
  • Multiple keywords: search on-the-fly
fundamentals
Fundamentals
  • Data
    • R: a set of records
    • W: a set of distinct words
  • Query
    • Q = {p1, p2, …, pl}: a set of prefixes
    • δ:Edit-distance threshold
  • Query result
    • RQ: a set of records such that each record has all query prefixes or their similar forms (conjunctive)
contributions outline
Contributions / Outline
  • Step 1
    • Incremental fuzzy prefix matching
  • Step 2
    • Multi-prefix intersection methods
    • Cache-based prefix intersection
observation
Observation
  • W = {exam, example, exemplar, exempt, sample}
  • δ = 2

Q’ = exampl

Q = example

delete e

delete e

match e

delete e

substitute e with a

match e

trie indexing
Trie Indexing

Computing set of active nodes ΦQ

  • Initialization
  • Incremental step

e

s

x

a

a

e

m

Active nodes for Q = example

m

m

p

2

$

p

p

l

1

2

2

l

l

t

e

0

2

e

a

$

$

$

r

$

initialization
Initialization
  • Q = ε

0

1

1

e

s

2

2

x

a

a

e

m

m

m

p

$

p

p

l

l

l

t

e

Initializing Φεwith all nodes within in depth of δ

e

a

$

$

$

r

$

incremental computation algorithm
Incremental Computation: Algorithm
  • Incremental computation from ΦQ’ to ΦQ
  • add(ΦQ , <n, d>) has effect only if there exists no active node in ΦQ with the same n and smaller d

Algorithm Details

incremental computation example
Incremental Computation: Example
  • Q = e

1

Active nodes for Q = ε

0

1

e

s

1

2

x

a

2

2

a

e

m

m

m

p

Active nodes for Q = e

$

p

p

l

l

l

t

e

e

a

$

$

r

$

$

incremental computation discussion
Incremental Computation: Discussion
  • Insertions
    • Needed after matches
    • Not needed after deletions and substitutions
      • deletions and insertions do not co-occur in adjacent positions
      • adjacent substitutions and insertions are interchangeable
  • Correctness and Completeness
    • Can be proved by reducing from/to edit-distance computation
outline
Outline
  • Step 1
    • Incremental fuzzy prefix matching
  • Step 2
    • Multi-prefix intersection methods
    • Cache-based prefix intersection
multi prefix intersection
Multi-Prefix Intersection
  • Q = vldbli
  • Multi-prefix intersection
    • To return records such that each record has all query keywords as prefixes (or their similar forms)
multi prefix intersection method 1
Multi-Prefix Intersection: Method 1

d

l

v

a

i

u

l

t

$

n

u

$

i

d

a

1

8

$

$

4

s

b

3

4

6

5

$

$

$

4

1

2

3

6

6

7

8

  • Q = vldbli

li

1 3 4 5 6 8

6 8

vldb

6 7 8

multi prefix intersection method 2
Multi-Prefix Intersection: Method 2

[1, 7]

[2, 6]

[7, 7]

d

[1, 1]

l

v

[1, 1]

[2, 4]

[5, 6]

[7, 7]

a

i

u

l

[1, 1]

[3, 3]

[4, 4]

[6, 6]

[7, 7]

t

$

2

n

u

$

5

i

d

[1, 1]

[6, 6]

[7, 7]

a

1

8

$

3

$

4

4

s

b

3

4

6

5

$

1

$

6

$

7

4

1

2

3

6

6

7

8

6

7

8

Read each

Verify/Probe

[2, 4]

  • Q = vldbli
experimental results
Experimental Results
  • Computing similar prefixes
experimental results17
Experimental Results
  • Multi-prefix intersection
experimental results18
Experimental Results
  • Overall scalability
questions

TASTIER: Efficient Auto-Completion, Type-Ahead Search

http://tastier.ics.uci.edu/

Thank You!

Questions?

Questions?

Efficient Interactive Fuzzy Keyword Search

ShengyueJi, Guoliang Li, Chen Li, JianhuaFeng

UC Irvine & Tsinghua