Slide1 l.jpg
This presentation is the property of its rightful owner.
Sponsored Links
1 / 1

eXtract: A Snippet Generation System for XML Search PowerPoint PPT Presentation


  • 88 Views
  • Uploaded on
  • Presentation posted in: General

eXtract: A Snippet Generation System for XML Search. Yu Huang, Ziyang Liu, Yi Chen Arizona State University . http://eXtract.asu.edu. Motivation: . Good snippets help users to easily judge the relevance and find desired results. Problem: How to generate good snippets for XML search?.

Download Presentation

eXtract: A Snippet Generation System for XML Search

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Slide1 l.jpg

eXtract: A Snippet Generation System for XML Search

Yu Huang, Ziyang Liu, Yi Chen

Arizona State University

http://eXtract.asu.edu

Motivation:

Good snippets help users to easily judge the relevance and find desired results.

Problem: How to generate good snippets for XML search?

No existing work on XML snippet generation yet.

Contributions: eXtract - the first system on snippet generation for XML search[Huang et al, SIGMOD ’08]

Challenge: What are good snippets?

Challenge: What information in result is significant to achieve the properties?

Solution: Designed an algorithm to generate IList

Solution: Identified desirable properties

  • Self-contained

  • Distinguishable

  • Representative

  • Small

  • The entities involved in the query result

  • Keys of the query result

  • Dominant features

0

Challenge: How to select instances in the result when generating a snippet to maximally cover IList within a size bound?

Solution: Designed an efficient and effective algorithm that generates good snippets from IList

  • Defined Instance Selection Problem: how to select node instances in a query result to cover as many items in IList as possible in the ranked order to generate a snippet within a bound?

  • Theorem: The Instance Selection Problem is NP-hard.

  • Designed a greedy algorithm that generates good snippets efficiently.

retailer apparel Texas

Sample Query:

Sample Snippet

(of size 11)

Find the apparel retailers in Texas.

A Query Result

retailer

retailer

Features and their occurrences

entity:

store:

clothes:

clothes:

clothes:

attribute:

city:

fitting:

situation:

category:

value: occurrences

Houston:2

Dallas: 1

men: 146

women: 101

children: 53

casual: 223

formal: 77

outwear: 116

suit: 92

pants: 43

shirts: 39

shorts: 10

name

product

store

store

store

name

product

store

apparel

Brook

Brothers

Brook

Brothers

apparel

name

state

city

merchandises

state

merchandises

Texas

Houston

Galleria

Bad

Texas

clothes

clothes

clothes

clothes

clothes

Good

situation

category

category

fitting

situation

category

situation

fitting

situation

category

fitting

men

casual

outwear

suit

casual

men

casual

outwear

men

formal

suit

Dominance score (DS): DS (Houston) = 2/(3/2) = 1.33, DS (children) = 53/(300/3) = 0.53

IList : Texas, apparel, retailer, store, Brook Brothers, outwear, suit, casual, men

Keywords

Related entities

Key

Dominant features

Experiments:

  • Comparison of Google Desktop, Greedy (eXtract), Optimal algorithm for instance selection.

  • User study scores are 2.3, 3.9 and 4.2 out of 5, respectively.

Quality

Speed

Precision

Recall

Time(s)

34th International Conference on Very Large Data Bases, August 23th-28th, 2008, Auckland, New Zealand


  • Login