slide1 n.
Skip this Video
Loading SlideShow in 5 Seconds..
CAREER: Towards Unifying Database Systems and PowerPoint Presentation
Download Presentation
CAREER: Towards Unifying Database Systems and

Loading in 2 Seconds...

  share
play fullscreen
1 / 15
Download Presentation

CAREER: Towards Unifying Database Systems and - PowerPoint PPT Presentation

RexAlvis
242 Views
Download Presentation

CAREER: Towards Unifying Database Systems and

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

    Slide 1:CAREER: Towards Unifying Database Systems andInformation Retrieval Systems NSF IDM Workshop 10 Oct 2004 Jayavel Shanmugasundaram Cornell University

    Slide 2:10000 foot view of Data Management

    Slide 3:10000 foot view of Data Management

    Slide 4:Internet Archive Database

    Slide 5:Internet Archive Database

    Slide 6:Structured Value Ranking Use structured data values associated with text columns to score results Main technical challenge Need to produce top-k results efficiently Order inverted lists by score But scores change frequently [Aizen et al., 2004] Flash crowds on Internet Recent award announcements How can we process top-k results efficiently while allowing frequent score updates?

    Slide 7:Solution Overview Order inverted lists by score Queries efficient Score updates slow Order inverted lists by document id Queries slow Score updates efficient Hybrid solution: order inverted lists by chunk Order chunks by score Order documents within chunk by id Guo et al. [ICDE 2005]

    Slide 8:10000 foot view of Data Management

    Slide 9:Applications Content management Mix of structured and unstructured data Database with date and time of accident (structured data) and accident description (unstructured data) Semi-structured data Scientific documents, Shakespeares plays, Support flexible keyword search interface over mix of structured and unstructured data XRANK [Guo et al., SIGMOD 2003]

    Slide 10:XML Keyword Search

    Slide 11:10000 foot view of Data Management

    Slide 12:Applications The Internet is enabling end-users to directly ask queries and explore results E.g., Used car marketplace Find all bright red ford mustangs that cost less than 20% of the average price of cars in its class Characteristics of queries Keyword search (for ease of use) Complex query operations (information synthesis) Want to see ranked results!

    Slide 13:Towards Unifying DB and IR No standard query language for both DB and IR SQL, XQuery mostly database query languages Have developed TeXQuery: a full-text search extension to XQuery Amer-Yahia et al. (WWW 2004) Full composability of database and IR primitives, ranking Adopted as the precursor to the XQuery full-text extensions currently being developed by the W3C Come see demo tomorrow

    Slide 14:Related Work Integrating DB and IR systems For the most part, treat individual systems as black boxes Our goal is to unify DB and IR systems Search over Semi-Structured Data Specialized techniques for search semi-structured data Our goal is to generalize DB and IR techniques Keyword search and ranking in databases

    Slide 15:Summary Many emerging applications require a unification of DB and IR techniques E-commerce applications Semi-structured documents Content management Argues for a new generation of systems and techniques that seamlessly provide this capability SVR, XRank, TeXQuery, Educational benefit: present unified view of data management Currently at graduate level Eventually introduce concepts at undergraduate level