1 / 5

A Course on Probabilistic Databases

A Course on Probabilistic Databases. Dan Suciu University of Washington. Outline. Part 1. Motivating Applications The Probabilistic Data Model Chapter 2 Extensional Query Plans Chapter 4.2 The Complexity of Query Evaluation Chapter 3 Extensional Evaluation Chapter 4.1

azriel
Download Presentation

A Course on Probabilistic Databases

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Probabilistic Databases - Dan Suciu A Course on Probabilistic Databases Dan Suciu University of Washington

  2. Probabilistic Databases - Dan Suciu Outline Part 1 • Motivating Applications • The Probabilistic Data Model Chapter 2 • Extensional Query Plans Chapter 4.2 • The Complexity of Query Evaluation Chapter 3 • Extensional Evaluation Chapter 4.1 • Intensional Evaluation Chapter 5 • Conclusions Part 2 Part 3 Part 4

  3. Probabilistic Databases - Dan Suciu Summary (1/2) • There are many applications that require storage/management of uncertain data • Retain data that is not absolutely certain • Retain more than one alternative way to clean • Probabilities are application specific • All we care about is that “bigger is better” • Queries have precise semantics • Important for query optimization • “Bigger probability” means “more certain answer” • “Stop worrying about probabilities and start asking queries”

  4. Probabilistic Databases - Dan Suciu Summary (2/2) • Extensional query evaluation: • Advantage: can use out of the box DBMS • Disadvantage: can’t handle unsafe queries (but can still give upper/lower bounds on probabilities) • “You don’t need a probabilistic database management system to manage probabilistic data” • Intensional query evaluation: • Advantage: can use out of the box model counting system • Disadvantage: requires expensive “lineage computation” step; none of the model counting approaches seems complete for SPJU queries.

  5. Probabilistic Databases - Dan Suciu Open Problems • How do we compute the “hard” queries ? • Extensions to: • BID tables • Full FO: ¬ ∀ • Independent AND disjoint tuples • Uniform probabilistic structures (MLNs) • Which queries admit efficient FBDDs?Which queries admit efficient d-DNNFs?

More Related