1 / 30

“ Artificial Intelligence ” in Database Querying

“ Artificial Intelligence ” in Database Querying. Dept. of CSE Seung-won Hwang. Why do you need to ace this class?. “ producing machines to automate tasks requiring intelligent behavior ” (wikipedia) AI techniques are highly relevant to many research fields, including database.

gage-gaines
Download Presentation

“ Artificial Intelligence ” in Database Querying

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. “Artificial Intelligence” in Database Querying Dept. of CSESeung-won Hwang

  2. Why do you need to ace this class? • “producing machines to automate tasks requiring intelligent behavior” (wikipedia) • AI techniques are highly relevant to many research fields, including database

  3. More obvious applications

  4. But…

  5. Crash course on DB • SQL queries select * from cars where color=‘red’ and type=‘convertible’ and brand=`hyundai’

  6. Crash course on DB • Deciding the most efficient execution plan among: • hyundai->red->convertible? • red->convertible->hyundai? • convertible->hyundai->red? • … • Depends on data structures (B+-tree), data distributions, … • However, all these efforts are useless efforts, if no object qualifies

  7. Our strength

  8. Our strength • Internet shopping, web bulletin board, cyworld, … • You are sending SQL queries without you knowing • (at least until you see DB errors) • DBMS is optimizing your query for you without you knowing

  9. Our weakness • But do you use DBMS for managing your word files, photos, etc.. • What do you use? • File system (Browsing) • Google desktop (Searching) • SQL semantics is too strict • No red hyundai convertible! Or too many red hyundai elantra?

  10. While Google makes $$$ for

  11. Giving “Artificial Intelligence” • What are the intelligent behaviors expected? • Suggesting alternatives: • Red hyundai • Red convertible • Orange convertible • What are the possible automation? • Deciding Red hyundai < Red convertible

  12. But how? • Any idea? GAP Underspecified/Overspecified Queries

  13. [S1] Borrowing wisdom from data (as google does) Useful for both too many or empty results

  14. Text ranking • tf (term frequency): how often query term appears in document • idf (inverse document frequency): how rare query term is in document collection cars.com convertible hyundai hyundai red convertible red red red red hyundai hyundai red red red red red convertible low idf high tf

  15. Applying to database Red hyundai = 0.9 Red honda = 0.4 Black hyundai = 0.8

  16. What is the assumption? • Rare items are preferred • Can you think of exceptions? • ‘purple pony’ vs. ‘purple lexus’ • How can we handle this problem?

  17. [S2] Borrowing wisdom of other users

  18. Query frequency • Keyword frequency in prior queries • Eg., car=‘BMW’ appearing in 50% of prior queries • Summing up, we can highly rank cars that are heavily queried beforeand rare in stocks

  19. [S3] Borrowing wisdom from domain knowledge

  20. Example 1: color (a) (b) (c) (d) (e)

  21. Example 2: shape=‘retro’

  22. [S4] Borrowing wisdom from specific user • Notion of similarity significantly differs across users • Shape? C A B

  23. You cannot expect users to describe • (or machine to understand) explicit explanation like • I want a photo of a building similar to eiffel tower in terms of shape, but not in terms of the overall shape, but in terms of the shape of the steel material…………..

  24. Mindreader? (mediabakery.com)

  25. In our car search example • You can show ‘red bmw’ and ‘hyundai sedan’ • Based on user response (or clicks), you can figure out which is more important factors, e.g., color • Then you can show more red cars to figure out further on preference on brands

  26. Summing up • You need to bridge the gap between SQL and ideal results, by collecting/analyzing as much as information available from data, prior users, user himself/herself, … • Implicitly and automatically

  27. Another implicit info to think about • Tagging frequency ranking/ automatic classification?

  28. Summary • Networks enables access to a large amount of user created contents/info “Web 2.0” • http://youtube.com/watch?v=6gmP4nk0EOE (interesting web 2.0 video) • Intelligent retrieval techniques is the key in new era • Ranking • Classification • I will then show how AI techniques (that you already know!) got me a PhD in intelligent retrieval research • Rank Formulation: machine learning • Rank/Classification Processing : best first search, hill climbing

  29. Q&A

More Related