Efficient computation of diverse query results. Presenting: Karina Koifman Course : DB Seminar. Example. Example. Yahoo! Autos. Maybe a better retrieval. Introduction. The article talks about the problem of efficiently computing diverse query results in online shopping applications.
Presenting: Karina Koifman Course : DB Seminar
Existing solutions are inefficient or do not work in all situations. Example:
first retrieve c × k and then pick a diverse subset from these.
*There are no Honda Accord convertibles
Find a result set that minimizes
Merged Inverted List:
Lets say Q looks for descriptions with ‘Low’, with k=3
We start from two Civics , then we know that we need only
one more so we pick the next Civic
Then we look for another in next level (Accord)- no such,
because it doesn’t have ‘Low’ in it (also no other in that level).
Then we look for another in next level (make)- and prune,
This is maximum diverse – we stop here.
If we had a Ford, we would continue
Give each car a score , then the query would take this score as parameter- minScore- smallest score in the result set,
Choose next next ID by :
The smallest ID such that score(id)>=root.minScore.
And the algorithm proceeds as before.
Main idea: to go over all the cars as they were on an axis
We use the WAND algorithm- to obtain the top-k list.
Next step is marking all possible nodes to add- as MIDDLE.
we also maintain a heap – for a node with minimum child.
Each step we move nodes from tentative to useful .
MultQ – rewriting the query as multiple queries and merging their results.
Naïve – all the results of a query
Basic - just first k answers – without diversity.
OnePass , Probe – our algorithms
U = unscored
S = scored
Thank You !