1 / 13

An Adaptive Algorithm for Finding Frequent Sets in Landmark Windows

An Adaptive Algorithm for Finding Frequent Sets in Landmark Windows. Xuan Hong Dang, Kok-Leong Ong and Vincent Lee Dept. Computer Science, Aarhus University, Denmark School of IT, Deakin University, Australia Faculty of IT, Monash University, Australia. Applications.

aimee
Download Presentation

An Adaptive Algorithm for Finding Frequent Sets in Landmark Windows

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. An Adaptive Algorithm for Finding Frequent Sets in Landmark Windows Xuan Hong Dang, Kok-Leong Ong and Vincent Lee Dept. Computer Science, Aarhus University, Denmark School of IT, Deakin University, Australia Faculty of IT, Monash University, Australia

  2. Applications • Sensors of all sorts are generating a lot of data streams • Many applications consume these data streams to discover evolving knowledge about the data stream

  3. Problem • Data rates can exceed compute capacity • Machine must adapt to produce results on time • HOW?

  4. A solution for finding frequent sets • Our method • Approximate frequency counts • Built adaptability in processing through load shedding • Applicable to landmark, forgetful and sliding windows

  5. StreamL • Given a transaction stream • {t1, t2, t3, ……………………………………………………., ti, tj, …} • ti = {x1, x2, …}, where xa is a literal landmark window

  6. StreamL • Capacity is bounded by number of transactions in the window and the size of each transaction • How to measure this capacity? • A simple way is use MFS to estimate how many itemsets to process in each transaction, i.e.,

  7. StreamL • For n transactions in the window, the number of itemsets to process is • If r is the rate, then the capacity to process each transaction can be

  8. StreamL • For n transactions in the window, the number of itemsets to process is • If r is the rate, then the capacity to process each transaction can be

  9. StreamL • When rate increases, the idea is to add a P such that • to maintain a non-overload situation. • To achieve a load of C, the adjust made by P is therefore achieved by dropping transactions

  10. StreamL • When transactions are dropped in a window, • {t1, t2, t3, ……………………………………………………., ti, tj, …} • Frequency of X becomes inaccurate • Qualify this with an error e landmark window

  11. StreamL • Qualify this with an error e, which is the result of dropping transactions with probability 1 - P (< 1) • We can use e to compute a guarantee using the Chernoff bounds, i.e., • How confident it is that true support of X deviates from the estimated support of X by +/- e

  12. Details • We presented the idea sketch • See paper for algorithm for landmark window • The idea can be extended to other windows; see technical report for forgetful and sliding window

  13. Thank You

More Related