1 / 17

CFI-Stream: Mining Closed Frequent Itemsets in Data Streams

CFI-Stream: Mining Closed Frequent Itemsets in Data Streams. Nan Jiang,Le Gruenwald SIGKDD ’ 06 報告者:林靜怡 2006/10/04. Introduction. mining Closed frequent itemsets computes and maintains closed itemsets online and incrementally perform the closure checking

fayola
Download Presentation

CFI-Stream: Mining Closed Frequent Itemsets in Data Streams

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CFI-Stream: Mining Closed Frequent Itemsets in DataStreams Nan Jiang,Le Gruenwald SIGKDD’06 報告者:林靜怡 2006/10/04

  2. Introduction • mining Closed frequent itemsets • computes and maintains closed itemsets online and incrementally • perform the closure checking • output the current closed frequent itemsets in real time based on users’ specified thresholds

  3. Definition • D:data stream • I = { , , …, } :a set of n elements, called items • T: subsets of all the transactions • X: subsets of all the items appearing in a data stream

  4. Definition • C(X):the smallest closed set containing X • Definition 1 An itemset X is said to be closed if and only if C(X)= f(g(X)) = f•g(X) = X

  5. Algorithm • CFI-Stream algorithm • DIrect Update (DIU) tree • perform the closure checking online over a data stream sliding window • Conditions need to check for closed itemsets • check when performing addition and deletion operations on the DIU tree

  6. DIU tree • maintain the current closed itemsets • k levels in the DIU tree, each level i stores the closed i-itemsets

  7. DIU tree • Each node in the DIU tree stores • a closed itemset • its current support information • links to its parent and children nodes

  8. Add a Transaction to the DIU Tree T1:original transaction set t:new arrived transaction • Conditions to Check for Closed Itemsets (1) t is in the T1, if the largest itemset X it contains is not currently in the DIU tree ->check for all X’s subsets Y, which are in T1

  9. (2) when t is not in T1, for each its subset Y, if Y is in T1, we need to check

  10. Closure Checking for Addition

  11. 2 1 C 3 3 CD CD 1 ABC • C,D 2 A,B 3 A,B,C 4 A,B,C 2 1 1 AB 2

  12. Delete a Transaction in DIU Tree • Conditions to Check for Closed Itemsets • When the number of the transactions with same itemset of X is equal to zero, if Y is a subset of X, and Y is a closed itemset in the original transaction set

  13. Closure Checking for Deletion

  14. 2 2 • C,D 2 A,B 3 A,B,C 4 A,B,C 3 C 3 1 CD AB 2 ABC

  15. Experiment • Synthetic datasets T10.I6.D100K and T5.I4.D100K

  16. Experiment

  17. Experiment

More Related