1 / 19

Microarray gene expression data association rules mining based on BSC-tree and FIS-tree

Microarray gene expression data association rules mining based on BSC-tree and FIS-tree. Authors: Xiang-Rong Jiang and Le Gruenwald Source: Data & Knowledge Engineering, vol.53, 2005, pp.3-29 Speaker: Shu-Fen Chiou( 邱淑芬 ) Date:2005/1/20. Outline. Introduction

Download Presentation

Microarray gene expression data association rules mining based on BSC-tree and FIS-tree

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Microarray gene expression data association rules mining based on BSC-tree and FIS-tree Authors: Xiang-Rong Jiang and Le Gruenwald Source: Data & Knowledge Engineering, vol.53, 2005, pp.3-29 Speaker: Shu-Fen Chiou(邱淑芬) Date:2005/1/20

  2. Outline • Introduction • Proposed method • Experimental results • Conclusions • Comment

  3. Introduction • Use association rules to mine the association relationships among different genes. • Three characteristics must be consideration: • The large search space • Uninteresting genes • Data normalization

  4. Association rule • Itemset : 商品集合 • Large itemset (frequent itemset): 經常被一齊購買的商品集合 • Minimum support 最小支持度門檻 • Minimum confidence最小信心門檻 • Association rule關聯規則 : 顧客若買了X之後,很有可能會再買Y • 從Large itemset我們可以推出關聯規則

  5. Association rule cont’s

  6. Ex: minsup=20% • sup{1}=6/10=60% • sup{1,2}=4/10=40% • sup{1,2,3}=2/10=20% • 以上是large itemset • sup{3,5}=1/10=10% • sup{1,3,5}=1/10=10% • 以上不是large itemset

  7. 產生關聯規則:minconf=50% • {1}{2} 規則成立 sup(1)=60%, sup(1,2)=40%, conf:67.7% • {1}{2,3} 規則不成立 sup(1)=60%, sup(1,2,3)=20%, conf:33.3% • {1,2}3 規則成立 sup(1,2)=40%, sup(1,2,3)=20%, conf:50%

  8. Proposed method n: fraction bits m: exponent bits • G1’s bit string:111011, Use only one bit with this example • Each gene the value of which greater than some standard point • of comparison (increasing):1 • Otherwise (zero or decresing):0

  9. Proposed method • BSC tree each node: G1’s 1-bit count (root count) =5 G1:111011 node-level: initial=1 bit-type: the node bit type 1: all bits is 1 0: all bits is 0 m: all bits is mixture 1 and 0 1-bit-count: the number of the 1 bits of the node 1|1|1 1|1|1 1|1|1 1|1|1

  10. Proposed method bit-type=1 • BSC-tree ANDing algorithm Getting the path code from BSC tree

  11. Proposed method • BSC-tree ANDing algorithm Find the subcode 1 bit count at the root node (root count) of the ANDing BSC-tree representing the 2-itemset is 1 + 2 = 3

  12. Proposed method • FIS tree G1, G2, G4, G7 and G8 are frequent 1-itemset • Level 1: • Suppose the minimum support=50% • G1 root count = 5, support = 5 / 6 = 83.3% > minSup • G2, G4, G7 and G8 root count = 3, support = 3 / 6 =50% = minSup • G3 root count = 0, support =0 < minSup • G5, G6 root count = 2 , support = 2 / 6 = 33.3% < minSup

  13. Proposed method • FIS tree G1G2, G1G4, G1G7, G2G8 and G4G7 are frequent 2-itemsets • Level 2: build frequent 2-itemsets • Suppose the minSup = 50% • Combine the each node with level 1, such as G1G2, G1G4,…, and get the ANDing BSC-tree • Root counts of G1G2, G1G4 and G1G7’s ANDing BSC-tree are 3, support = minSup, so they are frequent 2-itemsets • Root count of the G1G8 ‘s ANDing BSC-tree is 2, support < minSup

  14. Proposed method • Deriving association rule from a FIS-tree • G1G2 is a frequent 2-itemsets in FIS-tree. • Suppose the minSup = 50% and user-defined minimum confidence, • minConf =50% • For G1G2, support = 3 / 6 = 50%, for G1, support = 5 / 6 = 83.3% • Confidence = = 3 / 5 = 60% > minConf • => the rule G1 => G2 holds

  15. Experimental results

  16. Experimental results

  17. Experimental results

  18. Conclusions • Proposed a new association rule mining algorithm. • BSC-tree and FIS-tree are compression trees,and they can save space. • The FIS-tree mining algorithm’s performance is better then other methods.

  19. Comments • 利用簡單的概念創造出有效率的方法。 • 考慮基因表現量反應的程度,去找出較重要的gene,而不是只有用0和去代表基因表現量無變化、減少及1代表表現量增加。 高 3 2 低 1 無變化 0

More Related