1 / 18

Binary Clustering with Missing Values (BCMV)

Binary Clustering with Missing Values (BCMV). 作者 : A. Figueroa, J. Borneman , T. Jiang 報告人 : 劉效飛. N. N. Problem Formulation. 1. 兩個 vector 怎 樣才叫相似 ? Def: 沒有任何一個位置是互斥的, 則稱為相似 i.e. 將 每一個 vector 看成指紋採樣, 則兩個指紋採樣可能是同一個犯人所留下 Ex: (1 0 N N 1) , (1 N 1 0 1) are similar

eldora
Download Presentation

Binary Clustering with Missing Values (BCMV)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Binary Clustering with Missing Values(BCMV) 作者: A. Figueroa, J. Borneman, T. Jiang 報告人: 劉效飛

  2. N N

  3. Problem Formulation 1. 兩個vector怎樣才叫相似? Def: 沒有任何一個位置是互斥的,則稱為相似 i.e. 將每一個 vector 看成指紋採樣, 則兩個指紋採樣可能是同一個犯人所留下 Ex: (1 0 N N 1) , (1 N 1 0 1) are similar ∵(1 0 1 0 1)可以同時留下這兩種指紋 (1 0 N N 1), (0 N 1 1 1) is not similar ∵沒人可以同時留下這兩種指紋

  4. 2. 一群兩兩相似的vectors,之後簡稱為群 3. BCMV Def: 將vectors分割成最少個群

  5. Topic 1: NP_complete? • Theorem 1: BCMV(1) is in P • Theorem 2: BCMV(k) is NPC for all k >= 3 but BCMV(2) is still unknown

  6. Proof of Theorem 2 Def 4: Minimum Clique Partition (MCP): 將一個graph 分割成最少個clique Lemma: MCP is NPC on graphs of bounded degree k,k >= 3 Claim: MCP(k) can be reduced to BCMV(k) k >= 3

  7. 2 3 1: 1 N 0 0 0 N 2: N 1 N 0 N 0 3: 0 N 1 0 0 N 4: 0 0 0 1 N 0 5: 0 N 0 N 1 N 6: N 0 N 0 N 1 Ex: 4 1 6 5 原因: Two vectors are similar iff the corresponding nodes are adjacent.

  8. Proof of Theorem 1 Lemma : Vertex Cover on bipartie graphs is in P Claim : BCMV(1) could be reduced to Vertex Cover on bipartie graphs

  9. BCMV的另一種解釋: 輸入: 每一個 0_N_1 vector 都是某個犯人的指紋採樣 目標: 這些犯人如何請最少的人頭,來幫所有人頂罪 EX: 警方資料 A. 0 0 1 1 N B. 1 N 1 N 1 • 0 0 1 N 1 • N 1 N 1 1 E. N N 1 1 1 最省錢的方式,只需請二個人頭 人頭甲 0 0 1 1 1 可以幫A, C, E 犯人頂罪 人頭乙 1 1 1 1 1 可以幫B, D, E 犯人頂罪

  10. 可以用k個人頭頂罪  可以將vectors 分割成k群 Pf: (=>) A. 0 0 1 1 N B. 1 N 1 N 1 C. 0 0 1 N 1 D. N 1 N 1 1 E. N N 1 1 1 人頭甲 0 0 1 1 1 可以幫{A, C, E} 頂罪 {A, C, E}, {B, D, E} 一定是群 (<=) ∵每個群至少存在一個人 頭可以幫 忙整個群頂罪

  11. Reduce 方法(犯人觀點): • 將所有可已用來頂罪的人頭列出, • 再用node代替人頭,含有奇數個1的人頭放一邊,偶數個1的人頭放另一邊 • 將所有犯人列出,再用edge代替犯人 • If 人頭甲可以替犯人A頂罪 then 將node 甲和edge A接起來

  12. Step 2 甲 丙 戊 (竒) 乙 丁 (偶) Ex: Step 1: A. 0 1 N 1 0 => 甲. 0 1 1 1 0 乙. 0 1 0 1 0 B. N 1 1 0 0 => 丙. 1 1 1 0 0 丁. 0 1 1 0 0 C. 0 1 1 N 0 => 甲. 丁. D. 0 1 N 0 0 => 丁. 戊. 0 1 0 0 0 Step 3 Step 4 甲 丙 戊 (竒) 乙 丁 (偶) A B C D A D B C

  13. 為何出來結果一定是bipartie graph? ∵同一邊的人頭一定差兩個bits以上 Ex: E 甲 丙 戊 (竒) 乙 丁 (偶) 人頭甲: 0 1 1 1 0 人頭丙: 1 1 1 0 0 犯人E: N …… N …矛盾 A D B C

  14. Topic 2: Approximation • Def : minimum set covering problem Input: Collection C of subsets of a finite set S. output: 從C中挑出最少的sets,cover S的每個element

  15. 1. 若每個element 最多出現k個 sets 中, 則 approximable within k Pf: 觀念: ∵ 每挑錯一個set,最後set cover的總數 最多加一個  只要確定每挑 k個sets,其中一定有個 是對的

  16. 觀察: 將每個人頭看成一個set, 則可以變成 minimum set covering的問題 犯人指紋 人頭 set A: 1 N 1 1 0 1 1 1 1 0: {A, B} 1 0 1 1 0: {A, C} B: N 1 1 1 N => 1 1 1 1 1: {B} 0 1 1 1 1: {B} 0 1 1 1 0: {B} C: 1 0 1 N 1 1 0 1 1 1: {C}  approximation ratio 一定可以到達2^p

  17. Algorithm 1. repeat if element f 只屬於 V這個set, 將 V選入set cover else go to 2 2. 選可以包含最多elements 的 set, go to 1 Notes: 1. Approximable within 1+ ln|S| 2. Not approximable within (1 - ) ln|S| for any , unless NP

  18. Conclusions • 1. Is BCMV(2) NPC? • 2. Better Approximation ratio?

More Related