1 / 30

Positional Association Rules

Positional Association Rules. Dr. Bernard Chen Ph.D. University of Central Arkansas. Central Dogma of Molecular Biology. Amino Acids, the subunit of proteins. Protein Primary, Secondary, and Tertiary Structure. Protein 3D Structure. Protein Sequence Motif.

Download Presentation

Positional Association Rules

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Positional Association Rules Dr. Bernard Chen Ph.D. University of Central Arkansas

  2. Central Dogma of Molecular Biology

  3. Amino Acids, the subunit of proteins

  4. Protein Primary, Secondary, and Tertiary Structure

  5. Protein 3D Structure

  6. Protein Sequence Motif • Although there are 20 amino acids, the construction of protein primary structure is not randomly choose among those amino acids • Sequence Motif: A relatively small number of functionally or structurally conserved sequence patterns that occurs repeatedly in a group of related proteins.

  7. Protein Sequence Motif These biologically significant regions or residues are usually: • Enzyme catalytic site • Prostethic group attachment sites (heme, pyridoxal-phosphate, biotin…) • Amino acid involved in binding a metal ion • Cysteines involved in disulfide bonds • Regions involved in binding a molecule (ATP/ADP, GDP/GTP, Ca, DNA…)

  8. HSSP-BLOSUM62 Measure

  9. FutureWorks

  10. Motivation • In order to obtain the DNA/protein sequence motifs information, fixing the length of sequence segments is usually necessary. • Due to the fixed size, they might deliver a number of similar motifs simply shifted by several bases or including mismatches

  11. Example • If there exists a biological sequence motif with length of 12 and we set the window size to 9, it is highly possible that we discovered two similar sequence motifs where one motif covers the front part of the biological sequence motif and the other one covers the rear part.

  12. Positional Association Rules • The basic association rule gives the information of A => B • However, under the circumstances of the “order” involved with the appearance of items, the basic association rule is not powerful enough • we introduce another parameter called “distance assurance” to help identify frequent itemset with frequent distance

  13. Positional Association Rules

  14. Pseudocode of Positional Association Rule with the Apriori concept Algorithm: Positional Association Rule with the Apriori Concept Input: Database, D, (Protein sequences as Transactions and Sequence Motifs as items), min_support, min_confidence, and min_distance_assurance Output: P, positional association rules in D Method: L = find_frequent_itemsets(D, min_support) S = find_strong_association_rules(L, min_confidence) for (k=2; Sk ≠ Ø; k++ ) for each strong association rule, r Sk antecedent_motif = Apriori_Motif_Construct(r_ant) consequence_motif = Apriori_Motif_Construct(r_con) if antecident_motif == NULL or consequence_motif == NULL: goto Step (4) for each protein sequence, ps D for (ant_position=1; |ps| ; ant_position++) if antecedent_motif start appear on ps[ant_position]: r_ant_count++ for (con_position=1; |ps| ; con_position++) if consequent_motif start appear on ps[con_position]: distance = ant_position – con_position rdistance ++ Pk = { rdistance | rdistance > min_distance_assurance * r_ant_count } Apriori_Motif_Construct(itemset) if |itemset| == 1: return itemset else: for each positional association rules in P|itemset| if all items in the itemset appear in the positional association rule: return the new motif constructed by the positional association rule return NULL

  15. Positional Association Rules Example

  16. Positional Association Rules Example • minimum support = 60%, • minimum confidence = 80%, • minimum distance assurance = 60%

  17. minimum support = 60%, minimum confidence = 80%, minimum distance assurance = 60% • Scan for C1 A: 3/5 A B: 5/5 B C: 2/5 => => AB, AD, BD D: 4/5 D E: 1/5

  18. minimum support = 60%, minimum confidence = 80%, minimum distance assurance = 60% • Scan for C2 AB: 3/5 AB AD: 3/5 => AD => ABD BD: 4/5BD

  19. minimum support = 60%, minimum confidence = 80%, minimum distance assurance = 60% • Scan for C3 ABD: 3/5 => ABD => no C4

  20. minimum support = 60%, minimum confidence = 80%, minimum distance assurance = 60% • Therefore, the itemset that pass support: {AB, AD, BD, ABD} • Next, we need to compute their confidence

  21. minimum support = 60%, minimum confidence = 80%, minimum distance assurance = 60% • First, we work on 2-itemset: {AB,AD,BD} A=>B: 3/3 B=>A: 3/5 A=>D: 3/3 D=>A: 3/4 B=>D: 4/5 D=>B: 4/4

  22. minimum support = 60%, minimum confidence = 80%, minimum distance assurance = 60% • then, we work on 3-itemset: {ABD} A=>BD: 3/3 B=>AD: 3/5 D=>AB: 3/4 AB=>D: 3/3 AD=>B: 3/3 BD=>A: 3/4

  23. minimum support = 60%, minimum confidence = 80%, minimum distance assurance = 60% • Thus, the strong association rules we have: 2-itemset 3-itemset A=>B A=>BD A=>D AB=>D B=>D AD=>B D=>B Next, we work on Positional Association rules…

  24. Positional Association Rules D=>Bminimum distance assurance = 60% 1.= 3/4 3. =1/4 2. = 1/4

  25. Positional Association Rules B=>Dminimum distance assurance = 60% 1.= 3/6 3. = 1/6 2. = 1/6

  26. Positional Association Rules A=>Bminimum distance assurance = 60% 1.= 2/4 3. = 1/4 2. = 1/4 4. = 1/4

  27. Positional Association Rules A=>Dminimum distance assurance = 60% 1.= 3/4 2. = 1/4

  28. Positional Association Rules AD=>Bminimum distance assurance = 60% 1.= 2/3 2. = 1/3

  29. Positional Association Rules AB=>Dminimum distance assurance = 60% NO Positional Association Rules on AB !!!

  30. Positional Association Rules A=>BDminimum distance assurance = 60% 1.= 2/4 2. = 1/4

More Related