1 / 40

Information-Theoretic Mass Spectral Library Search

Outline Introduction Related Work Method Results and Discussion. Information-Theoretic Mass Spectral Library Search. Arvind Visvanathan CSCE 990 Seminar in Multi-Dimensional Chromatography Systems, Informatics, and Applications. Information-Theoretic Mass Spectral Library Search.

candid
Download Presentation

Information-Theoretic Mass Spectral Library Search

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Outline Introduction Related Work Method Results and Discussion Information-Theoretic Mass Spectral Library Search Arvind Visvanathan CSCE 990 Seminar in Multi-Dimensional Chromatography Systems, Informatics, and Applications Information-Theoretic Mass Spectral Library Search CSCE 990 – GCxGC Seminar

  2. Outline Introduction Related Work Method Results and Discussion Outline • Introduction • Mass spectrum search types • Related Work • Other techniques • NIST, PBM, DotMap • Method • Probability and Information • Normalized distribution function • Results • Conclusion Information-Theoretic Mass Spectral Library Search CSCE 990 – GCxGC Seminar

  3. Outline Introduction Related Work Method Results and Discussion Mass Spectrum Search Algorithm Search Types Applications Introduction – Mass Spectrum Decane Intensity m/z Information-Theoretic Mass Spectral Library Search CSCE 990 – GCxGC Seminar

  4. Outline Introduction Related Work Method Results and Discussion Mass Spectrum Search Algorithm Search Types Applications Introduction – Mass Spectrum Search Unknown Spectrum Search Algorithm Potential Matches MS Library Information-Theoretic Mass Spectral Library Search CSCE 990 – GCxGC Seminar

  5. Outline Introduction Related Work Method Results and Discussion Mass Spectrum Search Algorithm Search Types Applications Introduction – Search Types • Identity search • Unknown mass spectrum present in library • Looking for exact spectrum • Similarity search • Unknown mass spectrum not present in library • Looking for similar spectrum Information-Theoretic Mass Spectral Library Search CSCE 990 – GCxGC Seminar

  6. Outline Introduction Related Work Method Results and Discussion Mass Spectrum Search Algorithm Search Types Applications Introduction – MS Search Applications • Steroid detection in athletes • Monitor patient breath during surgery • Composition of molecular species found in space • Honey adulterated with corn syrup • Locate oil deposits • Monitor fermentation process in the biotechnology industry • Detect dioxins in contaminated fish Information-Theoretic Mass Spectral Library Search CSCE 990 – GCxGC Seminar

  7. Outline Introduction Related Work Method Results and Discussion MS Search Probability Based Matching DotMap Related Work – NIST MS-Search [Stein ‘94] • Pre-search the unknown spectra in library • Reduce search domain (160K  4K compounds) • Compute match factor for each compound in the pre-search result • Match Factor (MF) • Range 0-999 • Higher the better • Pre-search result sorted based on MF value • Pick the topmost compounds as possible matches Information-Theoretic Mass Spectral Library Search CSCE 990 – GCxGC Seminar

  8. Outline Introduction Related Work Method Results and Discussion MS Search Probability Based Matching DotMap Related Work – NIST MS-Search [Stein ‘94] • Match Factor Computation [Stein ‘94] • Term 1 – Mass weighted normalized dot product • Term 2 – Relative intensities of adjacent peaks in both spectra • Combination of F1 & F2 Information-Theoretic Mass Spectral Library Search CSCE 990 – GCxGC Seminar

  9. Outline Introduction Related Work Method Results and Discussion MS Search Probability Based Matching DotMap Related Work – NIST MS-Search [Stein ‘94] C-1 C-2 Information-Theoretic Mass Spectral Library Search CSCE 990 – GCxGC Seminar

  10. Outline Introduction Related Work Method Results and Discussion MS Search Probability Based Matching DotMap Related Work – Probability Based Matching [McLafferty et. al. ‘75] • Confidence Value (K) instead of MF • Four components for each m/z • Term 1 : U : Based on the uniqueness of a m/z value • Term 2 : A : Intensity contribution to the confidence • Term 3 : W : Window factor (measure of agreement) • Term 4 : D : Dilution factor (measure of purity) • K  ∑ (U + A + W – D) for each m/z Information-Theoretic Mass Spectral Library Search CSCE 990 – GCxGC Seminar

  11. Outline Introduction Related Work Method Results and Discussion MS Search Probability Based Matching DotMap Related Work – DotMap [Sinovec et. al. ‘04] Fumaric acid Adipic acid DotMap Lactic acid Information-Theoretic Mass Spectral Library Search CSCE 990 – GCxGC Seminar

  12. Outline Introduction Related Work Method Results and Discussion MS Search Probability Based Matching DotMap Related Work – DotMap [Sinovec et. al. ‘04] • Inverse problem • DotMap computed across the image • Higher valued areas indicate presence of compound of interest • Multiple compounds of interest • Compute DotMap overlay Information-Theoretic Mass Spectral Library Search CSCE 990 – GCxGC Seminar

  13. Outline Introduction Related Work Method Results and Discussion MS Search Probability Based Matching DotMap Related Work – DotMap [Sinovec et. al. ‘04] Information-Theoretic Mass Spectral Library Search CSCE 990 – GCxGC Seminar

  14. Outline Introduction Related Work Method Results and Discussion MS Search Probability Based Matching DotMap Related Work – DotMap [Sinovec et. al. ‘04] Information-Theoretic Mass Spectral Library Search CSCE 990 – GCxGC Seminar

  15. Outline Introduction Related Work Method Results and Discussion Motivation Probability & Entropy Distribution Function Match Factor Method – Motivation • NIST MS-Search [Stein ‘94] • No domain information utilized • PBM Matching [McLafferty et. al. ‘75] • Old technique (‘75) • Ad hoc domain information utilization • DotMap • No domain information utilized Information-Theoretic Mass Spectral Library Search CSCE 990 – GCxGC Seminar

  16. Outline Introduction Related Work Method Results and Discussion Motivation Probability & Entropy Distribution Function Match Factor Method – Entropy • Entropy based approach • Entropy  measure of the amount of uncertainty • Based on probabilities • Include domain based knowledge (information) in computing the match factor Information-Theoretic Mass Spectral Library Search CSCE 990 – GCxGC Seminar

  17. Outline Introduction Related Work Method Results and Discussion Motivation Probability & Entropy Distribution Function Match Factor Method – Distribution Function • Library • NIST EPA Library • 163K compounds • Compute distribution function (DF) • 2 dimensional array • m/z vs intensity • DF[i][j] • # compounds in library • m/z = i • Intensity = j Information-Theoretic Mass Spectral Library Search CSCE 990 – GCxGC Seminar

  18. Outline Introduction Related Work Method Results and Discussion Motivation Probability & Entropy Distribution Function Match Factor Method – Distribution Function Intensity m/z Information-Theoretic Mass Spectral Library Search CSCE 990 – GCxGC Seminar

  19. Outline Introduction Related Work Method Results and Discussion Motivation Probability & Entropy Distribution Function Match Factor Method – Normalized Distribution Function (NDF) • Normalized Distribution Function • NDF[mz][int] = DF[mz][int] / ∑ DF[mz][i] • Where ∑ DF[mz][i] = 163K • NDF  Probabilities [0-1] i i Information-Theoretic Mass Spectral Library Search CSCE 990 – GCxGC Seminar

  20. Outline Introduction Related Work Method Results and Discussion Motivation Probability & Entropy Distribution Function Match Factor Method – Assumptions • Assumption Each m/z is treated independently in the match factor computation from normalized distribution function Information-Theoretic Mass Spectral Library Search CSCE 990 – GCxGC Seminar

  21. Outline Introduction Related Work Method Results and Discussion Motivation Probability & Entropy Distribution Function Match Factor Method – Match Factor Information-Theoretic Mass Spectral Library Search CSCE 990 – GCxGC Seminar

  22. Outline Introduction Related Work Method Results and Discussion Overview Additive Noise Multiplicative Noise Johnson Colored Noise Random Spectrum Noise Results – Overview • Technique • Compound in library + Noise • Search noisy compound in library • Evaluation metric - Average Rank • Rank = Position of correct compound in hit list • Repeat above 3000 times and take average rank • Compared with • NIST • NISTDOT (First term in NIST algorithm) Information-Theoretic Mass Spectral Library Search CSCE 990 – GCxGC Seminar

  23. Outline Introduction Related Work Method Results and Discussion Overview Additive Noise Multiplicative Noise Johnson Colored Noise Random Spectrum Noise Results – Noise models • Additive AU = AL + G(0,σ) • Multiplicative AU = AL + AL* G(0,σ) • Johnson Colored AU = AL + G(0,σ*√m) • Random spectrum AU = AL + x * AR Information-Theoretic Mass Spectral Library Search CSCE 990 – GCxGC Seminar

  24. Outline Introduction Related Work Method Results and Discussion Overview Additive Noise Multiplicative Noise Johnson Colored Noise Random Spectrum Noise Results – Additive Noise • Compound = Compound + Additive noise • Additive Gaussian noise • Zero mean • Variable standard deviation • For each m/z in library spectrum AU = AL + G(0,σ) Information-Theoretic Mass Spectral Library Search CSCE 990 – GCxGC Seminar

  25. Outline Introduction Related Work Method Results and Discussion Overview Additive Noise Multiplicative Noise Johnson Colored Noise Random Spectrum Noise Results – Additive Noise (Example) Information-Theoretic Mass Spectral Library Search CSCE 990 – GCxGC Seminar

  26. Outline Introduction Related Work Method Results and Discussion Overview Additive Noise Multiplicative Noise Johnson Colored Noise Random Spectrum Noise Results – Additive Noise (Performance) Information-Theoretic Mass Spectral Library Search CSCE 990 – GCxGC Seminar

  27. Outline Introduction Related Work Method Results and Discussion Overview Additive Noise Multiplicative Noise Johnson Colored Noise Random Spectrum Noise Results – Multiplicative Noise • Compound = Compound + Multiplicative noise • Multiplicative Gaussian noise • Zero mean • Variable standard deviation • For each m/z in library spectrum AU = AL + AL* G(0,σ) Information-Theoretic Mass Spectral Library Search CSCE 990 – GCxGC Seminar

  28. Outline Introduction Related Work Method Results and Discussion Overview Additive Noise Multiplicative Noise Johnson Colored Noise Random Spectrum Noise Results – Multiplicative Noise (Example) Information-Theoretic Mass Spectral Library Search CSCE 990 – GCxGC Seminar

  29. Outline Introduction Related Work Method Results and Discussion Overview Additive Noise Multiplicative Noise Johnson Colored Noise Random Spectrum Noise Results – Multiplicative Noise (Performance) Information-Theoretic Mass Spectral Library Search CSCE 990 – GCxGC Seminar

  30. Outline Introduction Related Work Method Results and Discussion Overview Additive Noise Multiplicative Noise Johnson Colored Noise Random Spectrum Noise Results – Johnson Colored Noise • Compound = Compound + Colored Noise • Gaussian noise • Zero mean • Variable standard deviation • For each m/z in library spectrum AU = AL + G(0,σ*√m) Information-Theoretic Mass Spectral Library Search CSCE 990 – GCxGC Seminar

  31. Outline Introduction Related Work Method Results and Discussion Overview Additive Noise Multiplicative Noise Johnson Colored Noise Random Spectrum Noise Results – Johnson Colored Noise (Example) Information-Theoretic Mass Spectral Library Search CSCE 990 – GCxGC Seminar

  32. Outline Introduction Related Work Method Results and Discussion Overview Additive Noise Multiplicative Noise Johnson Colored Noise Random Spectrum Noise Results – Johnson Colored Noise (Performance) Information-Theoretic Mass Spectral Library Search CSCE 990 – GCxGC Seminar

  33. Outline Introduction Related Work Method Results and Discussion Overview Additive Noise Multiplicative Noise Johnson Colored Noise Random Spectrum Noise Results – Random Spectrum Noise • Compound = Compound + Random Spectrum • Additive Spectrum • Add x% of another random spectrum • For each m/z in library or random spectrum • AU = AL + x * AR Information-Theoretic Mass Spectral Library Search CSCE 990 – GCxGC Seminar

  34. Outline Introduction Related Work Method Results and Discussion Overview Additive Noise Multiplicative Noise Johnson Colored Noise Random Spectrum Noise Results – Random Spectrum Noise (Example) Information-Theoretic Mass Spectral Library Search CSCE 990 – GCxGC Seminar

  35. Outline Introduction Related Work Method Results and Discussion Overview Additive Noise Multiplicative Noise Johnson Colored Noise Random Spectrum Noise Results – Random Spectrum Noise (Performance) Information-Theoretic Mass Spectral Library Search CSCE 990 – GCxGC Seminar

  36. Outline Introduction Related Work Method Results and Discussion Overview Additive Noise Multiplicative Noise Johnson Colored Noise Random Spectrum Noise Results – Summary of Noise Models • Additive AU = AL + G(0,σ) • Multiplicative AU = AL + AL* G(0,σ) • Johnson Colored AU = AL + G(0,σ*√m) • Random Spectrum AU = AL + x * AR Information-Theoretic Mass Spectral Library Search CSCE 990 – GCxGC Seminar

  37. Outline Introduction Related Work Method Results and Discussion Overview Additive Noise Multiplicative Noise Johnson Colored Noise Random Spectrum Noise Results – Summary of Noise Models Information-Theoretic Mass Spectral Library Search CSCE 990 – GCxGC Seminar

  38. Outline Introduction Related Work Method Results and Discussion Overview Additive Noise Multiplicative Noise Johnson Colored Noise Random Spectrum Noise Results – Summary of Noise Models Information-Theoretic Mass Spectral Library Search CSCE 990 – GCxGC Seminar

  39. Outline Introduction Related Work Method Results and Discussion Conclusion • MS library search algorithm • Information theoretic • Domain knowledge incorporated • Algorithm works well for various noise models • Future work • Must improve performance for the random spectrum noise case Information-Theoretic Mass Spectral Library Search CSCE 990 – GCxGC Seminar

  40. Outline Introduction Related Work Method Results and Discussion Questions & Suggestions ? Information-Theoretic Mass Spectral Library Search CSCE 990 – GCxGC Seminar

More Related