1 / 16

Mining Officially Unrecognized Side effects of drugs by combining Web Search and Machine learning

Mining Officially Unrecognized Side effects of drugs by combining Web Search and Machine learning. Carlo Carino, Yuanyuan Jia, Bruce Lambert, Patricia West and Clement Yu University of Illinois at Chicago. Motivation. Drugs have side effects

alva
Download Presentation

Mining Officially Unrecognized Side effects of drugs by combining Web Search and Machine learning

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Mining Officially Unrecognized Side effects of drugs by combining Web Search and Machine learning Carlo Carino, Yuanyuan Jia, Bruce Lambert, Patricia West and Clement Yu University of Illinois at Chicago

  2. Motivation • Drugs have side effects • Food and Drug Administration (FDA) requires drug companies to do extensive clinical trials before a drug enters the market place • Not all side effects of a given drug are officially recognized by the FDA Recent case: Vioxx

  3. Objective • Find unrecognized side effects of drugs Approach • Submit a query ( drug name, side effect) to a Web search engine • Extract from the retrieved pages all side-effects which are found in those pages

  4. Problems associated with this approach May not retrieve enough relevant documents May retrieve a lot of irrelevant documents May not want to have entire documents

  5. Want to have more relevant documents Modify the query to become < drug name OR active ingredients> Active ingredients = the chemical compounds forming the drug

  6. Want to reduce the number of irrelevant documents Web retrieval | | Classification ( machine learning algorithm)

  7. Machine Learning algorithm Neural network Many features: words such as actual side-effects; “side-effects”. “adverse effects”; “safe”, “not” etc Hundreds of such features

  8. Training phase, then test phase Train on 7 drugs: identify the relevant and irrelevant pages retrieved for these 7 drugs; Test on 20 other drugs

  9. Results Each drug retrieves 100 pages from Google; Using the classification algorithm only 16.4 pages/drug are retained by our system; Average precision-accept: 90.3% Average precision-reject: 87.5% Average precision of top 17 pages for Google: 61.2%

  10. Reduce the amount of manual efforts for collecting training data Generate “positive examples” and “negative examples” automatically with high probabilities

  11. Automatic generation of training data Find a drug, d, such that the diseases it treats, T, are disjoint from its known side-effects, S. A retrieved page in response to <d, t in T>, if does not contain any known side effect S is “negative; A retrieved page in response to <d, s in S>, if does not contain “safe” or “not” in the vicinity of s is “positive”.

  12. Accuracy of generating training examples automatically Accuracy of generating 50 positive examples: 98% Accuracy of generating 50 negative examples: 96%

  13. Validation of unrecognized side-effects Validated by licensed pharmacist and drug information specialist Prilosec: pneumonia Accutane: Watery eye Uroxatral: Yellowing of skin or eyes

  14. Retrieve passages instead of pages For each passage of certain number of words, compute the degree of “relevance”. Output the passage with highest relevance, if it exceeds a threshold. Also output frequencies of side-effects

  15. Summary Proposed system can retrieve unrecognized side-effects of drugs Improve accuracy of retrieve; Need a good medical dictionary to recognize highly related side effects for example, nausea and vomiting

  16. Extensions Our approach, retrieval followed by classification, can be applied to other retrieval problems: Examples: Complications of medical procedures/ operations; Processing of queries of specialized types

More Related