Applications of Text Mining

Applications of Text Mining Ewan Klein School of Informatics & NeSC

Text Mining • Goals • Extract useful information from large bodies of unstructured or semi-structured documents • Looks for patterns in natural language text • Driven by application needs • Three Areas: • Adding Metadata • E.g., identify Dublin Core elements from document headers • Information Extraction • Identify nuggets of text data and marshall them into a fixed format • Assisting Curation

Text mining and Curation • Example workflow: • Make an observation • Search the research literature for knowledge • Incorporate relevant information into database • Challenges: • Current Information Retrieval (IR) techniques often too imprecise • Which enzymes act as catalysts in the glycolysis pathway? • We want to identify a relation between two entities • Move to augmenting IR with more knowledge of text structure • Mostly supervised machine learning techniques • Still need training data for each domain • Need to integrate text mining into Grid applications

BlueDwarf for Text Mining • BioCreative Competitioin • Joint entry with Stanford • Recognition of drug names, chemical names, and protein names in MEDLINE abstracts • Java maximum entropy tagger • Used roughly 700,000 features in the early stages • Java memory size of 1950 Mb • Died on available Informatics and Stanford machines • BlueDwarf • Arrived at 1,247,77 features, memory: 2560 Mb • Several experiments running in parallel • Provisional results: we obtained top-scoring results

Applications of Text Mining

Applications of Text Mining

Presentation Transcript

Text-Mining: analysis of text data

CSC 9010: Text Mining Applications Document Summarization

Text Mining

CSC 9010: Text Mining Applications Document-Level Techniques

Demonstration of Text Mining

Text Mining Applications for Literature Curation

Text mining- text analytics- data mining

Text Mining

Biomedical text mining

Text Mining

Text Mining

Text Mining

Web Intelligence Text Mining, and web-related Applications

Text Mining

Comparative Text Mining

Text-Mining: analysis of text data

Text Mining