80 likes | 104 Views
Explore how web wrappers automate input queries for bioinformatics searching, extract data, and offer practical solutions. Improve accuracy and efficiency in nucleotide sequence searches across diverse interfaces.
E N D
Automatic Resource Detection Enabling Wrappers to Crawl the Web for Bioinformatics Searching Sites
Wrapper Technology • As the Web grows, and forms become more prevalent, inputting queries in each site becomes too time consuming • Wrappers can crawl the Web for forms that require particular kinds of queries (i.e. nucleotide sequences), and automatically test each for function. • XML is most often used today, along with Java.
Practicality of Wrappers • Many can be automatically generated once type of website has been detected • Extraction of data can also be automated and dumped into a database or given to a human for curation of results • As a technology that has been around for at least 6 years, Web wrappers are well-established ways of reconciling different parts of the Web (ShopBot)
Example of BLAST • 160 different interfaces available for searching BLAST with different availability of options for each one • Wrapper can automatically search across all of these and identify that they are potential nucleotide sequence searchers • For those that have no intermediation or unruly processing, queries can be automated through the wrapper.
Possible Applications of This • With so many different interfaces and more being formed every day, this is a way of controlling for possible errors introduced into some of these search engines. • If the results can be automatically compared for content (i.e. run a meta-BLAST search), then any outliers can be excluded and the most probable list presented.
Interesting Current Applications • DiscoveryLink • Dbget • Biokleisli – can anyone find this? It’s referenced, but apparently is only a paper on Citeseer.
Final Results of the Wrapper • 2/3rds of all resources were correctly identified as capable of BLAST nucleotide searches. • While a greater number would be desirable, this would all analysis across different search interfaces to determine ranking and function
Another Possible Application • Ranking and function is another possible use of this wrapper. • Once the meta-BLAST search has been completed, the individual results could be compared to the meta-search to gain an idea of how accurate or in-depth each search is. • A tentative ranking scheme could be computed from this data for future searches.