1 / 25

Interfaces for Selecting and Understanding Collections

Interfaces for Selecting and Understanding Collections. Selecting from Collections. Collections are sets of documents that have been coalesced by a human or system. Traditional collections: NLM’s MedLine ACM Digital Library LEXIS-NEXIS Library/museum resources from a particular donor

kerri
Download Presentation

Interfaces for Selecting and Understanding Collections

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Interfaces for Selecting and Understanding Collections

  2. Selecting from Collections • Collections are sets of documents that have been coalesced by a human or system. • Traditional collections: • NLM’s MedLine • ACM Digital Library • LEXIS-NEXIS • Library/museum resources from a particular donor • How do people with information needs locate and identify the appropriate collections?

  3. Does it Matter? • Web search engines (e.g. Google) get us the information we need … • well maybe • Web search drops users into the middle of a collection without any understanding of the collection and its overall characteristics. • Web search misses • Lots of more structured materials • “the hidden web” • Subscription-based content • Which is likely the best edited, most accurate, and most valuable in specialized domains

  4. Interfaces over Multiple Collections • Interfaces for Selecting and Understanding Collections • Lists • Overviews • Examples • Automated source selection

  5. Lists of Collections • Usually just provides a list of collection names. • Difficult to select from if user does not know the collections beforehand • Over time people bookmark collections of value • Need tools for helping users who are outside of their areas of expertise

  6. Example

  7. Example

  8. Examples

  9. Overviews of Collections • Overviews provide a sense of what is in a collection • Overviews can be • Based on a category or directory structure • Automatically derived from the collection • Presentation of an overview is often a form of information visualization

  10. Category-based Overviews • MedLine – biomedical collection • Medical Subject Headings (MeSH) consists of 18,000 categories in a directed acyclic graph • ACM Digital Library – computer science collection • Hierarchy of 1200 catgory (keyword) labels • Yahoo – the Web • Graph of directories (probably a DAG) • Humans have to place documents in categories • Author for ACM DL, subject experts for MedLine, surfers for Yahoo

  11. MeSH Browser

  12. HiBrowse Browser

  13. ConeTrees

  14. Radial Views

  15. Hyperbolic Views

  16. MediaMetro

  17. Automatically Derived Overviews • Apply clustering algorithms to document collection • Remember Automatic Global Analysis • Use of co-occurrance and co-citation • Use of distance-based clustering approaches like hierarchic agglomerative clustering • Need methods to determine labels for clusters • Could be a document • identification of centroid (document most similar to all others) • Identification of hubs (document most mentioned by cluster) • Could be one or more terms • Use most common / best differentiator (using TF-IDF) • No human intervention required • but people are likely to be valuable as editors

  18. Scatter Gather

  19. Evaluation of Scatter-Gather • Scatter-Gather • Scatter-Gather conveyed overview of collection contents • Scatter-Gather without search was less effective than a basic search • Need to combine clustering with search

  20. Themescapes

  21. More Themescapes

  22. Kohonen Maps

  23. Evaluation of Graphical Overviews • One study found that non-experts found the clustering results difficult to use (worse than text-based views like Scatter-Gather) • Comparison of Kohonen map and Yahoo • 11 of 15 subjects found “interesting” page using Kohonen • 8 were able to find same page using Yahoo • 14 of 16 subjects found “interesting” page using Yahoo • 2 were able to find same page using Kohonen • Subjects liked ability to jump between categories without backing out of current category • Unsupervised thematic overviews probably better for giving a gist of what is in a collection than for search.

  24. Examples, Dialogs, Wizards • Retrieval by reformulation • Start with example queries • Rabbit, Helgon • Can be difficult to find appropriate starting query • Wizards • Found to be helpful for users without necessary domain knowledge get through many step processes • Not helpful when wizard not accompanied with help • Not useful when goal is teaching how to use the interface. • Guided tours • Presents a logical sequence of navigation choices for accomplishing a goal (e.g. Waldens Paths) • Not evaluated with regards to information access

  25. Automated Source Selection • Selecting collection automatically (but explicitly) • Need a model of each collection • What it covers, need model of topics • What it is good at, need metric for good • Develop a model of the user’s information need • Match the information need to the most valuable collections for that topic • Used in meta-search – interesting area of research • Could be starting point for interactive collection selection.

More Related