Paper presentation by Mark Sharp 17:610:554 Information Visualization, Prof. Spoerri 11/11/2002 - PowerPoint PPT Presentation

Paper presentation by mark sharp 17 610 554 information visualization prof spoerri 11 11 2002 l.jpg
Download
1 / 35

Galaxy of News: An Approach to Visualizing and Understanding Expansive News Landscapes Earl Rennison In UIST `94, ACM Symposium on User Interface Software and Technology. New York: ACM Press, 1994. Paper presentation by Mark Sharp 17:610:554 Information Visualization, Prof. Spoerri 11/11/2002

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

Download Presentation

Paper presentation by Mark Sharp 17:610:554 Information Visualization, Prof. Spoerri 11/11/2002

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Paper presentation by mark sharp 17 610 554 information visualization prof spoerri 11 11 2002 l.jpg

Galaxy of News: An Approach to Visualizing and Understanding Expansive News Landscapes Earl RennisonIn UIST `94, ACM Symposium on User Interface Softwareand Technology. New York: ACM Press, 1994.

Paper presentation by Mark Sharp

17:610:554 Information Visualization, Prof. Spoerri

11/11/2002

554 paper pres.


Paper summary l.jpg

Paper Summary

  • PROBLEM: Accessing and understanding news information is not well-supported by the information infrastructure.

  • VISION: An intelligent infrastructure that automatically builds the correlations and relationships between news articles and constructs an environment that allows readers to dynamically explore and gain understanding.

554 paper pres.


How does it work l.jpg

How does it work?

  • Articles have features (metadata) extracted by parsing algorithms, then they are clustered by ARN (a neural network algorithm) and mapped to a 3D space layout.

  • Nodes: keyword hierarchy / headlines / full text

  • Zoom in with left mouse button, out with right. [direct manipulation]

  • Animation (4D) helps user understand what system is doing. [motion: an early/pre-attentive visual cue]

554 paper pres.


Slide4 l.jpg

554 paper pres.


Slide5 l.jpg

554 paper pres.


Slide6 l.jpg

554 paper pres.


Slide7 l.jpg

554 paper pres.


Slide8 l.jpg

554 paper pres.


Slide9 l.jpg

554 paper pres.


Model components l.jpg

Model components

Temporal and behavior interaction: controls level-of-detail, user orientation cues, transition to new views.

Spatial construction: can be 2-, 3-, or n-dimensional; uses relationships; dynamic (appropriate for news)..

Relationships: designer-specified; e.g. temporal ordering… .

News base:not raw data; objects and annotations (keywords, slugwords, location, time, subject, etc.); manually or automatically derived from raw data.

554 paper pres.


Slide11 l.jpg

reading

writing


Slide12 l.jpg

554 paper pres.


Slide13 l.jpg

554 paper pres.


Which early pre attentive visual processes are leveraged l.jpg

Which early / pre-attentivevisual processes are leveraged?

Position

Proximity

Motion

Brightness

Size

Color

554 paper pres.


What is working l.jpg

What is working?

  • Principled (algorithmic) feature extraction and clustering.

  • Direct manipulation.

  • True zooming (seamless exploration of categories, document labels, and full texts).

  • Dynamic updating of content (new articles).

554 paper pres.


What is not working or clear l.jpg

What is not working or clear?

  • Clustering based on skinny metadata rather than full text vectors.

  • Keywords are single words, not terms.

  • “Relationships”?

554 paper pres.


What surprised you l.jpg

What surprised you?

  • Naivete about “understanding” and media studies.

554 paper pres.


Key insights what i learned l.jpg

Key Insights: what I learned

  • Detailed look into the architecture of a true large text corpus info viz system with many desirable features.

554 paper pres.


What is the key contribution l.jpg

What is the key contribution?

  • True zooming (seamless integration of all levels) is feasible in large text corpora.

554 paper pres.


Take away messages what can be generalized l.jpg

Take-away messages?What can be generalized?

  • Computational feasibility forces some compromises.

    • “What is not working”

    • Human heuristics (“relationships”?)

      BUT help is on the way (bigger iron)

554 paper pres.


3 questions for group and class discussion l.jpg

3 questions for groupand class discussion.

  • Is volume and lack of organization really our biggest problem with modern news information?

  • Would you use Galaxy of News? Why or why not?

  • What other kinds of text data would you like to see this approach applied to? How might a different domain affect the specification of metadata object representations and/or “relationships”?

554 paper pres.


Paper presentation by mark sharp 17 610 554 information visualization prof spoerri 11 11 200222 l.jpg

TileBars: Visualization of Term Distribution Information in Full Text Information AccessMarti HearstProceedings of the ACM SIGCHI Conference onHuman Factors in Computing Systems (CHI), pp. 59-66, Denver, CO, May 1995.

Paper presentation by Mark Sharp

17:610:554 Information Visualization, Prof. Spoerri

11/11/2002

554 paper pres.


Paper summary23 l.jpg

Paper Summary

  • PROBLEM: Traditional IR is focused on text databases consisting of titles and abstracts; assumptions are not necessarily appropriate for full text.

  • VISION: Utilize term distributionwithin the text as well as overall frequency to model document relevance. Replace opaque rankingwith a transparent means for swift appraisal of the query-document relationship.

554 paper pres.


How does it work24 l.jpg

How does it work?

  • TextTiling algorithm partitions full text into adjacent, non-overlapping, multi-paragraph segments reflecting subtopic structure based on term co-occurrence and repetition.

  • Segments are scored for similarity to query terms.

  • Display shows document length, term frequency, and term distribution across segments.

554 paper pres.


Slide25 l.jpg

Length of rectangle :

length of document

Each gray square = 1 tile

(segment)

Tile darkness : term freq.

Query term sets : tile rows

554 paper pres.


Slide26 l.jpg

554 paper pres.


Slide27 l.jpg

554 paper pres.


Which early pre attentive visual processes are leveraged28 l.jpg

Which early / pre-attentivevisual processes are leveraged?

Length

Position

Darkness (gray scale)

554 paper pres.


What is working29 l.jpg

What is working?

  • Elegant rep. of document length.

  • Adjacency of tiles between term rows => overlap.

  • Gray scale leverages relative (vs. absolute) judgment.

  • Meaningful labels (start of text).

  • Direct click link from tiles to text segments.

  • Starting TREC/TIPSTER evaluation.

554 paper pres.


What is not working or clear30 l.jpg

What is not working or clear?

  • Depends on skillful Boolean query formulation (e.g. no stopwords).

  • Doesn’t appear to be scalable to large queries (>3 conjunctive terms).

554 paper pres.


What surprised you31 l.jpg

What surprised you?

  • “Because they do have a natural visual hierarchy, varying shades of gray show varying quantities better than color.”

554 paper pres.


Key insights what i learned32 l.jpg

Key Insights: what I learned

  • Relevance ranking is not the only game in town for putting cognitive cues on multi-document retrievals.

554 paper pres.


What is the key contribution33 l.jpg

What is the key contribution?

  • Text segmentation can enhance traditional (whole-document) IR as well as “fact retrieval.”

  • Novel paradigms for text retrieval can be both principled and computationally efficient.

554 paper pres.


Take away messages what can be generalized34 l.jpg

Take-away messages?What can be generalized?

  • Marti Hearst is a major player in text mining / text visualization.

554 paper pres.


3 questions for group and class discussion35 l.jpg

3 questions for groupand class discussion.

  • Instead of integer term frequency, what else could be used to “color” the tiles for relevance?

  • How might documents be ranked?

554 paper pres.


  • Login