paper presentation by mark sharp 17 610 554 information visualization prof spoerri 11 11 2002
Download
Skip this Video
Download Presentation
Paper presentation by Mark Sharp 17:610:554 Information Visualization, Prof. Spoerri 11/11/2002

Loading in 2 Seconds...

play fullscreen
1 / 35

SharpPaperPres554 - PowerPoint PPT Presentation


  • 253 Views
  • Uploaded on

Galaxy of News: An Approach to Visualizing and Understanding Expansive News Landscapes Earl Rennison In UIST `94, ACM Symposium on User Interface Software and Technology. New York: ACM Press, 1994. Paper presentation by Mark Sharp 17:610:554 Information Visualization, Prof. Spoerri 11/11/2002

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'SharpPaperPres554' - issac


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
paper presentation by mark sharp 17 610 554 information visualization prof spoerri 11 11 2002

Galaxy of News: An Approach to Visualizing and Understanding Expansive News Landscapes Earl RennisonIn UIST `94, ACM Symposium on User Interface Softwareand Technology. New York: ACM Press, 1994.

Paper presentation by Mark Sharp

17:610:554 Information Visualization, Prof. Spoerri

11/11/2002

554 paper pres.

paper summary
Paper Summary
  • PROBLEM: Accessing and understanding news information is not well-supported by the information infrastructure.
  • VISION: An intelligent infrastructure that automatically builds the correlations and relationships between news articles and constructs an environment that allows readers to dynamically explore and gain understanding.

554 paper pres.

how does it work
How does it work?
  • Articles have features (metadata) extracted by parsing algorithms, then they are clustered by ARN (a neural network algorithm) and mapped to a 3D space layout.
  • Nodes: keyword hierarchy / headlines / full text
  • Zoom in with left mouse button, out with right. [direct manipulation]
  • Animation (4D) helps user understand what system is doing. [motion: an early/pre-attentive visual cue]

554 paper pres.

model components
Model components

Temporal and behavior interaction: controls level-of-detail, user orientation cues, transition to new views.

Spatial construction: can be 2-, 3-, or n-dimensional; uses relationships; dynamic (appropriate for news)..

Relationships: designer-specified; e.g. temporal ordering… .

News base:not raw data; objects and annotations (keywords, slugwords, location, time, subject, etc.); manually or automatically derived from raw data.

554 paper pres.

slide11

reading

writing

which early pre attentive visual processes are leveraged
Which early / pre-attentivevisual processes are leveraged?

Position

Proximity

Motion

Brightness

Size

Color

554 paper pres.

what is working
What is working?
  • Principled (algorithmic) feature extraction and clustering.
  • Direct manipulation.
  • True zooming (seamless exploration of categories, document labels, and full texts).
  • Dynamic updating of content (new articles).

554 paper pres.

what is not working or clear
What is not working or clear?
  • Clustering based on skinny metadata rather than full text vectors.
  • Keywords are single words, not terms.
  • “Relationships”?

554 paper pres.

what surprised you
What surprised you?
  • Naivete about “understanding” and media studies.

554 paper pres.

key insights what i learned
Key Insights: what I learned
  • Detailed look into the architecture of a true large text corpus info viz system with many desirable features.

554 paper pres.

what is the key contribution
What is the key contribution?
  • True zooming (seamless integration of all levels) is feasible in large text corpora.

554 paper pres.

take away messages what can be generalized
Take-away messages?What can be generalized?
  • Computational feasibility forces some compromises.
    • “What is not working”
    • Human heuristics (“relationships”?)

BUT help is on the way (bigger iron)

554 paper pres.

3 questions for group and class discussion
3 questions for groupand class discussion.
  • Is volume and lack of organization really our biggest problem with modern news information?
  • Would you use Galaxy of News? Why or why not?
  • What other kinds of text data would you like to see this approach applied to? How might a different domain affect the specification of metadata object representations and/or “relationships”?

554 paper pres.

paper presentation by mark sharp 17 610 554 information visualization prof spoerri 11 11 200222

TileBars: Visualization of Term Distribution Information in Full Text Information AccessMarti HearstProceedings of the ACM SIGCHI Conference onHuman Factors in Computing Systems (CHI), pp. 59-66, Denver, CO, May 1995.

Paper presentation by Mark Sharp

17:610:554 Information Visualization, Prof. Spoerri

11/11/2002

554 paper pres.

paper summary23
Paper Summary
  • PROBLEM: Traditional IR is focused on text databases consisting of titles and abstracts; assumptions are not necessarily appropriate for full text.
  • VISION: Utilize term distributionwithin the text as well as overall frequency to model document relevance. Replace opaque rankingwith a transparent means for swift appraisal of the query-document relationship.

554 paper pres.

how does it work24
How does it work?
  • TextTiling algorithm partitions full text into adjacent, non-overlapping, multi-paragraph segments reflecting subtopic structure based on term co-occurrence and repetition.
  • Segments are scored for similarity to query terms.
  • Display shows document length, term frequency, and term distribution across segments.

554 paper pres.

slide25

Length of rectangle :

length of document

Each gray square = 1 tile

(segment)

Tile darkness : term freq.

Query term sets : tile rows

554 paper pres.

which early pre attentive visual processes are leveraged28
Which early / pre-attentivevisual processes are leveraged?

Length

Position

Darkness (gray scale)

554 paper pres.

what is working29
What is working?
  • Elegant rep. of document length.
  • Adjacency of tiles between term rows => overlap.
  • Gray scale leverages relative (vs. absolute) judgment.
  • Meaningful labels (start of text).
  • Direct click link from tiles to text segments.
  • Starting TREC/TIPSTER evaluation.

554 paper pres.

what is not working or clear30
What is not working or clear?
  • Depends on skillful Boolean query formulation (e.g. no stopwords).
  • Doesn’t appear to be scalable to large queries (>3 conjunctive terms).

554 paper pres.

what surprised you31
What surprised you?
  • “Because they do have a natural visual hierarchy, varying shades of gray show varying quantities better than color.”

554 paper pres.

key insights what i learned32
Key Insights: what I learned
  • Relevance ranking is not the only game in town for putting cognitive cues on multi-document retrievals.

554 paper pres.

what is the key contribution33
What is the key contribution?
  • Text segmentation can enhance traditional (whole-document) IR as well as “fact retrieval.”
  • Novel paradigms for text retrieval can be both principled and computationally efficient.

554 paper pres.

take away messages what can be generalized34
Take-away messages?What can be generalized?
  • Marti Hearst is a major player in text mining / text visualization.

554 paper pres.

3 questions for group and class discussion35
3 questions for groupand class discussion.
  • Instead of integer term frequency, what else could be used to “color” the tiles for relevance?
  • How might documents be ranked?

554 paper pres.

ad