Analysis Experiences Using Information Visualization Beth Hetzler Alan Turner
Realizing Value from Visual Analysis Tools • Sound algorithms (representation, clustering, projection, etc.) • Visualization conveys useful information • Interaction natural and easy to learn • User able to profit from visualization (*) • Concepts fit user model(s) and process (*) • System works acceptably in user environment (*)
Analysts’ Environment • Lots of information, variety of sources • Constantly more new information • Time pressure • Difficult to tell what is pertinent without reading or skimming • May be learning new subject area • May not be expert in computer science
Analytic Environment (cont.) • Maintain expertise in particular areas • broad issues over time • changing vocabulary, evolving themes • “Tyranny of the inbox” • Ad hoc questions on short fuse • little time to hone queries • Need to provide and support judgement • Risks of satisficing
High profit documents Legend Key documents Key documents that are high profit 419 28 Participant 5: 96 minutes Experience: 17 years Query 1: ESA | (european & space & agency) Query 2: (ESA | (european & space & agency)) > (19960601) Infodate Analysts’ Dilemma 2000 in Database 3 High 725 Query 1 Profit 419 Query 2 28 read 24 on-topic 6 High Profit 8 cut and paste 3 key ©1999 Patterson With permission of Emily Pattersonn
419 28 S5: 96 minutes ESA | (european & space & agency) (ESA | (european & space & agency)) > (19960601) Infodate Key documents that are high profit High profit documents Key documents 161 169 22 29 5 15 S2: 73 minutes esa & ariane* (esa & ariane*) & failure S3: 24 minutes europe 1996 (europe 1996) & (launch failure) (europe 1996) & ((launch failure):%2) S4: 68 minutes (european space agency):%3 & ariane & failure & (launcher |rocket)) 66 194 184 29 14 12 7 4 S6: 32 minutes 1996 & Ariane (1996 & Ariane) & (destr* | explo*) (1996 & Ariane) & (destr* | explo*) & (fail*) S7: 73 minutes software & guidance S8: 27 minutes esa & ariane ariane & 5 (ariane & 5):%2 ((ariane & 5):%2) & (launch & failure) S9: 44 minutes 1996 & European Space Agency & satellite 1996 & European Space Agency & lost 1996 & European Space Agency & lost & rocket With permission of Emily Patterson ©1999 Patterson
What Could It Mean to “Address Information Overload?” • Reduce time spent crafting queries • Reduce risk of eliminating important information • Increase chance of recognizing important information • Ability to handle more documents • Improve ability to structure information perusal • Reduce amount of reading • Faster time to get through same information
IN-SPIRE Basic Tools Document Viewer ThemeView Time Slicer Galaxy Also: Query tool, Group tool, ...
Pilot Environment • Analysts in normal work environment • IN-SPIRE running on regular workstation • Analysts use as time allows, on questions pertinent to their work • Normal data, but alternate query tool • Assess question most pertinent to analysts: does it help me with my data and my issues?
Example User Value: Less Time on Query Syntax; Lower Risk of Information Loss Data collection: news stories matching simple Boolean on Pakistan Green dots: Documents that would be excluded by “not (cricket or wicket or champion*)” Cricket-related
Examples of User Value • Better structuring of daily reading material • Easier to identify non-relevant material • Useful information from speculative large queries • Thinking about the issue and information in new ways
Examples of Issues • Novice vs. expert usage and benefits • Galaxy too static • Clusters not relevant for some users • Data glitches • Pragmatics: print, save, ...
Adapting to User Process: New Analytic Feature • Common user processes • Linear path through information • Convergent/divergent phases • Static visualization does not support well
Support Convergent/Divergent Process • Select or query to choose documents of interest • Move rest down • Interest documents recluster and reproject to show new themes • Move full set back up and repeat
Conserve existing user query Add additional broader one Combine and show relationship Smooth interface to current tools critical Adapting to User Process: Interface to Legacy System
Potential Tension: “Correct” vs. “Useful” Representations
Potential Tension: “Correct” vs. “Useful” Representations • Themes of interest may not be dominant • not dominant within data collection • not dominant within documents • Users need way to “steer” to more interesting themes and relationships • Minimal demands on user input • Clear that steering in effect
Support Analytic Flow • Research or monitoring • find important information • quicker process • Analysis • convergent/divergent thinking • identify new hypotheses • Drafting/editing reports • summarize results • capture citation, annotate, print
Bucket of Data Mismatch • Many tools work on fixed collection • Users’ data is much more fluid • query results this morning • more documents this afternoon • new query term added • Users can’t afford to redo work
Data: the Good, the Bad, and the Ugly • Ideal is not real world • Tags in “wrong” place • Meta data within text • Missing field labels • “Is it useful on my data?”
Conclusions • Information visualization can provide useful benefits for analysts • Need features to match user process • Need careful bridge to other user tools • Address challenges, even if not central to tool