1 / 26

Lecture 01: Introduction

Lecture 01: Introduction. September 5, 2012 COMP 250-2 Visual Analytics and Provenance. Motivation: What is in a User’s Interactions?. Keyboard, Mouse, etc. Types of Human-Visualization Interactions Text editing (input heavy, little output)

jalen
Download Presentation

Lecture 01: Introduction

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lecture 01:Introduction September 5, 2012 COMP 250-2Visual Analytics and Provenance

  2. Motivation:What is in a User’s Interactions? Keyboard, Mouse, etc • Types of Human-Visualization Interactions • Text editing (input heavy, little output) • Browsing, watching a movie (output heavy, little input) • Visual Analysis (closer to 50-50) Input Visualization Human Output Images (monitor)

  3. Provenance: Definitions • Provenance (according to Webster): • origin, source • the history of ownership of a valued object or work of art or literature • Example: Has anyone traced the provenances of these paintings?

  4. Provenance: Computer Science • Examples in CS relating to provenance: • Undo / Redo • Diff / Revision history • History logging (e.g., Wikipedia) • Interaction logging • Data annotation • Database queries and results (Oracle Flashback) • Screen capturing? • Knowledge management • Etc.

  5. Problems and Challenges • There are lots of people working on problems pertaining to provenance. For example: • Open Provenance Model (OPM): common framework for describing information provenance (e.g., Wikipedia) • Database: (1) how data is derived from large computation, (2) how data is copied / synthesized from one part of the database to another • Knowledge management / Ontology: the study of relationships between data, concepts, processes, information, etc. (e.g., Scientific Workflow)

  6. What is “Analytic Provenance”? • Analytic Provenance (AP): similar to workflow, is the provenance of the analysis (the steps of the analysis). This includes: • The record of the data and data manipulation • The record of the user’s interactions • The record of what the user sees • The record of the computational products (both the visualization and the data) • The record of the user’s analysis results • How is provenance research in visual analytics similar and different from the others?

  7. Similarities • At a high level, both are attempts at keeping track of the changes of data and information over time. • Along the same line, one common goal is to verify and validate the current information or the results of a query / analysis. • Others?

  8. Differences • Exploration with high interactivity. • Many provenance systems utilize state diagrams • Similar to depicting “scientific workflow” • What is a “state” in a highly exploratory system? • Often, the “states” are data dependent • Computation can be costly • Storing only the procedures do not always lead to the same result due to computational powers • Storing of “insight” • How to determine which interactions are useful? (the labeling problem…)

  9. Why Study Analytic Provenance? • Damn good question… • A problem that has plagued me for years • Validation (of the results) –kind of lame • Verification (of the process) – kind of lame • Training – Less lame, but still not very sexy • These can all be (somewhat) solved with video-capturing the screen (with some annotation and/or microphone)

  10. Why Study Analytic Provenance? • More interestingly, as a post hoc analysis: • Redo of a certain task • For example, I just did some analysis on the Census of Massachusetts. Can I reapply the same analysis to Rhode Island Census? • What analysis steps are useful? • For example, in solving a difficult problem, is there a key step (or steps) that lead to success (and to failure)? • Building a knowledge-base • If we believe that interactions encodes reasoning and analysis, can I aggregate all interactions/analyses into a knowledge-base?

  11. Why Study Analytic Provenance? • As an ad hoc analysis (that is, in real time), determine a user’s “analytical needs”: • Related to “adaptive” visualization or interfaces, but goes beyond adapting the UI. For example: • Analysis of search space: what data space has the user explored (and not explored)? • Analysis of bias: is the user favoring some hypothesis and ignoring evidence? • Essentially, anything that can give us some indication of who the user is (in terms of “ICD3”)

  12. Challenges? • What are some challenges that need to be solved in order to accomplish the previously stated goals? • How would you categorize them? • CHI 2010 Analytic Provenance workshop

  13. Questions?

  14. Provenance and Scientific Workflows:Challenges and Opportunities • Susan B.Davidson (Upenn) • Juliana Freire (Utah/NYU) • SIGMOD 2008

  15. Prospective Provenance: Steps taken in the scientific workflow • Retrospective Provenance: • The environment in which the steps were taken • Annotated Provenance: • Some notes entered by the user • Note the small squares in the upper left corner of each box (representing each step)

  16. Ways to Store Provenance • RDF (resource description file) • SPARQL (query language for RDF)

  17. Ways to Store Provenance • OWL (Web Ontology Language)

  18. Yuck… • Knowledge vs. information vs. data • Example about grass, rabbit, and wolf

  19. Opportunities • Provenance and Scientific Publications: • Reproducibility for scientific experiments • Provenance and Data Exploration: • Simplify exploratory processes (graph reduction) • Social Network Analysis: • Not sure how this is different from 1, but applied to SNA (and crowd sourcing) • Provenance in Education: • “Teaching is one of the killer applications of provenance-enabled workflow systems”

  20. User starts with an analogy template (left) • The user then reapplies the workflow to a different dataset (right). The system automatically figures out inconsistencies or problems with the reapplication and either (a) fixes it or (b) warns the user

  21. Open Problems • Information Management Infrastructure: • Usability issue relating to how to use information management systems • Provenance Analytics and Visualization: • Visualize and data-mine provenance. • Interoperability: • Steps of workflow generated from different software and computers • Connecting Database and Workflow: • Does provenance make sense without access to the (a) original or the (b) derived data

  22. Questions?

  23. Date and Time • When and where should we meet??

  24. Structure of the Course • Open-ended • The topic of provenance is both old and new… • Read existing papers (that I think are important), and • You suggest new ones • Identify research opportunities • Daily schedule (tentative): • 1:30 – 2:30 3-4 presentations of 15 minutes each • 2:30 – 2:45 break • 2:45 – 4:00 group discussion

  25. Time Line • See course website • http://www.cs.tufts.edu/comp/250VA/ • Sign up for: • 2 scribes positions (on different weeks) • 2 papers to present (on different weeks) • And on those weeks, you will lead the discussions • 1 week where your group (of 3) will identify papers and lead discussions

  26. Questions?

More Related