1 / 26

Presentation on IPAW‘08: Henning Bergmeyer

"A Python Library for Provenance Recording and Querying“ "Requirements for a Provenance Visualization Panel“. Presentation on IPAW‘08: Henning Bergmeyer. Overview. Brief Overview: Provenance System A Python Library for Provenance Recording and Querying Usage Examples

triage
Download Presentation

Presentation on IPAW‘08: Henning Bergmeyer

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. "A Python Library for Provenance Recording and Querying“"Requirements for a Provenance Visualization Panel“ Presentation on IPAW‘08: Henning Bergmeyer

  2. Overview • Brief Overview: Provenance System • A Python Library for Provenance Recording and Querying • Usage Examples • Initializing, Recording, Querying, Extending • Architecture • Requirements for a Provenance Visualization Panel • User Groups and Intentions • Graphical Representation and Exploration • Requirements

  3. Main Model Concepts of the Provenance System„Grid Provenance“, PReServ 1.0 (University of Southampton) • Interactions between actors • Relationships (1 subject, 1..n objects, 1 relation type) • are dependencies between interactions (e.g. cause-and-effect) • describe internal, otherwise hidden functionality of actors • Actor States • are assertions about internal states of actors • Interaction Records • complete documentation of an interaction through assertions of all influencing incidents and dependencies • Tracers • unique markers that serve to identify individual workflow executions • distributed along message paths

  4. Main System Concepts of the P-System • Distribution • Several connected P-Stores • differentiation of asserter views

  5. "A Python Library for Provenance Recording and Querying“(Roland Gude, Carsten Bochner)

  6. A Python Library for Provenance Recording and Querying • Open Source: http://sourceforge.net/projects/provenance-csl/ • Purpose • easy Provenance recording and querying for Python applications or applications with interface to Python • independent of Java on the client side • Examples for • Initialization • Recording • Querying • Extending own types

  7. Code Examples: Initialization from provenance.api import * • looks like bad coding style at first • but automatic lazy-loading of required modules prevents severe performance losses cl = client.Client(“http://localhost:8080”, asserter=“me”) • That‘s it! • A trace file can be specified to log communication with P-Store

  8. Code Example: Recording subj = utils.createSubjectId(1, “dataAccessor”, "parametername") objlist = [utils.createObjectId( utils.createInteractionKey("http://sink", http://source"), pAssID, 'anything', 'dataAccessor', 'parameter', 'isSender')] keys,response = self.cl.record([ [utils.createActorState(a_content_0, doc_style), utils.createRelationship(subj, rel_type, objlist), utils.createInteraction(m_content_0, doc_style), utils.createInteraction(xml_content_0) ] ], "isSender", sink, source) res = interfaces.IRecordAck(response)

  9. Code Example: Querying queryString = "for $n in $ps:pstruct return $n" response = self.cl.query(queryString) result = interfaces.IQueryAck(response) • Afterwards „result“ contains an XML structure containing all „pstructs“ available in that store.

  10. Architecture • SOAP interface translated from WSDL by ZSI • pyProtocols • Python lacks of OO-concept "Interfaces" • pyProtocols allows protocol definitions and automatic adaption • used to make SOAP interface transparent to user • Lazy-loading • PEAK framework

  11. Code Example: Extending Types class IAddress(IZSITypeCode): """ interface for string typecodes """ def getAsString(self): """ returns a String with the Value of the Stringlike. """ IString = protocols.protocolForType(basestring,[]) class AddressAdapter(object): protocols.advise(instancesProvide=[IAddress], asAdapterForProtocols=[IString]) def __init__(self, string): self._delegate = serverAPI.Address(string.__str__()) def getAsString(self): return self._delegate.__str__() def toTypeCode(self): return self._delegate

  12. Requirements for a Provenance Visualization Panel (Markus Kunde, Henning Bergmeyer)

  13. Motivation • Determine requirements for a Provenance visualization panel • Requirement to document Provenance in our projects (e.g. AeroGrid) • No specification for concrete use of the documented provenance, yet • => Tool at least for general browsing of low-level documentation is needed • Raw provenance data in XML is hard to browse • Verification of records • Experimental browsing to determine better query and interpretation methods • Panel provided by project „Grid Provenance“ not suitable

  14. Approach • Identify User Groups • User interests (What do they want to explore?) • User intentions (Why do they want to explore that?) • Analyse the Provenance data structure • Elements • Properties • Connections • Scale • Determine visualization and analysis methods • What information to be shown, • Where to show it • When, for how long, static or animated • Clear and consistent semantics for visual elements • Determine exploration strategy

  15. Identifying User Groups • Interest / Scope • What documentation is asked for? • What documentation is a user allowed to see? • Abstraction high-level border, range of access • Intention • Why is that documentation asked for? • Abstraction low-level border, type and level of detail of required documentation

  16. Identified User Groups • General User • Scientist, Engineer, Portal User • Interest: own work, own results,origin of used data • Intentions: reliability and authenticityof results, reproducibility • Designer • Software Engineer, Workflow Developer • Project related, all origins, monitored system, partner-made components • workflow behavior, service interaction, product evolution • Manager • Workflow Provider, Provenance Analyst, User Support • all assigned user and system Provenance • correctness of services, interpretation support, quality of the P-system • Administrator • Developer / Admin of Provenance System • all P-data available in connected P-stores • building the P-system and maintaining its function

  17. User Analysis Intentions Process: Evaluation of the approach of a workflow Actors, Interactions, Sequence of Process steps Results: Quality of intermediate and end results of processes Dependencies of inputs and outcome Relationship: Analysis of the evolution of data Relationships of interactions or actors Time Line Finding performance bottlenecks, improving workflows Evolution of results, actor behavior Participation Trust to result Participating actors Comparison Validate correctness of processes and results, by comparing documented executions with reference structures, like processes, views on interactions, results Interpretation Custom visualization requirements, deriving knowledge from Provenance data Custom, probably all aspects => Exploration required

  18. Exploration • Difficulty in a large scale graphic exploration system: • Where to start? • Begin with on overview • Select processes, interaction channels or actors • Fade out the rest and choose specific detail visualizations. • Read application specific content

  19. Elements

  20. Actor / Asserter Views

  21. Focus on Interaction Process Map (inspired by tube map) • Processes • Participating Actors • Bottlenecks Interaction Stretch • Individual Interactions • Relationships and order

  22. Combined Flow-Chart • Typical Data Flow Graph • Shows directions of message flows • No notion of time => Requires previous selection of recorded process. System / Process Context

  23. Process Aerial • Find individual executions of selected processes • Find anomalies • Show only interesting actor states and relationships • Scrolling up and down along time axis

  24. Visualisation Methods

  25. Graphical and Exploration Requirements • distinct, consistent representations of documentation elements to allow intuitive interpretation • extensible support of different layout methods • adjustment of alignment helps to interpret • switching of scope and detail • proxy displays for large data sets • e.g. navigation maps • mixing and migrating of layouts (animated)

  26. Architectural Requirements • support of VO management • store access • actor/asserter views • caching and merging of query results • extensible architecture • layout methods • element representations • exploration methods • "content" support • GUI abstraction • Web Portals • Desktop Applications

More Related