1 / 26

Towards a new approach to Document Collaboration

Towards a new approach to Document Collaboration. A Concurrency Control mechanism for XML databases. Jan Hidders U. Antwerp, Belgium. Stijn Dekeyser USQ, Australia. Introduction: XML Running Example Classic CC methods Data Models User Data Model Scheduler Data Model. Path Lock Schemes

eben
Download Presentation

Towards a new approach to Document Collaboration

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Towards a new approach to Document Collaboration A Concurrency Control mechanism for XML databases Jan Hidders U. Antwerp, Belgium Stijn Dekeyser USQ, Australia

  2. Introduction: XML Running Example Classic CC methods Data Models User Data Model Scheduler Data Model Path Lock Schemes Propagation (PL-PROP) Satisfiability (PL-SAT) Schedulers Commit scheduler Conflict scheduler Contents • Asynchronous collaborative work • cvs, change tracking, cscw, ems • Synchronous collaborative work • A new approach: use cases and clients • Discussion Part I Part II

  3. Asynchronous collaborative work Part I • CVS • Version repository + concurrency • Text, Line-based • Human intervention • Change Tracking • MS Office, OOo et al • Human mgt + intervention

  4. Asynchronous collaborative work • CSCW & EMS • Many different systems • Docs distributed wholly at all sites • Messages update docs • Human intervention needed

  5. Synchronous collaborative work • CSCW & EMS • XML-enabled RDBMS • Traditional table locking • Native XML databases • Document-based locking

  6. A new approach: use cases • Document Authoring • May move section while updating • CMS • XForms causes overwrite • Design & art • May update same parts! • Programming • Change function name globally

  7. A new approach: servers • Implementation: • Native or XML-enabled db • Use of Path Locks in transactions • Document collaboration protocol (dcp) • Type: • Enhanced web server, e.g. in Apache • or P2P implementation, e.g. in Word • Extra features: • Access Control (Elena Ferrari et al.) • Elements contain user rights • version management (Epic) • Elements contain version information

  8. A new approach: clients • General purpose XML editors • E.g. Epic or XMLSpy • Specific purpose XML editors • E.g. Autocad or Excel • Issues: • Intelligently query section to be updated • Commit when possible • Refresh content

  9. Discussion

  10. Introduction: XML Part II • XML is evolution of document language technology • 1969: GML (Generalized MarkUp L) • 1974: SGML (Standard …), Goldfarb • 1986: ISO standard • 1989: HTML, Berners-Lee, Berglund, Cailliau at CERN • XML much simpler than SGML (10% of spec) • Now: much more data stored as XML • Enter the XML-DBMS age…

  11. Running example (1/2) <document id="0"> <person id="1", age="55"> <name> Peter </name><addr> Parklane 7 </addr> <child> <person id="3", age="22"> <name> John </name> <addr> Unistreet 1 </addr> <hobby> swimming </hobby> <hobby> cycling </hobby> </person> </child> <child> <person id="4", age="7"> <name> David </name> <addr> Parklane 7 </addr> </person> </child> </person> <person id="2", age="43"> <name> Mary </name> <addr> Parklane 7 </addr> <hobby> painting </hobby> </person> </document>

  12. Queries: • /document/person//hobby • //child//hobby Running example (2/2)

  13. Classic CC methods (1/3) Table locking • How:On update, whole table is locked • Precludes phantoms • XML: parent-child relation in 1(*) table • Example: • Query: //child//hobby • Update that should be allowed: change hobby element not occuring under child • Not possible when entire table is locked

  14. Predicate locking • How • Locks in form of predicate: name=“person” • Predicate indicates what has been read • Example: • Query: /document/person//hobby • Update(ok): create person under root element • Update(~ok): create hobby under this person • Both are not possible since 1st predicate locks all persons under the root Classic CC methods (2/3)

  15. Classic CC methods (3/3) Hierarchical locking • How • Lock granule  intention lock on ancestors • Change granule  exclusive lock on X Tree locking • How • Lock node  lock parent of node • Add node under X exclusive lock on X And ... query //A//B requires shared-locks on entire tree

  16. User Data Model (1/2) Data Model • (XPath-tree) Tree with labelled nodes • NB: we ignore ordering of children Path Expressions • Sequence of tag names and wild-cards (*) • Separated by / (child) and // (descendants). • person/child • *//person/child

  17. Node-correctness:Thou shalt only use nodes which you have obtained via an addition or via a query. User Data Model (2/2) Query • Q(n,p): yields set of nodes which are reachable from n via path expression p. • Q(n,*//hobby) Addition • A(n,a): add node with name a under n • A(n,hobby) • Fails if n is not there, yields new node. Deletion • D(n): delete n • Fails if n has children. Commit • C(): end of transaction

  18. Scheduler’s Data Model (1/2) • Instance Graph • Acyclic graph with labelled nodes • Nodes labelled with a delete set: • Identifiers of transactions which deleted the node. • Actual Instance • Subgraph of instance graph formed by the nodes with an empty delete set • Is always an XPath-tree

  19. Scheduler’s Data Model (2/2) Query • Q(n,p): yields a set of nodes which are -- in the actual instance – reachable from n via path expression p. Addition • A(n,a): adds node with name a under n • Empty delete set • Fails if n is not in the actual instance. Deletion • D(n): add transaction to the delete set of n • Fails if n has children in the actual instance. Commit • C(): delete nodes with transaction in delete set

  20. PL propagation scheme (1/3) Read Locks • rl(n, p) • e.g. rl(n12, //a//b) Required read locks: • For Q(n,p) request rl(n,p) and do .... • Read lock propagation: • rl(n, a/p) -> rl(n', p) if n' is an a-child of n • rl(n, */p) -> rl(n', p) if n' is a child of n • rl(n, a//p) -> rl(n', *//p) and rl(n', p) if n' is an a-child of n • rl(n, *//p) -> rl(n', *//p) and rl(n', p) if n' is a child of n • Recomputing propagation on update is very easy (!)

  21. PL propagation scheme (2/3) *//child//hobby Example: Doc. root *//child//hobby child//hobby document *//child//hobby child//hobby *//child//hobby person person child//hobby *//child//hobby *//hobby child//hobby child child age name addr hobby age name addr *//child//hobby child//hobby *//hobby hobby *//child//hobby person person age name addr hobby hobby age name addr

  22. Number of locks: • updates: O(1) • queries: O(|p|.|G|) PL propagation scheme (3/3) Write Locks • wl(n, a) or wl(n, *) Required write locks: • For A(n,a) request: • wl(n,a) • For D(n) request: • wl(n, *) if n exists • wl(n', a) if n is an a-child of n' Conflict rules: • wl(n, *) and wl(n, a) conflicts with rl(n, *) and rl(n, a) • All others do not. (... write-write conflicts?)

  23. Number of locks: • updates: O(1) • queries: O(1) Path-lock satisfiability scheme Read locks • rl(n, p) (see PL-Prop) • For Q(n,p) only rl(n,p) is necessary Write locks • wl(n,a) and wl(n,*) • For A(n,a) and D(n) necessary as w. PL-Prop. Conflict rules • wl(n,a) and wl(n,*) conflict with rl(n',p) if • there is a path from n' to n with label-list L and • L/a (or L/*) satisfies the path expression p

  24. Commit scheduler • Transactions make requests for operations: • Query, Addition, Deletion, Commit, Roll-back • Scheduler accepts request only if • the operation does not fail, and • the required locks do not conflict with existing locks • On Commit the locks disappear of • the committing transaction, and • the nodes deleted by the transaction • On Roll-back the locks disappear of • the transaction being rolled-back

  25. Conflict Scheduler • Scheduler has a Dependency Graph (DG): • arrow t1 --> t2 if a lock of an operation of transaction t1 conflicts with a lock of a preceding operation of t2 • Scheduler accepts request only if • the operation does not fail, and • no cycles appear in the DG • A commit of t1 is not accepted if in the DG arrows depart from t1. • A roll-back of t1 leads to a roll-back of t2 if t2 --> t1 in the DG.

  26. Conclusion and Further Research • Commit and conflict schedulers guarantee serialisability • Complexity is decided by the size of the document / instance • Is order of children a problem? • simulation • write-write conflicts • What with the relocation of subtrees? • Identity of nodes to be taken into account? • No use / knowledge of (entire) instance? • Use of DataGuide • Instance Independent CC for SSD

More Related