1 / 21

TMSync

TMSync. Topic map-to-topic map updates. Lars Marius Garshol CTO, Ontopia <larsga@ontopia.net> TMRA 2006 2006-10-11. Agenda. Background the problem why TMSync is the solution TMSync in detail what it is how it works Applications what you can do with TMSync Conclusion. The problem

wgerber
Download Presentation

TMSync

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. TMSync Topic map-to-topic map updates Lars Marius Garshol CTO, Ontopia <larsga@ontopia.net> TMRA 2006 2006-10-11

  2. Agenda • Background • the problem • why TMSync is the solution • TMSync in detail • what it is • how it works • Applications • what you can do with TMSync • Conclusion

  3. The problem Solving it with TMSync Background

  4. The problem • Topic Maps hold out a promise as a great technology for data integration • because of merging, global identifiers, etc • However, dynamic sources are poorly supported at the moment • that is, converting once is easy, but staying in sync is hard • A solution that only supports static integration is near-worthless • in practice, integrated data is nearly always going to need updating from the source • building a one-time conversion is easy • building data integration with update support is hard • so, suddenly data integration with Topic Maps isn’t so easy, after all

  5. Merging is not the solution • Merging in Topic Maps is often thought of in terms of <mergeMap> • this is only useful if you are working from XTM files • <mergeMap> only has an effect when the XTM file is loaded • after that, the only way to use the <mergeMap> is to reload from scratch • reloading from scratch loses all changes... • Real applications are based on databases • here <mergeMap> has no effect

  6. What TMSync is • A simple way to update part of one topic map with part of another • define which part of the target topic map you want, • define which part of the source topic map it is the master for, and • the algorithm does the rest

  7. TMSync convert.xslt If the source is not a topic map • Simply do a normal one-time conversion • let TMSync do the update for you • In other words, TMSync reduces the update problem to a conversion problem source.xml

  8. What it is How it works TMSync in depth

  9. TMSync in mathematical terms • A function that given • a target topic map, • a source topic map, • a topic selector for the target map (a function), • a characteristic selector for the target map (a function), • a topic selector for the source map (a function), • a characteristic selector for the source map (a function), • produces an updated target map

  10. Mathematical specification • Currently based on the Q model[1] • mainly because this was the only model in existence when I started working • Will translate to the TMRM • since this is better-known, and now has a TMDM mapping [1] Q: A Model for Topic Maps, http://www.ontopia.net/topicmaps/materials/quads.html

  11. The selection process name occurrence name occurrence occurrence

  12. The update process name NAME NAME occurrence occurrence name name occurrence occurrence occurrence bar bar

  13. How to configure the algorithm • How to specify the topics • use a query • this gives great flexibility, while keeping the algorithm simple • it also means that we can efficiently find the set of topics to work on • How to specify the characteristics • use a query, again, or • use a set of types, or • ...

  14. What the algorithm does • For each topic in the sync’ed fragment • remove all sync’ed characteristics not in the source • except associations to non-sync’ed topics • add all characteristics in the source that are not in the target • leave the rest alone • Remove and add topics in the same way

  15. City of Bergen US Publisher Applications

  16. Norge.no Service Unit Person LivsIT The City of Bergen City of Bergen LivsIT

  17. City of Bergen configuration • On the source side • query to get all instances of “category” and “keyword” • accept all characteristics • On the target side • query to get all instances of “category” and “keyword” • except those with mark-as-local associations • accept all characteristics except local search name and mark-as-local

  18. TMSync Nameless US publisher • Use an automated process to classify documents • documents get reclassified now and then • output of process is an XTM document • If documents did not get reclassified, import would be enough • as it is, they use TMSync classified.xtm

  19. Related work Further work Conclusion

  20. Related work • RDFSync • algorithm to synchronize two RDF graphs efficiently • no business case focus • TM-Views • one possible way to define fragments for update • TMRAP • uses TMSync for the update-topic request

  21. Further work • Reformulate algorithm to TMRM instead of Q • this will be done in the paper submitted to the proceedings • Improve algorithm to handle delta sets • that is, to only need information about what has changed since last in the source • this should not be very difficult • may do this for the final paper

More Related