html5-img
1 / 30

From Documents to Knowledge Models

Max Völkel voelkel@fzi.de Forschungszentrum Informatik an der Universität Karlsruhe (TH). From Documents to Knowledge Models. Personal Knowledge Management. Definition: knowledge cues [Haller]

Download Presentation

From Documents to Knowledge Models

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Max Völkelvoelkel@fzi.de Forschungszentrum Informatik an der Universität Karlsruhe (TH) From Documents to Knowledge Models

  2. Personal Knowledge Management Definition: knowledge cues [Haller] • any kind of symbol, pattern or artefact which evokes some knowledge in a person’s mind, when viewed or used. • Knowledge cuescan be stored and retrieved on a computer – while knowledge may or may not. • Ok, in fact you store bits (signals)

  3. What is a Document? A team of 50 French researchers discussed …

  4. Definition: Document A team of 50 French researchers could agree on: • Document as form • Document as a container, which assembles and structures the content to make it easier for the reader to understand it. • Document as sign • Emphasize argumentative structure of the content. • Document can be referenced  acts as a sign for its meaning. • Document as medium • “Reading contract“ = intention or assumption of the author what will happen with the document.

  5. Document (my definition) I/II • A document consists of information atoms. • An information atom is the smallest unit of content which can be interpreted without a documents context (but of course requiring background knowledge). For text, these atoms are single words. • Packaging – establishes a context • Reference-ability – reference to a published document can act as a placeholder for the content expressed within. • Process metadata – should be sent along • such as authors, audience, goal Document Author, audience, goal

  6. Document (my definition) II/II • A document is a knowledge artefact consisting of several layers: Content Semantics – content means something. • Building upon logical and argumentative structure, the author encodes statements about a domain within the content. – to convey its content to the reader. • Argumentative structures appear on all scales. A typical structure is the “Introduction - Related work – Contribution - Conclusion”-pattern of scientific articles. On smaller scales, patterns like “claim-proof” and “question-answer” are used. Argumentative Structure Logical Structure – can reference smaller parts within a document • i.e. paragraphs, headlines, footnotes, citations, and title Visual Structure – guides the reader informally • type-setting (i.e. bold, italics, different font styles and size), placement of figures, pages – carries additional information Linearity – defined order • for navigating through all information items

  7. Ted Nelson I propose a different document agenda: I believe we need new electronic documents which are transparent, public, principled, and freed from the traditions of hierarchy and paper.

  8. What do people want? Why?

  9. What is a Wiki? What‘s new compared to CMS? • Easy Contribution  shorter time-to-publication • Wiki pages can be created and edited by any user quickly and easily • Easy Writing • Simple text formatting without the need to learn HTML  Wiki Syntax • Easy Linking • Automatic linking converts written names of pages, images and websites to links • Recent Changes • See what has happened – Awareness • Diff function shows the latest changes • Easily check whether changes are ok • Fulltext search for page titles and text • Backlink function shows which pages link to the current page • Find the context of this page • Directly link deep into a wiki using readable names Wikis were the first deployed, collaborative hypertext authoring environments  People want more links

  10. My definition based on OMG metamodel MOF What is a Model? Typed entities and typed relations TypeA2 TypeB2 Type C2 (Meta-)Modelling TypeA1 TypeC1 TypeB1 Modelling EntityX EntityY Real world from theviewpoint of the individual ArtifactX ArtifactY

  11. What is a Knowledge Model?

  12. From analogue to digital documents smaller content granularity more interconnected content more explicit structures.  Knowledge models very small information atoms, such as single words Richly connected items explicit semantics for the links. From Documents to Knowledge Models Definition • A knowledge model is a superset of documents and formal ontologies. • Annotated documents, stored together with their annotations, can be seen as a knowledge model.

  13. context annotation source Item before after target annotationmember detail What is a CDS? Conceptual Data Structures M. Völkel and H. Haller: Conceptual Data Structures (CDS) - Towards an Ontology for Semi-Formal Articulation of Personal Knowledge In Proc. of the 14th International Conference on Conceptual Structures 2006. Aalborg University - Denmark, July 2006.

  14. What is a CDS-based Knowledge Model? • A set of addressable items (text, images, maybe even multimedia elements) • Relations between items, classified in four types • Source/target: the generic, directed hyperlink link • Before/after: ordering relations, linear navigation • Context/detail: hierarchical relations, document and concept hierarchies • Annotation/annotationMember: annotations, to give the ability to type items and relations, items are used as types  meta-modeling • Knowledge models must be able to capture work-in-progress • CDS is not strict, you can have cycles, untyped items, paradox ordering, …

  15. CDS: A Hierarchy of Relations Legend Undirected Relation: related/related informal Relation Typerelation/inverse Equivalency: equivalent Directed Linking: source/target Annotation: annotation/annotationMember Order: before/after Hierarchy: detail/context Labelled Links: …/…-inverse Subclassing: is-a/superclass-of Taskpriority Tagging: tag/tagMember Documentorder Instantiation: type/instance formal

  16. Motivation

  17. Engineering Thinking Simulation Req. Engineering Examples for Knowledge Models Fiction Writing

  18. Writing / Sending Write down ideas Group them Structure them Add argumentation structures Add references to literature Link pieces in a first draft Add introduction and conclusion Repeat until coherent flow Publish document Reading / Recieving Visualise the structure graphically Connect new structures with existing own structures Mind maps Mind maps ??? ??? Reference Manager Textprocessing How does Writing/Reading works? „Von der Idee zum Text“ [Esselborn 2004]

  19. The tool chains break • Create a new slide show out of three old presentation plus one from your colleague • Why not have the content in smaller, more logical chunks? • Re-use the motivation part of an old paper for a new one • If you find a mis-spelling, why have to fix it twice? • Search a stack of paper notes with good ideas • Why are those not in your computer? • Search email archives to find out what the high-level architecture for the new authentication system is • Why not browse your PKM and see the relations?

  20. Technological Developments •  accelerated distribution by many orders of magnitude •  lower costs Analog  Digital Communicationspeed internet printing press cost written language time

  21. Cost of Communication Data transmission is cheap now • Total cost of communication to send content to n people: | choosing relevant parts of the personal model | + | encoding of model parts in document parts |+ | order document parts strictly linear/hierarchical | + n ·( | data transmission | | linear reading of the document | + | decoding of model parts from document parts | + | creating a networked model out of model parts | + | integrate new model to existing model | )

  22. Cost of Communication Where can we save, if n is small? • Total cost of communication to send content to n people: | choosing relevant parts of the personal model | + | encoding of model parts in document parts |+ | order document parts strictly linear/hierarchical | + n ·( | data transmission | | linear reading of the document | + | decoding of model parts from document parts | + | creating a networked model out of model parts | + | integrate new model to existing model | )

  23. Cost of Communication • Total cost of communication to send content to n people: | choosing relevant parts of the personal model | + | encoding of model parts in document parts |+ | order document parts strictly linear/hierarchical |+ n ·( | data transmission | | linear reading of the document | + | decoding of model parts from document parts | + | creating a networked model out of model parts | + | integrate new model to existing model | )

  24. Current process – culture is document-centric Recipient(s) Sender Cost

  25. Ideal process - What if not documents, but knowledge models would be exchanged between people? Recipient(s) Sender Cost

  26. Realistic (improved) process – use both Recipient(s) Sender Cost

  27. Under-utilisation of the interlinked nature of information [Oren] fine-granular nature of knowledge models allows for precise and effective linking – and browsing People have problems in using strict hierarchies [Oren] classification methods like tagging and non-strict taxonomies Keep the context [Oren]  networked nature of a knowledge model is more suited to represent contextual links than a set of documents Granularity Represent more than the content of just one document Information Management Problems  Solution: Knowledge Models

  28. When to use Knowledge Models? Fixed domain • Use domain specific tools & languages • Standardised representation formalisms • Established data exchange processes Open domain- or – Multiple domains • Use personal knowledge models • Unstructured, semi-structured, semi-formal and formal parts • Ad-hoc formalisation • Cheaper to create, easier to integrate • Use Documents • Costly to create • Cheap to read  sometimes the best solution • Hard to integrate Myself! My TeamMy Community Broad audience

  29. Related Work in Semantic Authoring • Initial ideas - although that term was not used - can be found already in V. Bush and D. Engelbart • ABCDE Format from Anita de Waard • Semantically annotated Latex (SALT) by Tudor Groza • Systems allowing end-users to construct ontologies out of their linked information objects. • L. Ludwig sees redundancy within and among documents as a hurdle to efficient information usage. Traditional notion of a document is replaced by virtual documents, which render parts of the knowledge base as an interactive tree. • Bernstein describes TinderBox, a "personal content management assistant", which offers sophisticated HTML generation via templates. • Gnowsis system by Sauermann allows to link desktop objects, integrates with wiki • iMapping – semantic concept maps by Haller • Same direction in the fields of semantic desktop and semantic wiki • Semantic Web Content Repository (swecr)

  30. Documents Document-centered culture is a costly legacy artefact and bottleneck for our society Personal knowledge models Superset of documents and ontologies Integrate with the semantic desktop Make knowledge worker happier and more productive Authoring is the bottleneck We should bring the power of modeling to the end-user Don‘t break the tool chain Focus on work-in-progress Contact:Max Völkel, voelkel@fzi.de Thank You very muchfor Your attention Conclusion

More Related