1 / 20

An agile process for the creation of conceptual models from content descriptions

An agile process for the creation of conceptual models from content descriptions. Hans-Werner Sehring Centre for Sustainable Content Logistics TuTech Innovation GmbH / Hamburg University of Technology Joint work with: Sebastian Boßung Henner Carl Joachim W. Schmidt. Outline.

crescent
Download Presentation

An agile process for the creation of conceptual models from content descriptions

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. An agile processfor the creation of conceptual modelsfrom content descriptions Hans-Werner Sehring Centre for Sustainable Content Logistics TuTech Innovation GmbH / Hamburg University of Technology Joint work with:Sebastian Boßung Henner Carl Joachim W. Schmidt

  2. Outline • Conceptual Content Management • Asset expressions and schemata • The Asset Schema Inference Process • Straight-forward schema inference • Cluster-based schema inference • Process evaluation • Summary and outlook An agile modelling process - Hans-Werner Sehring, 2007

  3. 1. Conceptual Content Management • Conceptual Content Management (CCM) • an approach to domain modelling • inspired by epistemology:entity description by classes and instances, called Assets • Assets are dual entity descriptions consisting ofcontent visualising it and a conceptual model describing it • model-based system generation • Features: • modelling is carried out by domain experts • domain models are open to changes • existing work is preserved, even if changes are applied • communication between domain experts with individual models is maintained An agile modelling process - Hans-Werner Sehring, 2007

  4. CCM dynamics Intermediate model(parse tree) superClass m:Model a:AssetClass b:AssetClass … … … … Political_Iconography (PI) m med1 mediation ( PI , ( Regents , Artists )) m m med2 client mediation ( Regents , Artists ) client ( PI ) m m distrib1 distrib2 distribution ( PI , Regents ) distribution ( PI , Artists ) DB m m client1 client2 ( PI ) client ( Regents ) client ( Artists ) DB DB ( Regents ) ( Artists ) Regents Artists • CCM systems (CCMSs) are dynamically generated from domain models: • immediately realizing model changes • preserving existing Assets • maintaining communication • Key contributions to this end: • modelling language • model compiler • architecture forevolvable systems model Historiographyfrom Time import Timestampfrom Topology import Placeclass Professor {content imageconceptcharacteristic n :Stringrelationship publs :Work* } An agile modelling process - Hans-Werner Sehring, 2007

  5. Model-driven development • All SW development starts with a conceptual model • especially model-driven development approaches call for models with a sufficient degree of formality • CCM is similar to model-driven development in the respect that software creation is highly automated • in CCM, software generation is even dynamic • A CCM model is required as a starting point for CCMSs • usually, some modelling expert (analyst) is consulted • due to dynamics requirement, such a modelling expert cannot be employed in CCM • domain experts are not modelling experts; usually have problems with, e.g., sufficient formality • but: experts can “tell their story” by providing examples An agile modelling process - Hans-Werner Sehring, 2007

  6. 2. Asset expressions and schemata Ludwig Heydenreich name : Name Georg Thilenius title: Name name : Name Architecture in Italy : FullProfessor : Book issuedBy : Professor publications: Work* : Professor 24 Feb 1934 concerns: Teacher issuedWhen : Timestamp reviewer: Professor issued : Place : Dissertation title: Name : City name : Name : Professor where : GeoPoint Die Sakralbau-Studien Erwin Panofsky Leonardo da Vinci' s : CareerStep • In many domains research starts by regarding instances (samples), not concepts An agile modelling process - Hans-Werner Sehring, 2007

  7. Asset model from the example • Models consisting of classes • Classes with • content handles and • attributes (and constraints) • characteristics • relationships • Manually defined classes for the example: • model Historiographyfrom Time import Timestampfrom Topology import Placeclass Professor {content imageconceptcharacteristic name :Stringrelationship publications :Work* }class Work {content scanconceptcharacteristic title :Stringrelationship concerns :Professor*relationship issued :Issuingrelationship reviewers :Professor*}class Issuing {conceptrelationship issued :Placerelationship issuedBy :Professorrelationship issuedWhen :Timestamp } An agile modelling process - Hans-Werner Sehring, 2007

  8. Asset model from the example (cont’d) • Example of personalisation: a domain expert introduces the distinction of documents:model MyHistoriographyfrom Historiography import Work, Professorclass Work {conceptrelationship reviewer unused}class Dissertation refines Work {conceptrelationship reviewer :Professor*} • Import and redefinition of classes for • schema evolution (user communities) • personalisation (single users) • … An agile modelling process - Hans-Werner Sehring, 2007

  9. 3. Asset Schema Inference Process (ASIP) reviewer: Professor : Professor • Bootstrapping: CCM itself requires an initial model as a starting point for the open dynamic modelling process • Required: sytematic support for domain experts in finding suitable models • Start with Asset Expressions: • content abstractions and applications:assigned names and bound values • semantic types (concepts): no inner structure • Concepts and classes are not distinguished in CCM models, intensional and extensional definitions • Free-form entity descriptions are used as samples; later they become instances of classes An agile modelling process - Hans-Werner Sehring, 2007

  10. Agile CCMS development • Agility: • based on the possibility to generate CCMSs dynamically • domain experts review their models based on experiences with an operational CCMS • if changes to the model are required, another iteration of the process is started • entity descriptions created within the CCMS can be used as samples for the next iteration of the process Create Asset expressions Construct schema Generate CCMS An agile modelling process - Hans-Werner Sehring, 2007

  11. ASIP phases Sample acquisition Phase 1 Schema inference Phase 2 • unhappy with schema: • modify samples • (- modify schema) answer questions Feedback questions Prototype generation Phase 3 System generation Phase 4 • The ASIP has four phases An agile modelling process - Hans-Werner Sehring, 2007

  12. Two schema inference experiments • Experiments with alternatives for phases 2 and 3: • (traditional) schema inference plus user feedbackstraight-forward approach starting from singletons • clustering, supervised by domain expertsstatistical approach, semi-supervised learning • Phase 3 (generation of questions to gather feedback) is determined by the alternative chosen • Result of phases 1-3 is a CCM model: • prototype generation and system generation (phase 4) are carried out by the CCM model compiler • the domain expert can modify the inferred schema(openness and dynamics) An agile modelling process - Hans-Werner Sehring, 2007

  13. 4. Straight-forward schema inference • Schema construction by traditional schema inference • derive naive classes directly from the set of samples • apply simplifications • if changes where applied to the schema, repeat step 2 • Step 1: for each sample create an Asset class with • a content handle whose type is determined by the encoding format of the sample’s content • attributes for all abstractions over the content • characteristics for certain known types • relationships for other types • no further constraints An agile modelling process - Hans-Werner Sehring, 2007

  14. Schema simplification • Step 2:simplifications, repeatedly applied in the specified order • identical class: unify classes with attributes and content handles with identical names and types • inheritance: subtype relationship of classes whose sets of attributes are in a subset relationship • type match: if two classes have attributes and content handles of identical types, prompt expert for unification • inheritance orphan: ask domain expert about removal of classes with only few instances • Note: • often classes considered equal if the attributes’ types match • here the name is considered, or else feedback is collected An agile modelling process - Hans-Werner Sehring, 2007

  15. 5. Cluster-based schema inference • Schema construction by clustering: • cluster samples, create classes from clusters • experiment based on k-means algorithm • Clustering steps: • classification: assign classes to clusters based on distance measure d: d(s,c) = α dsem(s,c) + (1-α) dstruct(s,c), α[0..1] • optimisation: recompute the cluster centres • inheritance hierarchy creation: like in the simple approach • feedback: visualise the clusters, allow to partition clusters=> semi-supervised learning • Less user interaction than in the traditional approach An agile modelling process - Hans-Werner Sehring, 2007

  16. Structural distance measure • dstruct is based on the length of the shortest edit script (similar to string matching) • Costs like:edit operationcost magnitude add attribute low remove attribute high change attribute name low broaden attribute type medium narrow attribute type very low increase cardinality of attribute value medium decrease cardinality of attribute value very low An agile modelling process - Hans-Werner Sehring, 2007

  17. Semantic distance measure Any WorkOfArt Person 0 Any Text Image 1 WorkOfArt Person 1 Book 1/2 1/2 Image Text 2 distance? 1/4 Book 3 h(T) • dsem is determined by the shortest paths in the class hierarchy • 1/2h(T1) if T1 is direct supertype of TCdsem(T1,Tm) + dsem(Tm,TC) if T1 is direct supertype of Tmdsem(s,c) = and Tm is supertype of TCdsem(TS,T1) + dsem(TS,TC) if TS is the most specific common supertype of T1 and TC An agile modelling process - Hans-Werner Sehring, 2007

  18. 6. Process evaluation • Schema quality: • generally difficult to judge • for domain modelling: not a schema that describes sample best, but model that best represents the application domain • Criteria [Cherfi, Akoka, Comyn-Wattiau]: • specification: • graphical legibility • simplicity • expressiveness • syntactical correctness • semantic correctness • usage: completeness, understandability • implementation: implementability, maintainability An agile modelling process - Hans-Werner Sehring, 2007

  19. Process evaluation (cont’d) • Selected parameters: • simplicity: in general depends on • the given sample set • domain expert’s answers in feedback phase • syntactical correctness: granted by model generation • semantic correctness: can be negatively impacted by structurally coinciding classes with different meanings • understandability: • generated class names can be an obstacle • but: generated system lowers impact of schema • implementability: by generation • maintainability: through dynamics An agile modelling process - Hans-Werner Sehring, 2007

  20. 7. Summary and outlook • Summary: • Conceptual Content Management allows domain experts to provide and individually change domain models • domain experts are usually no modelling experts, and they prefer to start with samples describing observations • a process helps domain experts defining initial models to start the open dynamic CCM activity • as one novel approach a cluster-based schema inference process has been investigated • Outlook: future work will include … • the inclusion of the cluster-based approach into the open modelling for extensional concept definitions • the employment of reasoning techniques (induction, abduction) to guide the schema construction process An agile modelling process - Hans-Werner Sehring, 2007

More Related