190 likes | 477 Views
Ontology Construction & Tools. Atilla ELÇİ Dept. of Computer Engineering Eastern Mediterranean University. Ontology Development. The Domain Expert’s Expressway: Ontology Development 101 : A Guide to Creating Your First Ontology by Natalya F. Noy and Deborah L. McGuinness .
E N D
Ontology Construction & Tools Atilla ELÇİ Dept. of Computer Engineering Eastern Mediterranean University CmpE 588 Spring 2008 EMU
Ontology Development The Domain Expert’s Expressway: • Ontology Development 101: A Guide to Creating Your First Ontologyby Natalya F. Noy and Deborah L. McGuinness. • Tools used: Protégé with OntoViz API. • Note that: • (i) extensive domain knowledge, and • (ii) ontology tools skill are required for building usefull ontologies. • Example: Brusa et al: A Process for Building a Domain Ontology, AOW 2007. CmpE 588 Spring 2008 EMU
Ontology Development Through Knowledge Discovery The (Syntactic) Discovery Approach [Davies et al. Ch. 2]: • Knowledge discovery • Ontology definition • Semi-automatic ontology construction • Ontology learning scenarios • Knowledge discovery for ontology learning CmpE 588 Spring 2008 EMU
Knowledge Discovery • Knowledge discovery: developing techniques enabling automatic discovery of novel and interesting information from (raw) data. • Lately, un-/semi-structured domains, such as: • Text Mining, • Web Mining, • Link Analysis (graphs/networks) • Relational Data Mining (relational / first order form) • Stream Mining (analysis of data streams) ... are of interest. => Semi-Automatic Ontology Construction CmpE 588 Spring 2008 EMU
Knowledge Discovery(continued) • KD relates to such research areas as: • Computational Learning Theory: theoretical questions about learnability, computability, learning algoriths. • Machine Learning: automated learning and knowledge representation • Data Mining: using learning techniques on large-scale real-life data, • Web Mining, • Statistics-cum-Statistical Learning: techniques for data analysis. • Conference: • 9th International Conference on Data Warehousing and Knowledge Discovery (DaWaK 2007), Sept. 3-7, 2007, Regensburg, Germany. Proceedings in LNCS. • CFP due date: • Submission of abstracts: April 2, 2007 • Submission of full papers: April 13, 2007 • Check KD subjects. • DaWaK 2008 CmpE 588 Spring 2008 EMU
Ontology Definition • Ontology is a graph / network structure consisting of: • A set of concepts (vertices in a graph) • A set of relationships connecting concepts (directed edges in a graph) • A set of instances of a particular concept or relationship (data records). • Formal/theoretical definitions of ontology as an abstract structure: • Ehrig et al. (2005): based on similarity measure • Bloehdorn et al. (2005): through integration of MLs. CmpE 588 Spring 2008 EMU
Ontology EngineeringSemi-Automatic Ontology Construction • Ontology Life Cycle of DILIGENT ontology engineering and construction methodology: building, local adaptation, analysis, revision, and local update. • Semi-automatic ontology construction (a la CRISP-DM ‘data mining’ methodology): • 1. Domain understanding: interest area. • 2. Data understanding: data versus semi-automatic ontology construction. • 3. Task definition: tasks of interest that are doable with the available data. • 4. Ontology learning: semi-automatic process executing the tasks of step 3. • 5. Ontology evaluation: estimating quality of solution to taks. • 6. Refinement (semi-/manual): human-in-the-loop transformation to improve the ontology. BusinessDomain OntologyDomain CmpE 588 Spring 2008 EMU
Ontology Learning Scenarios • Typical ones are as follows: • Inducing concepts and clustering of instances (given instances) • Inducing relations (given concepts and instances) • Ontology population (given an ontology and relevant but not-associated instances) • Ontology generation (given instances and background info) • Ontology updation (given an ontology and new instances). CmpE 588 Spring 2008 EMU
Knowledge Discovery for Ontology Learning • KD aims to extract a structure in the data. That is, mapping unstructured data into ontological structure. • At the same time, keep in mind scalability issues as KD process is used necessarily on real-life dataset volumes (~terabytes). • Some KD techniques used in addressing the ontology learning scenarios: • Unsupervised Learning • Semi-Supervised, Supervised, and Active Learning • Stream Mining & Web Mining • Focused Crawling • Data Visualization CmpE 588 Spring 2008 EMU
Unsupervised Learning • By grouping like instances through comparing them against each other and suggesting labels for the groupings that evolve. Methods used are: • Document Clustering • Latent Semantic Indexing • Ref. Section 2.6.1. CmpE 588 Spring 2008 EMU
Semi-Supervised, Supervised, and Active Learning • Man-in-the-loop, tools-assisted approaches • Reference Section 2.6.2 CmpE 588 Spring 2008 EMU
Stream Mining & Web Mining • Stream mining: schemes for rapidly changing data running continuously. • Web mining: • Web content mining • Web structure mining • Web usage mining • Reference Section 2.6.3. CmpE 588 Spring 2008 EMU
Focused Crawling • The approaches dealing with collecting documents on the Web. • Reference Section 2.6.4. CmpE 588 Spring 2008 EMU
Data Visualization • For obtaining early measures of data quality, content, and distribution. • Reference Section 2.6.5 CmpE 588 Spring 2008 EMU
Further References on Ontology Construction • Reference Section 2.7. • Especially note Fernandez (1999) paper on analyzing ontology development approaches against IEEE Standard for Developing Software Life Cycle Processes. • Reference Section 2.8: Note hints on research directions. CmpE 588 Spring 2008 EMU
Ontology Development Tools • Ontology Tools Survey, Revisited by Michael Denny • W3C Semantic Web Tools Wiki page CmpE 588 Spring 2008 EMU
Commercial SemWebTech Conferences • Semantic Technology Conference (SemTech 2007 ), 20-24 May, 2007, San Jose, California, USA. A PDF of the conference brochure is available for download at the conference website. • DAMA Intl Symposium & WILLSHIRE Meta-data Conference, 4-8 March, 2007, Boston, MA, USA. Download the Full Conference Program and Brochure in PDF Here (1.3 mb).Other Willshire Conference tracks. CmpE 588 Spring 2008 EMU
References • John Davies, Rudi Studer, Paul Warren (Editors): Semantic Web Technologies: Trends and Research in Ontology-based Systems, John Wiley & Sons (July 11, 2006). ISBN: 0470025964. Ch. 2.: pp. 9-25. • Brusa, G., Caliusco, M.L. and Chiotti, O. (2006). A Process for Building a Domain Ontology: an Experience in Developing a Government Budgetary Ontology. In Proc. Second Australasian Ontology Workshop (AOW 2006), Hobart, Australia. CRPIT, 72. Orgun, M.A. and Meyer, T., Eds., ACS. 7-15 • Ontology Tools Survey, Revisited by Michael Denny (published July 14, 2004 on xml.com) along with Michael's famous Ontology Editor Survey 2004 Table. • W3C Semantic Web Tools Wiki page: • Check Jena, SemWeb, Protégé, Swoop, etc. CmpE 588 Spring 2008 EMU