1 / 28

The Semantic Web Made Simple

The Semantic Web Made Simple. David Price December 2004 david.price@eurostep.com. Agenda. The Current Web and its technologies How’s it work now? The Semantic Web is adding semantics How’s it going to work in the future?. The Current Web. Web core concepts People read Web pages

rona
Download Presentation

The Semantic Web Made Simple

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Semantic Web Made Simple David Price December 2004 david.price@eurostep.com

  2. Agenda • The Current Web • and its technologies • How’s it work now? • The Semantic Web • is adding semantics • How’s it going to work in the future?

  3. The Current Web • Web core concepts • People read Web pages • Web page authors can control basic layout • Web pages need to link to each other • Web pages need to link to online media • that people read, view, listen to or interpret • People use tools that search/recall Web content (Yahoo, Google, Lycos, their own bookmarks)

  4. What’s on a Web Page? Categories of articles Text that’s actually graphics Online shopping link Article title and link A photograph Date and time Article abstract Location, temperature and unit

  5. What we saw • Things on the NY Times site • Text that’s actually a graphic • Categories of articles • Online shopping link • A photograph • Article title and link • Article abstract • Date and time • Location, temperature and unit • How did we know that? • Because we are humans who can read English and who can interpret what we see

  6. What did the editors do? • Determined the layout of the pages as a whole • Should it look like a real paper? Should there be advertising? • Wrote the text • Decided on navigation • Articles categories called “International”, “National”, “Sports”, etc. • Article categories list items link to separate page for each category with list of articles • Users will have to scroll down the page to see the headline articles • Articles titles will link directly to separate page for each article

  7. How did they do that? • They used HTML and graphic images • Hypertext Markup Language (HTML) allows editors to • control presentation and layout • Paragraph, Bold, Table/Column/Row • add links to other pages • Hyperlink Reference • show graphics • Image of many types are natively supported by browsers • link to other media that have software to present them • music, video, PDF, documents, presentations

  8. A peek under the covers

  9. How does that work? • HTML is a standard language • World Wide Web consortium standardized it • Companies have written software that reads HTML and presents it to you • These are Web browsers • The presentation capabilities of HTML, the related media and browsers are pretty powerful

  10. How does HTML really work? • What do the browsers understand? • <P>This is a paragraph.</P> • Present the text “This is a paragraph.” as a new paragraph • <A HREF=“newsitems.html”>News</A> • Present a hyperlink of text “News” and if it’s selected present new page from file “newsitems.html” • <TR><TD>dog</TD><TD>cat</TD><TR> • In the current row of the table, present text “dog” in column 1 of table and text “cat” in column 2 of table • <IMG SRC=“p1.jpg” /> • Present an image from whatever is in the file named “p1.jpg”

  11. So, What’s the problem? • Only a human being can read a Web page and extract any meaning from it • The Web browser does understand paragraph, image, link • The Web browser does not know it’s linking to a “News Article” or the image is a “picture of photographs” • It’s the meaning that’s really important • Wouldn’t it be powerful if computers could get some of the meaning out of Web pages?

  12. Why is it a powerful idea? • Using our NY Times/newspaper site example… • Suppose you were an Environmental Group • Suppose you want to monitor news stories about the environment or pollution • You could write a program that searches the Web media outlets • That program could trigger a notification about articles on environmental issues • Or, it could contact members of your group in specific locations when it finds legislation related to pollution in particular US states • This would save your members a lot of time searching for themselves, wouldn’t it?

  13. The Semantic Web • Figuring out how to get meaning out of things on the Web using software is what “The Semantic Web” is all about • “using software” means “without humans doing the interpretation” • How would one do that? • Clearly, HTML is not sufficient, so more powerful languages are required • Clearly, cannot replace everything already on the Web, so ways to add meaning are required • Need to combine better languages/communication, computer science and the study of what things mean

  14. Semantics • People have been studying what things exist and what they mean for centuries • This is called Philosophy • People have been studying how people communicate for decades • This is called Linguistics • People have been studying how computers can “learn” for a few decades • This is called Artificial Intelligence

  15. The Semantic Web • Vision of Web “inventer” Tim Berners-Lee and others • Wrote an article in Scientific American in 2001 • Goals • Go beyond processing by human beings • Make Web content computer processable • How? • Add semantics using ontologies • Use inference/reasoning over ontologies

  16. Ontologies • Ontology • A big word from philosophy, linguistics, and computer science • A formal, machine readable specification of a domain of interest • Names things and adds knowledge about and constraints on the things • Allows relationships between terms within and between different ontologies • Semantic Web researchers and W3C have been working several years now

  17. OWL History • US researchers produced DAML-ONT in 2000 • DARPA Agent Markup Language – Ontology Language • European researchers produced OIL about the same time • Ontology Inference Layer • Merged to produce DAML+OIL and submitted as Note to W3C and formed the W3C WebOnt group in 2001 • W3C WebOnt Group produced OWL in 2003 • OWL is now a W3C Recommendation • This is not really that important for our purposes… just remember that OWL didn’t appear overnight

  18. What is OWL? • The World Wide Web Consortium (W3C) created the HTML and XML standards • OWL is a next-generation W3C Web standard • its purpose is to add “semantics” to the Web • Therefore, it can be distributed and is Web-enabled and does not assume a single source for everything • In concept, it is very much like other data modelling languages (it calls models or schemas “ontologies”) • class, subclass, property, property type, instance/individual • supports set theory and logic-based statements about the classes and individuals • it has more than one syntax, XML being one

  19. RDF underlies OWL • RDF is another W3C standard, the Resource Description Framework • RDF is simple in concept but sufficient for many basic Semantic Web tasks (e.g. who created this presentation?) • It allows you to assign a property with a value to a Web page (or any Web resource)

  20. http://www.eurostep.com/TheSemanticWeb.ppt Creator “David Price” RDF underlies OWL • RDF is another W3C standard, the Resource Description Framework • RDF is simple in concept but sufficient for many basic Semantic Web tasks (e.g. who created this presentation?) • RDF is often represented by nodes and arcs

  21. Back to the NY Times

  22. What we saw… again • Things on the NY Times site • Text that’s actually a graphic • Categories of articles • Online shopping link • A photograph • Article title and link • Article abstract • Date and time • Location, temperature and unit • How did we know that? • Because we are humans who can read English and who can interpret what we see

  23. A peek under the semantic covers Article title Newspaper ontology Article Authors Date Article Subjects

  24. Without using an editor … Now these are semantics a software application can understand… Articles and Authors

  25. On Annotating the Web • You might ask: But what about the current Web content, we’re not going to rewrite it all are we? • And we’d answer: Of course not, but you can “annotate” them to add semantics. • What this means is: • Descriptive ontologies like the one for Newpapers are being developed • Descriptions are then linked to already existing Web pages, including any multi-media content (e.g. video) • The Semantic Web community calls this “annotating a Web resource” • You’ll also hear people use the term “metadata” too

  26. So, How does OWL Work? • An Ontology • is a formal description of a field of interest • defines Classes – the kinds of things of interest • Article, Person, etc. • defines Properties – the relationships and characteristics related to Classes • Article is WrittenBy Person, Person has Name • Then, based on the Ontology people create content • An author writes articles using software that understands the Newspaper Ontology • The Publisher gathers all the articles, classifieds, etc. and links them into the online version of the NY Times

  27. But how does that help? • If everyone, or at least a reasonably large community, agreed on an ontology for Newspapers • then sharing articles between sites is possible • presentation can be layered on top of the semantic content of the articles • Web robots, only smarter than Google, can find and relate content about specific subjects, by specific authors, etc. • The key is getting agreement on the ontologies • This is ongoing in various standards bodies, consortia, etc. but remains a major issue for the Semantic Web

  28. In Conclusion • The Semantic Web goal is to make semantic content of Web pages available for software applications • Work has been ongoing for several years • Building on decades of research • The OWL language is a key development • As are the languages upon which it is based, such as RDF Schema • But that’s for another day…

More Related