1 / 33

Semantic Web infrastructure Trisolda current state and perspectives

Semantic Web infrastructure Trisolda current state and perspectives. Filip Zavoral , Ji ří Dokulil SemWex - KSI MFF UK http://www.ksi.mff.cuni.cz/semwex/. 10. Mixer 26.11.2008. Se mantic web vs. s e mantiza tion. Se mantic web vision Tim Berners-Lee

melvyn
Download Presentation

Semantic Web infrastructure Trisolda current state and perspectives

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Semantic Web infrastructure Trisolda current state and perspectives FilipZavoral, Jiří DokulilSemWex - KSI MFF UK http://www.ksi.mff.cuni.cz/semwex/ 10. Mixer 26.11.2008

  2. Semantic web vs. semantization • Semantic web vision • Tim Berners-Lee • “The Semantic Web,” Scientific Am. 2001 • semantic research generously funded • 'hardly one has ever seen ...' • New buzzwords • Web 2.0, Web 3.0, Social web, Web of data, Meshups, … • Semantic web died? • no, not yet born • Web Semantization

  3. Semantic technologies Browser Security HTML HTTP TCP/IP

  4. Technical details

  5. Semantic web services

  6. Trisolda • Motto • 'hardly one has ever seen ...' the semantic web • data from real life • incomplete, duplicated, inaccurate, >20 millions triples • Jena • very slow load, over >1 million of triples → crash • Sesame • unable to load more then 200 000 triples • exponential complexity for loading • where is a working platform for semantic webresearch? • Technology background • Repository – data integration • DataPile

  7. Trisolda • Trisolda Architecture • Import interfaces • Repository • Querying & Executors

  8. Repository • Trisolda Repository • Stores incoming data • Retrieves results for queries • Stores used ontology • DataPile structure • holds data in any format • Applications server • Not all data and knowledge available when imported • the knowledge is not accurate • Background worker • inferencing • data unifications • reasoner • Framework for plug-ins

  9. Import • Direct import • data in data sources • converters to the used ontology • Crawling wild Web • Egothor web crawler • AgentMat • parsed pages stored • deductors deduce data and ontology • real life data • incomplete, duplicated, inaccurate • Import modes • batch insert • immediate insert 

  10. Querying • Query API • Based on simple graph matching • query: set of RDF triples with var. • result: multiset of possible variable mapping – a relation • Not another SQL-like language • set of C++ classes and operators • Query evaluation • levels of support by q engines • Query environments • present outputs • examples: rep. browser, RDF visualizer, semantic executors • service composition - conductors

  11. AgentMat - data semantization framework

  12. AgentMat - data extraction

  13. Future work • Conclusions • working infrastructure • currently not working - re-deployment, AgentMat & TriQ integration • gathering, storing and querying of semantic data • platform for research and experiments • Future work & long-term goals • specialized semantic data storage • semantic acquisition, data semantization • interface-based loosely coupled network of Semantic Web repositories • semantic computing, services, composition, executors ...

  14. Selected Publications • Beňo, Míšek, Zavoral: AgentMat: Framework for Data Scraping and Semantization, 3rd International Conference on Research Challenges in Information Science, IEEE, 2009 • Dokulil, Yaghob, Zavoral:Trisolda: The Environment for Semantic Data Processing, International Journal On Advances in Software, IARIA, 2009 • Podzimek, Dokulil, Yaghob, Zavoral: Mám hlad: pomůže mi Sémantický web?, Informačné technológie - Aplikácia a Teória, ITAT 2008 • Dokulil, Tykal, Yaghob, Zavoral: Semantic Web Repository And Interfaces, International Conference on Advances in Semantic Processing, SEMAPRO 2007, IEEE Computer Society Press - Best Paper Award • Dokulil, Tykal, Yaghob, Zavoral: Semantic Web Infrastructure, IEEE International Conference on Semantic Computing ICSC, IEEE Computer Society Press 2007 • Yaghob, Zavoral: Semantic Web Infrastructure using DataPile, Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence and Itelligent Agent Technology, Hong Kong, IEEE Computer Society Press 2006

  15. Web (3TB of *.cz) CrawlerGalamboš Repository WIEDědek Semantic processingZavoral, Dokulil User profileEckhardt User (agent, services) Mlýnková, Nečaský Beňo, Míšek FormalmodelsBednárek TrustandreputationNovotný ParalelYaghob LoboticPloblemsObdržálek SecurityTykal, Knap

  16. PART IITables in RDF querying -do we really need them?

  17. SPARQL syntax SQL-like – at first look “simple language” but complex grammar {?x ?y ?z . OPTIONAL { ?a ?b ?c . } . ?k ?l ?m . } {?x ?y ?z OPTIONAL { ?a ?b ?c } ?k ?l ?m }

  18. SPARQL semantics lot of changes – now stable based on algebra works with sets of variable mappings – i.e. tables very different from SQL “closed” no compositionality

  19. SPARQL RDF is a graph SPARQL provides pattern (subgraph) matching – no other graph handling SPARQL handles only fixed-size graphs RDFS supports arbitrary hierarchy of classes SPARQL has no aggregate functions, no “group by” no constructors

  20. Seasoned SQL developer

  21. Seasoned SQL developer

  22. Idea… ? make the language SQL-like inside not just outside joins, selection, projection, grouping, aggregation relational algebra works with relation, i.e. sets of triples, the database is made of relations RDF data is made of… RDF graphs maybe we should work with RDF graphs

  23. Tables – Graphs John Smith John Doe Jane Doe Bill Jackson

  24. Basic pattern variables -> “columns” ?firstname ex:firstname ?person ex:lastname ?lastname

  25. Further operations selection, joins, aggregation, projection group by

  26. Multiple values john@doe.com ex:mail ex:john ex:mail johndoe@work.com

  27. Local and global aggregations more values in one “column” maximal number of mails total count of mails

  28. What’s more? optional parts of the graph regular expressions textual representation (language)

  29. Conclusion current state is bad try something different ?

  30. PART IIILet’s have a look – RDF visualizer

  31. RDF subject – the thing we are describing predicate – the property of the thing object – the value of the property a graph (directed, labeled)

  32. Visualization triangle layout layered drawing for trees node merging more information for a node navigation the way to handle huge data

  33. Let’s have a look A picture is worth a thousand words…

More Related