1 / 25

An architecture for peer-to-peer reasoning

An architecture for peer-to-peer reasoning. George Anadiotis, Spyros Kotoulas and Ronny Siebes VU University Amsterdam. Obsolete motivation slides . Why do we need distribution… Why do we need anytime behavior… Why is should be (very) scalable…

onawa
Download Presentation

An architecture for peer-to-peer reasoning

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. An architecture for peer-to-peer reasoning George Anadiotis, Spyros Kotoulas and Ronny Siebes VU University Amsterdam

  2. Obsolete motivation slides  Why do we need distribution… Why do we need anytime behavior… Why is should be (very) scalable… Why should we drop consistency and completeness… Why do we need trust/ontology ranking… etc

  3. Talk outline • What is P2P? (1 slide) • Relationship between P2P and SW(3 slides) • Our Goal (1 slide) • Distributed SW stores(1 slide) • Structured P2P stores (3 slides) • Federated stores (2 slides) • Our approach (6 slides) • Future work (1 slide)

  4. Peer-to-Peer • Class of distributed systems • Most important characteristics • Same functionality across peers • Peer autonomy • Formation of overlay networks • Common interface • They respect some agreed-upon way to organize • File-sharing networks are NOT the only Peer-to-Peer systems.

  5. What is the relationship between Peer-to-Peer and the Semantic Web?

  6. Peer-to-peer benefits from the Semantic Web in: Source of semantic information to self-organize Interoperability

  7. Semantic Web benefits from Peer-to-Peer in: Common misconception: All Peer-to-Peer systems can offer the above • Scalable infrastructure for • Storage • Reasoning • Collaboration • Self-organization • Autonomy – control of data • Privacy • Scalable algorithms • Robustness • No censorship • No preferential treatment of information

  8. Our Goal • Global-scale semantic web storage and reasoning • Scalability • Computation • Administration

  9. Two strands of distributed semantic web stores • Structured peer-to-peer • Use DHTs • One global distributed store • Peers do not maintain their own data • Federated stores • Each peer maintains its own store • Stores are interconnected • Either global schema or mappings between schemata

  10. Distributed Hash Tables • The mathematical abstraction for hashtables is a Map • Functionality: • put(key,value) • get(key) • Similar to normal hash-tables with the difference that each bucket now is a peer • Accessing different buckets involves network traffic • Routing to a bucket is done bothering approx. log(N) peers, N is network size

  11. Toy DHT a b c d e f <Key=horse, Value=the horse is an animal> g h i j k l m n o p q r s t v u w x Values are stored in the peer with ID starting with the first letter of the key

  12. RDF storage on top of DHTs Peer 1 Peer 2 <monk_seal, subClassOf, seal> <mseal1, type, monk_seal> <rabbit, subClassOf, animal> <seal, subClassOf, animal> <animal, lives_in, habitat> a b c d e f <animal, lives_in, habitat> <animal, lives_in, habitat> g h i j k l <monk_seal, subClassOf, seal> <mseal1, type, monk_seal> <rabbit, subClassOf, animal> m n o p q r <seal, subClassOf, animal> <monk_seal, subClassOf, seal> <rabbit, subClassOf, animal> <mseal1, type, monk_seal> s t v u w x <rabbit, subClassOf, animal> <seal, subClassOf, animal> <animal, lives_in, habitat>

  13. Reasoning and inferred triples RDFS class axioms (1) <X, subClassOf, Z> <- <X, subClassOf, Y> , <Y, subClassOf, Z> (2) <X, type, Z> <- <X, type, Y>, <Y, subClassOf, Z> <rabbit, subClassOf, animal> <seal, subClassOf, animal> <animal, lives_in, habitat> a b c d e f <animal, lives_in, habitat> <animal, lives_in, habitat> g h i j k l <monk_seal, subClassOf, seal> <mseal1, type, monk_seal> <rabbit, subClassOf, animal> m n o p q r <seal, subClassOf, animal> <monk_seal, subClassOf, seal> <rabbit, subClassOf, animal> <mseal1, type, monk_seal> s t v u w x <monk_seal, subClassOf, animal> <monk_seal, subClassOf, animal> <monk_seal, subClassOf, animal>

  14. Reasoning and inferred triples RDFS class axioms (1) FORALL O,V O[rdfs:subClassOf->V] <- EXISTS W (O[rdfs:subClassOf->W] AND W[rdfs:subClassOf->V]). (2) FORALL O,T O[rdf:type->T] <- EXISTS S (S[rdfs:subClassOf->T] AND O[rdf:type->S]). <rabbit, subClassOf, animal> <seal, subClassOf, animal> <animal, lives_in, habitat> a b c d e f <monk_seal, subClassOf, animal> <animal, lives_in, habitat> <animal, lives_in, habitat> g h i j k l <monk_seal, subClassOf, seal> <mseal1, type, monk_seal> <rabbit, subClassOf, animal> m n o p q r <monk_seal, subClassOf, animal> <mseal1, type, animal> <mseal1, type, animal> <mseal1, type, animal> <seal, subClassOf, animal> <monk_seal, subClassOf, seal> <rabbit, subClassOf, animal> <mseal1, type, monk_seal> s t v u w x <monk_seal, subClassOf, animal>

  15. Limitations • As shown, the transitive closure has to be calculated – backwards chaining would require many DHT messages • But it does not scale to large number of ontologies. • E.g. a animal hierarchy: Adding the triple <animal, subClassOf, living_organism> means that for all triples with animal, we need to insert an additional triple. • Control over ontologies • Provenance of information • Ontologies and instance data are made public • Publishers are not in control of their ontologies/data • One super-user inserts all data

  16. Federated Stores Each peer maintains its ontology and instance data Mappings are (manually) defined between ontologies Thus, a semantic topology is created Queries are posted according to such a schema and forwarded following these mappings Semantic Web counterpart of Federated Databases

  17. Limitations • Bootstrapping • New peers have to manually map their ontologies to the ontology of a peer already in the network • Finding relevant ontologies requires flooding • Routing • The overlay is created according to the ontologies understood by peers, not the data they contain. Possible scalability problem. • Searching for instances requires flooding

  18. Our approach • Effort to combine both approaches • Use a DHT to efficiently find ontologies and instance data • Exploit semantic locality by keeping ontologies local to the publisher • Whenever possible, perform reasoning peer-to-peer

  19. Indexing Peer 1 Peer 2 <monk_seal, subClassOf, seal> <mseal1, type, monk_seal> animal:P1 a b c d e f habitat:P1 lives_in:P1 g h i j k l monk_seal:P2 mseal1:P2 rabbit:P1 m n o p q r seal:P1,P2 subClassOf:P1, P2 s t v u w x 19 <rabbit, subClassOf, animal> <seal, subClassOf, animal> <animal, lives_in, habitat>

  20. Peer 3 Querying Query <seal, subClassOf, X?>  <Y?, subClassOf, seal> seal? Peer 1 Peer 2 <rabbit, subClassOf, animal> <seal, subClassOf, animal> <animal, lives_in, habitat> <monk_seal, subClassOf, seal> <mseal1, has_type, monk_seal> <monk_seal, subClassOf, seal> <seal, subClassOf, animal> animal:P1 a b c d e f habitat:P1 lives_in:P1 g h i j k l monk_seal:P2 mseal1:P2 rabbit:P1 m n o p q r seal:P1,P2 subClassOf:P1, P2 s t v u w x 20 20 P1, P2

  21. Peer 3 Querying 2 Query <monk_seal, subClassOf, X?> monk_seal? Peer 1 Peer 2 <rabbit, subClassOf, animal> <seal, subClassOf, animal> <animal, lives_in, habitat> <monk_seal, subClassOf, seal> <mseal1, type, monk_seal> <seal, subClassOf, X?> <monk_seal, subClassOf, seal> <seal, subClassOf, animal> seal? animal:P1 a b c d e f habitat:P1 lives_in:P1 g h i j k l monk_seal:P2 mseal1:P2 rabbit:P1 m n o p q r P2 seal:P1,P2 subClassOf:P1, P2 s t v u w x 21 21 P1

  22. Advantages • Control • Access Control • Select which data is published on the index • Trust – ban spammers, remember good peers • Privacy • It is possible to obfuscate descriptors stored in the DHT • Responsibility • Publisher has the responsibility to maintain own data • Scalability • DHTs can scale to millions of nodes • Data is up-to-date

  23. Performance indicators Based on the data of swoogle, there is currently small overlap between ontologies The distribution of ontology popularity follows a power-law pattern If most answers reside on the same peer, our approach outperforms those that rely on triple distribution on top of a DHT

  24. Current and future work Simulations using SWD from Swoogle and Watson (around 25.000) Integration of privacy in the index Selecting the right ontologies/peers

  25. The end… ?

More Related