1 / 46

U-P2P

U-P2P. A Peer-to-peer System for Description and Discovery of Resource-sharing Communities Aloke Mukherjee, Carleton University August 28, 2003. Peer-to-peer File-sharing. Exploit storage capability of the edge Balance load Robustness to failure Weaknesses: Search and Communities.

chanel
Download Presentation

U-P2P

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. U-P2P A Peer-to-peer System for Description and Discovery of Resource-sharing Communities Aloke Mukherjee, Carleton University August 28, 2003

  2. Peer-to-peer File-sharing • Exploit storage capability of the edge • Balance load • Robustness to failure • Weaknesses: Search and Communities

  3. Search Problem • Lack of structured metadata • Filenames, Keyword matching • Opaque identifiers • Support for popular formats • Ignoring structured metadata • Implicit indicators • Collaborative filtering

  4. State of the Art: Search

  5. Community Problem • Not simple to create a community for sharing a new file format • Current state • Different protocols/apps (gnutella, fasttrack, jxtasearch) • Inadequate metadata (filename matching, limited schemas) • Ad-hoc attempts aimed at specific domains • Scattered and isolated – there is no easy way to discover communities

  6. State of the Art: Communities

  7. Improving Search • Standard metadata layer • Explicit structured metadata • All resources are XML files • XML Schema used to describe format (e.g. MP3, design pattern)

  8. Schema instantiates resource <schema> <element name=“designpattern”> <sequence> <element name=“name” type=“string”> <element name=“author” type=“string”> <element name=“context” type=“string”> <element name=“problem” type=“string”> <element name=“design” type=“string”> <element name=“diagram” type=“anyURI”> </sequence> </element> </schema> <designpattern> <name>singleton</name> <author>gang of four</author> <context>when creating a new class…</context> <problem>ensure a class only has…</problem> <design>make the class itself responsible…</design> <diagram>http://example.com/singleton.jpg</diagram> </designpattern>

  9. xslt resource xml schema resource create form resource search form resource resource view Automated interface generation xslt instantiates xslt

  10. resource xml schema xsl xsl instantiates resource create form resource search form xsl resource resource view

  11. resource xml schema xsl xsl instantiates xsl resource create form resource search form resource resource view

  12. Community Creation and Discovery:What is a Community? • Concrete object with defined tuple of attributes • Simplest form: (format, protocol, …) • Known examples: (mp3, napster) (video, kazaa) • Examples that don’t exist: (design patterns, gnutella) (p2p papers, jxtasearch) • Tuple is specified as a XML file

  13. Simplifying Community Creation • User-designed communities • Compose schema to describe format • Compose community XML file <community> <name>designpatterns</name> <schema>designpattern.xsd</schema> <protocol>gnutella</protocol> <display>designpattern.stylesheet</display> </community>

  14. mp3 community mp3 class mp3 mp3 Community as class

  15. mp3 community mp3 class communitycommunity class class mp3 mp3 Metaclass analogy

  16. mp3 community community community mp3 community Community discovery is File discovery • MP3 community shares MP3 files • Community community shares communities

  17. Simplifying Community Discovery • A Community for Communities: The Root Community • Communities are files shared in a real community • Root Community includes schema for communities (format, protocol) = (community, centralized db)

  18. Schema for Communities <schema> <element name="community"> <complexType> <sequence> <element name="name" type="xsd:string"/> <element name="protocol" type="protocolTypes"/> <element name="schema" type="xsd:anyURI"/> <element name="display" type="xsd:anyURI"/> </sequence> </complexType> </element> </schema> <community> <name>root community</name> <schema>community.xsd</schema> <protocol>central-db</protocol> <display>community.stylesheet</display> </community> The Root Community

  19. What is U-P2P? • A framework that breathes life into these ideas • Explicit metadata search and creation for every Community • Creation of Community tuples • (format, protocol etc…) • Discovery of Community tuples

  20. Design

  21. Technologies • Java • Tomcat Servlet Container • Java Server Pages (JSP) + Servlets • XSLT (transforms), XPath (queries) • Java components for XSLT, XPath (Xerces, Xalan) • eXist XML Database • Log4j (logging infrastructure), JUnit (unit testing)

  22. Evaluation and Validation: Areas of Interest • Publish and Search times as Community size increases • Breaking down Publish and Search operations • Community effect • Multiple central servers

  23. Publish

  24. Search

  25. Community Effect

  26. Multiple Central Servers

  27. Publish with Multiple Servers

  28. Vs. Without Multiple Central Servers

  29. Contributions • Standard Metadata Layer • All communities include support for explicit metadata search and creation • User-designed Communities • Users can easily share new formats with full support for metadata • Community for Communities • Prevents fragmented, isolated communities by providing metadata about communities and a standard method for discovering them • Performance and Scalability Gains • Communities can improve performance and scalability vs. systems where resources are undifferentiated

  30. Future Work • Performance improvements • Protocol independence (adapters for Gnutella, Freenet, etc.) • Community-aware Gnutella routing • More Community parameters (security, authentication, etc.)

  31. Future Work continued • Trust metrics (to differentiate between communities, metadata quality) • Community evolution • Inheritance and multiple inheritance for Communities

  32. U-P2P Publications A. Mukherjee, B. Esfandiari, N. Arthorne, “U-P2P: A Peer-to-peer System for Description and Discovery of Resource-sharing Communities”, ICDCS Workshops 2002: 701-705, July 2002. Neal Arthorne, Babak Esfandiari and Aloke Mukherjee, "U-P2P: A Peer-to-peer Framework for Universal Resource Sharing and Discovery”, Proceedings of Freenix track of Usenix 2003, 29-38, June 2003. http://u-p2p.sourceforge.net

  33. Backup slides

  34. WebAdapter: User Interaction Model

  35. Repository Design

  36. Repository Design: Resource IDs

  37. Repository Design: XML Database • Requirements • Flexibility to store wide variety of formats • Handle powerful queries over all metadata • XML Database better suited than RDBMS • Difficult to map fields to rows and columns • Chose eXist XML database • Open source • Written in Java • Support for XML:DB API

  38. Network Adapter Design • Abstract interface to Peer-to-peer Network • Routing search requests, handling results, handle incoming search requests, etc. • Only implemented Hybrid model (Napster model) • All peers can act as client and/or server

  39. Network Adapter: Protocol

  40. Evaluation and Validation: Challenges • Finding large XML collections • Berkeley Drosophila Genome Project: genome annotations • Other sources: DBLP (CS papers), EDGAR (SEC filings), GeneOntology (gene-related concepts) • Transforming DTDs to XML Schema (DTDXS package) • Automation • XML-RPC interface for publish and search

  41. Publish: Breakdown of Operations

  42. Publish: Client Timings

  43. Publish: Server Timings

  44. Network Adapter: Protocol

  45. Search: Breakdown of Operations

  46. Search: Total vs. Server Timings

More Related