Peerdb a p2p based system for distributed data sharing
1 / 17

PeerDB: A P2P-based System for Distributed Data Sharing - PowerPoint PPT Presentation

  • Uploaded on

PeerDB: A P2P-based System for Distributed Data Sharing. Wee Siong Ng, Beng Chin Ooi, Kian-Lee Tan, Aoying Zhou. Shawn Jeffery CS294-4 Peer-to-Peer Systems 11/05/03. Overview. A P2P “database” system Allows content-based search No global schema Utilizes mobile agents

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about ' PeerDB: A P2P-based System for Distributed Data Sharing' - neylan

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Peerdb a p2p based system for distributed data sharing

PeerDB: A P2P-based System for Distributed Data Sharing

Wee Siong Ng, Beng Chin Ooi, Kian-Lee Tan, Aoying Zhou

Shawn Jeffery

CS294-4 Peer-to-Peer Systems



  • A P2P “database” system

    • Allows content-based search

  • No global schema

  • Utilizes mobile agents

    • Provides flexibility and extensibility

  • Dynamically adjusts topology

Shawn Jeffery PeerDB


  • Generic P2P platform

  • Mobile Agents

    • Carry code and data

    • Collect stats

    • Security issues?

  • Dynamic Reconfiguration

    • How does this compare to Gia?

  • Location Independent Global Names Lookup (LIGLO) Servers

    • Small number

    • Provides a global identity for peers and peer status

    • Why not use a DHT/KBR/DOLR?

Shawn Jeffery PeerDB

Bestpeer security
BestPeer Security

  • Private and sharable data

    • Agents only able to access sharable data

    • Does this adequately restrict the power of mobile agents?

  • Communications on the wire also encrypted

  • What’s missing?

Shawn Jeffery PeerDB


Sharable Data

Local Data

Shawn Jeffery PeerDB


Schema mediation
Schema “Mediation”

  • Problems with supporting SQL queries:

    • No global schema information

    • Different nodes could name the same table/attribute differently (“len”, “length”)

  • Solution: User supplies metadata for each relation name and attribute

    • Users expected to do a lot

  • Formula based on matching relation keywords and attribute keywords to determine if a query matches a table

  • What about other schema mediation work (such as Piazza)?

Shawn Jeffery PeerDB

Local query processing phase i
Local Query Processing – Phase I

  • “Master Agent” coordinates the entire affair

  • Check Local Dictionary for matching relations

    • Use the relation matching strategy even for the local DB

  • Create “Relation Matching Agents” and flood to all neighbors

  • Wait for responses

    • Display results to user as they arrive

Shawn Jeffery PeerDB

Local query processing phase ii
Local Query Processing – Phase II

  • User selects the relations he/she wants

    • Create a “Data Retrieval Agent”

    • Rewrite query in terms of new relations

  • If local, submit SQL to local db

  • Contact remote nodes directly to access the data

    • Creates remote join plans locally - optimization?

Shawn Jeffery PeerDB

Remote query processing
Remote Query Processing

  • Phase I: Find relations

    • Relation Matching Agents flood with TTL

    • Check Export Dictionary for a match

      • Return matches directly

  • Phase II: Get data

    • Data Retrieval Agent submits SQL to DBMS

    • Return data to the requesting node directly

    • Run further data processing before returning

      • Again, security issues

Shawn Jeffery PeerDB


  • Master Agents monitor stats in the network

  • Keywords for some relations returned during Phase I

    • Update metadata

  • Number of objects returned for selected relations

    • Can be used for topology change decisions

  • Use most recently returned results as metric to determine who to connect with

    • Frequent updates – might need to change neighbors after each result returned

Shawn Jeffery PeerDB


  • Cache all query results locally

  • Soft state

  • LRU replacement

  • Users choose which copy they want

    • Only provided with peer id and an indication of which is the source

    • What about timestamp, etc?

    • Again, user heavily involved

Shawn Jeffery PeerDB

Relation matching performance
Relation Matching Performance

  • Significant tradeoff between precision and recall

  • Which is more important?

  • Is their approach acceptable?

Shawn Jeffery PeerDB

Experimental methodology
Experimental Methodology

  • Compare P2P Model vs Client/Server model

    • CS returns via the search path (?)

  • Compare static vs reconfigurable networks

  • Compare agent vs message based approach

  • 32 Nodes

    • Is this enough?

Shawn Jeffery PeerDB

Evaluation scenarios metrics
Evaluation Scenarios (Metrics?)

  • Fixed set of nodes

    • Easily test P2P protocols, Reconfiguration strategies

  • Latency

  • Quality and Quantity

  • What else is important?

Shawn Jeffery PeerDB


  • As you increase the amount of storage on each node, latency decrease

    • Due to caching

  • In general, reconfiguration performs better

  • Response times O(1 Minute)

    • Is this acceptable?

  • Agent based shown to be better

    • What if agent produces more data than it processes?

Shawn Jeffery PeerDB

Discussion a p2p dbms
Discussion: A P2P DBMS?

  • PeerDB represents a tiny step towards a P2P DB (also PIER, Piazza)

    • What does it do right?

    • What else is needed?

    • Is it ideal to have a P2P DB?

    • Is it feasible?

Shawn Jeffery PeerDB