- By
**barb** - Follow User

- 116 Views
- Uploaded on

Download Presentation
## PowerPoint Slideshow about 'A Step Back Reflections on P2P' - barb

**An Image/Link below is provided (as is) to download presentation**

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

### 2 P2P or Not 2 P2P?

### Scooped, Again

### The Capacity of Wireless Networks

Mema Roussopoulos

Mary Baker

David S. H. Rosenthal

TJ Giuli

Petros Maniatis

Jeff Mogul

Ideal P2P properties

- Self Organizing
- P2P routing
- Discovery
- Symmetric communication
- Peers are approximately equal
- Decentralized control
- No single point of failure

Constraints

- Budget
- Resource relevance to participants
- Trust
- Rate of system change
- Criticality
- Accountability
- Fault tolerance

Candidate problems

- Routing Problems
- Internet Routing (RON)
- Ad hoc Routing in Disaster Recovery
- Metropolitan-area Cell Phone Forwarding
- Backup
- Internet Backup
- Corporate Backup
- Distributed Monitoring

Candidate problems

- Data Sharing
- File sharing
- Censorship Resistance
- Data Dissemination
- Usenet
- Non-critical Content Distribution
- Critical Flash Crowds
- Auditing
- Digital Preservation
- Distributed Time Stamping

Budget

Low

Effect

High

- Lowest possible cost per peer, rather than lowest global cost
- Bit Torrent, Gnutella, Freenet, etc.
- SETI@home

- Dictates how many peers join
- Decides if P2P is viable for problem

- Worries less about performance criticality
- Favors centralized approaches, P2P irrelevant
- Clusters, High performance computing

Relevance

Low

Effect

High

- Personal data
- Private data
- Internet backup
- Corporate backup
- Web caching

- Relevance of resources encourages peers to join
- “When resource relevance is high, cooperation in a P2P solution evolves naturally”

- File sharing
- Freenet
- Content distribution
- Internet routing
- Bit Torrent
- Gnutella
- Kazaa

Trust

Low

Effect

High

- Encryption
- Anonymity
- Freenet
- Oceanstore
- Ivy
- Timestamping
- MojoNation

- Mutual trust
- Risks

- Gnutella
- Napster
- Overlays
- File sharing
- Usenet

Rate of Change

Low

Effect

High

- Tangler
- Freenet
- LOCKSS
- Time stamping
- Content distribution
- Usenet
- Flash crowds

- Churn
- Timeliness
- Consistency

- Internet routing
- Online net monitoring

Criticality

Low

Effect

High

- Usenet
- Content distribution
- Offline net study
- File sharing

- Centralized control
- Accountability
- Fault tolerance

- Ad hoc disaster recovery
- Flash crowds
- Internet monitoring
- Routing

Conclusion

- Framework for analyzing P2P applications
- Captures constraints and app requirements
- Limited budget is motivating factor
- Problems with low relevance are inappropriate for P2P

Critique

- Strengths
- Quantifies application requirements and suitable use cases
- Generically describes suitability of classes of P2P apps
- Weaknesses
- Incomplete view of requirements
- Fuzzy requirements not accounted for

Service Capacity

- # of peers available to serve a document
- Throughput of P2P system
- Average delay
- Rate of dissemination
- What factors govern the effectiveness of a system to scale?

Research problem

- Analyze behavior of P2P systems
- Describe and model capacity behavior
- Transient regime
- Steady state
- Analyze conditions
- Does system scale as modeled?
- Are delays and throughput bounded?

Throughput

- Transient
- Steady state

Service capacity model

- Steady state
- Impact of peer join/departure
- Performance
- Factors
- Peer selection
- Data management
- Multipart downloads
- Size of parts
- Admission and scheduling
- Traffic

Analysis

- 2 States
- Transient (branching process model)
- Steady state
- Deterministic
- Branching process
- Markov chain

Deterministic

Time

0

Rate

1

- N-1 users want a doc
- N=2k
- S bits per request
- S(n-1) bits total
- Time interval at s/b seconds
- Exponential growth
- Ability to serve large bursts
- Average delays scales by lg(n)

0

Count

1

Deterministic

Time

1

Rate

1

- N-1 users want a doc
- N=2k
- S bits per request
- S(n-1) bits total
- Time interval at s/b seconds
- Exponential growth
- Ability to serve large bursts
- Average delays scales by lg(n)

0

Count

2

1

Deterministic

Time

2

Rate

2

- N-1 users want a doc
- N=2k
- S bits per request
- S(n-1) bits total
- Time interval at s/b seconds
- Exponential growth
- Ability to serve large bursts
- Average delays scales by lg(n)

0

Count

6

1

2

2

Deterministic

Time

3

Rate

4

- N-1 users want a doc
- N=2k
- S bits per request
- S(n-1) bits total
- Time interval at s/b seconds
- Exponential growth
- Ability to serve large bursts
- Average delays scales by lg(n)

0

Count

8

1

2

3

3

2

3

3

Multipart

- M identical size chunks
- Service completions at s/mb=m seconds
- Optimization, peers favor others with no chunks
- At time k, system is partitioned into k sets Ai,i=1…k.
- |Ai|=2k-i
- Ai corresponds to peers who have only received the ith chunk

A4

A2

A3

A1

Time slot k

Multipart

- M identical size chunks
- Service completions at s/mb=m seconds
- Optimization, peers favor others with no chunks
- At time k, system is partitioned into k sets Ai,i=1…k.
- |Ai|=2k-i
- Ai corresponds to peers who have only received the ith chunk

A4

A2

A3

Time slot k

Multipart

- Delay is in effect reduced by a factor of m
- Large values of m better, but require more network overhead
- Congestion, bandwidth bottleneck ignored in this model

Branching Process Model

- Let Nd(t)=#peers serving document d at time t.
- Ti is a random variable, transfer time
- E[T]==1/
- Age dependent branching process model, v=2

Branching Process Model

- are growth characteristics
- If T is exponentially distributed,
- If T is deterministic, ln2
- Exponential distribution increases growth exponent

is inversely proportional to v

Large fanout decreases growth exponent

Intuition: limit number of downloads at each peer

Effect of v on GrowthTheorem II

Peers exit system with probability 1-upon completion

If v<1, system becomes extinct

When peers exit, allowing multiple upload ensures document availability and system growth

Peer ChurnSystem increases slowly with increasing v

Effect of m

- Allowing multipart downloads increases performance by factor m
- Growth rate increased by factor m
- Delay factor is reduced by 1/m
- Assumes peers are not simultaneously sharing multiple parts of files

Summary

Multipart

Branching

- Time interval for transfer
- N=2k
- Delays bounded by log n
- Exponential growth

Deterministic

- Time interval m
- Delays bounded by (m)log n
- Space partitioned into sets
- More chunks is faster
- Network overhead is high

- Time interval is a random variable
- Delays bounded by log
- Parameters determine operation
- Accounts for congestion, churn

Markov Chain

- Distant past irrelevant with knowledge of recent past
- Sequence of random variables, X1…Xn
- Transition matrix
- Eigenvectors determine stable state conditions

Markov Chain

Sunny

Rainy

P(Rainy|Sunny)

Sunny

Rainy

P(Rainy|Rainy)

P(Sunny|Rainy)

P(Sunny|Sunny)

Weather, day 0

Weather, day 1

Weather, day 2

Weather, day n

Markov Model

- Poisson process r:
- State
- x=#of peers requesting
- y= #peers hosting
- Multipart files
- Partial peers contribute at rate
- Total rate:

i

Q

S0

Si=

(*1)

Exponentially distributed

Full service rate:

Exit rate:

Performance

Seeds/downloaders

- Seeds/downloaders
- is upload ratio of downloader to seed
- System with high leverages capacity
- Marginal change of system performance low when offered load is high

Bit Torrent

- Multipart d/l
- Chunk size 1 mb
- Credit system
- Updates every 5 min
- 150-200 file insertions

Service capacity

Throughput

Delay

Conclusions

- Credit system, growth are diametric
- Offered load linearly scales with number of peers
- Large multi-part files spread better
- Peer churn reduces throughput to constant
- Delays decrease with offered load

Jonathan Ledlie, Jeff Shneidman, Margo Seltzer, John Huth

Outline

- Introduction
- Grid Computing
- P2P Systems
- Fallacies preventing cooperation
- Shared and Disjoint Problems
- Conclusions

What they are, Goals, Manifestations

Introduction

- Background, Motivation
- Peer-to-Peer vs. Grid Computing
- Overlapping problem domain
- P2P focuses on research
- Grid is concerned with concrete, tangible solutions
- History, repeated – the Web

Introduction – cont.

- Current trends
- Divergent, parallel development
- Duplication of work
- Grid: risk of non-optimal solutions
- Missing out on P2P’s strong achievements (search and storage scalability, decentralization, anonymity, denial of service prevention)
- Cooperation is the key

Grids

- What is the Grid?

“a type of parallel and distributed system that enables the sharing, selection, and aggregation of resources distributed across multiple administrative domains based on the resources’ availability, capability, performance, cost, and user’s QoS requirements”

- Short version: virtualizing computer resources
- Large scale heterogeneous resource sharing (different platforms, hardware/software architectures, and computer languages)
- Functional classification:
- Computational grids (run batch jobs during idle times)
- Data grids

Grid Goals

- Design goal:
- Solve problems too big for a single supercomputer, but retain the flexibility to work on multiple smaller problems
- Self-configuring, self-tuning, self-healing
- Allow data sharing and support computation across administrative domains
- Standardized programming interface
- GGF (Global Grid Forum)
- Globus toolkit – the de facto standard for grid middleware

Grid Manifestations

- Protocols:
- Resource management:
- Grid Resource Allocation & Management Protocol (GRAM)
- Information services:
- Monitoring and Discovery Service (MDS)
- Security services:
- Grid Security Infrastructure (GSI)
- Data movement and management:
- Global Access to Secondary Storage (GASS), GridFTP
- Tools:
- Grid Portal Software (GridPort, OGCE)
- Grid Packaging Toolkit
- Grid-enabled MPI (MPICH-G2)
- Network Weather Service
- Condor (CPU cycle scavenging) and Condor-G (job submission)
- APIs:
- Web Services: Open Grid Services Architecture (OGSA)

P2P

- What is P2P?

“…a class of applications that take advantage of resources – storage, cycles, content, human presence – available at the edges of the Internet”

- Decentralized, non-hierarchical node organization
- Inherently untrusted

P2P Goals

- Cost sharing / reduction
- Every peer responsible of its own cost
- Reduction of file storage costs
- Reduction of computation costs
- Improved scalability / reliability
- Lack of centralization allows new algorithms (CAN, Chord…etc) to be designed to allow improved scalability
- Resource Aggregation
- Every peer lends its own resources to the network
- Increased Autonomy
- Tasks are performed locally – no central service provider

P2P Goals – cont.

- Anonymity / Privacy
- FreeNet
- Dynamism
- Nodes enter and leave the system in a transparent way
- Ad-hoc communication
- Members can join and leave based on their physical location or interests

Grids

Parallel, distributed systems concerned with resource sharing, selection, aggregation

Resource availability, capability, performance, cost, and user QoS requirements are considered

Self-configuring, self-tuning, self-healing

Idle cycle and storage utilization

P2P

Distributed systems that take advantage of resources scattered throughout the Internet

Decentralized, non-hierarchical node organization

Concerned with fault-tolerance, scalability, availability…etc.

Idle cycle and storage utilization

SummaryGrid

Distributed computation

distributed.net

SETI@home

Data production / aggregation

P2P

Distributed file sharing

Gnutella, KaZaA

Distributed computation

distributed.net

Anonymity

Freenet, Publius

Summary – cont.Outline

- Introduction
- Grid Computing
- P2P Systems
- Fallacies preventing cooperation
- Shared and Disjoint Problems
- Conclusions

What they are, Goals, Manifestations

Fallacies preventing cooperation

- “The technical problems in Grid systems are different from those in p2p systems”
- Usage misconception: Grid for computing problems, P2P for file sharing
- Data handling and data production in Grid systems has become important
- P2P used in desktop collaboration and network computation
- “open problems” in both camps have striking similarities

Fallacies preventing cooperation

- “While the technical problems are similar, the architectures (physical topology, bandwidth availability and use, trust model, etc.) demand that the specific solutions be fundamentally different”
- Solving common problems through sharing good ideas from each community
- Application dependent – special requirements tailored to application needs, however the technical approaches for solving a particular problem could benefit both communities

Fallacies preventing cooperation

- “Grid projects do not have the flexibility to try new algorithms/ideas because they have to get real work done. P2P research is all about this flexibility”
- Grid has room for flexible research, too
- Testing new applications and protocols
- Users willing to adopt different technologies to get the work done

Outline

- Introduction
- Grid Computing
- P2P Systems
- Fallacies preventing cooperation
- Shared and Disjoint Problems
- Conclusions

What they are, Goals, Manifestations

Shared problems

- Topology Formation
- Node join and neighbor discovery
- Work has been done by both groups:
- Grid: “On fully decentralized resource discovery in grid environments”
- P2P: “Self-organization in p2p systems”
- Grid infrastructure in not flexible – hard coded
- Could benefit from P2P research prototypes

Shared problems – cont.

- Utilization
- Resource discovery, data retrieval
- P2P hash-based look-up schemes are useful
- Resource management / optimization
- How to “best” utilize resources in a network
- Data replication/caching examined by both communities
- Scheduling and handling of contention
- P2P focus: bandwidth usage (e.g. Gnutella)
- Grid focus: scheduling
- Load balancing: break large tasks into distributed smaller ones

Shared problems – cont.

- Coping with Failure
- P2P: lossy storage model (Freenet, Gnutella)
- Considerations for Grid adaptability:
- Different common loss model
- Storage size (order of half a petabyte/month)
- Security-related issues
- Authenticity: verification of data/computation
- Availability: resilience to DoS attacks
- Authorization: ACLs

Shared problems – cont.

- Maintenance
- P2P: essentially no standards or APIs
- Efforts by Berkeley BOINC, Google Compute, overlay standardization
- Grid: pushes for a standardized API
- GGF (Global Grid Forum)
- OGSA (Open Grid Services Architecture)
- Web services oriented API – Globus as reference implementation

Disjoint Problems

- Anonymity
- Not really useful for Grid systems, yet

Conclusions

- A lot of overlap between the goals and research interests of the two communities
- P2P community needs to consider the needs of the Grid users to see how existing research can be applied successfully to Grid problems
- Aim for common standards as much as possible

Piyush Gupta, P. R. Kumar

Outline

- Introduction
- Arbitrary Networks
- Protocol and Physical Model
- Upper bound on transport capacity
- Constructive lower bound on transport capacity
- Random Networks
- Protocol and Physical Model
- Constructive lower bound on throughput capacity
- Possible Implications
- Discussion of tradeoffs
- Conclusion

Introduction

- Ad-hoc wireless networks
- No centralized control
- Each node involved in routing scheme
- Problems:
- Network layer: routing
- MAC: varying network topology, decentralization
- TDMA too complex – no centralized control
- FDMA inefficient in dense networks
- CDMA difficult to implement
- Random access preferred

Introduction – cont.

- Sharing channels -> “hidden” and “exposed” terminal problem: MACA, MACAW – use handshake signals to alleviate part of these problems
- Physical layer
- power regulated to minimize interference
- Exploring the capacity of wireless networks
- n nodes deployed in a 1 sq. meter region
- average distance between source and destination is L-bar
- Bandwidth: each node can transmit at W bps over common wireless channel
- Multi-hop transmission with buffering
- Two types:
- Arbitrary Networks
- Random Networks

Arbitrary Networks

- Nodes are arbitrarily distributed over a unit area disc
- Destination is arbitrary
- Rate is arbitrary
- Transmission range is arbitrary
- How can we model if a transmission was received successfully by the receiver?
- Two models: Protocol Model, Physical Model

Protocol Model

- Transmission from Xi to Xj is successful if for every node Xk transmitting simultaneously:

where Xi denotes the location of a node, and Δ is the guarding zone specified by the protocol

Physical Model

- - subset of nodes transmitting simultaneously at some time instant over a certain sub-channel
- Pk – power level chosen at node Xk
- A transmission originating at node Xi is successfully received at node Xj if:
- β = minimum signal-to-interference ratio necessary for successful reception
- N = ambient noise power level
- α > 2
- Signal power decays with distance as 1/rα

Transport Capacity of Arbitrary Networks

- Bit-meter = a bit transported a distance of 1m
- used as indicator of a network’s transport capacity
- Protocol Model
- Main result:
- If this capacity is divided between the n nodes, we have:

for each node

- For equidistant destinations, the throughput capacity is:

Upper bound on transport capacity

Assumptions:

- There are n nodes arbitrarily located in a planar disk of unit area
- The network transports lnT bits over T seconds (each node generates bits at rate l bps)
- The average distance between source and destination of a bit is L
- Transmissions are slotted into synchronized slots of length t seconds

Upper bound on transport capacity

- Protocol Model
- Physical Model
- If Pmax/Pmin < bthen

Constructive lower bound on transport capacity

- Theorems and Lemmas that show a scenario where the order of the upper bound presented earlier is achieved
- There exists a placement of nodes and assignment of traffic patterns such that the network can achieve

under Protocol Model, and

under Physical Model

Proofs are in the paper

Outline

- Introduction
- Arbitrary Networks
- Protocol and Physical Model
- Upper bound on transport capacity
- Constructive lower bound on transport capacity
- Random Networks
- Protocol and Physical Model
- Constructive lower bound on throughput capacity
- Possible Implications
- Discussion of tradeoffs
- Conclusion

Random Networks

- n nodes randomly located on the surface of a sphere of area 1 sq. meter (S2), or disk of area 1 sq. meter in the plane
- independently and uniformly distributed
- randomly chosen destination with send rate l(n) bps
- assumptions: all nodes are homogeneous (all transmissions employ the same nominal range or power)
- Two models:
- Protocol model, Physical model

Protocol Model

- A transmission from Xi reaches Xj successfully if for every other Xk transmitting, the following holds:

1.

2.

where Xi represents the location of a node and r is the common range

Physical Model

- - subset of nodes transmitting simultaneously at some time instant over a certain sub-channel
- Let P be the common power level

Then, a transmission from a node Xi is successfully received by node Xj if:

Throughput Capacity of Random Networks

- Feasible throughput
- if a transmission schedule can be achieved such that every node can send l(n) bits/sec on average to its destination node
- depends on the location of nodes (random)
- Result:
- Protocol model:
- Physical model:

Constructive lower bound on throughput capacity

- Goal: show that virtual channel capacity guarantee of each source-destination pair of randomly located nodesis

with probability approaching 1 as for c > 0

- Steps:
- define a Voronoi tessellation of S² where each cell is carefully chosen in relation to the number of nodes
- bound the number of interfering neighbors of a Voronoi cell
- bound the length of an all-cell transmission schedule
- define the routes of a packet in the Voronoi tessellation
- prove that each cell contains at least one node
- calculate the expected routes that pass through a cell and infer the expected traffic of each node

Outline

- Introduction
- Arbitrary Networks
- Protocol and Physical Model
- Upper bound on transport capacity
- Constructive lower bound on transport capacity
- Random Networks
- Protocol and Physical Model
- Constructive lower bound on throughput capacity
- Possible Implications
- Discussion of tradeoffs
- Conclusion

Possible Implications

- Results allow for a perfect scheduling algorithm that knows the location of all nodes and traffic demands, and coordinates the wireless transmissions temporally and spatially to avoid collisions (however, if the nodes are mobile or location information is not available, the capacity can only be smaller)
- As the number of nodes increases, the throughput decreases
- Feasible scenario: If communication occurs only between nearby nodes, the bit rate does not decrease with n
- Scaled distance between source and destination is O(1/sqrt(n)) meters
- Power consumption
- Faster rate of decay of signal power with distance allows greater transport and throughput capacity

Implications – cont.

- Division of labor is possible
- One node in a cell can be designated to relay multi-hop packets, if desired
- Tradeoffs upper bound on throughput
- Conflict between reducing the number of hops and increase spatial concurrency and frequency reuse
- Must reduce r(n) to smallest value possible without losing connectivity

Tradeoffs – cont.

- Arbitrary Networks under the Protocol model
- Constraints that determine the transport capacity to be at most
- The length of routes
- Consumption of two-dimensional area by transmission
- Total number of nodes

Conclusions

- Designers may want to consider designing networks with small number of nodes
- Communication with nearby nodes at constant bit rates can be provided in a dense clusters of nodes, since the source – destination distance shrink as O(1/sqrt(n))

Appendix: A spatial tessellation

- Voronoi tessellation of the surface of the S² sphere
- A Voronoi cell is the set of all points which are closer to ai than to any of the other aj’s
- Adjacent cells – share a common point
- Every node in a cell is within distance r(n) of every node in own cell or adjacent cell
- Interfering neighbors – a point in one cell is within a distance (2+Δ)r(n) of some point in the other cell

Tessellation Properties

- For each e>0, there is a Voronoi tessellation such that Each cell contains a disk of radius e and is contained in a disk of radius 2e
- Every Voronoi cell contains a disk of area

with radius r(n)

- Every Voronoi cell is contained in a disk of radius 2r(n)

Bound on number of interfering neighbors of a cell

- Every cell in Vn has no more than c1 interfering neighbors
- c1 = f(Δ) and grows linearly in (1+Δ)²
- Allows construction of a schedule of bounded length
- Each cell in the tessellation Vn has an opportunity to transmit every 1+c1 slots such that transmission is successful within a r(n) distance from the transmitter (in the Protocol Model)

Download Presentation

Connecting to Server..