Modeling data transfer in Content-Centric Networks

G. Carofiglio (Bell Labs) joint work with M. Gallo (Orange labs), L.Muscariello (Orange labs), D.Perino (Bell labs) Modeling data transfer in Content-Centric Networks Towards the definition of transport control mechanisms for CCN

Content-centric networking by Jacobson et al. PARC, USA 2009-on Named-Data Networking (NDN) NFS project 2010-on Content-based Networking by Carzaniga et al. Univ. of Colorado (2001-2004) Data Oriented Network Architecture (DONA) by Shenker et al. UC Berkley, 2007 Publish/Subscribe Internet Routing Paradigm 2009-on (PSIRP-PURSUIT EU project) Network of information (NetInf) 4WARD-SAIL EU projects 2008-on Content-Oriented Networking: a New Experience for Content Transfer (CONNECT), French project (2010-on) Content-centric networking proposals

Content Centric Networking CCN advantages Today’s Internet Before… …After

Content names and packets Content items are identified by a unique name and are split into self-identified chunks (packets) CCN names are hierarchically structured to allow aggregation of routing and forwarding table entries. E.g. /BellLabs/ICN/talks/reality.avi/chunk0 CCN packets are of two types:

CCN Communication model imdb.com www.imdb.com/title/tt12242 www.imdb.com/title/tt12242 • Interests are forwarded along named-based routes (RIB/FIB) • Data are retrieved from the first hitting cache (Content Store) • Chunks of a given content may be retrieved from multiple locations • Interest aggregation on a pending interest (PIT)

Name-based Routing Routing can be done in a similar fashion to today’s IP routing: • Routers announce “name prefixes” covering data they are willing to serve • Announcements are propagated through the network via a routing protocol • Every router builds its Forwarding Information Base (FIB) Component-wise longest prefix match forwarding based on FIB information

CCN vs IP Forwarding CCN IP • CCN FIB (Forwarding Information Base) is used to forward Interest packets toward potential sources. Almost identical to IP FIB. • Content Store enhances buffer memory in an IP router and has a caching function: in-transit Data are stored in order to serve future requests and replaced only when the cache is full. • PIT (Pending Interest table) keeps track of forwarded Interests. Only one entry per “similar” requests is kept. This entry is used to forward Data packets toward interested consumers.

CCN node operations Interest packet Lookup Content Store Lookup+Update(Add/Change entry) Pending Interest Table (PIT) Lookup Forwarding Information Base (FIB) Forwarding Data packet Lookup+Update(Remove entry) Pending Interest Table (PIT) Store in Content Store Forwarding Additional operations may be required to verify packets

CCN transport principles receiver based Receiver controls Interest rate for a content retrieval based on Data rate (window-based controller), performs loss recovery and congestion control. natively multipath Interest may be forwarded along different routes according to different forwarding strategies. Chunks of the same content may be retrieved from different locations. coupled with in-path caching The sender endpoint does not exist. On a path between user and a permanent copy of the content every intermediate node can send the data in response to an Interest.

CCN Data Transfer Model

Goal Performance evaluation of the CCN transport paradigm analytically characterize average throughput/delivery time account for the interplay between receiver-driven transport layer and in-network caching Explicit data transfer rate formula To develop an analytical tool for the design of a CCN window-based flow control network dimensioning tool

System description M different content items organized in K popularity classes, m=M/K in each class Content popularity distribution: Zipf Average content size  [chunks] Content store size x [chunks] Shortest path routing Link delays Topologies (a) linear topology (b) binary tree No bandwidth limitations

Virtual round trip time Miss probability for class k at node i Round trip delay at node i VRTT represents the average time that elapses between the expression of an interest and data delivery Similarly to RTT for TCP, VRTT is the average distance in time between the user and the requested data

Request Process Two level request process modeled through a Markov Modulated Rate Process (MMRP) VRTT is assumed to have converged to a stationary value. Exp(1/k) k=q(k) VRTTk Geo() Exp(1/)

Related work CACHING MODELS single LRU cache dynamics (King 71, Flajolet 92) miss probability exact formulae (computational expensive) combinatorial approaches (Coffman 99, Starobinski-Tse 01) probabilistic models: Jelenkovic (92-08) asymptotics of miss probability for large caches closed-form characterization under Poisson content request arrival process approximation of miss process Kurose-Towsley 09 extends single cache model in Dan-Towsley 90 to the network case Numerical approximation All previous models are at content level, most ignore request correlation.

Single cache asymptotics Stationary miss probability where Miss rate The output process is a renewal process that we approximate with a MMRP (as in Jelenkovic 08) with intensity μk at content level. lk mk

Sketch of the proof • A request for chunk i generates a miss at tnwhen more than x different chunks are requested after its previous request at tn-1 • Thanks to the memory-less property of the Poisson process (content level) and of the geometric size distribution (chunk level), the # of different chunk requests in (tn-1, tn) is independent from that in (tn-2, tn-1), (where {tn} is the miss sequence for class k ). • The distribution of the # of requests in between two consecutive requests for the same chunk can be computed from that of the # of requests among two subsequent requests for any two chunks of class k.

Model/Simulation comparison: miss probability We developed an ad hoc C++ event-driven simulator at chunk level Parameters: M= 20000, K=400, m=50, =2, chunk size= 10KB, =690 chunks (6.9MB), x=100k÷400k chunks (1GB÷4GB), =40 content/s, W=1, no filtering.

Request aggregation We assume a timeout of Δ sec. for PIT entries. In practice, requests are ‘filtered’ on a time interval equal to min(Δ, RVRTTk(i)), where RVRTT denotes the residual VRTT at node i for class k. Note that if a chunk request is filtered, every following request for chunks of the same content is equally filtered The filtering probability at the first cache is

Network of caches : linear topology Under the assumption of a MMRP output process at the first cache and w/o filtering, the miss probability at node i is where xi is the cache size at node i. In presence of request filtering,

Network of caches: binary tree For a binary tree topology and in absence of request filtering, where pk(i)=f(qk(i)), and • The miss rate at node i is where pfilt,k(1) accounts for the probability to aggregate requests for the same content in the PIT.

Miss probabilities in the binary tree Parameters: M= 20000, K=400, m=50, =0.7, chunk size= 10KB, =690 chunks (6.9MB), x=200000 (2GB), =40 content/s, W=1, no filtering. RTT=2ms at each hop.

Miss rates with and without filtering Request rate reduction up to 20% at 3rd node with filtering.

Average throughput The miss probabilities contribute to the definition of the VRTT for contents in class k. The average throughput in steady state is Letting W vary over time we can define a control over the interest rate such to achieve a target average throughput Average delivery time: Tk= /Xk The throughput formula is a powerful tool for cache sizing under throughput or delivery time guarantees.

VRTT and Throughput The filtering does not have a large impact of VRTT/throughput values, but it helps reduce overall request traffic. Throughput gain due to parallel downloading is more important for most popular classes due to a smaller VRTT

Dimensioning and Trade-offs The model can be used as a tool to quantify network cost to guarantee a certain performance or predict end user performance under a given per-service resource allocation. Traffic/Storage capacity trade-off Performance/Network cost trade-off Parameters: Binary tree, same parameters as before except =2.3 (YouTube, Daily Motion)

Model extension to the case of finite downlink bandwith in CCN

Bandwidth sharing in CCN N(t) D I I I I I I I I D D D D D D D D D D D D D D D D D D Repository t VRTT(t)

Max-min bandwidth allocation every route is characterized by a number of parallel transfers document requests arrive according to a Poisson process of rate transfers in the same route perfectly share available bandwidth transfers in different routes share bandwidth in the max-min sense a transfer gets a rate from route when capacity constraints there exists a bottleneck for route that determines the fair rate

Impact on user’s performance A given data transfer makes use of multiples sub-routes according to the miss probabilities the delay to retrieve data from cache i is According to the VRTT formula we obtain The average data delivery time being the traffic load on link i

Linear network Case I Case II Bottleneck (C1 ,C2, C3 )=(10,20,30)Mbps (C1 ,C2, C3 )=(30,10,20)Mbps Content items: M=20k different files, K=2000 classes, packets of size 10kB, packets (90% of the most popular files in the catalog)

Tree-like hierarchical network branch 1 branch 2 Symmetrical 20Mbps 20Mbps 1000 20Mbps 1000 30Mbps 5000 30Mbps 30Mbps Asymmetrical • Symmetrical case: • Branch 1 = Branch 2 Case III • Asymmetrical case: • Branch 1= Case III • Branch 2=Case IV

ICP: an Interest Control Protocol for Content-Centric Networking

Goal Definition of a CCN transport control mechanism We design a receiver-driven Interest Control Protocol for CCN (ICP), whose definition still lacks in literature. Efficient and Fair use of network resources We prove that ICP realizes the fair and fully efficient resource sharing described in our previous work we provide an analytical characterization of average rate, expected data transfer delay and queue dynamics in steady state on a single and multi-bottleneck network topology.

Interest Control Protocol • AIMD window-based interest control • interest Window increase • W is increased by (default = 1)

Interest Control Protocol (cont’d) • interest Window decrease • each expressed Interest, is associated with a timer • timer expiration congestion detection • W is multiplied by a decreasing factor (0.5 by default) • no more than one decrease every

Single Bottleneck n Content Repository • C link capacity • R constant propagation delay • nnumber of parallel download • queue occupancy • Interest window size • receiver rate and • , , average stationary values of X(t) ,Q(t) and W(t) • Evolution equations C,R

Single bottleneck example = 10ms, C=100 Mbps,R~0 • Fair sharing of available bandwidth among active flows • Frequency of oscillations is proportional to the number of active flows n The steady state solution is a periodical limit cycle of duration with time averages

number of hops Multi-bottleneck scenario down-link capacity of link i repository • cache size at hop i, in packets • round trip delay between user and node i • miss probability for class k at node i • we define VRTTk is the average time that elapses between the expression of a query and data delivery :

Multi-bottleneck equations System dynamics are described by: where

Stationary Solution • the solution of the previous equations is a limit cycle of duration and time average: • where is the max-min fair rate associated to the route i

Two bottlenecks example σ = 5000 chunks, α = 1.7, x(i) = 50.000 chunks, M = 20.000 files, K = 5 classes = 4ms (C1 ,C2)=(100,50)Mbps • total queuing delay is bounded by

Simulation results σ = 5000 chunks, α = 1.7, x(i) = 50.000 chunks, M = 20.000 files, K = 5 classes Case II fixed , (C1 ,C2, C3 )=(100,40,100)Mbps A fixed timer value results in an inefficient regime

Adaptive Interest Timer • must be larger than RTTmin • RTTmin is the min packet delivery time • ICP estimates the round trip time : • - delivery time estimation at every data packet reception • - ignore packets requested more than once • - RTTmin and RTTmax over an history of samples (e.g. last 20 packets) • initial value of is 10 ms and if there is one sample RTTmax = 2 * RTTmin

A three links example σ = 5000 packets, α = 1.7, x(i) = 50.000 packets, M = 20.000 files, K = 4.000 classes, R ≈ 0, l=1 Content/s , infinite buffer size Case I (C1 ,C2, C3 )=(50,100,100)Mbps Case II (C1 ,C2, C3 )=(100,40,100)Mbps • converges to a value slightly larger of , function of the popularity class

«Modeling data transfer in Content-centric networking », G.Carofiglio, M.Gallo, L.Muscariello, D.Perino, in Proc. of IEEE/ACM ITC 2011. «Bandwidth and Storage Sharing in Information-Centric Networks », G.Carofiglio, M.Gallo, L.Muscariello, in Proc. of ACM ICN Sigcomm 2011. «Experimental evaluation of storage management in Content-Centric Networking », G. Carofiglio, V. Gehlen, D. Perino, in Proc. of IEEE ICC 2011. «ICP: Interest Control Protocol in Content-Centric Networking », G. Carofiglio, M.Gallo, L.Muscariello, in Proc. of IEEE Infocom Nomen 2012. «Evaluating per-application storage management in content-centric networks»,G.Carofiglio, M.Gallo, L.Muscariello, D.Perino, under submission. http://www.anr-connect.org References

Research Challenges & Future Directions in CCN transport management • Multipath routing & interest forwarding strategies : • the receiver is the unique end-point in CCN: receiver-driver transport protocols • load balancing is fully distributed into the network, no notion of end to end connection design of optimal (min delay, min network cost) interest forwarding with no or little state per FIB entry per node • Traffic management & congestion control • different notion of fairness among applications • Interest shaping/discarding • overload control (proactive congestion detection, back pressure) • coupled receiver and hop by hop congestion control

Research Challenges & Future Directions in CCN storage management • Implicit/Lightweight cache coordination techniques (Probabilistic caching, …) • Per-application caching • Strategy layer definition (interaction with forwarding, multi-path routing, multicast) • Dynamic catalog/content popularity

Questions

Modeling data transfer in Content-Centric Networks

Modeling data transfer in Content-Centric Networks

Presentation Transcript

Modeling Data-Centric Routing in Wireless Sensor Networks

Data Centric Issues

Challenges in the Design and Evaluation of Content -Centric Networks

Content-based Routing for Information Centric Networks

Secure Content Delivery in Information-Centric Networks: Design, Implementation, and Analyses

PIT Overload Analysis in Content Centric Networks

Data-Centric Security

Interference Centric Wireless Networks

Parallelizing FIB Lookup in Content Centric Networking

User-Centric Data Dissemination in Disruption Tolerant Networks

Data-Centric Storage in Sensornets

Data-Centric Storage in Sensor Networks With GHT

Data centric Storage In Sensor networks

Decomposing Data-Centric Storage Query Hot-Spots in Sensor Networks

Data-centric Networking Through Adaptive Content-based Routing

Data-Centric Security

Information-Centric Networks

Data-Centric Active Querying in Sensor Networks: The ACQUIRE Project

Beyond 2020 Content Centric Networking

Data-centric XML

Content Partners and Data Transfer

Data Architecture, Modeling, and Networks