200 likes | 322 Views
A Framework for Lazy Replication in P2P VoD. Bin Cheng 1 , Lex Stein 2 , Hai Jin 1 , Zheng Zhang 2. 1 Huazhong University of Science & Technology (HUST) 2 Microsoft Research Asia (MSRA) NOSSDAV 2008, Braunschweig, Germany, May 30, 2008. Background. VoD , popular Internet service
E N D
A Framework for Lazy Replication in P2P VoD Bin Cheng1, Lex Stein2, Hai Jin1, Zheng Zhang2 1 Huazhong University of Science & Technology (HUST) 2 Microsoft Research Asia (MSRA) NOSSDAV 2008, Braunschweig, Germany, May 30, 2008
Background • VoD, popular Internet service • -Youtube, Hulu Can P2P help VoD? -Feasibility -Performance improvement • P2P, useful technology • -File sharing, live streaming • -BitTorrent, PPLive GridCastwith caching -36% decrease -43% departure misses Replication in P2P VoD
Motivation 1 Replication algorithms 2 Performance evaluation 3 3 3 Conclusions 4 Outline
Motivation -what does GridCast look like? http://www.gridcast.cn
Motivation -GridCast system overview Hybrid architecture (client-server + P2P) • Tracker: indexes all joined peers • Source Server: stores a complete copy of every video • Peer: fetches chunks from source servers or other peers • Web Portal: provides the video catalog tracker Web portal Source Server
Motivation -trace collection GridCast has been deployed on CERNET since May 2006 • Network (CERNET) • 1,500 Universities, 20 million hosts • Good bandwidth, 2 to 100Mbps to the desktop (core is complicated) • Content • 2,000 videos • 48 minutes on average • 400 to 800Kbps, 610 Kbps on average
Departure misses become a big issue Motivation -trace analysis Classify misses by their causes Chunk X does not hit in the peer cache, Why? • New content • Never fetched by any peer • Peer departed • Fetched by some peers, but all of them are offline • Peer evicted • Fetched by an online peer, but evicted • Can not connect • Cached by some online peer that is not in the neighborhood • Insufficient bandwidth • Cached by some neighbor, but cannot retrieve it 43%
Replication Challenges Chances Short user sessions Depart at any time Unused network resource 72% (DOWN), 81% (UP) Disk space 37% available disk Motivation -challenges and chances Caching is not enough. Can we do better?
When ? Where? What ? Replication -three key questions Framework
Replication –fundamental tradeoff • Benefit: • Reduce departure misses • Reduce some eviction misses if the cache is not full • Cost: • Increase network traffic • Increase bandwidth misses • Increase some eviction misses if the cache is full
x x Replication -eager replication • Replicate all missed chunks • Use all of unused bandwidth A neighborhood B C
A the increasing of chunk requests the increasing of online time B C Replication -lazy replication • Based on two predictors • Peer departure predictor • Chunk request predictor • Lazy-oracle and lazy-simple • Lazy factor • How much remained bandwidth can be used • Target peer selection • Random, Sequentially, File locality first neighborhood
Replication -peer departure predictor Based on the observation of online time -50% of user session, less than 10 minutes -the peer with higher online time is likely to stay longer Simple departure predictor -online time <= 10 minutes, leave -online time > 10 minutes, stay
popularity 8 6 4 3 t history future 4 3 2 1 now Replication -chunk request predictor Chunks requested recently are more likely to be requested earlier in the near future Simple chunk request predictor -use the chunk access history in the last several hours -give higher weight to the recent requests
Performance Evaluation -simulation setup • Trace-driven • 1GB • Realized bandwidth • Last 1 hour history for chunk request predictor • 10 minutes interval for peer departure predictor • Use the existing neighborhood • Metrics • Benefit: decrease of chunks served by the source servers • Cost: increase of chunks replicated between peers • Efficiency: Benefit / Cost
File locality first achieves the best performance Performance Evaluation -exploring configurations
Lower lazy factor is better Performance Evaluation -lazy factor -More chunks are delayed to be replicated when the peer leaves -Smaller lazy factor, more efficient
Performance Evaluation -comparison • Lazy-simple is close to lazy-oracle, in terms of benefits • Lazy-simple is better than eager, in terms of efficiency • Lazy-simple, 15% decrease of server load
1 2 3 We identify that departure miss is a major issue for P2P VoD with caching With two simple predictors, lazy replication can decrease server load by 15% Lazy replication is more efficient than eager replication Conclusions
Thank you!Any questions…… Bin Cheng, Lex Stein, Hai Jin and Zheng Zhang HUST and MSRA Huazhong University of Science & Technology Microsoft Research Asia NOSSDAV 2008, Braunschweig, Germany