1 / 40

Example: Rumor Performance Evaluation

Example: Rumor Performance Evaluation. Andy Wang CIS 5930-03 Computer Systems Performance Analysis. Motivation. Optimistic peer replication is popular Intermittent connectivity Availability of replicas for concurrent updates Convergence and correctness for updates

bairn
Download Presentation

Example: Rumor Performance Evaluation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Example: Rumor Performance Evaluation Andy Wang CIS 5930-03 Computer Systems Performance Analysis

  2. Motivation • Optimistic peer replication is popular • Intermittent connectivity • Availability of replicas for concurrent updates • Convergence and correctness for updates • Example: Rumor, Coda, Ficus, Lotus Notes, Outlook Calendar, CVS

  3. Background • Replication provides high availability • Optimistic replication allows immediate access to any replicated item, at the risk of permitting concurrent updates • Reconciliation process makes replicas consistent (i.e., two replicas for peer-to-peer)

  4. Background Continued • Conflicts occur when different replicas of the same file are updated subsequent to the previous reconciliation

  5. Log on Desktop 10:00 Update 10:25 Update 10:40 Update Log on Portable 10:00 Update 10:25 Update 10:51 Update disconnected Optimistic Replication Example Log on Portable 10:00 Update 10:25 Update Log on Desktop 10:00 Update 10:25 Update connected

  6. Log on Desktop 10:00 Update 10:25 Update 10:40 Update Log on Portable 10:00 Update 10:25 Update 10:51 Update disconnected Log on Desktop 10:00 Update 10:25 Update 10:40 Update 10:51 Update Log on Portable 10:00 Update 10:25 Update 10:40 Update 10:51 Update • connected • Run reconciliation • Detect a conflict • Propagate updates Example Continued

  7. Goal • Understand the cost characteristics of the reconciliation process for Rumor

  8. Services • Reconciliation • Exchange file system states • Detect new and conflicting versions • If possible, automatically resolve conflicts • Else, prompt user to resolve conflicts • Propagate updates

  9. Outcomes • Two reconciled replicas become consistent for all files and directories • Some files remain inconsistent and require user to resolve conflicts

  10. Metrics • Time • Elapsed time • From the beginning to the completion of a reconciliation request • User time (time spent using CPU) • System time (time spent in the kernel) • Failure rate • Number of incomplete reconciliations and infinite loops (none observed)

  11. Metrics not Measured • Disk access time • Require complex instrumentations • E.g., buffering, logging, etc. • Network and memory resources • Not heavily used • Correctness • Difficult to evaluate

  12. Monitor Implementation Reconciliation Process • Top-level Perl time command Perl library Spool-to-dump Recon Spool-to-dump C++ Scanner Rfindstored Rrecon Server

  13. Parameters • System parameters • CPU (speed of local and remote servers) • Disk (bandwidth, fragmentation level) • Network (type, bandwidth, reliability) • Memory (size, caching effects, speed) • Operating system (type, version, VM management, etc.)

  14. Parameters (Continued) • Workload parameters • Number of replicas • Number of files and directories • Number of conflicts and updates • Size of volumes (file size)

  15. Workloads • Update characteristics extracted from Geoff Kuenning’s traces

  16. Experimental Settings • Machine model: Dell Latitude XP • CPU: x486 100 MHz • RAM: 36MB • Ethernet: 10Mb • Operating system: Linux 2.0.x • File system: ext3

  17. Experimental Settings • Should have documented the following as well • CPU: L1 and L2 cache sizes • RAM: Brand and type • Disk: brand, model, capacity, RPM, and the size of on-disk cache • File system version

  18. Experimental Design • 255 full factorial design • Linear regression or multivariate linear regression to model major factors • Target: 95% confidence interval

  19. 255 Full Factorial Design • Number of replicas: 2 and 6 • Number of files: 10 and 1,000 • File size: 100 and 22,000 bytes • Number of directories: 10 and 100 • Number of updates: 10 and 450 • Capped at 10 updates for 10 files • Number of conflicts: 0 /* typical */

  20. 255 Full Factorial Analysis • Experiment errors < 3%

  21. Variation of Effects • All major effects significant at 95% confidence interval

  22. Residuals vs. Predicted Time • Clusters caused by dominating effects of files

  23. Residuals vs. Experiment Numbers • Residuals show homoscedasticity, almost

  24. Quantile-Quantile Plot • Residuals are normally distributed, almost

  25. Multivariate Regression • Number of replicas: 2 • Number of files: 4 levels, 10-600 • File size: 22,000 bytes • Number of directories: 4 levels, 10-60 • Number of updates: 0 • Number of conflicts: 0 /* typical */ • Number of repetitions: 5 per data point

  26. Multivariate Regression • Experiment errors < 7% • All coefficients are significant

  27. Residuals vs. Predicted Time • Elapsed time shows a bi-model trend • User time shows an exponential trend

  28. Residuals vs. Experiment Numbers • Not so good for elapsed time and user time

  29. Quantile-Quantile Plot • Residuals are not normally distributed for elapsed time and user time

  30. Log Transform (User Time) • ANOVA tests failed miserably

  31. Residual Analyses (User Time) • No indications that transforms can help…

  32. Possible Explanations • i-node related factors • Number of files per directory block • Crossing block boundary may cause anomalies • Caching effects • Reboot needed across experiments

  33. Linear Regression • Number of files: 100, 150, 200, 250, 252, 253,300, 350, 400, 450 • Test for the boundary-crossing condition as the number of files exceeds one block • Note that Rumor has hidden files • Number of repetitions: 5 per data point • Flush cache (reboot) before each run

  34. Linear Regression • R2 > 80% • All coefficients are significant

  35. Residuals vs. Predicted Time • Elapsed time shows a bi-model trend • User time shows an exponential trend

  36. Residuals vs. Experiment Numbers • Elapsed time shows a rising bi-modal trend • Randomization of experiments may help

  37. Quantile-Quantile Plot • Error residuals for elapsed time is not normal • Perhaps piece-wise normal

  38. Possible Explanations • i-node related factors: No • Caching effects: No • Hidden factors: Maybe • Bugs: Maybe

  39. Conclusion • Identified the number of files as the dominating factor for Rumor running time • Observed the existence of an unknown factor in the Rumor performance model

  40. White Slide

More Related