Download presentation
Presentation is loading. Please wait.
Published byMagdalen Sara Welch Modified over 9 years ago
2
Example: Rumor Performance Evaluation Andy Wang CIS 5930 Computer Systems Performance Analysis
3
Motivation Optimistic peer replication is popular –Intermittent connectivity –Availability of replicas for concurrent updates –Convergence and correctness for updates Example: Rumor, Coda, Ficus, Lotus Notes, Outlook Calendar, CVS 2
4
Background Replication provides high availability Optimistic replication allows immediate access to any replicated item, at the risk of permitting concurrent updates Reconciliation process makes replicas consistent (i.e., two replicas for peer-to- peer) 3
5
Background Continued Conflicts occur when different replicas of the same file are updated subsequent to the previous reconciliation 4
6
Optimistic Replication Example 5 Log on Desktop 10:00Update 10:25Update Log on Portable 10:00Update 10:25Update connected Log on Desktop 10:00Update 10:25Update 10:40Update Log on Portable 10:00Update 10:25Update 10:51 Update disconnected
7
Example Continued 6 Log on Desktop 10:00Update 10:25Update 10:40Update Log on Portable 10:00Update 10:25Update 10:51 Update disconnected Log on Desktop 10:00Update 10:25Update 10:40Update 10:51Update Log on Portable 10:00Update 10:25Update 10:40Update 10:51 Update connected Run reconciliation Detect a conflict Propagate updates
8
Goal Understand the cost characteristics of the reconciliation process for Rumor 7
9
Services Reconciliation –Exchange file system states –Detect new and conflicting versions If possible, automatically resolve conflicts Else, prompt user to resolve conflicts –Propagate updates 8
10
Outcomes Two reconciled replicas become consistent for all files and directories Some files remain inconsistent and require user to resolve conflicts 9
11
Metrics Time –Elapsed time From the beginning to the completion of a reconciliation request –User time (time spent using CPU) –System time (time spent in the kernel) Failure rate –Number of incomplete reconciliations and infinite loops (none observed) 10
12
Metrics not Measured Disk access time –Require complex instrumentations E.g., buffering, logging, etc. Network and memory resources –Not heavily used Correctness –Difficult to evaluate 11
13
Monitor Implementation 12 Spool-to-dump Recon ScannerRfindstoredRreconServer Perl library C ++ Reconciliation Process Top-level Perl time command
14
Parameters System parameters –CPU (speed of local and remote servers) –Disk (bandwidth, fragmentation level) –Network (type, bandwidth, reliability) –Memory (size, caching effects, speed) –Operating system (type, version, VM management, etc.) 13
15
Parameters (Continued) Workload parameters –Number of replicas –Number of files and directories –Number of conflicts and updates –Size of volumes (file size) 14
16
Workloads Update characteristics extracted from Geoff Kuenning’s traces 15 File access Read- only access Read-write access Nonshared accessShared access Read access Write access 2-way sharing3+way sharing Read access Write access Read access Write access
17
Experimental Settings Machine model: Dell Latitude XP CPU: x486 100 MHz RAM: 36MB Ethernet: 10Mb Operating system: Linux 2.0.x File system: ext3 16
18
Experimental Settings Should have documented the following as well –CPU: L1 and L2 cache sizes –RAM: Brand and type –Disk: brand, model, capacity, RPM, and the size of on-disk cache –File system version 17
19
Experimental Design 2 5 5 full factorial design Linear regression or multivariate linear regression to model major factors Target: 95% confidence interval 18
20
2 5 5 Full Factorial Design Number of replicas: 2 and 6 Number of files: 10 and 1,000 File size: 100 and 22,000 bytes Number of directories: 10 and 100 Number of updates: 10 and 450 –Capped at 10 updates for 10 files Number of conflicts: 0 /* typical */ 19
21
2 5 5 Full Factorial Analysis Experiment errors < 3% 20
22
Variation of Effects All major effects significant at 95% confidence interval 21
23
Residuals vs. Predicted Time Clusters caused by dominating effects of files 22
24
Residuals vs. Experiment Numbers Residuals show homoscedasticity, almost 23
25
Quantile-Quantile Plot Residuals are normally distributed, almost 24
26
Multivariate Regression Number of replicas: 2 Number of files: 4 levels, 10-600 File size: 22,000 bytes Number of directories: 4 levels, 10-60 Number of updates: 0 Number of conflicts: 0 /* typical */ Number of repetitions: 5 per data point 25
27
Multivariate Regression Experiment errors < 7% All coefficients are significant 26
28
Residuals vs. Predicted Time Elapsed time shows a bi-model trend User time shows an exponential trend 27
29
Residuals vs. Experiment Numbers Not so good for elapsed time and user time 28
30
Quantile-Quantile Plot Residuals are not normally distributed for elapsed time and user time 29
31
Log Transform (User Time) ANOVA tests failed miserably 30
32
Residual Analyses (User Time) No indications that transforms can help… 31
33
Possible Explanations i-node related factors –Number of files per directory block –Crossing block boundary may cause anomalies Caching effects –Reboot needed across experiments 32
34
Linear Regression Number of files: 100, 150, 200, 250, 252, 253, 300, 350, 400, 450 –Test for the boundary-crossing condition as the number of files exceeds one block –Note that Rumor has hidden files Number of repetitions: 5 per data point Flush cache (reboot) before each run 33
35
Linear Regression R 2 > 80% All coefficients are significant 34
36
Residuals vs. Predicted Time Elapsed time shows a bi-model trend User time shows an exponential trend 35
37
Residuals vs. Experiment Numbers Elapsed time shows a rising bi-modal trend –Randomization of experiments may help 36
38
Quantile-Quantile Plot Error residuals for elapsed time is not normal –Perhaps piece-wise normal 37
39
Possible Explanations i-node related factors: No Caching effects: No Hidden factors: Maybe Bugs: Maybe 38
40
Conclusion Identified the number of files as the dominating factor for Rumor running time Observed the existence of an unknown factor in the Rumor performance model 39
41
40 White Slide
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.