MassConf: Automatic Configuration Tuning By Leveraging User Community Information Computer Science Wei Zheng, Ricardo Bianchini, Thu Nguyen Rutgers University
Introduction Large software is complex –May have hundreds of configuration parameters –Selecting proper values is important Configuring software is difficult –Depends on hardware, workload, load intensity, and target –Hard to understand the relationship between them –Large configuration space Existing approaches are far from ideal – Hard to find related parameters – Tuning performance involves many time-consuming experiments 2
MassConf Our approach: vendor helps new users’ configuration process – Collect configurations of existing users for new users to try – Rank configurations to minimize the number of experiments Key observations: – A configuration may work well for many users – Multiple configurations may work well for each user Main challenges: – Ranking configurations from most to least promising configurations – Incomplete info about how well each configuration would work 3
Incomplete Information Configuration SpaceUser Space C2C2 C9C9 C4C4 U5U5 U3U3 U1U1 U7U7 Configuration Space User Space C2C2 C9C9 C4C4 U5U5 U3U3 U1U1 U7U7 4 MassConf wants to rank C4 highly.
MassConf Overview Existing User 1 Existing User 2 Existing User N New User M Vendor New User 1 1. Inform environment and configuration 3. Rank configurations 6. Change ranked list 2. Inform environment and target 4. Provide ranked list of configurations 5. Try configurations in turn (resort to Simplex, if needed) 5
Adaptive Ranking Dynamically adapt to place good configurations at the top Three approaches: slow, fast, and fastest C7C7 C2C2 C3C3 C5C5 C9C9 C1C1 C8C8 C4C4 C6C6 First Configuration Last Configuration C7C7 C2C2 C3C3 C9C9 C1C1 C8C8 C6C6 C7C7 C2C2 C3C3 C9C9 C1C1 C8C8 C6C6 C6C6 Original Slow Fast Fastest C5C5 C5C5 C7C7 C2C2 C3C3 C9C9 C1C1 C8C8 C5C5 C4C4 C4C4 C4C4 1 st User 2 nd User 6
Case Study: Apache Performance Synthetic population of users due to lack of real data Workloads: small files, large files, dynamic CGI scripts A “user” is a combination of workload & performance target 219 existing users – Evenly spread in the space of workloads, intensity, and target 195 new users – Evenly spread but not overlapping with existing users 7
Configuration Popularity Some configurations work well for many users. 8
Popularity vs Meeting Users’ Target Some good configurations are not popular. 9
Evaluation MassConf: Adaptive ranking (low, fast, and fastest) Popularity ranking – the intuitive and obvious approach Simplex – a well-known optimization algorithm Metric: number of experiments to satisfy new users 10
Results MassConf successfully reached all performance targets Adaptive ranking beats popularity-based ranking Adaptive ranking: the faster, the better MassConf reaches more users’ targets than Simplex MassConf is also faster than Simplex # of Exp’sPopularity Ranking MassConf Adapt-slow MassConf Adapt-fast MassConf Adapt-fastest Total Avg Max84 11
Conclusions MassConf uses existing configurations to help new users Case study shows that MassConf efficiently achieves the performance targets MassConf can be applied to other software, types of targets Future works: Multi-tier systems In the paper and TR: bootstrapping; optimized MassConf; more experiments, analysis, and results 12
MassConf Overview Existing User 1 Existing User 2 Existing User N New User M Vendor New User 1 1. Inform environment and configuration 2. Cluster environments 4. Rank configurations 7. Store selected configuration 8. Change ranked list 3. Inform environment and target 5. Provide ranked list of configurations 6. Try configurations in turn (resort to Simplex, if needed) 9. Warn about configuration 13