Learning and Leveraging the Relationship between Architecture-Level Measurements and Individual User Satisfaction Alex Shye, Berkin Ozisikyilmaz, Arindam Mallik, Gokhan Memik, Peter A. Dinda, Robert P. Dick, and Alok N. Choudhary Northwestern University, EECS International Symposium on Computer Architecture, June 2008. Beijing, China.
Overall Summary Claim: Any optimization ultimately exists to satisfy the end user Claim: Current architectures largely ignore the individual user Findings/Contributions User satisfaction is correlated to CPU performance User satisfaction is non-linear, application-dependent, and user-dependent We can use hardware performance counters to learn and leverage user satisfaction to optimize power consumption while maintaining satisfaction Justin Ratter: What matters? How is information technology used? End-user meaningful platform performance matters… went on to mention “this is not a typical design point’
Why care about the user? 1 2 3 User-centric applications Architectural trade-offs exposed to the user 3 Optimization opportunity User variation = optimization potential
Performance vs. User Satisfaction Your favorite metric (IPS, throughput, etc.) ? WE
Current Architectures Performance Level ?
Our Goal Performance Level Leverage knowledge for optimization Learn relationship between user satisfaction and hardware performance Leverage knowledge for optimization
Measuring Performance Hardware performance counters are supported on all modern processors Low overhead Non-intrusive WinPAPI interface; 100Hz For each HPC: Maximum Minimum Standard deviation Range Average
User Study Setup IBM Thinkpad T43p Two user studies: Pentium M with Intel Speedstep Supports 6 Frequencies (2.2Ghz -- 800Mhz) Two user studies: 20 users each First to learn about user satisfaction Second to show we can leverage user satisfaction Three multimedia/interactive applications: Java game: A first-person-shooter tank game Shockwave: A 3D shockwave animation Video: DVD-quality MPEG video
First User Study Goal: How: Learn relationship between HPCs and user satisfaction How: Randomly change performance/frequency Collect HPCs Ask the user for their satisfaction rating!
Correlation to HPCs Compare each set of HPC values with user satisfaction ratings Collected 360 satisfaction levels (20 users, 6 frequencies, 3 applications) 45 metrics per satisfaction level Pearson’s Product Moment Correlation Coefficient (r) -1: negative linear correlation, 1: positive linear correlation Strong correlation: 21 of 45 metrics over .7 r value
Correlation to the Individual User HPCs User ID User Satisfaction Combine all user data Fit into a neural network Inputs: HPCs and user ID Output: User satisfaction Observe relative importance factor User more than two times more important than the second-most important factor User satisfaction is highly user-specific!
Performance vs. User Satisfaction User satisfaction is often non-linear User satisfaction is application-specific Most importantly, user satisfaction is user-specific
Leveraging User Satisfaction Observations: User satisfaction is non-linear User satisfaction is application dependent User satisfaction is user dependent All three represent optimization potential! Based on observations, we construct Individualized DVFS (iDVFS) Dynamic voltage and frequency scaling (DVFS) effective for improving power consumption Common DVFS schemes (i.e., Windows XP DVFS, Linux ondemand governor) are based on CPU-utilization
Individualized DVFS (iDVFS) User-aware performance prediction model Predictive user-aware Dynamic Frequency Scaling Building correlation network based on counters stats and user feedback Learning/Modeling Stage Runtime Power Management Hardware counter states User Satisfaction Feedback
iDVFS – Learning/Modeling HPCs User Satisfaction Train per-user and per-application Small training set! Two modifications to neural network training Limit inputs (used two highest correlation HPCs) BTAC_M-average and TOT_CYC-average Repeated trainings using most accurate NN We do per-user, per-application for this work but there can be more (example of Video more app-specific)
iDVFS – Control Algorithm ρ: user satisfaction tradeoff threshold αf: per frequency threshold M: maximum user satisfaction Greedy approach Make prediction every 500ms If within user satisfaction within αfρ of M twice in a row, decrease frequency If not, increase frequency and is αf decreased to prevent ping-ponging between frequency summarize briefly. Details are in paper.
Second User Study Goal: How: Evaluate iDVFS with real users Users randomly use application with iDVFS and with Windows XP DVFS Afterwards, users asked to rate each one Frequency logs maintained through experiments Replayed through National Instruments DAQ for system power
Example Trace- Shockwave iDVFS can scale frequency effectively based upon user satisfaction In this case, we slightly decrease power compared to Windows DVFS
Example- Video iDVFS significantly improves power consumption Here, CPU utilization not equal to user satisfaction
Results – Video No change in user satisfaction, significant power savings
Results – Java Same user satisfaction, same power savings Not strange users, non-monotonic. Few cases where sticking to user loses. In this case, technique combining other techniques (e.g., Vertigo) may be good. Bound by CPU performance. Still, there are many cases where we maintain user satisfaction and save power Same user satisfaction, same power savings Red: Users gave high ratings to lower frequencies Dashed Black: Neural network bad
Results – Shockwave Lowered user satisfaction, improved power Save power in almost every case. A few are unhappier. Overall, people are still happy and we save power. Lowered user satisfaction, improved power Blue: Gave constant ratings during training
Energy-Satisfaction Product Need a metric for tradeoff and this is one possibility; One could argue we improve ESP even for shockwave Slight increase in ESP Benefits in energy reduction outweigh loss in user satisfaction with ESP
Conclusion We explore user satisfaction relative to actual hardware performance Show correlation from HPCs to user satisfaction for interactive applications Show that user satisfaction is generally non-linear, application-, and user-specific Demonstrate an example for leveraging user satisfaction to improve power consumption over 25%
Thank you Questions? For more information, please visit: http://www.empathicsystems.org