Leveraging Big Data to Develop Next Generation Demand Side Management Programs and Energy Regulations Daniel Young, Energy Solutions Mike McGaraghan, Energy Solutions Nate Dewart, Energy Solutions Pat Eilert, PG&E Dan Hopper, SCE
Moneyball Analogy #1 “Some of the scouts still believed they could tell by the structure of a young man’s face not only his character but his future in pro ball.” -Michael Lewis Moneyball. “All I have is the box scores.” -Bill James, Baseball Abstract. Sabermetrics statistician. 2 Conclusion: We can do better with more data, and the data we need is out there.
The Need for Data The development of successful utility programs and energy codes and standards requires a LOT of data: Base-case product performance Tech options for higher efficiency/performance Forecasts of future product performance trends Incremental cost of improvement 3
Example: Where Data Can Help Image source: DOE study: Incorporating Experience Curves in Appliance Standards Analysis 4 More Stringent More Cost Effective Highest NPV? Less Conservative
Case Study: LED Lamps Goal: Ensure minimum performance across several operating parameters for LED lamps: Light color, light quality, efficacy, lifetime, dimmability, etc. Opportunity: LEDs and big data LED technology is rapidly improving, while costs are rapidly decreasing Several existing databases to track product performance Many existing industry forecasts to calibrate against Looking beyond efficiency 5
Moneyball Analogy #2 “A hitter should be measured by his success in that which he is trying to do, and that which he is trying to do is create runs.” -Bill James Baseball Abstract. Sabermetrics statistician. Conclusion: Focus on the right metrics and keep the end goal in mind. 6
2012 Analysis Approach: 700 unique price points were manually collected for over 500 unique lamp models (not new, definitely not big data) Multi-variable regression model to analyze the dataset (a little new) 7 ENERGY STAR? CRI CCT Power Factor Wattage Efficacy Light Output Bulb Shape Dimmability Lifetime
Price Modeling – 2012 Data 8 Note: Results based on online retailer data, which we found to be significantly higher on average than in store prices.
Moneyball Analogy #3 “The power of statistical analysis depends on sample size…a right-handed hitter who has gone two for ten against left-handed pitching, cannot as reliably be predicted to hit.200 against lefties as a hitter who has gone 200 for 1,000.” -Michael Lewis Moneyball. Conclusion: We could use some more data. 9
Next Step: Bringing in Big Data Retailer-based web crawler tool: screen-scraping methods retailer provided APIs (Application Programming Interfaces) Scope of data collection: Nine online retailers 3,000 unique price points 1,000 unique LED lamp models 50 different manufacturers Data collected weekly 10
2012 Data vs 2014 Data 11 Note: 2014 data is refreshed every week
Benefits of Big Data More data -> improvements to the regression analysis: Individual models could be created for each lamp type Additional independent variables analyzed Comparable or improved explanatory power for each model New data is collected each week with minimal effort Ability to monitor real-time performance and price changes Observe trends in performance and price 12
Example Regression Results Best fit model is based on: Lumens Brand Energy Star Qualified Metrics not independently impacting price include: Dimmable Color Temperature CRI Wattage Beam Angle Warranty Length Diameter Efficacy Lumen Maintenance 13
Observed Trends 14
Implications on IMC 15 No more inc cost for CRI?
Back to the Future Key questions for IDSM program development and codes and standards advocacy/evaluation: What’s the baseline performance? How do the best products perform? How is performance changing over time? What’s the incremental cost? 16
Summary Major Benefits Major increase in data volume and accuracy Better data for more effective programs and codes Saves significant time and resources over existing methods Outstanding Issues How to use the data most effectively Linking to product performance databases Inconsistent retailer info and labeling Legality of web- crawling 17
Moneyball Analogy #4 “Statcast, a 3-D tracking system that provides detailed metrics on the locations and movements of the ball, the players, and even the umpires…will proliferate not just through the ranks of all professional sports but to youth sports, affecting everything from how games are taught to the statistical nomenclature of sport” -Billy Beane. July 7, “A Tech-Driven Revolution.” Wall Street Journal Conclusion: The opportunities for big data have only just begun. 18