Download presentation
Presentation is loading. Please wait.
1
Slide 1 © 2008 Warren B. Powell Slide 1 Approximate Dynamic Programming for High-Dimensional Problems in Energy Modeling Ohio St. University October 7, 2009 Warren Powell CASTLE Laboratory Princeton University http://www.castlelab.princeton.edu © 2009 Warren B. Powell, Princeton University
2
Slide 2 Goals for an energy policy model Potential questions »Policy questions How do we design policies to achieve energy goals (e.g. 20% renewables by 2015) with a given probability? How does the imposition of a carbon tax change the likelihood of meeting this goal? What might happen if ethanol subsidies are reduced or eliminated? What is the impact of a breakthrough in batteries? »Energy economics What is the best mix of energy generation technologies? How is the economic value of wind affected by the presence of storage? What is the best mix of storage technologies? How would climate change impact our ability to use hydroelectric reservoirs as a regulating source of power?
3
Slide 3 Goals for an energy policy model Designing energy supply and storage portfolios to work with wind: »The marginal value of wind and solar farms depends on the ability to work with intermittent supply. »The impact of intermittent supply will be mitigated by the use of storage. »Different storage technologies (batteries, flywheels, compressed air, pumped hydro) are each designed to serve different types of variations in supply and demand. »The need for storage (and the value of wind and solar) depends on the entire portfolio of energy producing technologies.
4
Slide 4 Intermittent energy sources Wind speed Solar energy
5
Slide 5 Wind 30 days 1 year
6
Slide 6 Storage Batteries Ultracapacitors Flywheels Hydroelectric
7
Slide 7 Long term uncertainties…. 2010 2015 2020 2025 2030 Tax policy Batteries Solar panels Carbon capture and sequestration Price of oil Climate change
8
Slide 8 Goals for an energy policy model Model capabilities we are looking for »Multi-scale Multiple time scales (hourly, daily, seasonal, annual, decade) Multiple spatial scales Multiple technologies (different coal-burning technologies, new wind turbines, …) Multiple markets –Transportation (commercial, commuter, home activities) –Electricity use (heavy industrial, light industrial, business, residential) –…. »Stochastic (handles uncertainty) Hourly fluctuations in wind, solar and demands Daily variations in prices and rainfall Seasonal changes in weather Yearly changes in supplies, technologies and policies
9
Slide 9 Outline Modeling stochastic resource allocation problems An introduction to ADP ADP and the post-decision state variable A blood management example The SMART energy policy model
10
Slide 10 © 2008 Warren B. Powell Slide 10 A resource allocation model Attribute vectors:
11
Slide 11 © 2008 Warren B. Powell Slide 11 A resource allocation model Modeling resources: »The attributes of a single resource: »The resource state vector: »The information process:
12
Slide 12 © 2008 Warren B. Powell Slide 12 A resource allocation model Modeling demands: »The attributes of a single demand: »The demand state vector: »The information process:
13
Slide 13 © 2008 Warren B. Powell Slide 13 Energy resource modeling The system state:
14
Slide 14 Energy resource modeling The decision variables:
15
Slide 15 Energy resource modeling Exogenous information: Slide 15
16
Slide 16 © 2008 Warren B. Powell Slide 16 Energy resource modeling The transition function Known as the: “Transition function” “Transfer function” “System model” “Plant model” “Model”
17
Slide 17 Energy resource modeling Demands Resources
18
Slide 18 Energy resource modeling t t+1 t+2
19
Slide 19 Energy resource modeling t t+1 t+2 Optimizing at a point in time Optimizing over time
20
Slide 20 Energy resource modeling The objective function »How do we find the best policy? Myopic policies Rolling horizon policies Simulation-optimization Dynamic programming Decision function (policy) State variable Contribution function Finding the best policy Expectation over all random outcomes
21
Slide 21 Outline Modeling stochastic resource allocation problems An introduction to ADP ADP and the post-decision state variable A blood management example The SMART energy policy model
22
Slide 22 Introduction to dynamic programming Bellman’s optimality equation: Assume this is knownCompute this for each state S
23
Slide 23 Introduction to dynamic programming Bellman’s optimality equation: Problem: Curse of dimensionality Three curses State space Outcome space Action space (feasible region)
24
Slide 24 Introduction to dynamic programming The computational challenges: How do we find ? How do we compute the expectation? How do we find the optimal solution?
25
Slide 25 Introduction to ADP Classical ADP »Most applications of ADP focus on the challenge of handling multidimensional state variables »Start with »Now replace the value function with some sort of approximation »May draw from the entire field of statistics/machine learning.
26
Slide 26 Introduction to ADP Other statistical methods »Regression trees Combines regression with techniques for discrete variables. »Data mining Good for categorical data »Neural networks Engineers like this for low-dimensional continuous problems »Kernel/locally polynomial regression Approximations portions of the value function locally using simple functions »Dirichlet mixture models Aggregate portions of the function and fit approximations around these aggregations.
27
Slide 27 Introduction to ADP But this does not solve our problem »Assume we have an approximate value function. »We still have to solve a problem that looks like »This means we still have to deal with a maximization problem (might be a linear, nonlinear or integer program) with an expectation.
28
Slide 28 Outline Modeling stochastic resource allocation problems An introduction to ADP ADP and the post-decision state variable A blood management example The SMART energy policy model
29
Do not use weather report Use weather report Forecast sunny.6 Rain.8 -$2000 Clouds.2 $1000 Sun.0 $5000 Rain.8 -$200 Clouds.2 -$200 Sun.0 -$200 Schedule game Cancel game Rain.1 -$2000 Clouds.5 $1000 Sun.4 $5000 Rain.1 -$200 Clouds.5 -$200 Sun.4 -$200 Schedule game Cancel game Rain.1 -$2000 Clouds.2 $1000 Sun.7 $5000 Rain.1 -$200 Clouds.2 -$200 Sun.7 -$200 Schedule game Cancel game Rain.2 -$2000 Clouds.3 $1000 Sun.5 $5000 Rain.2 -$200 Clouds.3 -$200 Sun.5 -$200 Schedule game Cancel game Forecast cloudy.3 Forecast rain.1 - Decision nodes - Outcome nodes Information Action Information Action State
30
Slide 30 The post-decision state New concept: »The “pre-decision” state variable: Same as a “decision node” in a decision tree. »The “post-decision” state variable: Same as an “outcome node” in a decision tree.
31
Slide 31 The post-decision state An inventory problem: »Our basic inventory equation: »Using pre- and post-decision states:
32
Slide 32 The post-decision state Pre-decision, state-action, and post-decision Pre-decision state State Action Post-decision state
33
Slide 33 The post-decision state Pre-decision: resources and demands
34
Slide 34 The post-decision state
35
Slide 35 The post-decision state
36
Slide 36 The post-decision state
37
Slide 37 The post-decision state Classical form of Bellman’s equation: Bellman’s equations around pre- and post-decision states: »Optimization problem (making the decision): Note: this problem is deterministic! »Expectation problem (incorporating uncertainty):
38
Slide 38 Introduction to ADP We first use the value function around the post-decision state variable, removing the expectation: We then replace the value function with an approximation that we estimate using machine learning techniques:
39
Slide 39 The post-decision state Value function approximations: »Linear (in the resource state): »Piecewise linear, separable: »Indexed PWL separable:
40
Slide 40 The post-decision state Value function approximations: »Ridge regression (Klabjan and Adelman) »Benders cuts
41
Slide 4141 Making decisions Following an ADP policy
42
Slide 4242 Making decisions Following an ADP policy
43
Slide 4343 Making decisions Following an ADP policy
44
Slide 4444 Making decisions Following an ADP policy
45
Slide 45 45 Approximate dynamic programming With luck, the objective function will improve steadily
46
Slide 46 The post-decision state Comparison to other methods: »Classical MDP (value iteration) »Classical ADP (pre-decision state): »Updating around post-decision state: Expectation No expectation
47
Slide 47 Approximate dynamic programming Step 1: Start with a pre-decision state Step 2: Solve the deterministic optimization using an approximate value function: to obtain. Step 3: Update the value function approximation Step 4: Obtain Monte Carlo sample of and compute the next pre-decision state: Step 5: Return to step 1. Simulation Deterministic optimization Recursive statistics
48
Slide 48 Approximate dynamic programming Step 1: Start with a pre-decision state Step 2: Solve the deterministic optimization using an approximate value function: to obtain. Step 3: Update the value function approximation Step 4: Obtain Monte Carlo sample of and compute the next pre-decision state: Step 5: Return to step 1. Simulation Deterministic optimization Recursive statistics
49
Slide 49 Outline Modeling stochastic resource allocation problems An introduction to ADP ADP and the post-decision state variable A blood management example The SMART energy policy model
50
Slide 50 Blood management Managing blood inventories
51
Slide 51 Blood management Managing blood inventories over time t=0 Week 1 Week 2 Week 3 t=1 t=2 t=3 Week 0
52
O-,1 O-,2 O-,3 AB+,2 AB+,3 O-,0 AB+,0 AB+,1 AB+,2 O-,0 O-,1 O-,2 AB+ AB- A+ A- B+ B- O+ O- AB+,0 AB+,1 Satisfy a demandHold
53
AB+,0 AB+,1 AB+,2 O-,0 O-,1 O-,2 AB+,0 AB+,1 AB+,2 AB+,3 O-,0 O-,1 O-,2 O-,3 AB+,0 AB+,1 AB+,2 AB+,3 O-,0 O-,1 O-,2 O-,3
54
AB+,0 AB+,1 AB+,2 O-,0 O-,1 O-,2 AB+,0 AB+,1 AB+,2 AB+,3 O-,0 O-,1 O-,2 O-,3
55
AB+,0 AB+,1 AB+,2 O-,0 O-,1 O-,2 AB+,0 AB+,1 AB+,2 AB+,3 O-,0 O-,1 O-,2 O-,3 Solve this as a linear program.
56
AB+,0 AB+,1 AB+,2 O-,0 O-,1 O-,2 AB+,0 AB+,1 AB+,2 AB+,3 O-,0 O-,1 O-,2 O-,3 Dual variables give value additional unit of blood.. Duals
57
Slide 57 Updating the value function approximation Estimate the gradient at
58
Slide 58 Updating the value function approximation Update the value function at
59
Slide 59 Updating the value function approximation Update the value function at
60
Slide 60 Updating the value function approximation Update the value function at
61
Slide 61 Outline Modeling stochastic resource allocation problems An introduction to ADP ADP and the post-decision state variable A blood management example The SMART energy policy model
62
Slide 62 SMART-Stochastic, multiscale model SMART: A Stochastic, Multiscale Allocation model for energy Resources, Technology and policy »Stochastic – able to handle different types of uncertainty: Fine-grained – Daily fluctuations in wind, solar, demand, prices, … Coarse-grained – Major climate variations, new government policies, technology breakthroughs »Multiscale – able to handle different levels of detail: Time scales – Hourly to yearly Spatial scales – Aggregate to fine-grained disaggregate Activities – Different types of demand patterns »Decisions Hourly dispatch decisions Yearly investment decisions Takes as input parameters characterizing government policies, performance of technologies, assumptions about climate
63
Slide 63 The annual investment problem 2008 2009
64
Slide 64 The hourly dispatch problem Hourly electricity “dispatch” problem
65
Slide 65 The hourly dispatch problem Hourly model »Decisions at time t impact t+1 through the amount of water held in the reservoir. Hour t Hour t+1
66
Slide 66 The hourly dispatch problem Hourly model »Decisions at time t impact t+1 through the amount of water held in the reservoir. Value of holding water in the reservoir for future time periods. Hour t
67
Slide 67 The hourly dispatch problem
68
Slide 68 The hourly dispatch problem 2008 Hour 1 2 3 4 8760 2009 1 2
69
Slide 69 The hourly dispatch problem 2008 Hour 1 2 3 4 8760 2009 1 2
70
Slide 70 SMART-Stochastic, multiscale model 2008 2009
71
Slide 71 SMART-Stochastic, multiscale model 2008 2009
72
Slide 72 SMART-Stochastic, multiscale model 2008 2009 2010 2011 2038
73
Slide 73 SMART-Stochastic, multiscale model 2008 2009 2010 2011 2038
74
Slide 74 SMART-Stochastic, multiscale model 2008 2009 2010 2011 2038 ~5 seconds
75
Slide 75 SMART-Stochastic, multiscale model Use statistical methods to learn the value of resources in the future. Resources may be: »Stored energy Hydro Flywheel energy … »Storage capacity Batteries Flywheels Compressed air »Energy transmission capacity Transmission lines Gas lines Shipping capacity »Energy production sources Wind mills Solar panels Nuclear power plants Amount of resource Value
76
Slide 76 SMART-Stochastic, multiscale model Approximating continuous functions The algorithm performs very fine discretization over a small range of the function which is visited most often.
77
Slide 77 SMART-Stochastic, multiscale model Benchmarking »Compare ADP to optimal LP for a deterministic problem Annual model –8,760 hours over a single year –Focus on ability to match hydro storage decisions 20 year model –24 hour time increments over 20 years –Focus on investment decisions »Comparisons on stochastic model Stochastic rainfall analysis –How does ADP solution compare to LP? Carbon tax policy analysis –Demonstrate nonanticipativity Slide 77
78
Slide 7878 Iterations Percentage error from optimal 0.06% over optimal Benchmarking on hourly dispatch 2.50 2.00 1.50 1.00 0.50 0.00 ADP objective function relative to optimal LP
79
Slide 79 Benchmarking on hourly dispatch Optimal from linear program Reservoir level Demand Rainfall
80
Slide 8080 Benchmarking on hourly dispatch ADP solution Reservoir level Demand Rainfall Approximate dynamic programming
81
Slide 81 Benchmarking on hourly dispatch Optimal from linear program Reservoir level Demand Rainfall Optimal from linear program
82
Slide 82 Benchmarking on hourly dispatch ADP solution Reservoir level Demand Rainfall Approximate dynamic programming
83
Slide 83 © 2009 Warren B. Powell Slide 83 Multidecade energy model Optimal vs. ADP – daily model over 20 years 0.24% over optimal
84
Slide 84 Energy policy modeling Traditional optimization models tend to produce all-or-nothing solutions Cost differential: IGCC - Pulverized coal Pulverized coal is cheaper IGCC is cheaper Investment in IGCC Traditional optimization Approximate dynamic programming
85
Slide 8585 Time period Precipitation Sample paths Stochastic rainfall
86
Slide 8686 Reservoir level Optimal for individual scenarios Time period ADP Stochastic rainfall
87
Slide 87 Energy policy modeling Following sample paths »Demands, prices, weather, technology, policies, … Slide 87 2030 Achieved goal w/ Prob. 0.70 Metric (e.g. % renewable) Need to consider: Finge-grained noise (wind, rain, demand, prices, …) Coarse-grained noise (technology, policy, climate, …) Need to consider: Finge-grained noise (wind, rain, demand, prices, …) Coarse-grained noise (technology, policy, climate, …)
88
Slide 88 Energy policy modeling Policy study: »What is the effect of a potential (but uncertain) carbon tax in year 8? 1 2 3 4 5 6 7 8 9 Year Carbon tax 0
89
Slide 89 Energy policy modeling Renewable technologies Carbon-based technologies No carbon tax
90
Slide 90 Energy policy modeling With carbon tax Carbon-based technologies Renewable technologies Carbon tax policy unknown Carbon tax policy determined
91
Slide 91 Energy policy modeling With carbon tax Carbon-based technologies Renewable technologies
92
Slide 92 Conclusions Capabilities »SMART can handle problems with over 300,000 time periods so that it can model hourly variations in a long- term energy investment model. »It can simulate virtually any form of uncertainty, either provided through an exogenous scenario file or sampled from a probability distribution. »Accurate modeling of climate, technology and markets requires access to exogenously provided scenarios. »It properly models storage processes over time. »Current tests are on an aggregate model, but the modeling framework (and library) is set up for spatially disaggregate problems.
93
Slide 93 Conclusions Limitations »More research is needed to test the ability of the model to use multiple storage technologies. »Extension to spatially disaggregate model will require significant engineering and data. »Run times will start to become an issue for a spatially disaggregate model. »Value function approximations capture the resource state vector, but are limited to very simple exogenous state variations.
94
Slide 94 Outline Modeling stochastic resource allocation problems An introduction to ADP ADP and the post-decision state variable A blood management example The SMART energy policy model Merging machine learning and optimization
95
Slide 95 Merging machine learning and optimization The challenge of coarse-grained uncertainty »Fine-grained uncertainty can generally be modeled as memoryless (even if it is not). »Coarse-grained uncertainty affects what might be called “state of the world.” »The value of a resource depends on the “state of the world.” Is there a carbon tax? What is the state of battery research? Have there been major new oil discoveries? What is the price of oil? Did the international community adopt strict limits on carbon emissions? Has their been advances in our understanding of climate change?
96
Slide 96 Merging machine learning and optimization Modeling the “state of the world” » »We can use powerful machine learning algorithms to overcome these new curses of dimensionality. Instead of one piecewise linear value function for each resource and time period… We need one for each state of the world. There can be thousands of these.
97
Slide 97 Merging machine learning and optimization Strategy 1: Locally polynomial regression »Widely used in statistics »Approximate complex functions locally using simple functions. »Estimate of the function is a weighted sum of these local approximations. »But cannot handle categorical variables.
98
Slide 98 Merging machine learning and optimization Strategy 2: Dirichlet process mixtures of generalized linear models
99
Slide 99 Merging machine learning and optimization Strategy 3: Hierarchical learning models »Estimate piecewise constant functions at different levels of aggregation:
100
Slide 100 Merging machine learning and optimization Next steps: »We need to transition these machine learning techniques into an ADP setting: Can they be adapted to work within a linear or nonlinear optimization algorithm? All three methods are asymptotically unbiased, but this depends on unbiased observations. In an ADP algorithm, observations are biased. We need to design an effective exploration strategy so that the solution does not become stuck. »Other issues Will the methods provide fast, robust solutions for effective policy analysis?
101
Slide 101 © 2009 Warren B. Powell© 2008 Warren B. Powell Slide 101
102
Slide 102
103
Slide 103
104
Slide 104 © 2009 Warren B. Powell Demand modeling Commercial electric demand 7 days
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.