Amr Ayad, Ziad Shawwash, and Alaa Abdalla

Slides:



Advertisements
Similar presentations
BC SUPPLY GAP: A REDEFINITION OF SELF- SUFFICIENCY 32 nd USAEE/IAEE Conference, Anchorage AK July 29, 2013.
Advertisements

1 An Adaptive GA for Multi Objective Flexible Manufacturing Systems A. Younes, H. Ghenniwa, S. Areibi uoguelph.ca.
CROWN “Thales” project Optimal ContRol of self-Organized Wireless Networks WP1 Understanding and influencing uncoordinated interactions of autonomic wireless.
Dynamic Programming.
Improving Transmission Asset Utilization through Advanced Mathematics and Computing 1 Henry Huang, Ruisheng Diao, Shuangshuang Jin, Yuri Makarov Pacific.
Pradeep Varakantham Singapore Management University Joint work with J.Y.Kwak, M.Taylor, J. Marecki, P. Scerri, M.Tambe.
Applications of Stochastic Programming in the Energy Industry Chonawee Supatgiat Research Group Enron Corp. INFORMS Houston Chapter Meeting August 2, 2001.
Slide 1 Harnessing Wind in China: Controlling Variability through Location and Regulation DIMACS Workshop: U.S.-China Collaborations in Computer Science.
© 2003 Warren B. Powell Slide 1 Approximate Dynamic Programming for High Dimensional Resource Allocation NSF Electric Power workshop November 3, 2003 Warren.
1 Hybrid Agent-Based Modeling: Architectures,Analyses and Applications (Stage One) Li, Hailin.
Multiagent Planning with Factored MDPs Carlos Guestrin Daphne Koller Stanford University Ronald Parr Duke University.
INTRODUCTION  British Columbia has committed to become energy self-sufficient by 2016 and generate 3,000 MWh additional (insurance) energy.  Currently.
Preliminary Analysis of the SEE Future Infrastructure Development Plan and REM Benefits.
Energy arbitrage with micro-storage UKACC PhD Presentation Showcase Antonio De Paola Supervisors: Dr. David Angeli / Prof. Goran Strbac Imperial College.
Distributed control and Smart Grids
A Framework for Distributed Model Predictive Control
Richard Patrick Samples Ph.D. Student, ECE Department 1.
Power System Economics Daniel Kirschen. Money © 2012 D. Kirschen & University of Washington1.
6, rue du Général Clergerie Paris – France Tel: +33-(0) Fax: ~ Michel COLOMBIER IDDRI Paris Impacts and Adaptation.
CUHK Learning-Based Power Management for Multi-Core Processors YE Rong Nov 15, 2011.
Information Theory for Mobile Ad-Hoc Networks (ITMANET): The FLoWS Project Competitive Scheduling in Wireless Networks with Correlated Channel State Ozan.
El Gallo Hydroelectricity Project PDD Analysis
Algorithmic, Game-theoretic and Logical Foundations
BY: A. Mahmood, M. N. Ullah, S. Razzaq, N. Javaid, A. Basit, U. Mustafa, M. Naeem COMSATS Institute of Information Technology, Islamabad, Pakistan.
NON-TREATY STORAGE AGREEMENT “Introduction to Operations and the Non Treaty Storage Scenarios” Presenter: Jim Gaspard.
1 19 th World Energy Congress – 2004 Round Table 1 – Non Fossil Fuels: Will They Deliver? Jerson Kelman President, Brazilian Water Agency - ANA.
1 Seema Thakur (st107641) Advisor: Dr. Weerakorn Ongsakul Optimal Generation Scheduling of Cascaded Hydro-thermal and Wind Power Generation By Particle.
Hydro Power – the history and the future. Robyn Hammond, School of Environmental Sciences, University of East Anglia. 24 th February 2005.
DEPARTMENT/SEMESTER ME VII Sem COURSE NAME Operation Research Manav Rachna College of Engg.
Learning for Physically Diverse Robot Teams Robot Teams - Chapter 7 CS8803 Autonomous Multi-Robot Systems 10/3/02.
Modeling with WEAP University of Utah Hydroinformatics - Fall 2015.
Assoc. Prof. Dr. Tarkan Erdik
WG3 Flexible Generation
RoboCup: The Robot World Cup Initiative
Keep the Adversary Guessing: Agent Security by Policy Randomization
Organizing to Implement Diversification
Analysis of the Effects of a Flexible Ramping Ancillary Service Product on Power System Operations Ibrahim Krad Eduardo Ibanez Erik Ela.
GSR022: Review of Security and Economy Required Transfer Conditions
Introduction to Load Balancing:
UBC/UW Student Symposium on Hydrology & WR Friday, Sept28th, 2012
Optimization of multifluid microgrids, the REIDS project in Singapore
Renewable energy supply chains
Combined operation of different power plants PREPARED BY : Priyanka Grover Btech (EE) SBSSTC,FZR.
Civil Engineering/ Hydrotechnical Group
A SEMINAR ON HYBRID POWER SYSTEM
The Management of Renewable Energy
EE5900: Cyber-Physical Systems
Chapter 1 The Systems Development Environment
Parallel Programming in C with MPI and OpenMP
Lecture 1 Economic Analysis and Policies for Environmental Problems
Applications of Optimization
Multi-Agent Exploration
Opportunities in the Changing Energy System
Security of supply - deriving a winter energy standard using the DOASA model EPOC Winter Workshop September 2018.
A sequential Simulation-Optimization Model for Water Allocation from the multi-Reservoir System in the Karkheh River Basin System, Iran M. Fereidoon1,2.
Optimal Electricity Supply Bidding by Markov Decision Process
For modeling conflict and cooperation Schwartz/Teneketzis
Towards Next Generation Panel at SAINT 2002
Announcements Homework 3 due today (grace period through Friday)
Introduction to locality sensitive approach to distributed systems
2016 International Conference on Grey Systems and Uncertainty Analysis
An Adaptive Middleware for Supporting Time-Critical Event Response
Distributed Control Applications Within Sensor Networks
Regional Modeling and Linking Sector Models with CGE Models
Market-based Dynamic Task Allocation in Mobile Surveillance Systems
Markov Decision Problems
Operations Management
Overview: Chapter 2 Localization and Tracking
The Future Grid and Energy Storage
Chrysostomos Koutsimanis and G´abor Fodor
Presentation transcript:

Amr Ayad, Ziad Shawwash, and Alaa Abdalla Optimization Days, Montreal- Energy and Environment Session TB5 May8th, 2012 A Multi-agent Reinforcement Learning Approach to Develop the Water Value Function for Multireservoir Hydroelectric Systems Amr Ayad, Ziad Shawwash, and Alaa Abdalla

Introduction The prupose of (MARLOMMR ) is to establish the marginal value of water and value of water-in-storage for multireservoir hydroelectric power systems as well as optimal policy” releases”. The new algorithm uses the multiagent reinforcement learning (MARL) technique to compose a long/medium term reservoir operation optimization model. To validate the new model, a stochastic dynamic programming algorithm (SDPOM2R ) was developed and will be used to benchmark the MARLOMMR model. SDPOM2R will be used to validate other models as well. * MARLOMMR : Multi-agent reinforcement learning optimization model for multiple reservoirs * SDPOM2R: Stochastic Dynamic Programming Optimization Model for 2 Reservoirs

OUTLINE OF THE PRESENTATION: BC Hydro System Research Project MARL Technique Problem Definition Main Model: MARLOMMR Conclusions

BC Hydro System Commercial Crown corporation owned by the Province of British Columbia Serving approximately 95 % of the province’s population and approximately 1.8 million customers Clean or renewable generation accounts for 90% of total supply Responsible for reliably generation between 42,000 and 52,000 GWh of electricity per year Peak Load ~ 11,000 MW Transmission network of over 18,500 kilometres and 57,000 kilometres of distribution lines Among the lowest electricity rates in North America.

BC Hydro System 61 dams 37 Hydroelectric Stations (10,500 MW) Peace river system provides 34% of the energy requirement Columbia river system with 31% of the energy requirement 1 Gas-fired Thermal plant: (912 MW) 3 Combustion Turbine plants (110 MW) Many Run of River, Biomass etc, ~ 1450 MW, (soon more ROR) Wind, 222 MW (soon ~ 740 MW+) 100+ Generating units

BC Hydro System

Research Project” Water Value Capital Project” at BC Hydro Amr Ayad, Ph.D. Student September-11-18 Research Project” Water Value Capital Project” at BC Hydro Jointly funded by NSERC and BC Hydro. Principal Investigator is Prof. Ziad Shawwash The main purpose is to create, compare and test several models that use different techniques to determine the best model/models to allocate the value of water-in- storage specially for the large multi-year-storage reservoirs which is used as a planning/decision making tool The work will not entirely start from scratch. 7

Research Project” Water Value Capital Project” at BC Hydro Amr Ayad, Ph.D. Student September-11-18 Research Project” Water Value Capital Project” at BC Hydro Other than determining the water value and marginal value of water, the focus is on: Deriving the optimal operation policy for the planning horizons Forecast for expected revenue, energy and market transactions Capture more of the system complexity Better representation of the stochasticity/uncertainty involved Incorporating the CRT flood constraints and others 8 8

Technique: Multi-agent Reinforcement Learning Technique Background MARL defined by Busoniu et al (2008), as “A group of autonomous, interacting entities sharing a common environment, which they perceive with sensors and upon which they act with actuators”. MARL can be regarded as a fusion of temporal-difference reinforcement learning , game theory, and more general direct policy search techniques. The main issues with the MARL are: The stability of the agents’ learning dynamics, Type of interaction between the agents

MARL Technique Tasks Fully cooperative: proper coordination and breaking ties in join-action value function Indirect coordination: agents indirectly guided to do biased actions to maximize the common return Coordination-based Methods: the global Q-function is decomposed Fully Competitive: unlike the nature of our problem Mixed Task: for the stateless cases Agent independent methods: each agent has its own Q-table and needs to replicate the other agents tables Agent tracking: module to track Agent-aware Methods: adapt by heuristics

MARL Technique Related Issues Benefits Learning in MARL Reward Allocation Communication of Multiple Agents Centralized vs. Decentralized Control in MARL  Function Approximation Benefits Speedup/ efficiency of the computation process Sharing the experience between agents Unlike the single agent RL, failure of an agent in its task does not mean failure of the optimization,

MARL Technique Challenges As the computation complexity increases exponentially with the increase of the state-action pairs in single agent RL, the same issue exists in MARL, There is difficulty to define a well-structured goal for multiple agents as the optimization cannot be performed without taking the correlation of the agents’ returns. As the agents are learning simultaneously, each agent has to follow the other agents’ non-stationary behavior The scalability  of the algorithm to realistic problem sizes which is also encountered in single agent RL ,

MARL Technique Challenges Exploration/exploitation balance problem is even harder in MARL than in single RL Convergence to a strategy regardless what the other agents are doing  Rationality and best response to other agents’ behavior  Q-tables need storage for the multiple agents, otherwise use function approximation. As of today, this technique was used to model systems such as AI, game theory and robotics, ITS and others but not used to solve a similar problem to the one at hand.

Problem Definition Reservoirs and plants State-Space Decision-Space Large and Complex System Reservoirs and plants State-Space Decision-Space Planning Horizon Main Constraints Objective Function Stochastic Variables implementation and representation Uncertainty Stochastic Main Constraints Maximum and minimum limits on turbine flow , Maximum and minimum limits on total plant discharge, Trade limits on exports and imports (transmission limits), and Maximum and minimum limits on generation. Environmental and other non-power constraints

Main Model: MARLOMMR Decomposition: Dantzig and Wolfe * MARLOMMR : Multi-agent reinforcement learning optimization model for multiple reservoirs

Main Model: MARLOMMR Each plant is represented by a decentralized-single-agent. Plants/agents are divided in groups depending on the river system they are at. For example MCA, REV and ARD will be in one group and GMS, PCN will be in another group and so on. For plants that are not in any groups; they might be added to some already-set-up groups or represented by single agents. All the plants in each group/ river system will be considered neighbors and they will be having a level of indirect communication between each other.

Main Model: MARLOMMR Each group(Squad) communicates with others and this would be through communication between individual agents in different groups or through (central agent module) for each group that communicates for the group with the other alike central agents from other groups.

Main Model: MARLOMMR Environment could be GOM or another model! Depending on the final structure of the problem. Decomposition and parallel processing are ideas under consideration Automating the learning parameters, stability and dynamics of learning as well as using the most efficient state-space discretization are focus areas. Start off with one stochastic variable and 5 reservoirs Dantzig and Wolfe *GOM: Generalized Optimization Model

Conclusions Developed a stochastic dynamic programming model (SDPOM2R) to handle the problem of two reservoirs (GMS and MCA) and has been tried successfully with three reservoirs. Currently developing the MARL Model Considered as an extension for the current applicable models such as RLROM (by Abdalla) Expected to have the first version in this month (May) Still, there are challenges in application of MARL technique *RLROM: reinforcement Learning Reservoir Optimization Model

Amr Ayad, Ph.D. Student September-11-18 QUESTIONS ?