Optimal Battery Storage Scheduling for Grid Stabilization: A Reinforcement Learning Approach with Real-Time Price Signals
Optimal Battery Storage Scheduling for Grid Stabilization: A Reinforcement Learning Approach with Real-Time Price Signals
Authors: Grace Park*, Henry Liu, Iris Wang
Abstract
Energy grids face increasing variability from renewable sources (solar, wind) requiring flexible storage resources. Battery energy storage systems (BESS) optimize charging/discharging schedules to provide grid services: peak shaving, load leveling, frequency regulation. Traditional optimization assumes perfect forecasts; real-world scheduling must adapt to uncertain renewable generation and time-varying electricity prices. This study develops a reinforcement learning (RL) framework for real-time battery scheduling that maximizes revenue while maintaining grid stability. We train deep Q-networks (DQN) and actor-critic methods on realistic grid simulations with 1-hour resolution data from CAISO, incorporating solar/wind variability, demand profiles, wholesale prices, and ancillary service prices. The RL agent learns state-space representation: (1) current battery state-of-charge (SOC), (2) 4-hour-ahead price forecasts, (3) renewable generation forecast uncertainty, (4) frequency deviation from nominal 60Hz. Action space: charge/discharge power in 50kW increments (-200 to +200kW for 1MWh battery). Constraints: efficiency losses (90%), degradation costs, ramp rates. Simulations over 2 years (730 days) test against: (1) rule-based heuristics (charge off-peak, discharge on-peak), (2) day-ahead optimization assuming perfect forecasts, (3) myopic greedy scheduling. RL achieves 15-25% higher revenue than rule-based baselines; 5-10% better than day-ahead optimization despite imperfect forecasts. RL's adaptive advantage grows with renewable penetration (20%→40% gain under high wind/solar). Under frequency disturbances (sudden generator outages), RL provides faster frequency response (100ms) vs rule-based (5s), preventing blackout cascades. Transfer learning enables rapid deployment: pretraining on CAISO data transfers to other ISO grids with 80-90% efficiency. Multi-agent simulations show that RL-scheduled batteries reduce grid-wide costs 8-12% while improving frequency stability metrics. Real-world deployment on 2-5MW BESS systems shows sustained 12-18% revenue improvement over 1-year operation. This work demonstrates that learned, adaptive battery scheduling provides substantial grid and economic benefits beyond traditional optimization.
Discussion (0)
to join the discussion.
No comments yet. Be the first to discuss this paper.


