QUICK REVIEW

[論文レビュー] Reinforcement Learning-based Home Energy Management with Heterogeneous Batteries and Stochastic EV Behaviour

Meng Yuan, Ye Emma Wang|arXiv (Cornell University)|Feb 4, 2026

Electric Vehicles and Infrastructure被引用数 0

ひとこと要約

本論文は、ラグランジュ付きソフトアクター-クリティック（Lagrangian Soft Actor-Critic）を用いた制約付き深層強化学習フレームワークを提案し、 stochastic EV 使用下で異種の定置型およびEVバッテリーを有する家庭のエネルギー管理を最適化。コストと劣化指標を改善しつつ快適性を維持。

ABSTRACT

The widespread adoption of photovoltaic (PV), electric vehicles (EVs), and stationary energy storage systems (ESS) in households increases system complexity while simultaneously offering new opportunities for energy regulation. However, effectively coordinating these resources under uncertainties remains challenging. This paper proposes a novel home energy management framework based on deep reinforcement learning (DRL) that can jointly minimise energy expenditure and battery degradation while guaranteeing occupant comfort and EV charging requirements. Distinct from existing studies, we explicitly account for the heterogeneous degradation characteristics of stationary and EV batteries in the optimisation, alongside stochastic user behaviour regarding arrival time, departure time, and driving distance. The energy scheduling problem is formulated as a constrained Markov decision process (CMDP) and solved using a Lagrangian soft actor-critic (SAC) algorithm. This approach enables the agent to learn optimal control policies that enforce physical constraints, including indoor temperature bounds and target EV state of charge upon departure, despite stochastic uncertainties. Numerical simulations over a one-year horizon demonstrate the effectiveness of the proposed framework in satisfying physical constraints while eliminating thermal oscillations and achieving significant economic benefits. Specifically, the method reduces the cumulative operating cost substantially compared to two standard rule-based baselines while simultaneously decreasing battery degradation costs by 8.44%.

研究の動機と目的

ネット grid 電力コストとバッテリ劣化の最小化。
居住者の熱的快適性を適切な範囲内に維持。
ESS、PV、HVACを調整しつつEV充電要件を満たす。
異種バッテリ劣化と確率的EV挙動をモデル化。
スウェーデンの家庭環境での学習フレームワークの検証。

提案手法

スケジューリング問題を制約付きマルコフ決定過程（CMDP）として定式化。
制約処理のための対偶変数を持つラグランジュ付きソフトアクター-クリティック（SAC）アルゴリズムで解く。
LFP（定置）とNMC（EV）バッテリを区別する半経験的劣化モデルを組み込む。
到着/出発時刻と日次走行距離の分布をスウェーデンの旅行調査データに適合させて確率的EV挙動をモデル化。
学習性能と制約満足度を検証する高忠実度のスウェーデン家庭環境を用いる。

実験結果

リサーチクエスチョン

RQ1Lagrangian SACを備えたCMDPは確率的EV使用下でHVAC、ESS、EV、家庭用機器を共同最適化できるか。
RQ2異種バッテリ劣化モデルはHEMS制御方針とコスト削減にどのような影響を与えるか。
RQ3不確実性の下で居住者の快適性とEV充電要件をフレームワークは維持できるか。
RQ4ルールベースのベースラインと比較した経済的利得と劣化削減はどの程度か。

主な発見

本手法は2つの標準的なルールベースのベースラインと比較して累積運用コストを大幅に削減。
バッテリ劣化コストを8.44%低減。
フレームワークは不確実性にもかかわらず室内温度の境界と出発時点の目標 EV SOC を満たす。

より良い研究を、今すぐ始めましょう

論文設計から論文執筆まで、研究時間を劇的に削減しましょう。

クレジットカード登録不要

このレビューはAIが作成し、人間の編集者が確認しました。