Abstract: In optimal dispatch (OD) of multiple integrated energy systems (MIES), purely data-driven reinforcement learning (RL) methods often encounter challenges such as transient data boundaries, ...