Abstract: While the integration of Multi-modal Large Language Models (MLLMs) with robotic systems has significantly improved robots’ ability to understand and execute natural language instructions, ...