First, in the first terminal, I successfully ran CUDA_VISIBLE_DEVICES=1 LMCACHE_CONFIG_FILE=example.yaml vllm serve mistral-community/Mistral-7B-v0.2 --max-model-len ...
I reproduced the clear example according to the instructions in examples/cache_controller/clear/README.md. process: CUDA_VISIBLE_DEVICES=0 LMCACHE_CONFIG_FILE=example ...