Google Colab is a free online tool from Google that lets you write and run Python code directly in your browser.
We are using https://huggingface.co/nvidia/Llama-3.1-405B-Instruct-FP4. Also welcome to try it with nvidia/Llama-3.3-70B-Instruct-FP4, the issue is the same. python ...
xformers attention in VAE VAE load device: cuda:0, offload device: cpu, dtype: torch.float32 CLIP/text encoder model load device: cuda:0, offload device: cpu, current ...