If anyone is interested, here's how I run SGlang on the AMD RX 7900 XTX (gfx1100) with ROCm 6.2.4. Currently, the attention backend is based on Triton. It seems that flashInfer support is under ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results