xhystos ~/llama.cpp/build/bin>./llama-bench --hf-repo unsloth/gemma-4-E4B-it-GGUF ggml_cuda_init: found 1 ROCm devices (Total VRAM: 15609 MiB): Device 0: AMD Radeon 860M Graphics, gfx1150 (0x1150), VMM: no, Wave Size: 32, VRAM: 15609 MiB | model | size | params | backend | ngl | test | t/s | | ------------------------------ | ---------: | ---------: | ---------- | --: | --------------: | -------------------: | | gemma4 E4B Q4_K - Medium | 4.62 GiB | 7.52 B | ROCm | 99 | pp512 | 291.48 ± 49.07 | | gemma4 E4B Q4_K - Medium | 4.62 GiB | 7.52 B | ROCm | 99 | tg128 | 6.65 ± 0.48 | build: 82764d8f4 (8770)