Setup đź”—
From [Github] - AMD Strix Halo Llama.cpp Toolboxes:
-
setup kernel to allow dynamic allocation of RAM to the GPU:
add
amd_iommu=off amdgpu.gttsize=126976 ttm.pages_limit=32505856"to your grub kernel line:sudo vi /etc/default/grub# general: amdgpu.gttsize=MB*1024 ttm.pages_limit=MB*262144 # e.g.: # 104G: amdgpu.gttsize=106496 ttm.pages_limit=27262976 # 120G: amdgpu.gttsize=122880 ttm.pages_limit=31457280 GRUB_CMDLINE_LINUX="rd.luks.uuid=<partition-id> rhgb quiet \ amd_iommu=off amdgpu.gttsize=126976 ttm.pages_limit=32505856" # 124Grun
sudo grub2-mkconfigto activate new config- more info: [Github] - Framework-strix-halo-llm-setup
-
install and run
nvtopto see available RAM/VRAM
Models đź”—
- LLM Stats
- LLM Stats - SWE-Bench Verified
- Local LLMs on Strix Halo 128GB Shared Ram: My Tests
- AMD Ryzen AI MAX+ 395 “Strix Halo” — Benchmark Grid
Qwen3.6-35B-A3B đź”—
[huggingface/unsloth] - Qwen3.6-35B-A3B-GGUF
wget https://huggingface.co/unsloth/Qwen3.6-35B-A3B-GGUF/resolve/main/Qwen3.6-35B-A3B-UD-Q8_K_XL.gguf
params: temp=1.0, top_p=0.95, 200K
links