notetoself - AI Framework

notetoself - AI Framework

2026-04-19

Setup 🔗

From [Github] - AMD Strix Halo Llama.cpp Toolboxes:

setup kernel to allow dynamic allocation of RAM to the GPU:

add amd_iommu=off amdgpu.gttsize=126976 ttm.pages_limit=32505856" to your grub kernel line:

sudo vi /etc/default/grub

# general: amdgpu.gttsize=MB*1024 ttm.pages_limit=MB*262144
# e.g.:
# 104G: amdgpu.gttsize=106496 ttm.pages_limit=27262976
# 120G: amdgpu.gttsize=122880 ttm.pages_limit=31457280
GRUB_CMDLINE_LINUX="rd.luks.uuid=<partition-id> rhgb quiet \
    amd_iommu=off amdgpu.gttsize=126976 ttm.pages_limit=32505856" # 124G

run sudo grub2-mkconfig to activate new config

more info: [Github] - Framework-strix-halo-llm-setup

install and run nvtop to see available RAM/VRAM

Models 🔗

Qwen3.6-35B-A3B 🔗

[huggingface/unsloth] - Qwen3.6-35B-A3B-GGUF

wget https://huggingface.co/unsloth/Qwen3.6-35B-A3B-GGUF/resolve/main/Qwen3.6-35B-A3B-UD-Q8_K_XL.gguf

params: temp=1.0, top_p=0.95, 200K

links

Qwen3.6-35B-A3B on my laptop drew me a better pelican than Claude Opus 4.7

Page source

## Setup From [\[Github\] - AMD Strix Halo Llama.cpp Toolboxes](https://github.com/kyuz0/amd-strix-halo-toolboxes): * setup kernel to allow dynamic allocation of RAM to the GPU: add `amd_iommu=off amdgpu.gttsize=126976 ttm.pages_limit=32505856"` to your grub kernel line: `sudo vi /etc/default/grub` ```bash # general: amdgpu.gttsize=MB*1024 ttm.pages_limit=MB*262144 # e.g.: # 104G: amdgpu.gttsize=106496 ttm.pages_limit=27262976 # 120G: amdgpu.gttsize=122880 ttm.pages_limit=31457280 GRUB_CMDLINE_LINUX="rd.luks.uuid=<partition-id> rhgb quiet \ amd_iommu=off amdgpu.gttsize=126976 ttm.pages_limit=32505856" # 124G ``` run `sudo grub2-mkconfig` to activate new config * more info: [\[Github\] - Framework-strix-halo-llm-setup](https://github.com/Gygeek/Framework-strix-halo-llm-setup) * install and run `nvtop` to see available RAM/VRAM ## Models * [LLM Stats](https://llm-stats.com/benchmarks) * [LLM Stats - SWE-Bench Verified](https://llm-stats.com/benchmarks/swe-bench-verified) * [Local LLMs on Strix Halo 128GB Shared Ram: My Tests](https://blog.t1m.me/blog/local-llms-on-strix-halo-128gb-shared-ram) * [AMD Ryzen AI MAX+ 395 “Strix Halo” — Benchmark Grid](https://kyuz0.github.io/amd-strix-halo-toolboxes/) ### Qwen3.6-35B-A3B [\[huggingface/unsloth\] - Qwen3.6-35B-A3B-GGUF](https://huggingface.co/unsloth/Qwen3.6-35B-A3B-GGUF) ```bash wget https://huggingface.co/unsloth/Qwen3.6-35B-A3B-GGUF/resolve/main/Qwen3.6-35B-A3B-UD-Q8_K_XL.gguf ``` **params**: temp=1.0, top_p=0.95, 200K **links** * [Qwen3.6-35B-A3B on my laptop drew me a better pelican than Claude Opus 4.7](https://simonwillison.net/2026/Apr/16/qwen-beats-opus/) </partition-id>