Blame

e0b6ab lhl 2025-08-11 12:57:39
first swag
1
# llama.cpp with ROCm
2
3
> [!WARNING]
4
> This is a technical guide and assumes a certain level of technical knowledge. If there are confusing parts or you run into issues, I recommend using a strong LLM with research/grounding and reasoning abilities (eg Claude Sonnet 4) to assist.
5
6
If you are looking for pre-built llama.cpp ROCm binaries, first check out:
7
- Lemonade's [llamacpp-rocm](https://github.com/lemonade-sdk/llamacpp-rocm) builds
8
- kyuz0's pre-build [AMD Strix Halo Llama.cpp Toolboxes](https://github.com/kyuz0/amd-strix-halo-toolboxes) container builds.
9
10
## Building llama.cpp with ROCm
11
12
You can basically just follow the [llama.cpp build guide](https://github.com/ggml-org/llama.cpp/blob/master/docs/build.md#hipblas)
13
```
14
git clone https://github.com/ggml-org/llama.cpp
15
cd llama.cpp
16
17
# build w/o rocWMMA
18
cmake -S . -B build -DGGML_HIP=ON -DAMDGPU_TARGETS=gfx1151 -DCMAKE_BUILD_TYPE=Release && cmake --build build --config Release -- -j$(nproc)
19
20
# or, really, you want to build w/ rocWMMA
21
cmake -B build -S . -DGGML_HIP=ON -DAMDGPU_TARGETS="gfx1151" -DGGML_HIP_ROCWMMA_FATTN=ON && time cmake --build build --config Release -j$(nproc)
22
23
# after about 2 minutes you should have a freshly baked llama.cpp in build/bin:
24
build/bin/llama-bench --mmap 0 -fa 1 -m /models/gguf/llama-2-7b.Q4_K_M.gguf
25
```
26
27
Of course, to build, you need some dependencies sorted.
28
29
## ROCm
30
You'll need ROCm installed first before you can build. For best performance you'll want to use the latest ROCm/TheRock nightlies. See: [[Guides/AI-Capabilities#rocm]]
31
32
To build, you may need to make sure your environment variables are properly set. If so, take a look at [https://github.com/lhl/strix-halo-testing/blob/main/rocm-therock-env.sh](https://github.com/lhl/strix-halo-testing/blob/main/rocm-therock-env.sh) for an example of what this might look like. Change `ROCM_PATH` to whatever your ROCm path is.
33
34
## rocWMMA
35
Your ROCm probably has the rocWMMA libraries installed already. If not, you'll want them in your rocm folder. This is relatively straightforward (we only need the library installed, but you can refer to [https://github.com/lhl/strix-halo-testing/blob/main/arch-torch/02-build-rocwwma.sh](https://github.com/lhl/strix-halo-testing/blob/main/arch-torch/02-build-rocwwma.sh) for building this.
36
7aae6b lhl 2025-08-11 13:34:52
typo
37
If you are using a TheRock nightly build of ROCm, you may get some errors compiling. In that case, take a look at [https://github.com/lhl/strix-halo-testing/blob/main/llm-bench/apply-rocwmma-fix.sh](https://github.com/lhl/strix-halo-testing/blob/main/llm-bench/apply-rocwmma-fix.sh) to apply the fixes necessary for a compile.
e0b6ab lhl 2025-08-11 12:57:39
first swag
38
- See also: https://github.com/ggml-org/llama.cpp/pull/15239