Releases · ggml-org/llama.cpp

12 Mar 06:44

bf69cfe

b4874 Latest

Latest

vulkan: fix bug in coopmat1 mul_mat_id (#12316)

* tests: run mul_mat_id with a larger N

* vulkan: fix bug in coopmat1 mul_mat_id

Assets 26

cudart-llama-bin-win-cu11.7-x64.zip

303 MB 2025-03-12T06:44:01Z
cudart-llama-bin-win-cu12.4-x64.zip

373 MB 2025-03-12T06:44:10Z
llama-b4874-bin-macos-arm64.zip

23.9 MB 2025-03-12T06:44:20Z
llama-b4874-bin-macos-x64.zip

25.5 MB 2025-03-12T06:44:21Z
llama-b4874-bin-ubuntu-arm64.zip

26.1 MB 2025-03-12T06:44:22Z
llama-b4874-bin-ubuntu-vulkan-x64.zip

32.1 MB 2025-03-12T06:44:23Z
llama-b4874-bin-ubuntu-x64.zip

27.7 MB 2025-03-12T06:44:25Z
llama-b4874-bin-win-avx-x64.zip

16.7 MB 2025-03-12T06:44:26Z
llama-b4874-bin-win-avx2-x64.zip

16.7 MB 2025-03-12T06:44:27Z
llama-b4874-bin-win-avx512-x64.zip

16.7 MB 2025-03-12T06:44:28Z
Source code (zip)

2025-03-12T05:59:19Z
Source code (tar.gz)

2025-03-12T05:59:19Z

11 Mar 20:00

github-actions

b4873

10f2e81

b4873

CUDA/HIP: refractor mmqv to unify the calculation of nwarps and rows …

Assets 26

11 Mar 14:59

github-actions

b4872

ba76543

b4872

ggml-backend : fix backend search path (#12330)

* Fix backend search path

* replace .native() with '/'

* reverted .native()

Assets 26

11 Mar 12:36

github-actions

b4871

6ab2e47

b4871

metal : Cache the Metal library at the device context level (#12265)

Assets 26

11 Mar 09:08

github-actions

b4870

96e1280

b4870

clip : bring back GPU support (#12322)

* clip : bring back GPU support

* use n_gpu_layers param

* fix double free

* ggml_backend_init_by_type

* clean up

Assets 26

10 Mar 20:10

github-actions

b4869

2c9f833

b4869

mat vec double buffer (#12188)

Assets 26

10 Mar 18:10

github-actions

b4868

2513645

b4868

musa: support new arch mp_31 and update doc (#12296)

Signed-off-by: Xiaodong Ye <[email protected]>

Assets 26

10 Mar 17:59

github-actions

b4867

8acdacb

b4867

opencl: use OpenCL C standard supported by the device (#12221)

This patch nudges the llama.cpp a bit to be supported on PoCL which
doesn't support OpenCL C CL2.0. The issue is solved by querying the
device for the supported OpenCL C versions and using the highest one
available.

Assets 26

10 Mar 13:04

github-actions

b4865

e128a1b

b4865

tests : fix test-quantize-fns to init the CPU backend (#12306)

ggml-ci

Assets 25

10 Mar 12:46

github-actions

b4864

6ef79a6

b4864

common : refactor '-o' option (#12278)

As discussed in PR 'llama-tts : add -o option' (#12042):

* common_params : 'out_file' string is the only output file name parameter left in common_params. It's intended to be used in all example programs implementing an '-o' option.

* cvector-generator, export-lora, imatrix : default output filenames moved from 'common_params' to the 'main()' of each example program.

Assets 25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releases: ggml-org/llama.cpp

b4874

b4873

b4872

b4871

b4870

b4869

b4868

b4867

b4865

b4864