Skip to content

Releases: ngxson/llama.cpp

b5835

06 Jul 11:00
6491d6e
Compare
Choose a tag to compare
vulkan: increase LOAD_VEC_A to 8 (IQ1/IQ2) or 4 (IQ3) (#14485)

Commit taken from remyoudompheng's PR https://github.com/ggml-org/llama.cpp/pull/12260

Co-authored-by: Rémy Oudompheng <[email protected]>

b5834

06 Jul 09:08
e592be1
Compare
Choose a tag to compare
vulkan: fix rms_norm+mul fusion (#14545)

The fused operation was grabbing the epsilon value from the wrong place.

Add an env var to disable fusion.

Add some missing checks for supported shapes/types.

Handle fused rms_norm+mul in check_results.

b5833

05 Jul 07:56
a0374a6
Compare
Choose a tag to compare
vulkan: Handle updated FA dim2/3 definition (#14518)

* vulkan: Handle updated FA dim2/3 definition

Pack mask boolean and n_head_log2 into a single dword to keep the push
constant block under the 128B limit.

* handle null mask for gqa

* allow gqa with dim3>1

b5832

05 Jul 07:47
ddef995
Compare
Choose a tag to compare
server : fix assistant prefilling when content is an array (#14360)

b5831

05 Jul 06:42
6681688
Compare
Choose a tag to compare
opencl: add GELU_ERF (#14476)

b5830

05 Jul 04:40
bac8bed
Compare
Choose a tag to compare
eval-callback : check for empty input (#14539)

b5829

05 Jul 04:32
b81510a
Compare
Choose a tag to compare
test-backend-ops: add support for specifying output format (#14368)

* test-backend-ops: add support for specifying output format

Signed-off-by: Xiaodong Ye <[email protected]>

* Address review comments

Signed-off-by: Xiaodong Ye <[email protected]>

* Add build_commit and build_number in test_result

Signed-off-by: Xiaodong Ye <[email protected]>

* Address review comments

Signed-off-by: Xiaodong Ye <[email protected]>

* refactor

Signed-off-by: Xiaodong Ye <[email protected]>

* Get build commit from ggml_commit()

Signed-off-by: Xiaodong Ye <[email protected]>

* Merge errors into test_operation_info && address review comments

Signed-off-by: Xiaodong Ye <[email protected]>

* Address review comments

Signed-off-by: Xiaodong Ye <[email protected]>

* Address review comments

Signed-off-by: Xiaodong Ye <[email protected]>

* remove visitor nonsense

* remove visitor comment

Signed-off-by: Xiaodong Ye <[email protected]>

* Address review comments

Signed-off-by: Xiaodong Ye <[email protected]>

---------

Signed-off-by: Xiaodong Ye <[email protected]>
Co-authored-by: slaren <[email protected]>

b5828

04 Jul 16:40
ef797db
Compare
Choose a tag to compare
metal : disable fast math in all quantize kernels (#14528)

ggml-ci

b5827

04 Jul 06:33
67d1ef2
Compare
Choose a tag to compare
batch : add optional for sequential equal split (#14511)

ggml-ci

b5826

04 Jul 06:40
7b50f7c
Compare
Choose a tag to compare
graph : prepare for 4D mask (#14515)

ggml-ci