Add AMX support for brute force search #1210

xtangxtang · 2025-06-04T06:07:02Z

Hi~, reviewer, Intel Advanced Matrix Extensions (AMX) is a set of specialized instructions designed to accelerate matrix operations, which are fundamental in many areas of modern computing such as machine learning, scientific computing, and graphics processing. AMX leverages dedicated tile-based architecture within the CPU to perform these operations more efficiently than traditional scalar or SIMD (Single Instruction, Multiple Data) methods.

During our analysis of brute force search for Knowhere, we find AMX could speedup up to 2x performance according to dbpedia_openai_1M dataset for bf16 data type. All tests were performed on an Intel(R) Xeon(R) 6980P processor, employing 32 physical cores and 32 threads

sre-ci-robot · 2025-06-04T06:07:13Z

Welcome @xtangxtang! It looks like this is your first PR to zilliztech/knowhere 🎉

mergify · 2025-06-04T06:07:37Z

@xtangxtang 🔍 Important: PR Classification Needed!

For efficient project management and a seamless review process, it's essential to classify your PR correctly. Here's how:

If you're fixing a bug, label it as kind/bug.
For small tweaks (less than 20 lines without altering any functionality), please use kind/improvement.
Significant changes that don't modify existing functionalities should be tagged as kind/enhancement.
Adjusting APIs or changing functionality? Go with kind/feature.

For any PR outside the kind/improvement category, ensure you link to the associated issue using the format: “issue: #”.

Thanks for your efforts and contribution to the community!.

alexanderguzhva

cleanup required

benchmark/hdf5/benchmark_float.cpp

alexanderguzhva · 2025-06-16T14:12:46Z

src/simd/distances_amx.cc

+#if defined(USE_AMX)
+float amx_inner_product_matrix_bf16( char **floatLibraryMatrix, char  *floatQueryMatrix, uint64_t dims,uint64_t batchSizeA,
+                              uint64_t batchSizeB, float *results_ptr){
+    int DIM=32;


comments are needed for the magic numbers
also, please use constexpr

OK， we will use constexpr for magic numbers

alexanderguzhva · 2025-06-16T14:16:24Z

src/simd/distances_amx.cc

+    float results[16*16] __attribute__((aligned(64)))={0};
+
+    if(!init_mem){
+        cfg[0]=1;


I would prefer to use defined structs, for example

struct TileConfig { // must be 1 uint8_t paletteId; // must be 0 uint8_t startRow; uint8_t reserved[14]; // measured in bytes uint16_t colsb[16]; // measured in rows uint8_t rows[16]; };

Good idea, has replace it

alexanderguzhva · 2025-06-16T14:17:34Z

src/simd/distances_amx.cc

+        cfg[48+2] = 16;
+        init_mem = true;
+
+        _tile_loadconfig((void *)cfg);


fyi, thread_local approach here leads to a missing _tile_release() instruction.

Yes, but we understand, the release will be called when the thread stop, maybe the process quit. Please provide a position for us to release the tile config, if you have an idea

alexanderguzhva · 2025-06-16T14:22:34Z

src/simd/distances_amx.cc

+      }
+    }
+    _tile_stored(2, results, batchSizeB*2*2);
+    _tile_zero(2);


I would move _tile_zero(2); one line above for(int i=0;i<blockCount;i++){

alexanderguzhva · 2025-06-16T14:31:29Z

src/simd/distances_amx_intr.h

+    return 1;
+}
+
+static void amx_bf16_mul(void *cfg, void *ma, void *mb, int64_t a_stride, int64_t b_stride, void *mc) {


alexanderguzhva · 2025-06-16T14:35:46Z

src/simd/distances_ref.cc

+                                    float& dis8, float& dis9, float& dis10, float& dis11,
+                                    float& dis12, float& dis13, float& dis14, float& dis15
+                                    ) {
+    enable_amx();


this function needs to be invoked only once per process, please alter src/simd/hook.cc appropriately

alexanderguzhva · 2025-06-16T14:37:01Z

src/simd/distances_amx_intr.h

+    unsigned long bitmask = 0;
+    long status = syscall(SYS_arch_prctl, ARCH_GET_XCOMP_PERM, &bitmask);
+    if (0 != status) {
+        std::cout << "SYS_arch_prctl(READ) error" << std::endl;


please alert about the problems via the error code rather than via std::cout

if (status != 0) { return ...; }

alexanderguzhva · 2025-06-16T14:39:03Z

src/simd/hook.cc

    std::lock_guard<std::mutex> lock(patch_bf16_mutex);
 #if defined(__x86_64__)
-    if (use_avx512 && cpu_support_avx512()) {
+    if (use_amx && cpu_support_amx()) {


if (use_amx && cpu_support_amx()) { if (enable_amx() != SUCCESS) { fallback to avx512 and alert users } else { use amx } }

Actually, the invocation of AMX and AVX512 should be the same here, except that AMX requires an additional enable_amx call.

alexanderguzhva · 2025-06-16T14:39:57Z

thirdparty/faiss/faiss/utils/distances_if.h

        typename Apply>
 void internal_bf16_vec_inner_products_ny_if(
-        const knowhere::bf16* __restrict x,
-        const knowhere::bf16* __restrict y,


why was __restrict removed?

Sorry, we have add it back

sre-ci-robot · 2025-07-09T07:30:14Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: xtangxtang
To complete the pull request process, please assign zhengbuqian after the PR has been reviewed.
You can assign the PR to them by writing /assign @zhengbuqian in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

ruclz · 2025-07-09T07:40:10Z

Hi~, I have updated the code from your comment, please help to review.

alexanderguzhva · 2025-08-18T22:54:03Z

/reopen

sre-ci-robot · 2025-08-18T22:54:07Z

@alexanderguzhva: Reopened this PR.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

abenmao and others added 9 commits May 20, 2025 06:44

add amx feature

8b7a994

add amx support

268ce69

Merge branch 'zilliztech:main' into main

55b6259

synch to main stream

e09389e

Merge branch 'main' of https://github.com/epeshared/knowhere

2b14dc7

merge to main stream

cd97b85

refine amx code format

767e026

remove amx prefetch and amx 32 batch

8033520

remove bf16_vec_inner_product_batch_4_ref_amx

a4a8262

sre-ci-robot requested review from alexanderguzhva and liliu-z June 4, 2025 06:07

sre-ci-robot added the size/XL label Jun 4, 2025

mergify bot added the needs-dco label Jun 4, 2025

mergify bot added the do-not-merge/missing-related-issue label Jun 4, 2025

xtangxtang changed the title ~~Add amx support for brute force search~~ Add AMX support for brute force search Jun 4, 2025

alexanderguzhva reviewed Jun 16, 2025

View reviewed changes

ruclz added 2 commits July 9, 2025 15:28

update code according the reviewer's comment

79d7e4c

update code according the reviewer's comment

7aafffd

xtangxtang requested a review from alexanderguzhva July 10, 2025 05:44

github-actions bot added the stale label Aug 10, 2025

github-actions bot closed this Aug 18, 2025

sre-ci-robot reopened this Aug 18, 2025

github-actions bot removed the stale label Aug 19, 2025

github-actions bot added the stale label Sep 19, 2025

Add AMX support for brute force search #1210

Are you sure you want to change the base?

Add AMX support for brute force search #1210

Uh oh!

Conversation

xtangxtang commented Jun 4, 2025

Uh oh!

sre-ci-robot commented Jun 4, 2025

Uh oh!

mergify bot commented Jun 4, 2025

Uh oh!

alexanderguzhva left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sre-ci-robot commented Jul 9, 2025

Uh oh!

ruclz commented Jul 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alexanderguzhva commented Aug 18, 2025

Uh oh!

sre-ci-robot commented Aug 18, 2025

Uh oh!

Uh oh!

ruclz commented Jul 9, 2025 •

edited

Loading