Skip to content

Conversation

p-debski2
Copy link
Contributor

Summary

  • imported GGML functions into nntr_ggml_impl directory and used those instead of the actual GGML in the ggml_interface functions
  • removed GGML includes from ggml_interface

Signed-off-by: p-debski2 [email protected]

@jijoongmoon
Copy link
Collaborator

I think it is OK to include the code to remove the ggml submodule.

Copy link
Contributor

@gkisalapl gkisalapl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make it one commit PR

@p-debski2 p-debski2 force-pushed the work/ggml-integration branch from b95b19b to c2e18b2 Compare September 4, 2025 08:42
@p-debski2 p-debski2 force-pushed the work/ggml-integration branch 2 times, most recently from 9ad293e to af4b4e0 Compare September 4, 2025 12:02
Copy link
Contributor

@djeong20 djeong20 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This patch would not work correctly for all platforms.
For instance, you need to modify the Android.mk file under jni and test/jni. There are still ggml dependencies left.

For instance,

MESON_HAS_GGML := @MESON_HAS_GGML@
ifeq ($(MESON_HAS_GGML),1)
LOCAL_MODULE := ggml
LOCAL_SRC_FILES := @MESON_GGML_ROOT@/src/ggml-backend.cpp \
@MESON_GGML_ROOT@/src/ggml-cpu/ggml-cpu-hbm.cpp \
@MESON_GGML_ROOT@/src/ggml-cpu/unary-ops.cpp \
@MESON_GGML_ROOT@/src/ggml-cpu/vec.cpp \
@MESON_GGML_ROOT@/src/ggml-cpu/ggml-cpu-traits.cpp \
@MESON_GGML_ROOT@/src/ggml-cpu/llamafile/sgemm.cpp \
@MESON_GGML_ROOT@/src/ggml-cpu/ops.cpp \
@MESON_GGML_ROOT@/src/ggml-cpu/amx/mmq.cpp \
@MESON_GGML_ROOT@/src/ggml-cpu/amx/amx.cpp \
@MESON_GGML_ROOT@/src/ggml-cpu/binary-ops.cpp \
@MESON_GGML_ROOT@/src/ggml-cpu/ggml-cpu-aarch64.cpp \
@MESON_GGML_ROOT@/src/ggml-cpu/cpu-feats-x86.cpp \
@MESON_GGML_ROOT@/src/ggml-backend-reg.cpp \
@MESON_GGML_ROOT@/src/ggml-opt.cpp \
@MESON_GGML_ROOT@/src/gguf.cpp \
@MESON_GGML_ROOT@/src/ggml-threading.cpp \
@MESON_GGML_ROOT@/src/ggml-alloc.c \
@MESON_GGML_ROOT@/src/ggml-quants.c \
@MESON_GGML_ROOT@/src/ggml-cpu/ggml-cpu.cpp \
@MESON_GGML_ROOT@/src/ggml-cpu/ggml-cpu_c.c \
@MESON_GGML_ROOT@/src/ggml-cpu/ggml-cpu-quants.c \
@MESON_GGML_ROOT@/src/ggml.c
LOCAL_CXXFLAGS += -std=c++17 -O3 -fexceptions
LOCAL_C_INCLUDES := @MESON_GGML_ROOT@/include \
@MESON_GGML_ROOT@/src \
@MESON_GGML_ROOT@/src/ggml-cpu
LOCAL_EXPORT_C_INCLUDES := $(LOCAL_C_INCLUDES)
include $(BUILD_SHARED_LIBRARY)
endif # MESON_HAS_GGML

include $(CLEAR_VARS)
LOCAL_MODULE := ggml
LOCAL_SRC_FILES := $(NNTRAINER_ROOT)/builddir/jni/$(TARGET_ARCH_ABI)//libggml.so
include $(PREBUILT_SHARED_LIBRARY)

@djeong20
Copy link
Contributor

djeong20 commented Sep 4, 2025

Additionally, I don't think the enable_ggml option is needed anymore. Let's remove the meson option and the ENABLE_GGML macro as well.

@@ -442,11 +436,6 @@ ln -sf %{_libdir}/pkgconfig/capi-nnstreamer.pc %{_libdir}/pkgconfig/capi-ml-comm
# Setup Ruy
tar -xf packaging/ruy.tar.gz -C subprojects

# Setup GGML
%if 0%{?enable_ggml}
tar -xf packaging/ggml.tar.gz -C subprojects
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please delete the packaging/ggml.tar.gz file as well.

Removed GGML as a dependency and introduced the 'nntr_ggml_impl' directory with implementation of necessary functions

Signed-off-by: p-debski2 <[email protected]>
@p-debski2 p-debski2 force-pushed the work/ggml-integration branch 5 times, most recently from 065a679 to d175017 Compare September 5, 2025 13:08
@p-debski2
Copy link
Contributor Author

p-debski2 commented Sep 5, 2025

All PR comments have been addressed, and I believe it should work correctly for all platforms now.

However, I keep having trouble with Ubuntu Meson tests on CI after removing the enable-ggml flag. Building process is correct, but when it comes to launching unit tests, it always times-out on the cpu-backend one.

Locally it works fine for me, and different builds with similar configurations also pass the unit tests phase (see Ubuntu pdebuild & Ubuntu Meson with Clang). It's also concerning that for those workflows that pass the CI, this test usually takes ~20s to finish, while here it times out in 60s.

@p-debski2 p-debski2 force-pushed the work/ggml-integration branch from d175017 to 0e88783 Compare September 8, 2025 07:15
Removed the enable-ggml option from Meson and use of ENABLE_GGML macro from code according to PR suggestions

Signed-off-by: p-debski2 <[email protected]>
@p-debski2 p-debski2 force-pushed the work/ggml-integration branch from 0e88783 to d4809fd Compare September 8, 2025 07:50
@p-debski2
Copy link
Contributor Author

I've investigated the issue with timeouts on Ubuntu Meson locally, and ran a series of tests on different configurations before and after changes made by this PR.

Here's the results from unit tests as they were ran on CI before this PR (note that GGML is disabled in this configuration):

debskip@AMDN6357:~/repos/nntrainer$ meson test -C builddir/ --suite unittests
ninja: Entering directory `/home/debskip/repos/nntrainer/builddir'
ninja: no work to do.
 1/28 nntrainer:unittests / unittest_tizen_capi_layer                OK               0.07s
 2/28 nntrainer:unittests / unittest_nntrainer_exe_order             OK               0.07s
 3/28 nntrainer:unittests / unittest_base_properties                 OK               0.07s
 4/28 nntrainer:unittests / unittest_nntrainer_appcontext            OK               0.07s
 5/28 nntrainer:unittests / unittest_nntrainer_quantizer             OK               0.08s
 6/28 nntrainer:unittests / unittest_util_func                       OK               0.08s
 7/28 nntrainer:unittests / unittest_common_properties               OK               0.08s
 8/28 nntrainer:unittests / unittest_tizen_capi_dataset              OK               0.10s
 9/28 nntrainer:unittests / unittest_nntrainer_activations           OK               0.11s
10/28 nntrainer:unittests / unittest_nntrainer_internal              OK               0.11s
11/28 nntrainer:unittests / unittest_tizen_capi_lr_scheduler         OK               0.12s
12/28 nntrainer:unittests / unittest_nntrainer_lr_scheduler          OK               0.12s
13/28 nntrainer:unittests / unittest_nntrainer_lazy_tensor           OK               0.18s
14/28 nntrainer:unittests / unittest_tizen_capi_optimizer            OK               0.19s
15/28 nntrainer:unittests / integration_tests                        OK               0.19s
16/28 nntrainer:unittests / unittest_nntrainer_cpu_backend           OK               0.25s
17/28 nntrainer:unittests / unittest_nntrainer_tensor_pool           OK               0.24s
18/28 nntrainer:unittests / unittest_compiler                        OK               0.38s
19/28 nntrainer:unittests / unittest_layers                          OK               0.40s
20/28 nntrainer:unittests / unittest_nntrainer_graph                 OK               0.74s
21/28 nntrainer:unittests / unittest_nntrainer_models                OK               0.90s
22/28 nntrainer:unittests / unittest_memory                          OK               0.98s
23/28 nntrainer:unittests / unittest_nntrainer_modelfile             OK               1.12s
24/28 nntrainer:unittests / unittest_models                          OK               1.43s
25/28 nntrainer:unittests / unittest_nntrainer_tensor                OK               1.75s
26/28 nntrainer:unittests / unittest_datasets                        OK               1.88s
27/28 nntrainer:unittests / unittest_tizen_capi                      OK               3.22s
28/28 nntrainer:unittests / unittest_ccapi                           OK               4.82s


Ok:                 28
Expected Fail:      0
Fail:               0
Unexpected Pass:    0
Skipped:            0
Timeout:            0

Here's the results from the same commit before my changes, but this time with GGML enabled:

debskip@AMDN6357:~/repos/nntrainer$ meson test -C builddir/ --suite unittests
ninja: Entering directory `/home/debskip/repos/nntrainer/builddir'
ninja: no work to do.
 1/28 nntrainer:unittests / unittest_tizen_capi_optimizer            OK               0.42s
 2/28 nntrainer:unittests / unittest_tizen_capi_lr_scheduler         OK               0.42s
 3/28 nntrainer:unittests / unittest_tizen_capi_dataset              OK               0.43s
 4/28 nntrainer:unittests / unittest_nntrainer_activations           OK               0.43s
 5/28 nntrainer:unittests / unittest_nntrainer_exe_order             OK               0.43s
 6/28 nntrainer:unittests / unittest_nntrainer_internal              OK               0.43s
 7/28 nntrainer:unittests / unittest_nntrainer_lazy_tensor           OK               0.43s
 8/28 nntrainer:unittests / unittest_nntrainer_quantizer             OK               0.42s
 9/28 nntrainer:unittests / unittest_util_func                       OK               0.42s
10/28 nntrainer:unittests / unittest_nntrainer_graph                 OK               0.42s
11/28 nntrainer:unittests / unittest_base_properties                 OK               0.40s
12/28 nntrainer:unittests / unittest_common_properties               OK               0.36s
13/28 nntrainer:unittests / unittest_nntrainer_tensor_pool           OK               0.32s
14/28 nntrainer:unittests / unittest_nntrainer_lr_scheduler          OK               0.29s
15/28 nntrainer:unittests / unittest_nntrainer_appcontext            OK               0.26s
16/28 nntrainer:unittests / unittest_tizen_capi_layer                OK               0.44s
17/28 nntrainer:unittests / unittest_compiler                        OK               0.20s
18/28 nntrainer:unittests / integration_tests                        OK               0.12s
19/28 nntrainer:unittests / unittest_layers                          OK               0.31s
20/28 nntrainer:unittests / unittest_nntrainer_models                OK               0.71s
21/28 nntrainer:unittests / unittest_memory                          OK               0.54s
22/28 nntrainer:unittests / unittest_nntrainer_modelfile             OK               1.09s
23/28 nntrainer:unittests / unittest_models                          OK               1.15s
24/28 nntrainer:unittests / unittest_datasets                        OK               1.97s
25/28 nntrainer:unittests / unittest_nntrainer_tensor                OK               2.63s
26/28 nntrainer:unittests / unittest_tizen_capi                      OK               3.44s
27/28 nntrainer:unittests / unittest_ccapi                           OK               5.73s
28/28 nntrainer:unittests / unittest_nntrainer_cpu_backend           OK               7.12s


Ok:                 28
Expected Fail:      0
Fail:               0
Unexpected Pass:    0
Skipped:            0
Timeout:            0

And finally, the default configuration with changes from this PR:

debskip@AMDN6357:~/repos/nntrainer$ meson test -C builddir/ --suite unittests
ninja: Entering directory `/home/debskip/repos/nntrainer/builddir'
ninja: no work to do.
 1/28 nntrainer:unittests / unittest_tizen_capi_layer                OK               0.08s
 2/28 nntrainer:unittests / unittest_util_func                       OK               0.07s
 3/28 nntrainer:unittests / unittest_nntrainer_internal              OK               0.07s
 4/28 nntrainer:unittests / unittest_nntrainer_quantizer             OK               0.07s
 5/28 nntrainer:unittests / unittest_tizen_capi_lr_scheduler         OK               0.09s
 6/28 nntrainer:unittests / unittest_nntrainer_lr_scheduler          OK               0.07s
 7/28 nntrainer:unittests / unittest_nntrainer_lazy_tensor           OK               0.09s
 8/28 nntrainer:unittests / unittest_nntrainer_exe_order             OK               0.10s
 9/28 nntrainer:unittests / unittest_common_properties               OK               0.09s
10/28 nntrainer:unittests / unittest_base_properties                 OK               0.11s
11/28 nntrainer:unittests / unittest_nntrainer_tensor_pool           OK               0.15s
12/28 nntrainer:unittests / unittest_tizen_capi_dataset              OK               0.17s
13/28 nntrainer:unittests / unittest_tizen_capi_optimizer            OK               0.17s
14/28 nntrainer:unittests / unittest_nntrainer_activations           OK               0.17s
15/28 nntrainer:unittests / unittest_nntrainer_appcontext            OK               0.15s
16/28 nntrainer:unittests / integration_tests                        OK               0.23s
17/28 nntrainer:unittests / unittest_compiler                        OK               0.34s
18/28 nntrainer:unittests / unittest_layers                          OK               0.43s
19/28 nntrainer:unittests / unittest_nntrainer_graph                 OK               0.79s
20/28 nntrainer:unittests / unittest_memory                          OK               0.94s
21/28 nntrainer:unittests / unittest_nntrainer_models                OK               1.01s
22/28 nntrainer:unittests / unittest_nntrainer_modelfile             OK               1.05s
23/28 nntrainer:unittests / unittest_models                          OK               1.29s
24/28 nntrainer:unittests / unittest_datasets                        OK               1.83s
25/28 nntrainer:unittests / unittest_nntrainer_tensor                OK               2.83s
26/28 nntrainer:unittests / unittest_tizen_capi                      OK               3.35s
27/28 nntrainer:unittests / unittest_ccapi                           OK               5.35s
28/28 nntrainer:unittests / unittest_nntrainer_cpu_backend           OK               7.48s


Ok:                 28
Expected Fail:      0
Fail:               0
Unexpected Pass:    0
Skipped:            0
Timeout:            0

We can see that the last two configurations with GGML enabled give very similar results in terms of time, but they also run longer then the first one, which was previously ran on CI for Ubuntu Meson workflow.

This behavior seems to be the expected one. Maybe Github Actions agents don't have enough resources to run the tests in 60s, so I tried raising the test-timeout value from 60s to 90s, and now it passes with ~78s on the longest running test case. If you're not okay with this modification, please let me know and I'll undo it, but I don't know how we can solve this in any other way.

Copy link
Contributor

@EunjuYang EunjuYang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add dequantize_row_q4_0 as well?

Added dequantize_row_q4_0 and dequantize_row_q8_0 to the nntr_ggml_impl

Signed-off-by: p-debski2 <[email protected]>
@EunjuYang
Copy link
Contributor

EunjuYang commented Sep 12, 2025

I checked this PR is compatible with our recent applications, CausalLM (Linux & Android).

Copy link
Contributor

@EunjuYang EunjuYang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants