Skip to content

Update to add several fixes (seeds, updated environments) and metric collection plots (from Mark) #30

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 33 commits into
base: main
Choose a base branch
from

Conversation

imallona
Copy link
Member

@imallona imallona commented May 29, 2025

⚠️ reviewer: #31 has a similar aim and works on forked clustbench modules, better review these side by side

This originally aimed to incorporate (parts of) btraven00@e0989fb and #29 but sequentially to have a better understanding of the many features

But currently it mostly deals with pinning software environments and fixing random seeds in some of the methods, as well as plotting results (from Mark).

⚠️ There is a sort of understanding this simplified PR could work well as a clustering example benchmark (e.g. three simple YAMLs to onboard users) and a more complex project to explore multiple soft environments, seeds etc is going to be developed at https://github.com/omnibenchmark/omnibenchmark_paper and not necessarily meant for tutorials/teaching/onboarding

First iteration

btraven00 and others added 5 commits May 21, 2025 21:21
* run from post-0.2.0 tag, main branch
* docs: use public repo URIs
* chore: add convenience target to build environments
* add top-level Makefile to prepare env
* feat: parametrize num of cores on the makefile
* chore: ignore common temporary outputs and image build artifacts
* update .eb files to easybuild 5.0
* remove remote storage
* do not run artifacts if not in main repo
* inject checksums to rmarkdown easyconfig
* update sklearn singularity definition
* feat: add microbenchmark for numpy operations
* chore: bump clustering-benchmarks to 1.1.6
* feat: templatize the definitions
* feat: mv output folders to timestamped names
* feat: add --yes flag
* docs: update README
@imallona
Copy link
Member Author

@imallona extend the makefile and the yaml-izer starting with conda on some versions, e.g. to reproduce Mark's singularity recipe results (with conda)

@imallona imallona marked this pull request as ready for review May 29, 2025 11:16
@imallona imallona requested a review from Copilot May 29, 2025 11:16
Copilot

This comment was marked as resolved.

@btraven00
Copy link
Contributor

btraven00 commented May 30, 2025

Surely this is known and planned for, but noting that this PR is not able to run the whole clustbench example (missing some of the envs etc). Should I review the current state, or wait? In other words: is the idea to merge changes incrementally or?

@imallona

This comment was marked as resolved.

@imallona
Copy link
Member Author

imallona commented Jun 3, 2025

FCPS instability is fixed (it was the seed) imallona/clustering_report#3 (comment) (this fixes it for repeated runs, but not for repeated ks within a run)

@imallona imallona requested a review from Copilot June 4, 2025 10:30
Copilot

This comment was marked as resolved.

@imallona imallona changed the title Systematize clustbench run (software grid) Update to add several fixes (seeds, updated environments) and metric collection plots (from Mark) Jun 18, 2025
@btraven00 btraven00 self-requested a review June 18, 2025 10:35
Copy link
Contributor

@btraven00 btraven00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • envs/README could use some update (remove reference to unexsisting env files)
  • maybe add a script to do the easybuild builds?

@btraven00
Copy link
Contributor

btraven00 commented Jun 18, 2025

FWIW at d414d26

(omnibenchmark) ✘-INT ~/clustering_example/envs [imallona|✔] 
12:45 $ eb clustbench.eb --robot
== Temporary log file in case of crash /tmp/eb-65zdr4j2/easybuild-6yjcgpbh.log
== found valid index for /home/ben/micromamba/envs/omnibenchmark/easybuild/easyconfigs, so using it...
== found valid index for /home/ben/micromamba/envs/omnibenchmark/easybuild/easyconfigs, so using it...
== resolving dependencies ...
== processing EasyBuild easyconfig /home/ben/clustering_example/envs/clustbench.eb
== building and installing clustbench/1-foss-2023b...
  >> installation prefix: /home/ben/.local/easybuild/software/clustbench/1-foss-2023b
== fetching files and verifying checksums...
== ... (took < 1 sec)
== FAILED: Installation ended unsuccessfully: Checksum verification for extension source genieclust-1.1.6.tar.gz failed (took 0 secs)
== Results of the build can be found in the log file(s) /tmp/eb-65zdr4j2/easybuild-clustbench-1-20250618.124547.mKhtA.log
== Summary:
   * [FAILED]  clustbench/1-foss-2023b
ERROR: Installation of clustbench.eb failed: 'Checksum verification for extension source genieclust-1.1.6.tar.gz failed'
(omnibenchmark) ✘-42 ~/clustering_example/envs [imallona|✔] 
12:45 $ eb --version
This is EasyBuild 5.1.0 (framework: 5.1.0, easyblocks: 5.1.0) on host omnibenchmark.
12:46 $ git rev-parse --short HEAD
d414d26

I'd be ok with merging the branch, though, and debug building of the easyconfigs in a separate issue, more isolated from everything else.

@imallona
Copy link
Member Author

does it work ignoring checksums?

@btraven00
Copy link
Contributor

btraven00 commented Jun 18, 2025

ignoring, or fixing the checksum, lets it go beyond

  >> command completed: exit 0, ran in < 1s
==      configuring...
==      building...
==      testing...
==      installing...
  >> running shell command:
        /home/ben/.local/easybuild/software/Python/3.11.5-GCCcore-13.2.0/bin/python -m pip install --prefix=/home/ben/.local/easybuild/software/clustbench/1-foss-2023b  --no-deps --ignore-installed --no-build-isolation .
        [started at: 2025-06-18 13:14:03]
        [working dir: /home/ben/.local/easybuild/build/clustbench/1/foss-2023b/genieclust/genieclust-1.1.6]
        [output and state saved to /tmp/eb-w38w6nna/run-shell-cmd-output/python-dug_f_x_]
==      ... (took 32 secs)

ERROR: Shell command failed!
    full command              ->  /home/ben/.local/easybuild/software/Python/3.11.5-GCCcore-13.2.0/bin/python -m pip install --prefix=/home/ben/.local/easybuild/software/clustbench/1-foss-2023b  --no-deps --ignore-installed --no-build-isolation .
    exit code                 ->  1
    called from               ->  'install_step' function in /home/ben/micromamba/envs/omnibenchmark/lib/python3.12/site-packages/easybuild/easyblocks/generic/pythonpackage.py (line 909)
    working directory         ->  /home/ben/.local/easybuild/build/clustbench/1/foss-2023b/genieclust/genieclust-1.1.6
    output (stdout + stderr)  ->  /tmp/eb-w38w6nna/run-shell-cmd-output/python-dug_f_x_/out.txt
    interactive shell script  ->  /tmp/eb-w38w6nna/run-shell-cmd-output/python-dug_f_x_/cmd.sh

== ... (took 18 mins 3 secs)
== FAILED: Installation ended unsuccessfully: shell command 'python ...' failed with exit code 1 in extensions step for clustbench.eb (took 18 mins 5 secs)
== Results of the build can be found in the log file(s) /tmp/eb-w38w6nna/easybuild-clustbench-1-20250618.125628.arxIM.log
== Summary:
   * [FAILED]  clustbench/1-foss-2023b

The problem, to me, seems to be improper configuration by the include flags in the genieclust package:

      [1/1] Cythonizing genieclust/cluster_validity.pyx
      building 'genieclust.cluster_validity' extension
      gcc -DNDEBUG -g -fwrapv -O3 -Wall -O2 -ftree-vectorize -march=native -fno-math-errno -fPIC -O2 -ftree-vectorize -march=native -fno-math-errno -fPIC -O2 -ftree-vectorize -march=native -fno-math-errno -I/home/ben/.local/easybuild/software/FFTW/3.3.10-GCC-13.2.0/include -I/home/ben/.local/easybuild/software/FlexiBLAS/3.3.1-GCC-13.2.0/include -I/home/ben/.local/easybuild/software/FlexiBLAS/3.3.1-GCC-13.2.0/include/flexiblas -fPIC -I/home/ben/.local/easybuild/software/clustbench/1-foss-2023b/lib/python3.11/site-packages/numpy/core/include -Isrc/ -I../src/ -I/home/ben/.local/easybuild/software/Python/3.11.5-GCCcore-13.2.0/include/python3.11 -c genieclust/cluster_validity.cpp -o build/temp.linux-x86_64-cpython-311/genieclust/cluster_validity.o -fopenmp -std=c++11
      In file included from /home/ben/.local/easybuild/software/clustbench/1-foss-2023b/lib/python3.11/site-packages/numpy/core/include/numpy/ndarraytypes.h:1929,
                       from /home/ben/.local/easybuild/software/clustbench/1-foss-2023b/lib/python3.11/site-packages/numpy/core/include/numpy/ndarrayobject.h:12,
                       from /home/ben/.local/easybuild/software/clustbench/1-foss-2023b/lib/python3.11/site-packages/numpy/core/include/numpy/arrayobject.h:5,
                       from genieclust/cluster_validity.cpp:1296:
      /home/ben/.local/easybuild/software/clustbench/1-foss-2023b/lib/python3.11/site-packages/numpy/core/include/numpy/npy_1_7_deprecated_api.h:17:2: warning: #warning "Using deprecated NumPy API, disable it with " "#define NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION" [-Wcpp]
         17 | #warning "Using deprecated NumPy API, disable it with " \
            |  ^~~~~~~
      In file included from genieclust/cluster_validity.cpp:1302:
      genieclust/../src/c_cvi.h:30:10: fatal error: cvi.h: No such file or directory
         30 | #include "cvi.h"
            |          ^~~~~~~
      compilation terminated.
      error: command '/tmp/eb-w38w6nna/tmpump7ifyt/rpath_wrappers/gcc_wrapper/gcc' failed with exit code 1
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for genieclust
Failed to build genieclust

Copy link
Contributor

@DanInci DanInci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to merge, apart from the conda env question left.

envs/fcps.yml Outdated
Comment on lines 18 to 26
- conda-forge::r-rmarkdown
- conda-forge::r-cairo
- conda-forge::r-svglite
- conda-forge::r-ggplot2
- conda-forge::r-tidyr
- bioconda::bioconductor-complexheatmap
- conda-forge::r-jsonlite
- conda-forge::r-dplyr
- conda-forge::r-r.utils
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't these be pinned to a version?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks 65262c9

@btraven00
Copy link
Contributor

btraven00 commented Jun 18, 2025

Doing a fresh build of the clustbench.eb environment fails with an error in one of the submodules in the perl bundle, details below:

                                               
ERROR: Shell command failed!              
    full command              ->  make test
    exit code                 ->  2        
    called from               ->  'test_step' function in /home/easybuild01/micromamba/envs/omnibenchmark/lib/python3.12/site-packages/easybuild/easyblocks/generic/configuremake.py (line 401
)                                          
    working directory         ->  /opt/cache/easybuild01/easybuild/build/PerlbundleCPAN/5.38.0/GCCcore-13.2.0/IOSocketSSL/IO-Socket-SSL-2.083
    output (stdout + stderr)  ->  /tmp/eb-dkohnfk9/run-shell-cmd-output/make-ml07jn7o/out.txt
    interactive shell script  ->  /tmp/eb-dkohnfk9/run-shell-cmd-output/make-ml07jn7o/cmd.sh
                                               
== ... (took 12 mins 16 secs)               
== FAILED: Installation ended unsuccessfully: shell command 'make ...' failed with exit code 2 in extensions step for Perl-bundle-CPAN-5.38.0-GCCcore-13.2.0.eb (took 12 mins 42 secs)
== Results of the build can be found in the log file(s) /tmp/eb-dkohnfk9/easybuild-Perl-bundle-CPAN-5.38.0-20250618.203905.NRtFl.log

From the log file:

[snip]
== 2025-06-18 20:51:48,515 environment.py:93 INFO Environment variable EBVARCPPFLAGS set to -I/home/easybuild01/.local/easybuild/software/OpenSSL/1.1/include -I/home/easybuild01/.local/easybuild/software/libreadline/8.2-GCCcore-13.2.0/include -I/home/easybuild01/.local/easybuild/software/ncurses/6.4-GCCcore-13.2.0/include -I/home/easybuild01/.local/easybuild/software/expat/2.5.0-GCCcore-13.2.0/include -I/home/easybuild01/.local/easybuild/software/zlib/1.2.13-GCCcore-13.2.0/include -I/home/easybuild01/.local/easybuild/software/binutils/2.40-GCCcore-13.2.0/include (previously undefined)
== 2025-06-18 20:51:48,515 environment.py:93 INFO Environment variable INSTALLDIRS set to site (previously undefined)
== 2025-06-18 20:51:48,522 build_log.py:226 ERROR EasyBuild encountered an error (at easybuild/tools/build_log.py:166 in caller_info): shell command 'make ...' failed with exit code 2 in extensions step for Perl-bundle-CPAN-5.38.0-GCCcore-13.2.0.eb (at easybuild/framework/easyblock.py:4856 in run_all_steps)

Do note this is from a shared environment if you want to debug.

This is EasyBuild 5.1.0 (framework: 5.1.0, easyblocks: 5.1.0) on host omnibenchmark.

@imallona
Copy link
Member Author

imallona commented Jun 19, 2025

Perhaps premature but it's running well for me, user easybuild01, tmux imallona if you'd like to keep an eye on it, or to retry with (full paths to ease browsing):

mkdir -p /home/easybuild01/imallona/src

## retrieve `develop` easyconfigs as for today, because Perl 5x CPAN errors 
##   were common in the past but issues have been fixed already (I hope)
##   please see https://github.com/easybuilders/easybuild-easyconfigs/issues?q=is%3Aissue%20Perl-bundle-CPAN
git clone https://github.com/easybuilders/easybuild-easyconfigs.git

## clone the clustering example repo - current state (`branch imallona`)
git clone [email protected]:omnibenchmark/clustering_example.git
cd clustering_example
git checkout imallona

## specify recent easyconfigs as robotspath hoping someone fixed the Perl 5 CPAN easyconfig (I think it's fixed)
eb --robot-paths=/home/easybuild01/imallona/src/easybuild-easyconfigs/easybuild/easyconfigs/ \
    --robot \
    /home/easybuild01/imallona/src/clustering_example/envs/clustbench.eb

edit it runs; please note I've updated genieclust and clustbench to 1.6 and clarified they're from the author's repo and from source dab8639 . Trying now fcps.eb using the same tmux/user/path.

@imallona
Copy link
Member Author

imallona commented Jun 19, 2025

Something's going on with pydantic-core (oras recipe), conda works but singularity doesn't anymore. Oddly enough it used to work with older omnibenchmark versions, same apptainer.

@btraven00
Copy link
Contributor

hmm should we document or script the fetching of easybuilders/easybuild-easyconfigs ?
perhaps pinning to a given commit?

@btraven00
Copy link
Contributor

btraven00 commented Jun 23, 2025

Another fresh build that took several hours to complete. I had to manually remove a bogus source file in the Doxygen build to let it continue.

== Results of the build can be found in the log file(s) /tmp/eb-nz9hamr2/easybuild-Perl-bundle-CPAN-5.38.0-20250623.040132.jTmpV.log
== Summary:
   * [SUCCESS] Doxygen/1.9.8-GCCcore-13.2.0
   * [SUCCESS] giflib/5.2.1-GCCcore-13.2.0
   * [SUCCESS] gzip/1.13-GCCcore-13.2.0
   * [SUCCESS] LittleCMS/2.15-GCCcore-13.2.0
   * [SUCCESS] lz4/1.9.4-GCCcore-13.2.0
   * [SUCCESS] groff/1.23.0-GCCcore-13.2.0
   * [SUCCESS] zstd/1.5.5-GCCcore-13.2.0
   * [FAILED]  Perl-bundle-CPAN/5.38.0-GCCcore-13.2.0
   * [SKIPPED] LibTIFF/4.6.0-GCCcore-13.2.0
   * [SKIPPED] libwebp/1.3.2-GCCcore-13.2.0
   * [SKIPPED] intltool/0.51.0-GCCcore-13.2.0
   * [SKIPPED] OpenJPEG/2.5.0-GCCcore-13.2.0
   * [SKIPPED] Pillow/10.2.0-GCCcore-13.2.0
   * [SKIPPED] gperf/3.1-GCCcore-13.2.0
   * [SKIPPED] util-linux/2.39-GCCcore-13.2.0
   * [SKIPPED] fontconfig/2.14.2-GCCcore-13.2.0
   * [SKIPPED] X11/20231019-GCCcore-13.2.0
   * [SKIPPED] Tk/8.6.13-GCCcore-13.2.0
   * [SKIPPED] Tkinter/3.11.5-GCCcore-13.2.0
   * [SKIPPED] matplotlib/3.8.2-gfbf-2023b
   * [SKIPPED] clustbench/1.6-foss-2023b
ERROR: Installation of Perl-bundle-CPAN-5.38.0-GCCcore-13.2.0.eb failed: "shell command 'make ...' failed with exit code 2 in extensions step for Perl-bundle-CPAN-5.38.0-GCCcore-13.2.0.eb"

(omnibenchmark) easybuild02@omnibenchmark:~/easybuild-easyconfigs$ git show
commit 78a0a64a53d921fcf3c3bc83ac66966626bfa15f (HEAD -> develop, origin/develop, origin/HEAD)
Merge: b2b6c0db2e 474fef3b3d
Author: Simon Branford <[email protected]>
Date:   Sat Jun 21 11:58:05 2025 +0100

    Merge pull request #23178 from SebastianAchilles/20250621104359_new_pr_Ninja1130

    {tools}[GCCcore/14.3.0] Ninja v1.13.0, Meson v1.8.2, libpciaccess v0.18.1

(omnibenchmark) easybuild02@omnibenchmark:~/easybuild-easyconfigs$

(omnibenchmark) easybuild02@omnibenchmark:~/clustering_example$ git log -1
commit 057c248af14de0d2464553d538a1ed2649fcd05a (HEAD -> imallona, origin/imallona)
Author: Izaskun Mallona <[email protected]>
Date:   Fri Jun 20 11:29:16 2025 +0200

    Move graph sources from CRAN to bioC (they moved the package)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

fcps|atom and other datasets with true cardinality <4 show inconsistent results when repeatedly running the same k
3 participants