This file shows how to generate a new Bazel module encapsulating a GCC compiler suite.
For this example, we want to build from the stable releases of
binutils, gcc, and glibc for x86_64 and riscv architectures.
We'll use the x86_64 and riscv-64 compiler suite development tips as examples. These will become the gcc_x86_64_suite and the gcc_riscv_suite Bazel Modules.
These modules will use Bazel module versioning, with a patch number following the GCC version number. For the example below, we will be creating
gcc_x86_64_suite and gcc_riscv_suite version 15.2.0.0 using a snapshot of GCC 15.2.0 as the baseline source. The Bazel version is 8.4.
The example use case for these crosscompiler tools is support of a RISC-V 64 bit network appliance, built with some AI or inference engine capabilities. That means the RISC-V toolchain is the dominant toolchain, with a closely-aligned x86_64 toolchain useful for unit tests and internal integration tests.
Toolchains run over kernels, importing quite a lot of include files and linker/loader data from the kernel build process. We won't build a kernel from scratch to
get those sysroot files. Instead we will import them from Ubuntu 25 system images.
It takes several steps to generate each Bazel crosscompiler toolchain:
- Git pull or clone sources for
binutils,gcc, andglibc. Detach the Git head to selectbinutils-2_45,releases/gcc-15.2.0, andglibc-2.42respectively. For this example, these are found under/home2/vendor. - Import sysroot files for a minimal server. This takes several steps. For the RISC-V case these are:
- Download installation media for Ubuntu for RISC-V.
- Move it to
/opt/riscv_vm/ubuntu-25.04-live-server-riscv64.iso. - Start a new VM with this ISO, using a recent
u-boot.binto get it launched:DIR=/opt/riscv_vm qemu-system-riscv64 \ -machine virt -nographic -m 8192 -smp 4 \ -kernel $DIR/u-boot.bin \ -device virtio-net-device,netdev=eth0 -netdev user,id=eth0 \ -device virtio-rng-pci \ -drive file=$DIR/ubuntu-25.04-live-server-riscv64.iso,format=raw,if=virtio \ -drive file=$DIR/disk,format=raw,if=virtio
- Complete the VM installation steps and reboot without the ISO image
DIR=/opt/riscv_vm QEMU_CPU=rv64,v=true,zba=true,zbb=true,zbc=true,zbkb=true,zbkc=true,zbkx=true,zvbb=true,zvbc=true,vlen=256,vext_spec=v1.0 \ qemu-system-riscv64 -L $DIR/lib \ -machine virt -cpu max,zfbfmin=false,zvfbfmin=false,zvfbfwma=false -nographic -m 8192 -smp 4 \ -kernel $DIR/u-boot.bin \ -device virtio-net-device,netdev=eth0 \ -netdev user,id=eth0,hostfwd=tcp::5555-:22 \ -device virtio-rng-pci \ -drive file=$DIR/disk,format=raw,if=virtio
- Survey for key files connecting the kernel to
gccandglibc:riscvm:/usr$ find . -name libc.so.6 -ls 1545 1536 -rwxr-xr-x 1 root root 1571096 Jul 9 16:42 ./lib/riscv64-linux-gnu/libc.so.6 riscvm:/usr$ find . -name crt1.o -ls 23101 4 -rw-r--r-- 1 root root 3592 Jul 9 16:42 ./lib/riscv64-linux-gnu/crt1.o riscvm:/usr$ find . -name stdio.h ./include/stdio.h ./include/riscv64-linux-gnu/bits/stdio.h
- Update and upgrade any installed Ubuntu packages
- Generate a tarball from the existing sysroot:
riscvm:/$ tar cJf ~/riscv_sysroot.tar.xz usr
scpthis to our host Linux system, and install it as/opt/riscv. Find and patch any full path name references buried within so files:$ find . -name \*.so -size -10b -ls | grep -v '\->' 21970344 4 -rw-r--r-- 295 Jul 9 12:42 ./usr/lib/riscv64-linux-gnu/libc.so
- Configure and compile those sources into
/home2/build_riscv. The configuration step specifies intermediate install directory/opt/riscv/sysroot.- Clean the
binutilsbuild directory and configure with:
/home2/vendor/binutils-gdb/configure --prefix=/opt/riscv/sysroot \ --with-sysroot=/opt/riscv/sysroot --target=riscv64-linux-gnu make -j4 make install
- Some files are not exactly where the gcc build environment expects them, so add some links:
mkdir /opt/riscv/sysroot/lib/riscv64-linux-gnu ln /opt/riscv/sysroot/lib/libc.so.6 /opt/riscv/sysroot/lib/riscv64-linux-gnu/libc.so.6 /opt/riscv/sysroot/usr/include$ ln -s riscv64-linux-gnu/asm asm
- Clean the gcc build directory and configure with:
/home2/vendor/gcc/configure --prefix=/opt/riscv/sysroot \ --with-sysroot=/opt/riscv/sysroot --enable-languages=c,c++ --disable-multilib \ --target=riscv64-linux-gnu make -j4 make install
- Clean the glibc build directory and configure with:
/home2/vendor/glibc/configure --host=riscv64-linux-gnu --prefix=/opt/riscv/sysroot \ CC=/opt/riscv/sysroot/bin/riscv64-linux-gnu-gcc LD=/opt/riscv/sysroot/bin/riscv64-linux-gnu-ld \ AR=/opt/riscv/sysroot/bin/riscv64-linux-gnu-ar \ --with-headers=/opt/riscv/sysroot/usr/include \ --disable-multilib --enable-languages=c,c++ make -j4 make install
- Search for and replace any full paths with relative paths. This mostly applies to sharable
object files and some loader metadata files. 'libc.so
andlibm.so` are common files needing edits.The file$ find . -name \*.so -size -10b -ls | grep -v '\->' 21970344 4 -rw-r--r-- 295 Jul 9 12:42 ./usr/lib/riscv64-linux-gnu/libc.so
libc.sois a text file. Edit theGROUPline to look like:GROUP ( ./libc.so.6 ./libc_nonshared.a AS_NEEDED ( ./ld-linux-riscv64-lp64d.so.1 ) ) - Test the linkages with commands like:
The linkages most likely to fail involve
$ /opt/riscv/sysroot/bin/riscv64-linux-gnu-gcc examples/helloworld.c $ file a.out a.out: ELF 64-bit LSB executable, UCB RISC-V, RVC, double-float ABI, version 1 (SYSV), \ dynamically linked, interpreter /lib/ld-linux-riscv64-lp64d.so.1, for GNU/Linux 4.15.0, with debug_info, not stripped /opt/riscv/sysroot/bin/riscv64-linux-gnu-gcc -lstdc++ examples/helloworld.cc $ file a.out a.out: ELF 64-bit LSB executable, UCB RISC-V, RVC, double-float ABI, version 1 (SYSV), \ dynamically linked, interpreter /lib/ld-linux-riscv64-lp64d.so.1, for GNU/Linux 4.15.0,\ with debug_info, not stripped
gccfindingas,ld, andcollect2, or the linker finding kernel-generated include files or dynamic loader files.
- Clean the
We now have a RISC-V crosscompiler installed on our local workstation. The next steps collect
a subset of those crosscompiler files into portable Bazel modules. Scripts like scripts/gcc_riscv.py
do most of that work.
- Identify the subset of files in the intermediate install directories intended for the Bazel tarballs.
These files are specified in python scripts like
script/gcc_riscv.py. The files and their locations depend on specific suite releases, so these scripts likely need to be updated. - Run the desired script to collect needed files, strip unnecessary debugging information, replace duplicates with hard links, and
generate the compressed Bazel module tarballs. The scripts will install the new Bazel modules (tarball plus metadata) under
/opt/bazel/bzlmod/.Test the new modules to verify all desired files are present and nothing references host directories like/usr/or/opt/riscv64. This is usually an iterative process, especially making sure that all of the obscure files needed by the linker/loader are present and on a relative file path.39. Move the installation directory/opt/riscv/to/opt/riscv_savebefore exercising the new modules within Bazel. This helps test hermeticity, so thatbazel buildcan not easily find the local compiler suite components.
A development shop might need three or more coordinated toolchains. We have the first one, a crosscompiler toolchain ready to build binaries for deployment. The shop would also need at least one more toolchain, one to run on an integration server or developer's workstation to test compilations, run unit tests, and fully mocked local integration tests. Those toolchains likely run on an x86_64 hardware platform instead of the riscv-64 hardware platform, and with a different sysroot configuration. The compiler version and system library versions should be the same.
For our example, we want an x86_64 toolchain with gcc 15.2, regardless of the native gcc version available on the integration server or the developer's
workstations. Our sysroot can be copied - with pruning - from an existing server's /usr directory, where we clone a selection of /usr/include and
/usr/lib64 into the new sysroot /opt/x86_64.
Building the x86_64 toolchain is similar to building the riscv64 toolchain, except there is no target option in the configuration.
File location within /opt/x86_64/sysroot is likely to differ, since this is not a crosscompiler build so much as a direct upgrade
to a single-platform sysroot.
The host crosscompiler components expect to find a number of shared libraries at runtime. If we want a portable toolchain, we need those to be found in a portable system Bazel module, not from the local host system.
For example, the compiler component cc1 depends on these system libraries at specific versions:
- libisl.so.15.1.1
- libmpc.so.3
- libmpfr.so.6
- libgmp.so.10
- libzstd.so.1
- libstdc++.so.6
- libm.so.6
- libgcc_s.so.1
- libc.so.6
These versions are those found on a Fedora 42 workstation at the time the compiler suites were built. We need to package these as a Bazel module and register that module as a dependency of any compiler suites built together.
bazel_dep(name="fedora_syslibs", version="42.0.0")Each of the toolchain wrappers should use these libraries. For example, toolchains/riscv/gcc-risc/imported/gcc:
#!/bin/bash
set -euo pipefail
LD_LIBRARY_PATH=external/fedora_syslibs+ \
external/gcc_riscv_suite+/bin/riscv64-linux-gnu-gcc "$@"- [] Test with more complex compilations
- [] Understand and remove multiple versions of the same file, especially between
/usr/includeand/include - [] Verify that the loader scripts are current and hermetic
- [] Document the search paths for complicated process invocations and include path searches, such as gcc -> ld -> loader scripts.