-
Notifications
You must be signed in to change notification settings - Fork 103
Description
x86和arm虚机上都试了,按照docker目录的Dockerfile 都可以启动容器:docker run -it -v ~/zhihu:/mnt nvidia/cuda:12.5.1-devel-ubuntu20.04,但同样碰到几个问题:
- nvidia/cuda:12.5.1-devel-ubuntu20.04 这个image arm和x86上都应该没有预装python,所以启动容器后,容器里需要额外+: apt install python3 python3-pip
- 容器里运行 root@8140d29f1925:/mnt/ZhiLight# pip3 install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple
都会报错(不知道为啥找不到numpy2.1.3版本,改成最新的2.2.0也找不到,去掉清华镜像也一样错误):
Requirement already satisfied: torch==2.4.1 in /usr/local/lib/python3.8/dist-packages (from -r requirements.txt (line 1)) (2.4.1)
ERROR: Could not find a version that satisfies the requirement numpy==2.1.3 (from -r requirements.txt (line 2)) (from versions: 1.3.0, 1.4.1, 1.5.0, 1.5.1, 1.6.0, 1.6.1, 1.6.2, 1.7.0, 1.7.1, 1.7.2, 1.8.0, 1.8.1, 1.8.2, 1.9.0, 1.9.1, 1.9.2, 1.9.3, 1.10.0.post2, 1.10.1, 1.10.2, 1.10.4, 1.11.0, 1.11.1, 1.11.2, 1.11.3, 1.12.0, 1.12.1, 1.13.0, 1.13.1, 1.13.3, 1.14.0, 1.14.1, 1.14.2, 1.14.3, 1.14.4, 1.14.5, 1.14.6, 1.15.0, 1.15.1, 1.15.2, 1.15.3, 1.15.4, 1.16.0, 1.16.1, 1.16.2, 1.16.3, 1.16.4, 1.16.5, 1.16.6, 1.17.0, 1.17.1, 1.17.2, 1.17.3, 1.17.4, 1.17.5, 1.18.0, 1.18.1, 1.18.2, 1.18.3, 1.18.4, 1.18.5, 1.19.0, 1.19.1, 1.19.2, 1.19.3, 1.19.4, 1.19.5, 1.20.0, 1.20.1, 1.20.2, 1.20.3, 1.21.0, 1.21.1, 1.21.2, 1.21.3, 1.21.4, 1.21.5, 1.21.6, 1.22.0, 1.22.1, 1.22.2, 1.22.3, 1.22.4, 1.23.0, 1.23.1, 1.23.2, 1.23.3, 1.23.4, 1.23.5, 1.24.0, 1.24.1, 1.24.2, 1.24.3, 1.24.4)
ERROR: No matching distribution found for numpy==2.1.3 (from -r requirements.txt (line 2)) - 忽略第二个错误,容器里继续执行打包: root@8140d29f1925:/mnt/ZhiLight# python3 setup.py bdist_wheel 报错:error: [Errno 2] No such file or directory: 'cmake' 通过apt install -y cmake 无法解决,会报告: CMake 3.18 or higher is required. You are running version 3.16.3 先删除之:apt remove cmake 换成高版本:wget https://cmake.org/files/v3.31/cmake-3.31.2-linux-x86_64.tar.gz && tar xzvf cmake-3.31.2-linux-x86_64.tar.gz && export PATH=$PATH:/mnt/cmake-3.31.2-linux-x86_64/bin
- 容器里继续执行打包: root@8140d29f1925:/mnt/ZhiLight# python3 setup.py bdist_wheel
running bdist_wheel
running build
running build_py
running egg_info
writing zhilight.egg-info/PKG-INFO
writing dependency_links to zhilight.egg-info/dependency_links.txt
writing requirements to zhilight.egg-info/requires.txt
writing top-level names to zhilight.egg-info/top_level.txt
reading manifest file 'zhilight.egg-info/SOURCES.txt'
writing manifest file 'zhilight.egg-info/SOURCES.txt'
running build_ext
-- The C compiler identification is GNU 9.4.0
-- The CXX compiler identification is GNU 9.4.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- The CUDA compiler identification is NVIDIA 12.5.82 with host compiler GNU 9.4.0
-- Detecting CUDA compiler ABI info
-- Detecting CUDA compiler ABI info - done
-- Check for working CUDA compiler: /usr/local/cuda/bin/nvcc - skipped
-- Detecting CUDA compile features
-- Detecting CUDA compile features - done
CUDA Version: 12.5.82
-- Found Python: /usr/bin/python3.8 (found version "3.8.10") found components: Interpreter Development Development.Module Development.Embed
Will link against CUDA 12 complied libraries
CMAKE_CXX_FLAGS -D_GLIBCXX_USE_CXX11_ABI=0
-- CMAKE_INSTALL_RPATH:
-- Submodule update
-- USE_STATIC_NCCL is set. Linking with static NCCL library.
-- Found NCCL: /usr/include
-- Determining NCCL version from /usr/include/nccl.h...
-- Looking for NCCL_VERSION_CODE
-- Looking for NCCL_VERSION_CODE - not found
-- NCCL version < 2.3.5-5
-- Found NCCL (include: /usr/include, library: /usr/lib/x86_64-linux-gnu/libnccl_static.a)
Traceback (most recent call last):
File "", line 1, in
ModuleNotFoundError: No module named 'pybind11'
CMake Error at CMakeLists.txt:68 (find_package):
By not providing "Findpybind11.cmake" in CMAKE_MODULE_PATH this project has
asked CMake to find a package configuration file provided by "pybind11",
but CMake did not find one.
Could not find a package configuration file provided by "pybind11" with any
of the following names:
pybind11Config.cmake
pybind11-config.cmake
Add the installation prefix of "pybind11" to CMAKE_PREFIX_PATH or set
"pybind11_DIR" to a directory containing one of the above files. If
"pybind11" provides a separate development package or SDK, be sure it has
been installed.
-- Configuring incomplete, errors occurred!
Traceback (most recent call last):
File "setup.py", line 138, in
setup(
File "/usr/lib/python3/dist-packages/setuptools/init.py", line 144, in setup
return distutils.core.setup(**attrs)
File "/usr/lib/python3.8/distutils/core.py", line 148, in setup
dist.run_commands()
File "/usr/lib/python3.8/distutils/dist.py", line 966, in run_commands
self.run_command(cmd)
File "/usr/lib/python3.8/distutils/dist.py", line 985, in run_command
cmd_obj.run()
File "/usr/lib/python3/dist-packages/wheel/bdist_wheel.py", line 223, in run
self.run_command('build')
File "/usr/lib/python3.8/distutils/cmd.py", line 313, in run_command
self.distribution.run_command(command)
File "/usr/lib/python3.8/distutils/dist.py", line 985, in run_command
cmd_obj.run()
File "/usr/lib/python3.8/distutils/command/build.py", line 135, in run
self.run_command(cmd_name)
File "/usr/lib/python3.8/distutils/cmd.py", line 313, in run_command
self.distribution.run_command(command)
File "/usr/lib/python3.8/distutils/dist.py", line 985, in run_command
cmd_obj.run()
File "/usr/lib/python3/dist-packages/setuptools/command/build_ext.py", line 87, in run
_build_ext.run(self)
File "/usr/lib/python3.8/distutils/command/build_ext.py", line 340, in run
self.build_extensions()
File "/usr/lib/python3.8/distutils/command/build_ext.py", line 449, in build_extensions
self._build_extensions_serial()
File "/usr/lib/python3.8/distutils/command/build_ext.py", line 474, in _build_extensions_serial
self.build_extension(ext)
File "setup.py", line 102, in build_extension
subprocess.check_call(["cmake", ext.sourcedir] + cmake_args, cwd=build_temp)
File "/usr/lib/python3.8/subprocess.py", line 364, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['cmake', '/mnt/ZhiLight', '-DCMAKE_CXX_STANDARD=17', '-DCMAKE_LIBRARY_OUTPUT_DIRECTORY=/mnt/ZhiLight/build/lib.linux-x86_64-3.8/zhilight/', '-DPYTHON_EXECUTABLE=/usr/bin/python3', '-DPYTHON_VERSION=3.8', '-DCMAKE_BUILD_TYPE=Release', '-DWITH_TESTING=ON', '-DEXAMPLE_VERSION_INFO=0.4.7', '-GNinja', '-DCMAKE_MAKE_PROGRAM:FILEPATH=ninja', '-DPython_ROOT_DIR=/usr']' returned non-zero exit status 1.