Skip to content

安装过程报错 #2

@zhudy

Description

@zhudy

x86和arm虚机上都试了,按照docker目录的Dockerfile 都可以启动容器:docker run -it -v ~/zhihu:/mnt nvidia/cuda:12.5.1-devel-ubuntu20.04,但同样碰到几个问题:

  1. nvidia/cuda:12.5.1-devel-ubuntu20.04 这个image arm和x86上都应该没有预装python,所以启动容器后,容器里需要额外+: apt install python3 python3-pip
  2. 容器里运行 root@8140d29f1925:/mnt/ZhiLight# pip3 install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple
    都会报错(不知道为啥找不到numpy2.1.3版本,改成最新的2.2.0也找不到,去掉清华镜像也一样错误):
    Requirement already satisfied: torch==2.4.1 in /usr/local/lib/python3.8/dist-packages (from -r requirements.txt (line 1)) (2.4.1)
    ERROR: Could not find a version that satisfies the requirement numpy==2.1.3 (from -r requirements.txt (line 2)) (from versions: 1.3.0, 1.4.1, 1.5.0, 1.5.1, 1.6.0, 1.6.1, 1.6.2, 1.7.0, 1.7.1, 1.7.2, 1.8.0, 1.8.1, 1.8.2, 1.9.0, 1.9.1, 1.9.2, 1.9.3, 1.10.0.post2, 1.10.1, 1.10.2, 1.10.4, 1.11.0, 1.11.1, 1.11.2, 1.11.3, 1.12.0, 1.12.1, 1.13.0, 1.13.1, 1.13.3, 1.14.0, 1.14.1, 1.14.2, 1.14.3, 1.14.4, 1.14.5, 1.14.6, 1.15.0, 1.15.1, 1.15.2, 1.15.3, 1.15.4, 1.16.0, 1.16.1, 1.16.2, 1.16.3, 1.16.4, 1.16.5, 1.16.6, 1.17.0, 1.17.1, 1.17.2, 1.17.3, 1.17.4, 1.17.5, 1.18.0, 1.18.1, 1.18.2, 1.18.3, 1.18.4, 1.18.5, 1.19.0, 1.19.1, 1.19.2, 1.19.3, 1.19.4, 1.19.5, 1.20.0, 1.20.1, 1.20.2, 1.20.3, 1.21.0, 1.21.1, 1.21.2, 1.21.3, 1.21.4, 1.21.5, 1.21.6, 1.22.0, 1.22.1, 1.22.2, 1.22.3, 1.22.4, 1.23.0, 1.23.1, 1.23.2, 1.23.3, 1.23.4, 1.23.5, 1.24.0, 1.24.1, 1.24.2, 1.24.3, 1.24.4)
    ERROR: No matching distribution found for numpy==2.1.3 (from -r requirements.txt (line 2))
  3. 忽略第二个错误,容器里继续执行打包: root@8140d29f1925:/mnt/ZhiLight# python3 setup.py bdist_wheel 报错:error: [Errno 2] No such file or directory: 'cmake' 通过apt install -y cmake 无法解决,会报告: CMake 3.18 or higher is required. You are running version 3.16.3 先删除之:apt remove cmake 换成高版本:wget https://cmake.org/files/v3.31/cmake-3.31.2-linux-x86_64.tar.gz && tar xzvf cmake-3.31.2-linux-x86_64.tar.gz && export PATH=$PATH:/mnt/cmake-3.31.2-linux-x86_64/bin
  4. 容器里继续执行打包: root@8140d29f1925:/mnt/ZhiLight# python3 setup.py bdist_wheel
    running bdist_wheel
    running build
    running build_py
    running egg_info
    writing zhilight.egg-info/PKG-INFO
    writing dependency_links to zhilight.egg-info/dependency_links.txt
    writing requirements to zhilight.egg-info/requires.txt
    writing top-level names to zhilight.egg-info/top_level.txt
    reading manifest file 'zhilight.egg-info/SOURCES.txt'
    writing manifest file 'zhilight.egg-info/SOURCES.txt'
    running build_ext
    -- The C compiler identification is GNU 9.4.0
    -- The CXX compiler identification is GNU 9.4.0
    -- Detecting C compiler ABI info
    -- Detecting C compiler ABI info - done
    -- Check for working C compiler: /usr/bin/cc - skipped
    -- Detecting C compile features
    -- Detecting C compile features - done
    -- Detecting CXX compiler ABI info
    -- Detecting CXX compiler ABI info - done
    -- Check for working CXX compiler: /usr/bin/c++ - skipped
    -- Detecting CXX compile features
    -- Detecting CXX compile features - done
    -- The CUDA compiler identification is NVIDIA 12.5.82 with host compiler GNU 9.4.0
    -- Detecting CUDA compiler ABI info
    -- Detecting CUDA compiler ABI info - done
    -- Check for working CUDA compiler: /usr/local/cuda/bin/nvcc - skipped
    -- Detecting CUDA compile features
    -- Detecting CUDA compile features - done
    CUDA Version: 12.5.82
    -- Found Python: /usr/bin/python3.8 (found version "3.8.10") found components: Interpreter Development Development.Module Development.Embed
    Will link against CUDA 12 complied libraries
    CMAKE_CXX_FLAGS -D_GLIBCXX_USE_CXX11_ABI=0
    -- CMAKE_INSTALL_RPATH:
    -- Submodule update
    -- USE_STATIC_NCCL is set. Linking with static NCCL library.
    -- Found NCCL: /usr/include
    -- Determining NCCL version from /usr/include/nccl.h...
    -- Looking for NCCL_VERSION_CODE
    -- Looking for NCCL_VERSION_CODE - not found
    -- NCCL version < 2.3.5-5
    -- Found NCCL (include: /usr/include, library: /usr/lib/x86_64-linux-gnu/libnccl_static.a)
    Traceback (most recent call last):
    File "", line 1, in
    ModuleNotFoundError: No module named 'pybind11'
    CMake Error at CMakeLists.txt:68 (find_package):
    By not providing "Findpybind11.cmake" in CMAKE_MODULE_PATH this project has
    asked CMake to find a package configuration file provided by "pybind11",
    but CMake did not find one.

Could not find a package configuration file provided by "pybind11" with any
of the following names:

pybind11Config.cmake
pybind11-config.cmake

Add the installation prefix of "pybind11" to CMAKE_PREFIX_PATH or set
"pybind11_DIR" to a directory containing one of the above files. If
"pybind11" provides a separate development package or SDK, be sure it has
been installed.

-- Configuring incomplete, errors occurred!
Traceback (most recent call last):
File "setup.py", line 138, in
setup(
File "/usr/lib/python3/dist-packages/setuptools/init.py", line 144, in setup
return distutils.core.setup(**attrs)
File "/usr/lib/python3.8/distutils/core.py", line 148, in setup
dist.run_commands()
File "/usr/lib/python3.8/distutils/dist.py", line 966, in run_commands
self.run_command(cmd)
File "/usr/lib/python3.8/distutils/dist.py", line 985, in run_command
cmd_obj.run()
File "/usr/lib/python3/dist-packages/wheel/bdist_wheel.py", line 223, in run
self.run_command('build')
File "/usr/lib/python3.8/distutils/cmd.py", line 313, in run_command
self.distribution.run_command(command)
File "/usr/lib/python3.8/distutils/dist.py", line 985, in run_command
cmd_obj.run()
File "/usr/lib/python3.8/distutils/command/build.py", line 135, in run
self.run_command(cmd_name)
File "/usr/lib/python3.8/distutils/cmd.py", line 313, in run_command
self.distribution.run_command(command)
File "/usr/lib/python3.8/distutils/dist.py", line 985, in run_command
cmd_obj.run()
File "/usr/lib/python3/dist-packages/setuptools/command/build_ext.py", line 87, in run
_build_ext.run(self)
File "/usr/lib/python3.8/distutils/command/build_ext.py", line 340, in run
self.build_extensions()
File "/usr/lib/python3.8/distutils/command/build_ext.py", line 449, in build_extensions
self._build_extensions_serial()
File "/usr/lib/python3.8/distutils/command/build_ext.py", line 474, in _build_extensions_serial
self.build_extension(ext)
File "setup.py", line 102, in build_extension
subprocess.check_call(["cmake", ext.sourcedir] + cmake_args, cwd=build_temp)
File "/usr/lib/python3.8/subprocess.py", line 364, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['cmake', '/mnt/ZhiLight', '-DCMAKE_CXX_STANDARD=17', '-DCMAKE_LIBRARY_OUTPUT_DIRECTORY=/mnt/ZhiLight/build/lib.linux-x86_64-3.8/zhilight/', '-DPYTHON_EXECUTABLE=/usr/bin/python3', '-DPYTHON_VERSION=3.8', '-DCMAKE_BUILD_TYPE=Release', '-DWITH_TESTING=ON', '-DEXAMPLE_VERSION_INFO=0.4.7', '-GNinja', '-DCMAKE_MAKE_PROGRAM:FILEPATH=ninja', '-DPython_ROOT_DIR=/usr']' returned non-zero exit status 1.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions