Multiple issues installing for AMD GPU (Radeon RX7600XT) #1519

mcondarelli · 2025-02-16T14:12:05Z

System Info

I am under Linux Mint Xia (based on Ubuntu 24.04).
CPU: AMD Ryzen 9 5950X 16-Core Processor with 64GiB RAM.
GPU: Advanced Micro Devices, Inc. [AMD/ATI] Navi 33 [Radeon RX 7600/7600 XT/7600M XT/7600S/7700S / PRO W7600] (rev c0) (Actually: RX 7600 XT, if it matters)
Python: Python 3.10.16
Application: I am testing with InvokeAI

Reproduction

I tried following recipe but I found several errors:

project has been converted to .toml and thus pip install -r requirements-dev.txt won't work.
cmake -DCOMPUTE_BACKEND=hip -S . && make completes with no errors (just a few "kernels.hip:2857:17: warning: loop not unrolled:...".

Installing in venv did not complain but usage resulted in hard error:

Could not load bitsandbytes native library: /home/mcon/tmp/t/invoke/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_rocm63.so: undefined symbol: _Z36__device_stub__kOptimizer32bit1StateI12hip_bfloat16Li2EEvPT_S2_PfS3_ffffffiffbi
Traceback (most recent call last):
File "/home/mcon/tmp/t/invoke/lib/python3.10/site-packages/bitsandbytes/cextension.py", line 107, in <module>
  lib = get_native_library()
File "/home/mcon/tmp/t/invoke/lib/python3.10/site-packages/bitsandbytes/cextension.py", line 86, in get_native_library
  dll = ct.cdll.LoadLibrary(str(binary_path))
File "/usr/local/lib/python3.10/ctypes/__init__.py", line 452, in LoadLibrary
  return self._dlltype(name)
File "/usr/local/lib/python3.10/ctypes/__init__.py", line 374, in __init__
  self._handle = _dlopen(self._name, mode)
OSError: /home/mcon/tmp/t/invoke/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_rocm63.so: undefined symbol: _Z36__device_stub__kOptimizer32bit1StateI12hip_bfloat16Li2EEvPT_S2_PfS3_ffffffiffbi

ROCm Setup failed despite ROCm being available. Please run the following command to get more information:

python -m bitsandbytes

Inspect the output of the command and see if you can locate ROCm libraries. You might need to add them
to your LD_LIBRARY_PATH. If you suspect a bug, please take the information from python -m bitsandbytes
and open an issue at: https://github.com/bitsandbytes-foundation/bitsandbytes/issues

I also tried:

(invoke) mcon@ikea:~/tmp/t$ ROCM_PATH=/opt/rocm LD_LIBRARY_PATH=/opt/rocm/lib python -m bitsandbytes
Could not load bitsandbytes native library: /home/mcon/tmp/t/invoke/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_rocm63.so: undefined symbol: _Z36__device_stub__kOptimizer32bit1StateI12hip_bfloat16Li2EEvPT_S2_PfS3_ffffffiffbi
Traceback (most recent call last):
  File "/home/mcon/tmp/t/invoke/lib/python3.10/site-packages/bitsandbytes/cextension.py", line 107, in <module>
    lib = get_native_library()
  File "/home/mcon/tmp/t/invoke/lib/python3.10/site-packages/bitsandbytes/cextension.py", line 86, in get_native_library
    dll = ct.cdll.LoadLibrary(str(binary_path))
  File "/usr/local/lib/python3.10/ctypes/__init__.py", line 452, in LoadLibrary
    return self._dlltype(name)
  File "/usr/local/lib/python3.10/ctypes/__init__.py", line 374, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: /home/mcon/tmp/t/invoke/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_rocm63.so: undefined symbol: _Z36__device_stub__kOptimizer32bit1StateI12hip_bfloat16Li2EEvPT_S2_PfS3_ffffffiffbi

ROCm Setup failed despite ROCm being available. Please run the following command to get more information:

python -m bitsandbytes

Inspect the output of the command and see if you can locate ROCm libraries. You might need to add them
to your LD_LIBRARY_PATH. If you suspect a bug, please take the information from python -m bitsandbytes
and open an issue at: https://github.com/bitsandbytes-foundation/bitsandbytes/issues

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++ BUG REPORT INFORMATION ++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++ OTHER +++++++++++++++++++++++++++
ROCm specs: rocm_version_string='63', rocm_version_tuple=(6, 3)
PyTorch settings found: ROCM_VERSION=63
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++ DEBUG INFO END ++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Checking that the library is importable and ROCm is callable...
Couldn't load the bitsandbytes library, likely due to missing binaries.
Please ensure bitsandbytes is properly installed.

For source installations, compile the binaries with `cmake -DCOMPUTE_BACKEND=hip -S .`.
See the documentation for more details if needed.

Trying a simple check anyway, but this will likely fail...
Traceback (most recent call last):
  File "/home/mcon/tmp/t/invoke/lib/python3.10/site-packages/bitsandbytes/diagnostics/main.py", line 73, in main
    sanity_check()
  File "/home/mcon/tmp/t/invoke/lib/python3.10/site-packages/bitsandbytes/diagnostics/main.py", line 37, in sanity_check
    p1 = p.data.sum().item()
RuntimeError: HIP error: invalid device function
HIP kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing AMD_SERIALIZE_KERNEL=3
Compile with `TORCH_USE_HIP_DSA` to enable device-side assertions.

Above we output some debug information.
Please provide this info when creating an issue via https://github.com/TimDettmers/bitsandbytes/issues/new/choose
WARNING: Please be sure to sanitize sensitive info from the output before posting it.

Expected behavior

I expected to be able to use bitsandbytes.

The text was updated successfully, but these errors were encountered:

matthewdouglas · 2025-02-19T00:45:35Z

Thanks for reporting. You're right that the requirements-dev.txt has been removed and we need to update the docs.

Unfortunately the preview branch is in a broken state at the moment on ROCm. We're working on it! The commit at a0a95fd might be the best bet in the meantime.

mcondarelli · 2025-02-19T01:06:18Z

Unfortunately the preview branch is in a broken state at the moment on ROCm. We're working on it! The commit at a0a95fd might be the best bet in the meantime.

Thanks, I am now using fork by AMD (https://github.com/ROCm/bitsandbytes) which seems to be working; what is advise? use that or commit at a0a95fd?

matthewdouglas · 2025-02-19T15:39:34Z

I think either should be fine but if you've already achieved a working build then I would personally stick to that.

visionscaper · 2025-02-20T21:55:02Z

I had exactly the same issue:

Could not load bitsandbytes native library: bitsandbytes/bitsandbytes/libbitsandbytes_rocm63.so: undefined symbol: _Z36__device_stub__kOptimizer32bit1StateI12hip_bfloat16Li2EEvPT_S2_PfS3_ffffffiffbi

Reverting to commit a0a95fd seem to have resolved this particular issue, however another one surfaced:

$ python -m bitsandbytes
Traceback (most recent call last):
  File "/usr/lib/python3.10/runpy.py", line 187, in _run_module_as_main
    mod_name, mod_spec, code = _get_module_details(mod_name, _Error)
  File "/usr/lib/python3.10/runpy.py", line 146, in _get_module_details
    return _get_module_details(pkg_main_name, error)
  File "/usr/lib/python3.10/runpy.py", line 110, in _get_module_details
    __import__(pkg_name)
  File "/home/freddy/workspace/bitsandbytes/bitsandbytes/__init__.py", line 70, in <module>
    from .nn import modules
  File "/home/freddy/workspace/bitsandbytes/bitsandbytes/nn/__init__.py", line 21, in <module>
    from .triton_based_modules import (
  File "/home/freddy/workspace/bitsandbytes/bitsandbytes/nn/triton_based_modules.py", line 7, in <module>
    from bitsandbytes.triton.int8_matmul_mixed_dequantize import (
  File "/home/freddy/workspace/bitsandbytes/bitsandbytes/triton/int8_matmul_mixed_dequantize.py", line 12, in <module>
    from triton.ops.matmul_perf_model import early_config_prune, estimate_matmul_time
ModuleNotFoundError: No module named 'triton.ops'

This seems to relate to issue #1492 . Is this correct?

Anything else I can do to make it work in the meantime?

visionscaper · 2025-02-20T22:05:29Z

Also checked the bitsandbytes AMD fork but this resulted in the same issue ModuleNotFoundError: No module named 'triton.ops'.

I'm using the latest PyTorch nightly:

pip3 install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/rocm6.3

pip show torch
$ pip show torch
Name: torch
Version: 2.7.0.dev20250220+rocm6.3
Summary: Tensors and Dynamic neural networks in Python with strong GPU acceleration
Home-page: https://pytorch.org/
Author: PyTorch Team
Author-email: [email protected]
License: BSD-3-Clause
Location: lib/python3.10/site-packages
Requires: filelock, fsspec, jinja2, networkx, pytorch-triton-rocm, sympy, typing-extensions
Required-by: accelerate, bitsandbytes, lion-pytorch, torchaudio, torchvision

Lier0 · 2025-02-21T11:54:08Z

triton-lang/triton#5471
pip install --force-reinstall triton==3.1.0 may work around.

mcondarelli · 2025-02-21T12:42:00Z

triton-lang/triton#5471 pip install --force-reinstall triton==3.1.0 may work around.

I assume you mean pytorch-triton[-rocm]==3.1.0 as plain triton is at v2.1.0, right?

Lier0 · 2025-02-21T14:04:27Z

Yeah.
But there is other dependencies about rocm.
python-triton-rocm-3.1.0 goes with python-torch-2.5.1, about rocm6.2. xformer may complain.

mcondarelli · 2025-02-22T00:30:56Z

FYI:
This incomplete and convoluted script produces an apparently working installation.
I will try to simplify it.

#!/bin/bash
set -e

script_path=$(readlink -f "$0" 2>/dev/null || realpath "$0" 2>/dev/null || echo "$0")
sdir="$(dirname "${script_path}")"
sdir="$(cd "$sdir" && pwd)"
echo "The path of this script is: $script_path ($sdir)"
here=$(pwd)
user=$(ls -ld "$script_path" | awk '{print $3}')
home=$(getent passwd "$user" | cut -d: -f6)
echo "Home directory of $user is $home"

VENV="torch_venv"
PYTHON="python3.11"
REPO=https://download.pytorch.org/whl/nightly/rocm6.3

# Function to extract and install missing packages
install_missing_packages() {
    local EXTRA=$1
    local continue=Y
    while [ "$continue" == Y ]
    do
        # Run pip check and capture the output
        output=$($VENV/bin/pip check 2>&1 || :)
        echo "$output"

        # Extract missing packages
        missing_packages=()
        while IFS= read -r line
        do
            echo "$line"
            if [[ $line =~ requires\ ([^,]+),\ which\ is\ not\ installed ]]
            then
                missing_packages+=("${BASH_REMATCH[1]}")
                echo "${missing_packages[*]}"
            fi
        done <<< "$output"

        # If no missing packages, exit the loop
        if [ ${#missing_packages[@]} -eq 0 ]; then
            echo "No more missing packages found."
            break
        fi

        continue=N
        # Install missing packages with --ignore-installed --no-deps
        echo "Installing missing packages: ${missing_packages[*]}"
        for pkg in "${missing_packages[@]}"
        do
            echo "Running: pip install --ignore-installed --no-deps $EXTRA $pkg"
            if $VENV/bin/pip install --ignore-installed --no-deps $EXTRA "$pkg"
            then
                continue=Y
            else
                echo "Failed to install $pkg. Continuing with the next package..."
            fi
        done
    done
}

check_base () {
    if [ -d "$VENV" ]
    then
        # check Virtual Environment exists
        if [ -x "$VENV/bin/python" ]
        then
            echo "Virtual Environment at '$VENV' already present, skipping..."
        else
            echo "Directory at '$VENV' exixts but doesn't look like a Virtual Environment: bailing out."
            exit 1
        fi
    else
        echo "Creating basic Virtual Environment at '$VENV'..."
        # prepare environment
        $PYTHON -m venv $VENV
        $VENV/bin/pip install -U pip wheel
    fi
}

check_torch () {
    modules='pytorch-triton-rocm==3.1.0 torch torchvision torchaudio'
    # shellcheck disable=SC2043
    for module in $modules
    do
        name=${module%%==*}
        name=${name//-/_}
        
        [ -n "$(ls -dl $VENV/lib/*/site-packages/${name}{-*,.*,} 2>/dev/null)" ] || $VENV/bin/pip install --index-url $REPO --no-deps $module
    done
    install_missing_packages "--index-url $REPO"
    install_missing_packages
}

check_bitsandbytes () {
    repository='https://github.com/ROCm/bitsandbytes'
    tag='rocm_enabled_multi_backend'
    arch='gfx1100;gfx1102'
    module='bitsandbytes'
    if [ ! -d "$VENV/lib/*/site-packages/$module" ]
    then
        echo "Module '$module' not present in $VENV: installing..."
        if [ ! -d "$sdir/$module" ]
        then
            echo "Sources for '$modules' not present in chache: rebuilding..."
            (
                cd "$sdir"
                git clone "$repository"
                cd "$module"
                git checkout "$tag"
                cmake -DCOMPUTE_BACKEND=hip -DBNB_ROCM_ARCH="$arch" -S .
                make
            )
        fi
        $VENV/bin/pip install --no-deps "$sdir/$module"
        install_missing_packages
    fi
}

check_base
check_torch
check_bitsandbytes

export PYTORCH_ROCM_ARCH=gfx1102
export HSA_OVERRIDE_GFX_VERSION=11.0.0
export PYTORCH_HIP_ALLOC_CONF=expandable_segments:True
export TORCH_ROCM_AOTRITON_ENABLE_EXPERIMENTAL=1
export INVOKEAI_ROOT=~/invokeai
export GPU_DRIVER=rocm
$VENV/bin/python -m bitsandbytes

visionscaper · 2025-02-23T10:53:48Z

@Lier0 @mcondarelli Did you mean pytorch-triton-rocm==2.1.0? 3.1.0 doesn't exist ...

Edit: I now see I have pytorch-triton-rocm==3.2.0+git4b3bb1f8 after installing PyTorch nightly:

pip3 install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/rocm6.3

But that results in the issue No module named 'triton.ops' mentioned before.

How to get pytorch-triton-rocm==3.1.0?

visionscaper · 2025-02-23T11:24:24Z

@Lier0 I figured it out, thanks to @mcondarelli script. The correct command is:

pip install --force-reinstall pytorch-triton-rocm==3.1.0 --index-url https://download.pytorch.org/whl/nightly/rocm6.3

In your reply you didn't add the --index-url https://download.pytorch.org/whl/nightly/rocm6.3 part, which threw me off.

msyzzm · 2025-02-24T06:29:43Z

@Lier0 I figured it out, thanks to @mcondarelli script. The correct command is:
pip install --force-reinstall pytorch-triton-rocm==3.1.0 --index-url https://download.pytorch.org/whl/nightly/rocm6.3
In your reply you didn't add the --index-url https://download.pytorch.org/whl/nightly/rocm6.3 part, which threw me off.

Works for me. Thank you!

TimDettmers · 2025-02-28T14:41:57Z

Thank you for bringing this up and the discussion. We will try to understand the issues here and get back to you.

mcondarelli · 2025-03-03T23:13:43Z

The error still holds true with latest wheel installed as:

pip install https://github.com/bitsandbytes-foundation/bitsandbytes/releases/download/continuous-release_multi-backend-refactor/bitsandbytes-1.0.0-py3-none-manylinux_2_24_x86_64.whl

Please fix before releasing v1.0.0.
I am available for testing, if useful.

ERROR    Could not load bitsandbytes native library: 
                             /home/mcon/LLaMaConda/FluxGym/fluxgym/env/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_rocm63.so:                   
                             undefined symbol: _Z36__device_stub__kOptimizer32bit1StateI12hip_bfloat16Li2EEvPT_S2_PfS3_ffffffiffbi                            
                             Traceback (most recent call last):                                                                                               
                               File "/home/mcon/LLaMaConda/FluxGym/fluxgym/env/lib/python3.10/site-packages/bitsandbytes/cextension.py",                      
                             line 107, in <module>                                                                                                            
                                 lib = get_native_library()                                                                                                   
                               File "/home/mcon/LLaMaConda/FluxGym/fluxgym/env/lib/python3.10/site-packages/bitsandbytes/cextension.py",                      
                             line 86, in get_native_library                                                                                                   
                                 dll = ct.cdll.LoadLibrary(str(binary_path))                                                                                  
                               File "/usr/local/lib/python3.10/ctypes/__init__.py", line 452, in LoadLibrary                                                  
                                 return self._dlltype(name)                                                                                                   
                               File "/usr/local/lib/python3.10/ctypes/__init__.py", line 374, in __init__                                                     
                                 self._handle = _dlopen(self._name, mode)                                                                                     
                             OSError:                                                                                                                         
                             /home/mcon/LLaMaConda/FluxGym/fluxgym/env/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_rocm63.so:                   
                             undefined symbol: _Z36__device_stub__kOptimizer32bit1StateI12hip_bfloat16Li2EEvPT_S2_PfS3_ffffffiffbi

mcondarelli mentioned this issue Feb 18, 2025

[Issue]: Unable to compile ROCm5.7 on recent Ubuntu (InvokeAI + fp16 + RX7600 incompatibility) ROCm/ROCm#4358

Closed

matthewdouglas added documentation Improvements or additions to documentation ROCm labels Feb 19, 2025

matthewdouglas self-assigned this Feb 19, 2025

TimDettmers added high priority (first issues that will be worked on) Low Risk Risk of bugs in transformers and other libraries labels Feb 28, 2025

Looong01 mentioned this issue Mar 2, 2025

[AMD ROCm] OSError: ~/miniconda3/envs/geneformer/lib/python3.11/site-packages/bitsandbytes/libbitsandbytes_rocm62.so: undefined symbol: _Z36__device_stub__kOptimizer32bit1StateI12hip_bfloat16Li2EEvPT_S2_PfS3_ffffffiffbi #1550

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multiple issues installing for AMD GPU (Radeon RX7600XT) #1519

Multiple issues installing for AMD GPU (Radeon RX7600XT) #1519

mcondarelli commented Feb 16, 2025 •

edited

Loading

matthewdouglas commented Feb 19, 2025

mcondarelli commented Feb 19, 2025

matthewdouglas commented Feb 19, 2025

visionscaper commented Feb 20, 2025

visionscaper commented Feb 20, 2025 •

edited

Loading

Lier0 commented Feb 21, 2025 •

edited

Loading

mcondarelli commented Feb 21, 2025

Lier0 commented Feb 21, 2025

mcondarelli commented Feb 22, 2025

visionscaper commented Feb 23, 2025 •

edited

Loading

visionscaper commented Feb 23, 2025

msyzzm commented Feb 24, 2025 •

edited

Loading

TimDettmers commented Feb 28, 2025

mcondarelli commented Mar 3, 2025

Multiple issues installing for AMD GPU (Radeon RX7600XT) #1519

Multiple issues installing for AMD GPU (Radeon RX7600XT) #1519

Comments

mcondarelli commented Feb 16, 2025 • edited Loading

System Info

Reproduction

Expected behavior

matthewdouglas commented Feb 19, 2025

mcondarelli commented Feb 19, 2025

matthewdouglas commented Feb 19, 2025

visionscaper commented Feb 20, 2025

visionscaper commented Feb 20, 2025 • edited Loading

Lier0 commented Feb 21, 2025 • edited Loading

mcondarelli commented Feb 21, 2025

Lier0 commented Feb 21, 2025

mcondarelli commented Feb 22, 2025

visionscaper commented Feb 23, 2025 • edited Loading

visionscaper commented Feb 23, 2025

msyzzm commented Feb 24, 2025 • edited Loading

TimDettmers commented Feb 28, 2025

mcondarelli commented Mar 3, 2025

mcondarelli commented Feb 16, 2025 •

edited

Loading

visionscaper commented Feb 20, 2025 •

edited

Loading

Lier0 commented Feb 21, 2025 •

edited

Loading

visionscaper commented Feb 23, 2025 •

edited

Loading

msyzzm commented Feb 24, 2025 •

edited

Loading