Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple issues installing for AMD GPU (Radeon RX7600XT) #1519

Open
mcondarelli opened this issue Feb 16, 2025 · 14 comments
Open

Multiple issues installing for AMD GPU (Radeon RX7600XT) #1519

mcondarelli opened this issue Feb 16, 2025 · 14 comments
Assignees
Labels
documentation Improvements or additions to documentation high priority (first issues that will be worked on) Low Risk Risk of bugs in transformers and other libraries ROCm

Comments

@mcondarelli
Copy link

mcondarelli commented Feb 16, 2025

System Info

I am under Linux Mint Xia (based on Ubuntu 24.04).
CPU: AMD Ryzen 9 5950X 16-Core Processor with 64GiB RAM.
GPU: Advanced Micro Devices, Inc. [AMD/ATI] Navi 33 [Radeon RX 7600/7600 XT/7600M XT/7600S/7700S / PRO W7600] (rev c0) (Actually: RX 7600 XT, if it matters)
Python: Python 3.10.16
Application: I am testing with InvokeAI

Reproduction

I tried following recipe but I found several errors:

  • project has been converted to .toml and thus pip install -r requirements-dev.txt won't work.
  • cmake -DCOMPUTE_BACKEND=hip -S . && make completes with no errors (just a few "kernels.hip:2857:17: warning: loop not unrolled:...".
  • Installing in venv did not complain but usage resulted in hard error:
    Could not load bitsandbytes native library: /home/mcon/tmp/t/invoke/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_rocm63.so: undefined symbol: _Z36__device_stub__kOptimizer32bit1StateI12hip_bfloat16Li2EEvPT_S2_PfS3_ffffffiffbi
    Traceback (most recent call last):
    File "/home/mcon/tmp/t/invoke/lib/python3.10/site-packages/bitsandbytes/cextension.py", line 107, in <module>
      lib = get_native_library()
    File "/home/mcon/tmp/t/invoke/lib/python3.10/site-packages/bitsandbytes/cextension.py", line 86, in get_native_library
      dll = ct.cdll.LoadLibrary(str(binary_path))
    File "/usr/local/lib/python3.10/ctypes/__init__.py", line 452, in LoadLibrary
      return self._dlltype(name)
    File "/usr/local/lib/python3.10/ctypes/__init__.py", line 374, in __init__
      self._handle = _dlopen(self._name, mode)
    OSError: /home/mcon/tmp/t/invoke/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_rocm63.so: undefined symbol: _Z36__device_stub__kOptimizer32bit1StateI12hip_bfloat16Li2EEvPT_S2_PfS3_ffffffiffbi
    
    ROCm Setup failed despite ROCm being available. Please run the following command to get more information:
    
    python -m bitsandbytes
    
    Inspect the output of the command and see if you can locate ROCm libraries. You might need to add them
    to your LD_LIBRARY_PATH. If you suspect a bug, please take the information from python -m bitsandbytes
    and open an issue at: https://github.com/bitsandbytes-foundation/bitsandbytes/issues
    

I also tried:

(invoke) mcon@ikea:~/tmp/t$ ROCM_PATH=/opt/rocm LD_LIBRARY_PATH=/opt/rocm/lib python -m bitsandbytes
Could not load bitsandbytes native library: /home/mcon/tmp/t/invoke/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_rocm63.so: undefined symbol: _Z36__device_stub__kOptimizer32bit1StateI12hip_bfloat16Li2EEvPT_S2_PfS3_ffffffiffbi
Traceback (most recent call last):
  File "/home/mcon/tmp/t/invoke/lib/python3.10/site-packages/bitsandbytes/cextension.py", line 107, in <module>
    lib = get_native_library()
  File "/home/mcon/tmp/t/invoke/lib/python3.10/site-packages/bitsandbytes/cextension.py", line 86, in get_native_library
    dll = ct.cdll.LoadLibrary(str(binary_path))
  File "/usr/local/lib/python3.10/ctypes/__init__.py", line 452, in LoadLibrary
    return self._dlltype(name)
  File "/usr/local/lib/python3.10/ctypes/__init__.py", line 374, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: /home/mcon/tmp/t/invoke/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_rocm63.so: undefined symbol: _Z36__device_stub__kOptimizer32bit1StateI12hip_bfloat16Li2EEvPT_S2_PfS3_ffffffiffbi

ROCm Setup failed despite ROCm being available. Please run the following command to get more information:

python -m bitsandbytes

Inspect the output of the command and see if you can locate ROCm libraries. You might need to add them
to your LD_LIBRARY_PATH. If you suspect a bug, please take the information from python -m bitsandbytes
and open an issue at: https://github.com/bitsandbytes-foundation/bitsandbytes/issues

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++ BUG REPORT INFORMATION ++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++ OTHER +++++++++++++++++++++++++++
ROCm specs: rocm_version_string='63', rocm_version_tuple=(6, 3)
PyTorch settings found: ROCM_VERSION=63
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++ DEBUG INFO END ++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Checking that the library is importable and ROCm is callable...
Couldn't load the bitsandbytes library, likely due to missing binaries.
Please ensure bitsandbytes is properly installed.

For source installations, compile the binaries with `cmake -DCOMPUTE_BACKEND=hip -S .`.
See the documentation for more details if needed.

Trying a simple check anyway, but this will likely fail...
Traceback (most recent call last):
  File "/home/mcon/tmp/t/invoke/lib/python3.10/site-packages/bitsandbytes/diagnostics/main.py", line 73, in main
    sanity_check()
  File "/home/mcon/tmp/t/invoke/lib/python3.10/site-packages/bitsandbytes/diagnostics/main.py", line 37, in sanity_check
    p1 = p.data.sum().item()
RuntimeError: HIP error: invalid device function
HIP kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing AMD_SERIALIZE_KERNEL=3
Compile with `TORCH_USE_HIP_DSA` to enable device-side assertions.

Above we output some debug information.
Please provide this info when creating an issue via https://github.com/TimDettmers/bitsandbytes/issues/new/choose
WARNING: Please be sure to sanitize sensitive info from the output before posting it.

Expected behavior

I expected to be able to use bitsandbytes.

@matthewdouglas
Copy link
Member

Thanks for reporting. You're right that the requirements-dev.txt has been removed and we need to update the docs.

Unfortunately the preview branch is in a broken state at the moment on ROCm. We're working on it! The commit at a0a95fd might be the best bet in the meantime.

@matthewdouglas matthewdouglas self-assigned this Feb 19, 2025
@mcondarelli
Copy link
Author

Unfortunately the preview branch is in a broken state at the moment on ROCm. We're working on it! The commit at a0a95fd might be the best bet in the meantime.

Thanks, I am now using fork by AMD (https://github.com/ROCm/bitsandbytes) which seems to be working; what is advise? use that or commit at a0a95fd?

@matthewdouglas
Copy link
Member

I think either should be fine but if you've already achieved a working build then I would personally stick to that.

@visionscaper
Copy link

I had exactly the same issue:

Could not load bitsandbytes native library: bitsandbytes/bitsandbytes/libbitsandbytes_rocm63.so: undefined symbol: _Z36__device_stub__kOptimizer32bit1StateI12hip_bfloat16Li2EEvPT_S2_PfS3_ffffffiffbi

Reverting to commit a0a95fd seem to have resolved this particular issue, however another one surfaced:

$ python -m bitsandbytes
Traceback (most recent call last):
  File "/usr/lib/python3.10/runpy.py", line 187, in _run_module_as_main
    mod_name, mod_spec, code = _get_module_details(mod_name, _Error)
  File "/usr/lib/python3.10/runpy.py", line 146, in _get_module_details
    return _get_module_details(pkg_main_name, error)
  File "/usr/lib/python3.10/runpy.py", line 110, in _get_module_details
    __import__(pkg_name)
  File "/home/freddy/workspace/bitsandbytes/bitsandbytes/__init__.py", line 70, in <module>
    from .nn import modules
  File "/home/freddy/workspace/bitsandbytes/bitsandbytes/nn/__init__.py", line 21, in <module>
    from .triton_based_modules import (
  File "/home/freddy/workspace/bitsandbytes/bitsandbytes/nn/triton_based_modules.py", line 7, in <module>
    from bitsandbytes.triton.int8_matmul_mixed_dequantize import (
  File "/home/freddy/workspace/bitsandbytes/bitsandbytes/triton/int8_matmul_mixed_dequantize.py", line 12, in <module>
    from triton.ops.matmul_perf_model import early_config_prune, estimate_matmul_time
ModuleNotFoundError: No module named 'triton.ops'

This seems to relate to issue #1492 . Is this correct?

Anything else I can do to make it work in the meantime?

@visionscaper
Copy link

visionscaper commented Feb 20, 2025

Also checked the bitsandbytes AMD fork but this resulted in the same issue ModuleNotFoundError: No module named 'triton.ops'.

I'm using the latest PyTorch nightly:

pip3 install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/rocm6.3
pip show torch
$ pip show torch
Name: torch
Version: 2.7.0.dev20250220+rocm6.3
Summary: Tensors and Dynamic neural networks in Python with strong GPU acceleration
Home-page: https://pytorch.org/
Author: PyTorch Team
Author-email: [email protected]
License: BSD-3-Clause
Location: lib/python3.10/site-packages
Requires: filelock, fsspec, jinja2, networkx, pytorch-triton-rocm, sympy, typing-extensions
Required-by: accelerate, bitsandbytes, lion-pytorch, torchaudio, torchvision

@Lier0
Copy link

Lier0 commented Feb 21, 2025

triton-lang/triton#5471
pip install --force-reinstall triton==3.1.0 may work around.

@mcondarelli
Copy link
Author

triton-lang/triton#5471 pip install --force-reinstall triton==3.1.0 may work around.

I assume you mean pytorch-triton[-rocm]==3.1.0 as plain triton is at v2.1.0, right?

@Lier0
Copy link

Lier0 commented Feb 21, 2025

Yeah.
But there is other dependencies about rocm.
python-triton-rocm-3.1.0 goes with python-torch-2.5.1, about rocm6.2. xformer may complain.

@mcondarelli
Copy link
Author

FYI:
This incomplete and convoluted script produces an apparently working installation.
I will try to simplify it.

#!/bin/bash
set -e

script_path=$(readlink -f "$0" 2>/dev/null || realpath "$0" 2>/dev/null || echo "$0")
sdir="$(dirname "${script_path}")"
sdir="$(cd "$sdir" && pwd)"
echo "The path of this script is: $script_path ($sdir)"
here=$(pwd)
user=$(ls -ld "$script_path" | awk '{print $3}')
home=$(getent passwd "$user" | cut -d: -f6)
echo "Home directory of $user is $home"

VENV="torch_venv"
PYTHON="python3.11"
REPO=https://download.pytorch.org/whl/nightly/rocm6.3

# Function to extract and install missing packages
install_missing_packages() {
    local EXTRA=$1
    local continue=Y
    while [ "$continue" == Y ]
    do
        # Run pip check and capture the output
        output=$($VENV/bin/pip check 2>&1 || :)
        echo "$output"

        # Extract missing packages
        missing_packages=()
        while IFS= read -r line
        do
            echo "$line"
            if [[ $line =~ requires\ ([^,]+),\ which\ is\ not\ installed ]]
            then
                missing_packages+=("${BASH_REMATCH[1]}")
                echo "${missing_packages[*]}"
            fi
        done <<< "$output"

        # If no missing packages, exit the loop
        if [ ${#missing_packages[@]} -eq 0 ]; then
            echo "No more missing packages found."
            break
        fi

        continue=N
        # Install missing packages with --ignore-installed --no-deps
        echo "Installing missing packages: ${missing_packages[*]}"
        for pkg in "${missing_packages[@]}"
        do
            echo "Running: pip install --ignore-installed --no-deps $EXTRA $pkg"
            if $VENV/bin/pip install --ignore-installed --no-deps $EXTRA "$pkg"
            then
                continue=Y
            else
                echo "Failed to install $pkg. Continuing with the next package..."
            fi
        done
    done
}

check_base () {
    if [ -d "$VENV" ]
    then
        # check Virtual Environment exists
        if [ -x "$VENV/bin/python" ]
        then
            echo "Virtual Environment at '$VENV' already present, skipping..."
        else
            echo "Directory at '$VENV' exixts but doesn't look like a Virtual Environment: bailing out."
            exit 1
        fi
    else
        echo "Creating basic Virtual Environment at '$VENV'..."
        # prepare environment
        $PYTHON -m venv $VENV
        $VENV/bin/pip install -U pip wheel
    fi
}

check_torch () {
    modules='pytorch-triton-rocm==3.1.0 torch torchvision torchaudio'
    # shellcheck disable=SC2043
    for module in $modules
    do
        name=${module%%==*}
        name=${name//-/_}
        
        [ -n "$(ls -dl $VENV/lib/*/site-packages/${name}{-*,.*,} 2>/dev/null)" ] || $VENV/bin/pip install --index-url $REPO --no-deps $module
    done
    install_missing_packages "--index-url $REPO"
    install_missing_packages
}

check_bitsandbytes () {
    repository='https://github.com/ROCm/bitsandbytes'
    tag='rocm_enabled_multi_backend'
    arch='gfx1100;gfx1102'
    module='bitsandbytes'
    if [ ! -d "$VENV/lib/*/site-packages/$module" ]
    then
        echo "Module '$module' not present in $VENV: installing..."
        if [ ! -d "$sdir/$module" ]
        then
            echo "Sources for '$modules' not present in chache: rebuilding..."
            (
                cd "$sdir"
                git clone "$repository"
                cd "$module"
                git checkout "$tag"
                cmake -DCOMPUTE_BACKEND=hip -DBNB_ROCM_ARCH="$arch" -S .
                make
            )
        fi
        $VENV/bin/pip install --no-deps "$sdir/$module"
        install_missing_packages
    fi
}

check_base
check_torch
check_bitsandbytes

export PYTORCH_ROCM_ARCH=gfx1102
export HSA_OVERRIDE_GFX_VERSION=11.0.0
export PYTORCH_HIP_ALLOC_CONF=expandable_segments:True
export TORCH_ROCM_AOTRITON_ENABLE_EXPERIMENTAL=1
export INVOKEAI_ROOT=~/invokeai
export GPU_DRIVER=rocm
$VENV/bin/python -m bitsandbytes

@visionscaper
Copy link

visionscaper commented Feb 23, 2025

@Lier0 @mcondarelli Did you mean pytorch-triton-rocm==2.1.0? 3.1.0 doesn't exist ...

Edit: I now see I have pytorch-triton-rocm==3.2.0+git4b3bb1f8 after installing PyTorch nightly:

pip3 install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/rocm6.3

But that results in the issue No module named 'triton.ops' mentioned before.

How to get pytorch-triton-rocm==3.1.0?

@visionscaper
Copy link

@Lier0 I figured it out, thanks to @mcondarelli script. The correct command is:

pip install --force-reinstall pytorch-triton-rocm==3.1.0 --index-url https://download.pytorch.org/whl/nightly/rocm6.3

In your reply you didn't add the --index-url https://download.pytorch.org/whl/nightly/rocm6.3 part, which threw me off.

@msyzzm
Copy link

msyzzm commented Feb 24, 2025

@Lier0 I figured it out, thanks to @mcondarelli script. The correct command is:

pip install --force-reinstall pytorch-triton-rocm==3.1.0 --index-url https://download.pytorch.org/whl/nightly/rocm6.3

In your reply you didn't add the --index-url https://download.pytorch.org/whl/nightly/rocm6.3 part, which threw me off.

Works for me. Thank you!

@TimDettmers TimDettmers added high priority (first issues that will be worked on) Low Risk Risk of bugs in transformers and other libraries labels Feb 28, 2025
@TimDettmers
Copy link
Collaborator

Thank you for bringing this up and the discussion. We will try to understand the issues here and get back to you.

@mcondarelli
Copy link
Author

The error still holds true with latest wheel installed as:

pip install https://github.com/bitsandbytes-foundation/bitsandbytes/releases/download/continuous-release_multi-backend-refactor/bitsandbytes-1.0.0-py3-none-manylinux_2_24_x86_64.whl

Please fix before releasing v1.0.0.
I am available for testing, if useful.

ERROR    Could not load bitsandbytes native library: 
                             /home/mcon/LLaMaConda/FluxGym/fluxgym/env/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_rocm63.so:                   
                             undefined symbol: _Z36__device_stub__kOptimizer32bit1StateI12hip_bfloat16Li2EEvPT_S2_PfS3_ffffffiffbi                            
                             Traceback (most recent call last):                                                                                               
                               File "/home/mcon/LLaMaConda/FluxGym/fluxgym/env/lib/python3.10/site-packages/bitsandbytes/cextension.py",                      
                             line 107, in <module>                                                                                                            
                                 lib = get_native_library()                                                                                                   
                               File "/home/mcon/LLaMaConda/FluxGym/fluxgym/env/lib/python3.10/site-packages/bitsandbytes/cextension.py",                      
                             line 86, in get_native_library                                                                                                   
                                 dll = ct.cdll.LoadLibrary(str(binary_path))                                                                                  
                               File "/usr/local/lib/python3.10/ctypes/__init__.py", line 452, in LoadLibrary                                                  
                                 return self._dlltype(name)                                                                                                   
                               File "/usr/local/lib/python3.10/ctypes/__init__.py", line 374, in __init__                                                     
                                 self._handle = _dlopen(self._name, mode)                                                                                     
                             OSError:                                                                                                                         
                             /home/mcon/LLaMaConda/FluxGym/fluxgym/env/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_rocm63.so:                   
                             undefined symbol: _Z36__device_stub__kOptimizer32bit1StateI12hip_bfloat16Li2EEvPT_S2_PfS3_ffffffiffbi                            

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation high priority (first issues that will be worked on) Low Risk Risk of bugs in transformers and other libraries ROCm
Projects
None yet
Development

No branches or pull requests

6 participants