Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metal kernel error #1116

Open
Paulescu opened this issue Jan 30, 2025 · 0 comments
Open

Metal kernel error #1116

Paulescu opened this issue Jan 30, 2025 · 0 comments
Labels
bug Something isn't working

Comments

@Paulescu
Copy link

Describe the bug

I build mistralrs from source on my Macbook M2, with support for Metal kernels

git clone https://github.com/EricLBuehler/mistral.rs.git                                                                          ✔  2m 27s
cd mistral.rs
cargo build --release --features metal

and start an interactive

./mistralrs-server -i --isq Q4K plain -m deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B

but get this error.

2025-01-30T22:29:41.412313Z ERROR mistralrs_core::engine: prompt step - Model failed with error: Metal(KernelError(LoadLibraryError("In module 'monolithic_metal' imported from <built-in>:1:\n/System/Library/PrivateFrameworks/GPUCompiler.framework/Versions/31001/Libraries/lib/clang/31001.720/include/metal/metal_matrix:20:83: error: failed requirement 'is_scalar<_MLX_BFloat16>::value'; 'enable_if' cannot be used to disable this declaration\ntemplate <typename T, int Cols, int Rows = Cols, typename _E = typename enable_if<is_scalar<T>::value && is_floating_point<T>::value && Cols >=2 && Rows >= 2>::type>\n                                                                                  ^~~~~~~~~~~~~~~~~~~\nprogram_source:263:9: note: in instantiation of default argument for 'matrix<_MLX_BFloat16, 4, 4>' required here\ntypedef matrix<bfloat16_t, 4, 4> bfloat4x4;\n        ^~~~~~~~~~~~~~~~~~~~~~~~\nprogram_source:2191:19: warning: unused variable 'ncs'\n    const int64_t ncs = ne00;\n                  ^\nprogram_source:2192:19: warning: unused variable 'nr'\n    const int64_t nr  = ne01;\n                  ^\nprogram_source:2193:19: warning: unused variable 'n_t'\n    const int64_t n_t = ne1;\n                  ^\nprogram_source:2194:19: warning: unused variable 'n_s'\n    const int64_t n_s = ne2;\n                  ^\nprogram_source:2248:19: warning: unused variable 'nr'\n    const int64_t nr  = d_inner;\n                  ^\nprogram_source:2250:19: warning: unused variable 'n_s'\n    const int64_t n_s = n_seqs;\n                  ^\nprogram_source:7630:106: error: use of undeclared identifier 'simdgroup_bfloat8x8'; did you mean 'simdgroup_load'?\ntemplate [[host_name(\"kernel_mul_mm_bf16_f32\")]]    kernel mat_mm_t kernel_mul_mm<bfloat16_t, bfloat4x4, simdgroup_bfloat8x8, bfloat4x4,     1,     dequantize_bf16>;\n                                                                                                         ^~~~~~~~~~~~~~~~~~~\n                                                                                                         simdgroup_load\n/System/Library/PrivateFrameworks/GPUCompiler.framework/Versions/31001/Libraries/lib/clang/31001.720/include/metal/metal_simdgroup_matrix:88:17: note: 'simdgroup_load' declared here\nMETAL_FUNC void simdgroup_load(thread simdgroup_matrix<T, Cols, Rows> &d, const threadgroup T *src, ulong elements_per_row = Cols, ulong2 matrix_origin = ulong2(0, 0), bool transpose_matrix = false)\n                ^\nprogram_source:7630:69: error: explicit instantiation of 'kernel_mul_mm' does not refer to a function template, variable template, member function, member class, or static data member\ntemplate [[host_name(\"kernel_mul_mm_bf16_f32\")]]    kernel mat_mm_t kernel_mul_mm<bfloat16_t, bfloat4x4, simdgroup_bfloat8x8, bfloat4x4,     1,     dequantize_bf16>;\n                                                                    ^\nprogram_source:7245:13: note: candidate template ignored: invalid explicitly-specified argument for template parameter 'simdgroup_T8x8'\nkernel void kernel_mul_mm(device const  uchar * src0,\n            ^\n")))
2025-01-30T22:29:41.412479Z ERROR mistralrs_server::interactive_mode: Got a model error: "Metal error Error while loading library: In module 'monolithic_metal' imported from <built-in>:1:\n/System/Library/PrivateFrameworks/GPUCompiler.framework/Versions/31001/Libraries/lib/clang/31001.720/include/metal/metal_matrix:20:83: error: failed requirement 'is_scalar<_MLX_BFloat16>::value'; 'enable_if' cannot be used to disable this declaration\ntemplate <typename T, int Cols, int Rows = Cols, typename _E = typename enable_if<is_scalar<T>::value && is_floating_point<T>::value && Cols >=2 && Rows >= 2>::type>\n                                                                                  ^~~~~~~~~~~~~~~~~~~\nprogram_source:263:9: note: in instantiation of default argument for 'matrix<_MLX_BFloat16, 4, 4>' required here\ntypedef matrix<bfloat16_t, 4, 4> bfloat4x4;\n        ^~~~~~~~~~~~~~~~~~~~~~~~\nprogram_source:2191:19: warning: unused variable 'ncs'\n    const int64_t ncs = ne00;\n                  ^\nprogram_source:2192:19: warning: unused variable 'nr'\n    const int64_t nr  = ne01;\n                  ^\nprogram_source:2193:19: warning: unused variable 'n_t'\n    const int64_t n_t = ne1;\n                  ^\nprogram_source:2194:19: warning: unused variable 'n_s'\n    const int64_t n_s = ne2;\n                  ^\nprogram_source:2248:19: warning: unused variable 'nr'\n    const int64_t nr  = d_inner;\n                  ^\nprogram_source:2250:19: warning: unused variable 'n_s'\n    const int64_t n_s = n_seqs;\n                  ^\nprogram_source:7630:106: error: use of undeclared identifier 'simdgroup_bfloat8x8'; did you mean 'simdgroup_load'?\ntemplate [[host_name(\"kernel_mul_mm_bf16_f32\")]]    kernel mat_mm_t kernel_mul_mm<bfloat16_t, bfloat4x4, simdgroup_bfloat8x8, bfloat4x4,     1,     dequantize_bf16>;\n                                                                                                         ^~~~~~~~~~~~~~~~~~~\n                                                                                                         simdgroup_load\n/System/Library/PrivateFrameworks/GPUCompiler.framework/Versions/31001/Libraries/lib/clang/31001.720/include/metal/metal_simdgroup_matrix:88:17: note: 'simdgroup_load' declared here\nMETAL_FUNC void simdgroup_load(thread simdgroup_matrix<T, Cols, Rows> &d, const threadgroup T *src, ulong elements_per_row = Cols, ulong2 matrix_origin = ulong2(0, 0), bool transpose_matrix = false)\n                ^\nprogram_source:7630:69: error: explicit instantiation of 'kernel_mul_mm' does not refer to a function template, variable template, member function, member class, or static data member\ntemplate [[host_name(\"kernel_mul_mm_bf16_f32\")]]    kernel mat_mm_t kernel_mul_mm<bfloat16_t, bfloat4x4, simdgroup_bfloat8x8, bfloat4x4,     1,     dequantize_bf16>;\n                                                                    ^\nprogram_source:7245:13: note: candidate template ignored: invalid explicitly-specified argument for template parameter 'simdgroup_T8x8'\nkernel void kernel_mul_mm(device const  uchar * src0,\n            ^\n", response: ChatCompletionResponse { id: "1", choices: [Choice { finish_reason: "error", index: 0, message: ResponseMessage { content: Some(""), role: "assistant", tool_calls: [] }, logprobs: None }], created: 1738276181, model: "deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B", system_fingerprint: "local", object: "chat.completion", usage: Usage { completion_tokens: 0, prompt_tokens: 12, total_tokens: 12, avg_tok_per_sec: 38.834953, avg_prompt_tok_per_sec: inf, avg_compl_tok_per_sec: NaN, total_time_sec: 0.309, total_prompt_time_sec: 0.0, total_completion_time_sec: 0.0 } }

Latest commit or version

Which commit or version you ran with.

@Paulescu Paulescu added the bug Something isn't working label Jan 30, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant