We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RUST_LOG=trace mistralrs-server --port 20005 --isq Q4K --truncate-sequence plain -m /data/ai/huggingface/DeepSeek-R1-Distill-Qwen-14B
use this to start a mistrals-server, then send a big chat completion request
http POST http://gpu:20005/v1/chat/completions < /tmp/big-data.json
big-data.json
mistrals-server truncate the request, then reply the response
print this log, eat 100% CPU and 10.716G VRAM, but doesn't reply any response
mistral.log
b73e2e9
build with
export CUDA_NVCC_FLAGS=-fPIE cargo b -r --features='cuda cudnn'
GPU is Tesla T4
The text was updated successfully, but these errors were encountered:
No branches or pull requests
Describe the bug
use this to start a mistrals-server, then send a big chat completion request
http POST http://gpu:20005/v1/chat/completions < /tmp/big-data.json
big-data.json
Expect
mistrals-server truncate the request, then reply the response
Happened
print this log, eat 100% CPU and 10.716G VRAM, but doesn't reply any response
mistral.log
Latest commit or version
b73e2e9
build with
GPU is Tesla T4
The text was updated successfully, but these errors were encountered: