-
Notifications
You must be signed in to change notification settings - Fork 91
Insights: NVIDIA/TensorRT-Model-Optimizer
Overview
-
- 1 Merged pull request
- 0 Open pull requests
- 2 Closed issues
- 0 New issues
1 Pull request merged by 1 person
-
Update README.md news NVFP4 Blog
#223 merged
Jun 27, 2025
2 Issues closed by 2 people
-
SmoothQuant int8_sq weight_quantizer seems not working properly ?
#222 closed
Jun 26, 2025 -
B200x8 Run DeepSeek-V3-0324-FP4 Error
#220 closed
Jun 24, 2025
9 Unresolved conversations
Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.
-
[refactor] Update Ln-norm logic for upcoming PyTorch update
#206 commented on
Jun 28, 2025 • 1 new comment -
Explicit quantization in PyTorch before ONNX leads to slower TRT engine than ONNX PTQ
#207 commented on
Jun 22, 2025 • 0 new comments -
[RFC] TensorRT Model Optimizer - Product Roadmap
#146 commented on
Jun 23, 2025 • 0 new comments -
can not restore model
#216 commented on
Jun 24, 2025 • 0 new comments -
bug for torch/quantization/tensor_quant.py
#221 commented on
Jun 24, 2025 • 0 new comments -
Qwen2_MoE AWQ(w4a16/w4a8) quantization failed with Nan AssertionError
#182 commented on
Jun 24, 2025 • 0 new comments -
support for python 3.13
#217 commented on
Jun 24, 2025 • 0 new comments -
5090 error
#218 commented on
Jun 26, 2025 • 0 new comments -
[BUG] FP8 real_quantization doesnt work with block_sizes
#193 commented on
Jun 26, 2025 • 0 new comments