Skip to content

Commit 2017cd9

Browse files
committedMar 3, 2025
Update for 0.25.0 release
·
0.31.00.25.0
1 parent 25090b0 commit 2017cd9

File tree

177 files changed

+6646
-1950
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

177 files changed

+6646
-1950
lines changed
 

‎.pre-commit-config.yaml

Lines changed: 33 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -6,42 +6,43 @@ exclude: >
66
77
repos:
88
- repo: https://github.com/pre-commit/pre-commit-hooks
9-
rev: v4.6.0
9+
rev: v5.0.0
1010
hooks:
11-
- id: trailing-whitespace
12-
- id: mixed-line-ending
13-
args: [--fix=lf]
14-
- id: end-of-file-fixer
15-
- id: check-merge-conflict
16-
- id: requirements-txt-fixer
17-
- id: debug-statements
18-
- id: check-json
19-
exclude: ^.vscode/.*.json # vscode files can take comments
20-
- id: check-yaml
21-
args: [--allow-multiple-documents]
22-
- id: check-toml
2311
- id: check-added-large-files
2412
args: [--maxkb=500, --enforce-all]
2513
exclude: >
2614
(?x)^(
2715
examples/diffusers/quantization/assets/.*.png|
2816
examples/diffusers/cache_diffusion/assets/.*.png|
2917
)$
18+
- id: check-json
19+
exclude: ^.vscode/.*.json # vscode files can take comments
20+
- id: check-merge-conflict
21+
- id: check-symlinks
22+
- id: check-toml
23+
- id: check-yaml
24+
args: [--allow-multiple-documents]
25+
- id: debug-statements
26+
- id: end-of-file-fixer
27+
- id: mixed-line-ending
28+
args: [--fix=lf]
29+
- id: requirements-txt-fixer
30+
- id: trailing-whitespace
3031

3132
- repo: https://github.com/executablebooks/mdformat
32-
rev: 0.7.17
33+
rev: 0.7.21
3334
hooks:
3435
- id: mdformat
3536

3637
- repo: https://github.com/astral-sh/ruff-pre-commit
37-
rev: v0.6.4
38+
rev: v0.9.4
3839
hooks:
3940
- id: ruff
4041
args: [--fix, --exit-non-zero-on-fix]
4142
- id: ruff-format
4243

4344
- repo: https://github.com/pre-commit/mirrors-mypy
44-
rev: v1.11.2
45+
rev: v1.14.1
4546
hooks:
4647
- id: mypy
4748

@@ -88,25 +89,27 @@ repos:
8889
(?x)^(
8990
modelopt/onnx/quantization/operators.py|
9091
modelopt/onnx/quantization/ort_patching.py|
92+
modelopt/torch/_deploy/utils/onnx_utils.py|
9193
modelopt/torch/export/transformer_engine.py|
9294
modelopt/torch/quantization/export_onnx.py|
9395
modelopt/torch/quantization/plugins/attention.py|
94-
modelopt/torch/speculative/plugins/transformers.py|
9596
modelopt/torch/speculative/eagle/utils.py|
96-
modelopt/torch/_deploy/utils/onnx_utils.py|
97+
modelopt/torch/speculative/plugins/transformers.py|
9798
examples/chained_optimizations/bert_prune_distill_quantize.py|
98-
examples/diffusers/quantization/onnx_utils/export.py|
9999
examples/diffusers/cache_diffusion/pipeline/models/sdxl.py|
100+
examples/diffusers/quantization/onnx_utils/export.py|
100101
examples/llm_eval/gen_model_answer.py|
101102
examples/llm_eval/humaneval.py|
102103
examples/llm_eval/lm_eval_hf.py|
103104
examples/llm_eval/mmlu.py|
104105
examples/llm_eval/modeling.py|
105-
examples/llm_sparsity/finetune.py|
106106
examples/llm_qat/main.py|
107+
examples/llm_sparsity/finetune.py|
107108
examples/speculative_decoding/main.py|
108109
examples/speculative_decoding/medusa_utils.py|
109110
examples/speculative_decoding/vllm_generate.py|
111+
examples/deepseek/quantize_to_nvfp4.py|
112+
examples/deepseek/ptq.py|
110113
)$
111114
112115
# Default hook for Apache 2.0 in core c/c++/cuda files
@@ -132,7 +135,7 @@ repos:
132135
types_or: [shell]
133136

134137
- repo: https://github.com/keith/pre-commit-buildifier
135-
rev: 6.4.0
138+
rev: 8.0.1
136139
hooks:
137140
- id: buildifier
138141
- id: buildifier-lint
@@ -143,3 +146,12 @@ repos:
143146
- id: bandit
144147
args: ["-c", "pyproject.toml", "-q"]
145148
additional_dependencies: ["bandit[toml]"]
149+
150+
# Link checker
151+
- repo: https://github.com/lycheeverse/lychee.git
152+
rev: v0.15.1
153+
hooks:
154+
- id: lychee
155+
args: ["--no-progress", "--exclude-loopback"]
156+
stages: [manual] # Only run with `pre-commit run --all-files --hook-stage manual lychee`
157+
exclude: internal/

‎CHANGELOG.rst

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,28 @@
11
Model Optimizer Changelog (Linux)
22
=================================
33

4+
0.25 (2025-03-03)
5+
^^^^^^^^^^^^^^^^^
6+
7+
**Backward Breaking Changes**
8+
9+
- Deprecate Torch 2.1 support.
10+
- Deprecate ``humaneval`` benchmark in ``llm_eval`` examples. Please use the newly added ``simple_eval`` instead.
11+
- Deprecate ``fp8_naive`` quantization format in ``llm_ptq`` examples. Please use ``fp8`` instead.
12+
13+
**New Features**
14+
15+
- Support fast hadamard transform in :class:`TensorQuantizer <modelopt.torch.quantization.nn.modules.TensorQuantizer>`.
16+
It can be used for rotation based quantization methods, e.g. QuaRot. Users need to install the package `fast_hadamard_transfrom <https://github.com/Dao-AILab/fast-hadamard-transform>`_ to use this feature.
17+
- Add affine quantization support for the KV cache, resolving the low accuracy issue in models such as Qwen2.5 and Phi-3/3.5.
18+
- Add FSDP2 support. FSDP2 can now be used for QAT.
19+
- Add `LiveCodeBench <https://livecodebench.github.io/>`_ and `Simple Evals <https://github.com/openai/simple-evals>`_ to the ``llm_eval`` examples.
20+
- Disabled saving modelopt state in unified hf export APIs by default, i.e., added ``save_modelopt_state`` flag in ``export_hf_checkpoint`` API and by default set to False.
21+
- Add FP8 and NVFP4 real quantization support with LLM QLoRA example.
22+
- The :class:`modelopt.deploy.llm.LLM` now support use the :class:`tensorrt_llm._torch.LLM` backend for the quantized HuggingFace checkpoints.
23+
- Add `NVFP4 PTQ example for DeepSeek-R1 <https://github.com/NVIDIA/TensorRT-Model-Optimizer/tree/main/examples/deepseek>`_.
24+
- Add end-to-end `AutoDeploy example for AutoQuant LLM models <https://github.com/NVIDIA/TensorRT-Model-Optimizer/tree/main/examples/llm_autodeploy>`_.
25+
426
0.23 (2025-01-29)
527
^^^^^^^^^^^^^^^^^
628

0 commit comments

Comments
 (0)
Please sign in to comment.