vllm/tests/quantization
Jerry Zhang 7974736740
Add support for loading torchao models with `AOPerModuleConfig` (#17826)
Signed-off-by: Jerry Zhang <jerryzh168@gmail.com>
2025-05-14 16:24:59 -07:00
..
__init__.py [CI/Build] Move `test_utils.py` to `tests/utils.py` (#4425) 2024-05-13 23:50:09 +09:00
test_bitsandbytes.py [Misc] Auto detect bitsandbytes pre-quantized models (#16027) 2025-04-04 23:30:45 -07:00
test_compressed_tensors.py [Misc] Add compressed-tensors NVFP4A16 emulation support (#17914) 2025-05-11 15:58:38 +08:00
test_configs.py Update deprecated Python 3.8 typing (#13971) 2025-03-02 17:34:51 -08:00
test_cpu_offload.py [V1] Fully Transparent Implementation of CPU Offloading (#15354) 2025-03-31 20:22:34 +08:00
test_experts_int8.py [Misc] Add SPDX-License-Identifier headers to python source files (#12628) 2025-02-02 11:58:18 -08:00
test_fp8.py [FEAT][ROCm] Integrate Fused MoE Kernels from AITER (#14967) 2025-03-26 16:30:30 +08:00
test_gptq_dynamic.py [V1] V1 Enablement Oracle (#13726) 2025-03-14 22:02:20 -07:00
test_ipex_quant.py [Misc] Add SPDX-License-Identifier headers to python source files (#12628) 2025-02-02 11:58:18 -08:00
test_lm_head.py [V1] V1 Enablement Oracle (#13726) 2025-03-14 22:02:20 -07:00
test_ptpc_fp8.py [ROCm] [Feature] [Doc] [Dockerfile] [BugFix] Support Per-Token-Activation Per-Channel-Weight FP8 Quantization Inferencing (#12501) 2025-02-07 08:13:43 -08:00
test_quark.py [Bugfix] Fix quark fp8 format loading on AMD GPUs (#12612) 2025-05-08 02:53:53 -07:00
test_register_quantization_config.py Improve configs - `ModelConfig` (#17130) 2025-04-30 10:38:22 +08:00
test_torchao.py Add support for loading torchao models with `AOPerModuleConfig` (#17826) 2025-05-14 16:24:59 -07:00
utils.py [Misc] Add SPDX-License-Identifier headers to python source files (#12628) 2025-02-02 11:58:18 -08:00