Lu Fang
|
c6703d1e0d
|
[MISC] Remove unused variableds in C++ (#19609)
Signed-off-by: Lu Fang <lufang@fb.com>
|
2025-06-15 20:05:28 -07:00 |
Lu Fang
|
d3ccbd6350
|
Fix CUDA kernel index data type in vllm/csrc/quantization/fused_kernels/layernorm_utils.cuh +10 (#15159)
Signed-off-by: Lu Fang <lufang@fb.com>
Co-authored-by: Richard Barnes <rbarnes@meta.com>
|
2025-03-21 10:01:11 +08:00 |
Lu Fang
|
8c0d15d5c5
|
[Misc][Easy] Annotate unused vars in the csrc files (#14798)
Signed-off-by: Lu Fang <lufang@fb.com>
|
2025-03-15 12:40:09 +08:00 |
bnellnm
|
5467ac3196
|
[Kernel][Misc] Use TORCH_LIBRARY instead of PYBIND11_MODULE for custom ops (#5047)
|
2024-06-09 16:23:30 -04:00 |
Michael Goin
|
5f6d10c14c
|
[CI/Build] Enforce style for C++ and CUDA code with `clang-format` (#4722)
|
2024-05-22 07:18:41 +00:00 |
Antoni Baum
|
a10d3056da
|
[Core] Set `linear_weights` directly on the layer (#3977)
|
2024-04-11 16:35:51 -04:00 |
CHU Tianxiang
|
01a5d18a53
|
Add Support for 2/3/8-bit GPTQ Quantization Models (#2330)
|
2024-02-28 21:52:23 -08:00 |
Woosuk Kwon
|
6ef00b03a2
|
Enable CUDA graph for GPTQ & SqueezeLLM (#2318)
|
2024-01-03 09:52:29 -08:00 |
kliuae
|
1b7c791d60
|
[ROCm] Fixes for GPTQ on ROCm (#2180)
|
2023-12-18 10:41:04 -08:00 |
CHU Tianxiang
|
0fbfc4b81b
|
Add GPTQ support (#916)
|
2023-12-15 03:04:22 -08:00 |