vllm/csrc/core
Lucas Wilkinson 288cc6c234
[Attention] MLA with chunked prefill (#12639)
Signed-off-by: Lucas Wilkinson <lwilkinson@neuralmagic.com>
Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
Co-authored-by: Patrick Horn <patrick.horn@gmail.com>
Co-authored-by: simon-mo <xmo@berkeley.edu>
Co-authored-by: Tyler Michael Smith <tyler@neuralmagic.com>
2025-02-21 15:30:12 -08:00
..
exception.hpp [Bugfix] Fix Marlin MoE act order when is_k_full == False (#8741) 2024-09-28 18:19:40 -07:00
math.hpp [Attention] MLA with chunked prefill (#12639) 2025-02-21 15:30:12 -08:00
registration.h [CI/Build] Per file CUDA Archs (improve wheel size and dev build times) (#8845) 2024-10-03 22:55:25 -04:00
scalar_type.hpp Move linting to `pre-commit` (#11975) 2025-01-20 14:58:01 +08:00