.. |
core
|
[Bugfix] Respect num-gpu-blocks-override in v1 (#19503)
|
2025-06-12 11:00:23 +00:00 |
e2e
|
[CI] change spell checker from codespell to typos (#18711)
|
2025-06-11 19:57:10 -07:00 |
engine
|
[Bugfix] Fix TP inference for Flex attention backend (#19657)
|
2025-06-16 11:21:37 +00:00 |
entrypoints
|
[Misc] Add SPDX-FileCopyrightText (#19100)
|
2025-06-03 11:20:17 -07:00 |
kv_connector
|
[CI] change spell checker from codespell to typos (#18711)
|
2025-06-11 19:57:10 -07:00 |
metrics
|
Fix ValueError: Missing value for tag key(s): model_name,engine. (#19113)
|
2025-06-04 17:10:45 +08:00 |
sample
|
[CI] change spell checker from codespell to typos (#18711)
|
2025-06-11 19:57:10 -07:00 |
shutdown
|
[Misc] Add SPDX-FileCopyrightText (#19100)
|
2025-06-03 11:20:17 -07:00 |
spec_decode
|
[Bugfix] Fix EAGLE vocab embedding construction for Llama 70B (#19033)
|
2025-06-05 19:10:08 -07:00 |
structured_output
|
[Misc] Add SPDX-FileCopyrightText (#19100)
|
2025-06-03 11:20:17 -07:00 |
tpu
|
[TPU] support attention head dim smaller than 128 (#19620)
|
2025-06-16 06:40:53 +00:00 |
worker
|
Revert "[v1] Add fp32 support to v1 engine through flex attn" (#19404)
|
2025-06-10 01:30:20 -07:00 |
__init__.py
|
[V1] `AsyncLLM` Implementation (#9826)
|
2024-11-11 23:05:38 +00:00 |
test_async_llm_dp.py
|
[Core] Raise when non-multi-instance DP clients target a DP rank (#19227)
|
2025-06-06 19:03:01 +08:00 |
test_metrics_reader.py
|
[Misc] Add SPDX-FileCopyrightText (#19100)
|
2025-06-03 11:20:17 -07:00 |
test_oracle.py
|
[v1] Support mamba2 (#19327)
|
2025-06-18 20:34:15 +00:00 |
test_request.py
|
[Misc] Add __str__ for RequestStatus (#19780)
|
2025-06-18 03:03:01 +00:00 |
test_serial_utils.py
|
[Misc] Add SPDX-FileCopyrightText (#19100)
|
2025-06-03 11:20:17 -07:00 |
test_utils.py
|
[Misc] Add SPDX-FileCopyrightText (#19100)
|
2025-06-03 11:20:17 -07:00 |