.. |
async_engine
|
[V1] DP scale-out (2/N): Decouple engine process management and comms (#15977)
|
2025-05-13 10:48:21 -07:00 |
basic_correctness
|
[Core][Feature] Input metadata dump on crash (#13407)
|
2025-05-07 22:15:09 +00:00 |
benchmarks
|
Add `vllm bench [latency, throughput]` CLI commands (#16508)
|
2025-04-14 23:10:35 -07:00 |
compile
|
[AMD][torch.compile] Enable silu+fp8_quant fusion for rocm (#18082)
|
2025-05-13 22:13:56 -07:00 |
config
|
[Feature] specify model in config.yaml (#15798)
|
2025-04-01 01:20:06 -07:00 |
core
|
[Core] [Bugfix] Add Input Embeddings (#15428)
|
2025-05-02 01:06:39 -07:00 |
detokenizer
|
[V1] V1 Enablement Oracle (#13726)
|
2025-03-14 22:02:20 -07:00 |
distributed
|
[Feature] Support Pipeline Parallism in torchrun SPMD offline inference for V1 (#17827)
|
2025-05-15 22:28:27 -07:00 |
encoder_decoder
|
[V1] V1 Enablement Oracle (#13726)
|
2025-03-14 22:02:20 -07:00 |
engine
|
Allow users to pass arbitrary JSON keys from CLI (#18208)
|
2025-05-15 21:05:34 -07:00 |
entrypoints
|
add tools into TokenizeChatRequest (#18187)
|
2025-05-15 04:01:49 -07:00 |
fastsafetensors_loader
|
[Core] Integrate `fastsafetensors` loader for loading model weights (#10647)
|
2025-03-24 08:08:02 -07:00 |
kernels
|
[Bugfix] fix rotary embedding test for _get_padded_tensor_shape (#18229)
|
2025-05-16 01:32:45 +00:00 |
kv_transfer
|
[CI] Actually run tests/kv_transfer/test_disagg.py in CI (#17555)
|
2025-05-02 04:05:04 +00:00 |
lora
|
fix: typos (#18151)
|
2025-05-15 02:16:15 -07:00 |
metrics
|
[V1][Spec Decode] Remove deprecated spec decode config params (#15466)
|
2025-03-31 09:19:35 -07:00 |
mistral_tool_use
|
Update deprecated Python 3.8 typing (#13971)
|
2025-03-02 17:34:51 -08:00 |
model_executor
|
fix: typos (#18151)
|
2025-05-15 02:16:15 -07:00 |
models
|
[Misc] Consolidate Audio tests into multimodal common generation tests (#18214)
|
2025-05-16 09:18:08 +00:00 |
mq_llm_engine
|
[Misc] Replace os environ to monkeypatch in test suite (#14516)
|
2025-03-16 20:35:57 -07:00 |
multi_step
|
[Misc] Replace os environ to monkeypatch in test suite (#14516)
|
2025-03-16 20:35:57 -07:00 |
multimodal
|
Support custom implementations of VideoLoader backends. (#18091)
|
2025-05-15 13:26:49 +08:00 |
neuron
|
Make key optional for rotary embedding (#17566)
|
2025-05-07 00:11:46 -07:00 |
plugins
|
[Lora][Frontend]Add default local directory LoRA resolver plugin. (#16855)
|
2025-05-12 10:39:10 -07:00 |
plugins_tests
|
[V1] Scheduler Refactoring [1/N] - Add Scheduler Interface (#15250)
|
2025-03-20 17:50:43 -07:00 |
prefix_caching
|
[Misc] Replace os environ to monkeypatch in test suite (#14516)
|
2025-03-16 20:35:57 -07:00 |
prompt_adapter
|
[Misc] Add SPDX-License-Identifier headers to python source files (#12628)
|
2025-02-02 11:58:18 -08:00 |
prompts
|
[BugFix] Fix input positions for long context with sliding window (#2088)
|
2023-12-13 12:28:13 -08:00 |
quantization
|
Add support for loading torchao models with `AOPerModuleConfig` (#17826)
|
2025-05-14 16:24:59 -07:00 |
reasoning
|
[Bugfix] add qwen3 reasoning-parser fix content is None when disable … (#17369)
|
2025-04-29 16:32:40 +00:00 |
runai_model_streamer_test
|
[Misc] Split model loader (#17712)
|
2025-05-07 12:42:26 +08:00 |
samplers
|
[Sampler] Adapt to FlashInfer 0.2.3 sampler API (#15777)
|
2025-05-16 15:14:03 -07:00 |
spec_decode
|
[CI] Disable Failing Tests (#18165)
|
2025-05-14 13:49:56 -07:00 |
standalone_tests
|
[Build] Make sure local main branch is synced when VLLM_USE_PRECOMPILED=1 (#13921)
|
2025-03-03 16:43:14 +08:00 |
system_messages
|
[V1] Implement Cascade Attention (#11635)
|
2025-01-01 21:56:46 +09:00 |
tensorizer_loader
|
[CI/Build] Automatically retry flaky tests (#17856)
|
2025-05-09 09:55:17 -06:00 |
tokenization
|
Add full API docs and improve the UX of navigating them (#17485)
|
2025-05-03 19:42:43 -07:00 |
tool_use
|
Add chat template for Llama 4 models (#16428)
|
2025-04-24 20:19:36 +00:00 |
tpu
|
[Hardware][TPU][V1] Multi-LoRA implementation for the V1 TPU backend (#14238)
|
2025-05-07 16:28:47 -04:00 |
tracing
|
[Misc] Replace os environ to monkeypatch in test suite (#14516)
|
2025-03-16 20:35:57 -07:00 |
v1
|
[Sampler] Adapt to FlashInfer 0.2.3 sampler API (#15777)
|
2025-05-16 15:14:03 -07:00 |
vllm_test_utils
|
Update deprecated Python 3.8 typing (#13971)
|
2025-03-02 17:34:51 -08:00 |
weight_loading
|
[Bugfix] fix `an illegal memory access was encountered` of marlin kernel + act_order (#18245)
|
2025-05-16 16:02:44 -07:00 |
worker
|
[Core] Gate `prompt_embeds` behind a feature flag (#17607)
|
2025-05-04 00:19:20 +08:00 |
__init__.py
|
[Small] Formatter only checks lints in changed files (#1528)
|
2023-10-31 15:39:38 -07:00 |
build_cython.py
|
[Build] Cython compilation support fix (#14296)
|
2025-03-24 23:37:54 +00:00 |
conftest.py
|
[Model] Broadcast Ovis2 implementation to fit Ovis1.6 (#17861)
|
2025-05-11 17:56:30 -07:00 |
test_cache_block_hashing.py
|
Update deprecated Python 3.8 typing (#13971)
|
2025-03-02 17:34:51 -08:00 |
test_config.py
|
Add `pt_load_map_location` to allow loading to cuda (#16869)
|
2025-05-01 23:23:42 -07:00 |
test_embedded_commit.py
|
[Misc] Add SPDX-License-Identifier headers to python source files (#12628)
|
2025-02-02 11:58:18 -08:00 |
test_inputs.py
|
Update deprecated Python 3.8 typing (#13971)
|
2025-03-02 17:34:51 -08:00 |
test_logger.py
|
Update deprecated Python 3.8 typing (#13971)
|
2025-03-02 17:34:51 -08:00 |
test_logits_processor.py
|
Update deprecated Python 3.8 typing (#13971)
|
2025-03-02 17:34:51 -08:00 |
test_regression.py
|
[Misc] Replace os environ to monkeypatch in test suite (#14516)
|
2025-03-16 20:35:57 -07:00 |
test_sampling_params.py
|
[Bugfix][Frontend] respect provided default guided decoding backend (#15476)
|
2025-04-09 05:11:10 -07:00 |
test_scalartype.py
|
[Misc] Fix ScalarType float4 naming (#17690)
|
2025-05-06 01:07:15 -07:00 |
test_seed_behavior.py
|
[Bugfix] fix flaky test (#13089)
|
2025-02-11 14:41:20 +00:00 |
test_sequence.py
|
[Misc] Add SPDX-License-Identifier headers to python source files (#12628)
|
2025-02-02 11:58:18 -08:00 |
test_sharded_state_loader.py
|
[Misc] Split model loader (#17712)
|
2025-05-07 12:42:26 +08:00 |
test_triton_utils.py
|
[Bugfix] Fix triton import with local TritonPlaceholder (#17446)
|
2025-05-06 17:53:09 +08:00 |
test_utils.py
|
Allow users to pass arbitrary JSON keys from CLI (#18208)
|
2025-05-15 21:05:34 -07:00 |
test_version.py
|
[Metrics] Add `--show-hidden-metrics-for-version` CLI arg (#13295)
|
2025-02-22 00:20:45 -08:00 |
test_vllm_port.py
|
Throw better error for when running into k8s service discovery issue (#18209)
|
2025-05-15 21:07:28 -07:00 |
utils.py
|
[Misc] Split model loader (#17712)
|
2025-05-07 12:42:26 +08:00 |