Harry Mellor
|
27bebcd897
|
Convert `examples` to `ruff-format` (#18400)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-05-26 16:57:54 +00:00 |
Lukas Geiger
|
e7523c2e03
|
[V1][Sampler] Improve performance of FlashInfer sampling by sampling logits instead of probs (#18608)
|
2025-05-26 11:49:36 -04:00 |
Cyrus Leung
|
a869baca73
|
[Bugfix] Fix Llama GGUF initialization (#18717)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-05-26 07:49:22 -07:00 |
Cyrus Leung
|
82e2339b06
|
[Doc] Move examples and further reorganize user guide (#18666)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-05-26 07:38:04 -07:00 |
Cyrus Leung
|
9553fdb41e
|
[Doc] Improve API docs (#18713)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-05-26 07:33:34 -07:00 |
dylan
|
243eb9199f
|
[Bugfix]: handle hf-xet CAS error when loading Qwen3 weights in vLLM (#18701)
|
2025-05-26 07:10:56 -07:00 |
Reid
|
0665e29998
|
[Misc] add AutoGen integration (#18712)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
|
2025-05-26 13:56:18 +00:00 |
Łukasz Durejko
|
e76be06550
|
[Hardware][Intel-Gaudi] [CI/Build] Add tensor parallel size = 2 test to HPU CI (#18709)
Signed-off-by: Lukasz Durejko <ldurejko@habana.ai>
|
2025-05-26 05:26:07 -07:00 |
Isotr0py
|
0877750029
|
[CI/Build] Split pooling and generation extended language models tests in CI (#18705)
Signed-off-by: Isotr0py <2037008807@qq.com>
|
2025-05-26 04:00:08 -07:00 |
Naveassaf
|
6d68030f1c
|
[Model] Add support for YARN in NemotronNAS models (#18427)
Signed-off-by: Nave Assaf <nassaf@nvidia.com>
|
2025-05-26 10:31:49 +00:00 |
Ning Xie
|
5a2c76cbe1
|
[CI] fix dump_input for str type (#18697)
Signed-off-by: Andy Xie <andy.xning@gmail.com>
|
2025-05-26 18:23:35 +08:00 |
Cyrus Leung
|
38b13dfe78
|
[CI/Build] Replace `math.isclose` with `pytest.approx` (#18703)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-05-26 02:05:17 -07:00 |
Cyrus Leung
|
61a45e7a72
|
[Bugfix] Fix Mistral-format models with sliding window (#18693)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-05-26 01:44:04 -07:00 |
Cyrus Leung
|
65523a0995
|
[Doc] Fix issue template format (#18699)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-05-26 00:45:39 -07:00 |
Cyrus Leung
|
4b7740a105
|
[GH] Add issue template for reporting CI failures (#18696)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-05-26 00:42:04 -07:00 |
Ning Xie
|
4ea62c0ea0
|
[CI] add missing argument (#18694)
Signed-off-by: Andy Xie <andy.xning@gmail.com>
|
2025-05-26 00:22:04 -07:00 |
Maximilien de Bayser
|
561b77a0d6
|
[Bugfix] Fix the lm_head in gpt_bigcode in lora mode (#6357)
Signed-off-by: Max de Bayser <mbayser@br.ibm.com>
Signed-off-by: Max de Bayser <maxdebayser@gmail.com>
|
2025-05-26 14:52:25 +08:00 |
CYJiang
|
abd4030d94
|
refactor: simplify request handler, use positive condition check for handler assignment (#18690)
Signed-off-by: googs1025 <googs1025@gmail.com>
|
2025-05-26 06:32:28 +00:00 |
AlexZhao
|
8820821b59
|
[Misc] Fixed the abnormally high TTFT issue in the PD disaggregation example (#18644)
Signed-off-by: zhaohaidao <zhaohaidao2008@hotmail.com>
Signed-off-by: zhaohaiyuan <zhaohaiyuan@xiaohongshu.com>
Co-authored-by: zhaohaiyuan <zhaohaiyuan@xiaohongshu.com>
|
2025-05-26 13:51:27 +08:00 |
Cyrus Leung
|
fba0642704
|
[CI/Build][Doc] Update `gte-Qwen2-1.5B-instruct` usage (#18683)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: Isotr0py <2037008807@qq.com>
Co-authored-by: Isotr0py <2037008807@qq.com>
|
2025-05-25 20:27:50 -07:00 |
Lukas Geiger
|
6071e989df
|
[Core][Multimodal] Convert PIL Image to array without data copy when hashing (#18682)
Signed-off-by: Lukas Geiger <lukas.geiger94@gmail.com>
|
2025-05-25 17:33:35 +00:00 |
Cyrus Leung
|
57fd13a707
|
[Bugfix] Fix profiling dummy data for Pixtral (#18677)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-05-25 14:05:30 +00:00 |
Reid
|
3a886bd58c
|
[Misc] small improve (#18680)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
|
2025-05-25 06:05:38 -07:00 |
Reid
|
35be8fad62
|
[CI/build] fix no regex (#18676)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
|
2025-05-25 10:10:51 +00:00 |
Yuqi Zhang
|
f2faac745d
|
[Bugfix] Fix cpu usage and cache hit stats reporting on cpu environment (#18674)
Signed-off-by: zzzyq <zhangyuqi94@gmail.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
|
2025-05-25 02:36:06 -07:00 |
Reid
|
279f854519
|
[doc] improve readability (#18675)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
|
2025-05-25 01:40:31 -07:00 |
Reid
|
624b77a2b3
|
[doc] fix broken links (#18671)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
|
2025-05-25 01:36:33 -07:00 |
Cyrus Leung
|
503f8487c2
|
[Misc] Reduce logs on startup (#18649)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-05-24 23:03:53 -07:00 |
Ning Xie
|
44073a7ac3
|
[BUGFIX] catch subclass first for try...except (#18672)
Signed-off-by: Andy Xie <andy.xning@gmail.com>
|
2025-05-25 05:34:24 +00:00 |
Michael Goin
|
63934543a0
|
Speed up the `kernels/quantization/` tests (#18669)
Signed-off-by: mgoin <mgoin64@gmail.com>
|
2025-05-25 05:02:59 +00:00 |
Isotr0py
|
75f81750f3
|
[VLM] Initialize video input support for InternVL models (#18499)
Signed-off-by: Isotr0py <2037008807@qq.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
|
2025-05-25 04:51:25 +00:00 |
Mengqing Cao
|
6ab681bcbe
|
[Misc][ModelScope] Change to use runtime VLLM_USE_MODELSCOPE (#18655)
Signed-off-by: Mengqing Cao <cmq0113@163.com>
Signed-off-by: Isotr0py <2037008807@qq.com>
Co-authored-by: Isotr0py <2037008807@qq.com>
|
2025-05-25 04:51:21 +00:00 |
Chenguang Li
|
cebc22f3b6
|
[Misc]Replace `cuda` hard code with `current_platform` in Ray (#14668)
Signed-off-by: noemotiovon <757486878@qq.com>
|
2025-05-24 20:26:31 -07:00 |
Ning Xie
|
6c6dcd8611
|
[MISC] correct signature for LoaderFunction (#18670)
Signed-off-by: Andy Xie <andy.xning@gmail.com>
|
2025-05-24 20:17:47 -07:00 |
Seiji Eicher
|
7891fdf0c6
|
[V1] Fix _pickle.PicklingError: Can't pickle <class 'transformers_modules.deepseek-ai.DeepSeek-V2-Lite... (#18640)
Signed-off-by: Seiji Eicher <seiji@anyscale.com>
|
2025-05-24 20:07:20 -07:00 |
Woosuk Kwon
|
6825d9a998
|
[BugFix][Spec Decode] Improve Prefix Caching Logic in Speculative Decoding (#18668)
Signed-off-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
|
2025-05-24 17:33:46 -07:00 |
Reid
|
b554ab736e
|
[CI/Build] fix permission denied issue (#18645)
Signed-off-by: reidliu41 <reid201711@gmail.com>
Co-authored-by: reidliu41 <reid201711@gmail.com>
|
2025-05-24 16:09:10 +00:00 |
Aaron Pham
|
9ea7f1abf3
|
fix(regression): clone from reference items (#18662)
Signed-off-by: Aaron Pham <contact@aarnphm.xyz>
|
2025-05-24 15:25:20 +00:00 |
Aaron Pham
|
2807271c86
|
[CI] enforce import regex instead of re (#18665)
Signed-off-by: Aaron Pham <contact@aarnphm.xyz>
|
2025-05-24 08:04:14 -07:00 |
wangxiyuan
|
b9018a3f9f
|
[BugFix] Fix import error for fused_moe (#18642)
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
|
2025-05-24 07:53:36 -07:00 |
Ning Xie
|
4ceafb6299
|
[MISC] typo fix and clean import (#18664)
Signed-off-by: Andy Xie <andy.xning@gmail.com>
|
2025-05-24 07:52:09 -07:00 |
Cyrus Leung
|
2e6705784f
|
[CI/Build] `chmod +x` to `cleanup_pr_body.sh` (#18650)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-05-24 07:26:45 -07:00 |
Cyrus Leung
|
1cb194a018
|
[Doc] Reorganize user guide (#18661)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-05-24 07:25:33 -07:00 |
ztang2370
|
2cd4d58df4
|
[Model] use AutoWeightsLoader for gpt2 (#18625)
Signed-off-by: zt2370 <ztang2370@gmail.com>
|
2025-05-24 13:36:13 +00:00 |
Cyrus Leung
|
6d166a8d35
|
[Doc] Add community links (#18657)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-05-24 06:06:38 -07:00 |
Cyrus Leung
|
ef1dd6870f
|
[Doc] Fix indentation problems in V0 Paged Attention docs (#18659)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-05-24 06:06:35 -07:00 |
Mengqing Cao
|
e77dc4bad8
|
[MISC][pre-commit] Add pre-commit check for triton import (#17716)
Signed-off-by: Mengqing Cao <cmq0113@163.com>
|
2025-05-24 20:09:15 +08:00 |
Cyrus Leung
|
07458a51ce
|
[Doc] Update README links, mark external links (#18635)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-05-24 09:57:15 +00:00 |
qizixi
|
c1e4a4052d
|
[V1][Spec Decode] Support multi-layer eagle draft model (#18030)
Signed-off-by: qizixi <qizixi@meta.com>
|
2025-05-24 09:45:34 +00:00 |
Yuanhao WU
|
a859320575
|
[Model] Add support for Qwen2.5-Omni-7B-AWQ (Qwen2_5OmniForConditionalGeneration) (#18647)
|
2025-05-24 09:15:36 +00:00 |