hustxiayang
|
451da4bcbd
|
add tools into TokenizeChatRequest (#18187)
Signed-off-by: yangxia <yangxiast@gmail.com>
|
2025-05-15 04:01:49 -07:00 |
Cyrus Leung
|
d066e52013
|
[Bugfix] Fix chat utils tests (#18139)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-05-14 05:38:21 -07:00 |
Cyrus Leung
|
8f5dc41481
|
[Bugfix] Fix entrypoints audio test failure (#18111)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-05-14 09:08:07 +00:00 |
lkchen
|
6685890d11
|
[Fix] Move "model_config" as keyword args in chat_utils.py (#18098)
Signed-off-by: Linkun <github@lkchen.net>
|
2025-05-13 23:27:26 -07:00 |
Chauncey
|
dc1a821768
|
[Feature][V1] Support `tool_choice: required` when using Xgrammar as the `StructuredOutputBackend`. (#17845)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
|
2025-05-12 23:01:31 -07:00 |
Maximilien de Bayser
|
05a4324f8e
|
Initialize the delta tool call fields explicitly (#17340)
Signed-off-by: Max de Bayser <mbayser@br.ibm.com>
Co-authored-by: igmainc <igmainc@icloud.com>
|
2025-05-12 13:28:58 +00:00 |
Frieda Huang
|
9cea90eab4
|
[Frontend] Add /classify endpoint (#17032)
Signed-off-by: Frieda (Jingying) Huang <jingyingfhuang@gmail.com>
|
2025-05-11 07:57:07 +00:00 |
Ximo Guanter
|
fc4441a4ee
|
Add missing content type headers to /ping and /health (#17036) (#17786)
Signed-off-by: Ximo Guanter <ximo.guanter@gmail.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-05-10 07:13:32 +01:00 |
Russell Bryant
|
ec54d73c31
|
[CI] Fix test_collective_rpc (#17858)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
|
2025-05-08 16:47:12 +00:00 |
Cyrus Leung
|
96722aa81d
|
[Frontend] Chat template fallbacks for multimodal models (#17805)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-05-07 23:05:54 -07:00 |
Cyrus Leung
|
8a15c2603a
|
[Frontend] Add missing chat templates for various MLLMs (#17758)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-05-07 00:10:01 -07:00 |
Chauncey
|
98060b001d
|
[Feature][Frontend]: Deprecate --enable-reasoning (#17452)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
|
2025-05-01 06:46:16 -07:00 |
Cyrus Leung
|
afb4429b4f
|
[CI/Build] Reorganize models tests (#17459)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-04-30 23:03:08 -07:00 |
Marko Rosenmueller
|
77073c77bc
|
[Core] Prevent side-channel attacks via cache salting (#17045)
Signed-off-by: Marko Rosenmueller <5467316+dr75@users.noreply.github.com>
|
2025-04-30 20:27:21 +08:00 |
Gabriel Marinho
|
1c2bc7ead0
|
Truncation control for embedding models (#14776)
Signed-off-by: Gabriel Marinho <gmarinho@ibm.com>
Signed-off-by: Max de Bayser <mbayser@br.ibm.com>
Co-authored-by: Max de Bayser <mbayser@br.ibm.com>
|
2025-04-30 09:24:57 +08:00 |
Harry Mellor
|
a6977dbd15
|
Simplify (and fix) passing of guided decoding backend options (#17008)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-04-29 19:02:23 +00:00 |
Cyrus Leung
|
88ad9ec6b2
|
[Frontend] Support `chat_template_kwargs` in `LLM.chat` (#17356)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-04-29 22:03:35 +08:00 |
Nick Hill
|
70116459c3
|
[BugFix][Frontend] Fix `LLM.chat()` tokenization (#16081)
Signed-off-by: Nick Hill <nhill@redhat.com>
|
2025-04-25 22:20:05 +00:00 |
Sangyeon Cho
|
6aae216b4e
|
[Bugfix] remove fallback in guided_json (int range, patterns) (#16725)
Signed-off-by: csy1204 <josang1204@gmail.com>
Co-authored-by: 조상연[플레이스 AI] <sang-yeon.cho@navercorp.com>
|
2025-04-25 06:54:43 +00:00 |
wang.yuqi
|
67309a1cb5
|
[Frontend] Using matryoshka_dimensions control the allowed output dimensions. (#16970)
|
2025-04-24 07:06:28 -07:00 |
Travis Johnson
|
3cde34a4a4
|
[Frontend] Support guidance:no-additional-properties for compatibility with xgrammar (#15949)
Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>
|
2025-04-23 18:34:41 +00:00 |
Guillaume Calmettes
|
36fe78769f
|
[Bugfix] validate urls object for multimodal content parts (#16990)
Signed-off-by: Guillaume Calmettes <gcalmettes@scaleway.com>
|
2025-04-23 09:43:06 +08:00 |
Nicolò Lucchesi
|
9d4ca19d50
|
[Misc] Benchmarks for audio models (#16505)
Signed-off-by: NickLucche <nlucches@redhat.com>
|
2025-04-19 02:24:14 -07:00 |
Nicolò Lucchesi
|
2ef0dc53b8
|
[Frontend] Add sampling params to `v1/audio/transcriptions` endpoint (#16591)
Signed-off-by: Jannis Schönleber <joennlae@gmail.com>
Signed-off-by: NickLucche <nlucches@redhat.com>
Co-authored-by: Jannis Schönleber <joennlae@gmail.com>
|
2025-04-19 07:03:54 +00:00 |
wang.yuqi
|
3d3ab3689f
|
[New Model]: Snowflake Arctic Embed (Family) (#16649)
|
2025-04-18 08:11:57 -07:00 |
Harry Mellor
|
686623c5e7
|
Fix `nullable_kvs` fallback (#16837)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-04-18 05:58:39 -07:00 |
Harry Mellor
|
e78587a64c
|
Improve-mm-and-pooler-and-decoding-configs (#16789)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-04-17 22:13:32 -07:00 |
Tarun Kumar
|
e37073efd7
|
Add property-based testing for vLLM endpoints using an API defined by an OpenAPI 3.1 schema (#16721)
Signed-off-by: Tarun Kumar <takumar@redhat.com>
Signed-off-by: Nick Hill <nhill@redhat.com>
Co-authored-by: Nick Hill <nhill@redhat.com>
|
2025-04-17 21:08:27 -07:00 |
Angky William
|
fdcb850f14
|
[Misc] Enable vLLM to Dynamically Load LoRA from a Remote Server (#10546)
Signed-off-by: Angky William <angkywilliam@Angkys-MacBook-Pro.local>
Co-authored-by: Angky William <angkywilliam@Angkys-MacBook-Pro.local>
|
2025-04-15 22:31:38 +00:00 |
Russell Bryant
|
dc1b4a6f13
|
[Core][V0] Enable regex support with xgrammar (#13228)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
|
2025-04-14 10:13:38 +08:00 |
Ryan McConville
|
6c11ecf8d3
|
[Bugfix] Validate logit biases to prevent out of vocab ids crashing engine (#16529)
Signed-off-by: Ryan McConville <ryan@ryanmcconville.com>
|
2025-04-12 20:19:19 +00:00 |
Cyrus Leung
|
d9fc8cd9da
|
[V1] Enable multi-input by default (#15799)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-04-12 08:52:39 +00:00 |
Cyrus Leung
|
c5bc0e7fcc
|
[Misc] Update chat utils tests (#16520)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-04-12 06:48:43 +00:00 |
wang.yuqi
|
fbf722c6e6
|
[Frontend] support matryoshka representation / support embedding API dimensions (#16331)
|
2025-04-11 23:23:10 -07:00 |
Russell Bryant
|
9665313c39
|
[V1] Set structured output backend to `auto` by default (#15724)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
|
2025-04-10 17:53:26 +00:00 |
Michael Goin
|
87b4ac56c2
|
[CI][Bugfix] Fix bad tolerance for test_batch_base64_embedding (#16221)
Signed-off-by: mgoin <mgoin64@gmail.com>
|
2025-04-09 04:14:46 +00:00 |
Cyrus Leung
|
4ebc0b9640
|
[Bugfix] Proper input validation for multi-modal encoder-decoder models (#16156)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-04-08 09:45:21 -07:00 |
Michael Goin
|
b99733d092
|
[Bugfix] Do not skip "empty" parts of chats that are parsable (#16219)
Signed-off-by: mgoin <mgoin64@gmail.com>
|
2025-04-08 05:14:15 +00:00 |
leon-seidel
|
24f1c01e0f
|
[Bugfix][V0] XGrammar structured output supports Enum (#15878)
Signed-off-by: Leon Seidel <leon.seidel@fau.de>
|
2025-04-07 22:38:25 +00:00 |
iefgnoix
|
b6be6f8d1e
|
[TPU] Support sliding window and logit soft capping in the paged attention kernel for TPU. (#15732)
Signed-off-by: Xiongfei Wei <isaacwxf23@gmail.com>
|
2025-04-03 14:23:28 -07:00 |
Matthias Matt
|
cefb9e5a28
|
[Frontend] Implement Tool Calling with `tool_choice='required'` (#13483)
Signed-off-by: Liangfu Chen <liangfc@amazon.com>
Signed-off-by: Matt, Matthias <matthias.matt@tuwien.ac.at>
Co-authored-by: Liangfu Chen <liangfc@amazon.com>
Co-authored-by: mgoin <michael@neuralmagic.com>
|
2025-04-02 07:45:45 -07:00 |
Mark McLoughlin
|
98d7367b61
|
[Metrics] Hide deprecated metrics (#15458)
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
|
2025-04-02 07:37:19 -07:00 |
Chauncey
|
594a8b9030
|
[Bugfix] Fix the issue where the model name is empty string, causing no response with the model name. (#15938)
Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
|
2025-04-02 06:33:52 -07:00 |
Eric Tang
|
ddb94c2605
|
[core] Add tags parameter to wake_up() (#15500)
Signed-off-by: Eric <erictang000@gmail.com>
|
2025-04-02 01:59:27 -07:00 |
Varun Sundar Rabindranath
|
79455cf421
|
[Misc] Enable V1 LoRA by default (#15320)
Signed-off-by: Varun Sundar Rabindranath <varun@neuralmagic.com>
Co-authored-by: Varun Sundar Rabindranath <varun@neuralmagic.com>
|
2025-04-01 16:53:56 +08:00 |
Alexander Matveev
|
9a2160fa55
|
[V1] TPU CI - Add basic perf regression test (#15414)
Signed-off-by: Alexander Matveev <amatveev@redhat.com>
|
2025-03-31 13:25:20 -04:00 |
pansicheng
|
7fd8c0f85c
|
fix test_phi3v (#15321)
Signed-off-by: pansicheng <sicheng.pan.chn@gmail.com>
|
2025-03-30 02:01:34 -07:00 |
Varun Sundar Rabindranath
|
1286211f57
|
[Bugfix] LoRA V1: add and fix entrypoints tests (#15715)
Signed-off-by: Varun Sundar Rabindranath <varun@neuralmagic.com>
Co-authored-by: Varun Sundar Rabindranath <varun@neuralmagic.com>
|
2025-03-28 21:10:41 -07:00 |
Russell Bryant
|
7329ff5468
|
[V1] Support disable_any_whtespace for guidance backend (#15584)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
|
2025-03-28 23:46:45 +08:00 |
Lize Cai
|
a10314c6b3
|
[Misc] Fix test_sleep to use query parameters (#14373)
Signed-off-by: Lize Cai <lize.cai@sap.com>
Signed-off-by: youkaichao <youkaichao@gmail.com>
Co-authored-by: youkaichao <youkaichao@gmail.com>
|
2025-03-28 18:00:14 +08:00 |