Commit Graph

54 Commits

Author SHA1 Message Date
Trevor Royer 55f1a468d9
Move cli args docs to its own page (#18228) (#18264)
Signed-off-by: Trevor Royer <troyer@redhat.com>
2025-05-16 19:43:45 -07:00
Frieda Huang 9cea90eab4
[Frontend] Add /classify endpoint (#17032)
Signed-off-by: Frieda (Jingying) Huang <jingyingfhuang@gmail.com>
2025-05-11 07:57:07 +00:00
Nicolò Lucchesi 2ef0dc53b8
[Frontend] Add sampling params to `v1/audio/transcriptions` endpoint (#16591)
Signed-off-by: Jannis Schönleber <joennlae@gmail.com>
Signed-off-by: NickLucche <nlucches@redhat.com>
Co-authored-by: Jannis Schönleber <joennlae@gmail.com>
2025-04-19 07:03:54 +00:00
Cyrus Leung 61a44a0b22
[Doc] Add more tips to avoid OOM (#16765)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-04-17 09:54:34 +00:00
Christian Sears c09632a66c
Update openai_compatible_server.md (#16507)
Signed-off-by: Christian Sears <csears@redhat.com>
2025-04-11 22:54:58 +00:00
Wei Zeng 30d6a015e0
[Feature] specify model in config.yaml (#15798)
Signed-off-by: weizeng <weizeng@roblox.com>
2025-04-01 01:20:06 -07:00
Roger Wang 9c5c81b0da
[Misc][Doc] Add note regarding loading `generation_config` by default (#15281)
Signed-off-by: Roger Wang <ywang@roblox.com>
2025-03-23 14:00:55 -07:00
Cyrus Leung baec0d4de9
Revert "[Feature] specify model in config.yaml (#14855)" (#15293)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-03-21 08:30:23 -07:00
Wei Zeng 0fa3970deb
[Feature] specify model in config.yaml (#14855)
Signed-off-by: weizeng <weizeng@roblox.com>
2025-03-21 00:26:03 -07:00
Harry Mellor 6edbfa924d
Mention `extra_body` as a way top pass vLLM only parameters using the OpenAI client (#15240)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-03-20 19:18:36 -07:00
Nicolò Lucchesi fa82b93853
[Frontend][Docs] Transcription API streaming (#13301)
Signed-off-by: NickLucche <nlucches@redhat.com>
2025-03-06 10:39:35 +00:00
Cyrus Leung 1088f06242
[Doc] Move multimodal Embedding API example to Online Serving page (#14017)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-02-28 07:12:04 +00:00
Gabriel Marinho 1c3c975766
[FEATURE] Enables /score endpoint for embedding models (#12846) 2025-02-20 22:09:47 -08:00
Nicolò Lucchesi d84cef76eb
[Frontend] Add `/v1/audio/transcriptions` OpenAI API endpoint (#12909) 2025-02-13 07:23:45 -08:00
Harry Mellor dd6a3a02cb
[Doc] Convert docs to use colon fences (#12471)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-01-29 11:38:29 +08:00
Kyle Mistele 0034b09ceb
[Frontend] Rerank API (Jina- and Cohere-compatible API) (#12376)
Signed-off-by: Kyle Mistele <kyle@mistele.com>
2025-01-26 19:58:45 -07:00
Rafael Vasquez 43f3d9e699
[CI/Build] Add markdown linter (#11857)
Signed-off-by: Rafael Vasquez <rafvasq21@gmail.com>
2025-01-12 00:17:13 -08:00
Cyrus Leung 6cd40a5bfe
[Doc][4/N] Reorganize API Reference (#11843)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-01-08 21:34:44 +08:00
Harry Mellor aba8d6ee00
[Doc] Move examples into categories (#11840)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-01-08 13:09:53 +00:00
Cyrus Leung 8ceffbf315
[Doc][3/N] Reorganize Serving section (#11766)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-01-07 11:20:01 +08:00
Cyrus Leung ee77fdb5de
[Doc][2/N] Reorganize Models and Usage sections (#11755)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-01-06 21:40:31 +08:00
Robert Shaw 0c0c2015c5
Update openai_compatible_server.md (#11536)
Co-authored-by: Simon Mo <simon.mo@hey.com>
2024-12-26 16:26:18 -08:00
Cyrus Leung 6ad909fdda
[Doc] Improve GitHub links (#11491)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2024-12-25 14:49:26 -08:00
Cyrus Leung 9edca6bf8f
[Frontend] Online Pooling API (#11457)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2024-12-24 17:54:30 +08:00
Rafael Vasquez 32aa2059ad
[Docs] Convert rST to MyST (Markdown) (#11145)
Signed-off-by: Rafael Vasquez <rafvasq21@gmail.com>
2024-12-23 22:35:38 +00:00
kYLe 66d4b16724
[Frontend] Add OpenAI API support for input_audio (#11027)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
2024-12-16 22:09:58 -08:00
Cyrus Leung 0920ab9131
[Doc] Reorganize online pooling APIs (#11172)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2024-12-14 00:22:22 +08:00
Michael Goin 6d525288c1
[Docs] Add dedicated tool calling page to docs (#10554)
Signed-off-by: mgoin <michael@neuralmagic.com>
Co-authored-by: Tyler Michael Smith <tyler@neuralmagic.com>
2024-12-09 20:15:34 -05:00
Cyrus Leung aa39a8e175
[Doc] Create a new "Usage" section (#10827)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2024-12-05 11:19:35 +08:00
Maximilien de Bayser 214efc2c3c
Support Cross encoder models (#10400)
Signed-off-by: Max de Bayser <maxdebayser@gmail.com>
Signed-off-by: Max de Bayser <mbayser@br.ibm.com>
Signed-off-by: Flavia Beo <flavia.beo@ibm.com>
Co-authored-by: Flavia Beo <flavia.beo@ibm.com>
2024-11-24 18:56:20 -08:00
Cyrus Leung 32e46e000f
[Frontend] Automatic detection of chat content format from AST (#9919)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2024-11-16 13:35:40 +08:00
Mike Depinet f67ce05d0b
[Frontend] Pythonic tool parser (#9859)
Signed-off-by: Mike Depinet <mike@fixie.ai>
2024-11-14 04:14:34 +00:00
Guillaume Calmettes 36c513a076
[BugFix] Do not raise a `ValueError` when `tool_choice` is set to the supported `none` option and `tools` are not defined. (#10000)
Signed-off-by: Guillaume Calmettes <gcalmettes@scaleway.com>
2024-11-12 11:13:46 +00:00
cjackal d88bff1b96
[Frontend] add `add_request_id` middleware (#9594)
Signed-off-by: cjackal <44624812+cjackal@users.noreply.github.com>
2024-11-09 10:18:29 +00:00
Maximilien de Bayser ae62fd17c0
[Frontend] Tool calling parser for Granite 3.0 models (#9027)
Signed-off-by: Max de Bayser <mbayser@br.ibm.com>
2024-11-07 07:09:02 -08:00
Cyrus Leung 06386a64dd
[Frontend] Chat-based Embeddings API (#9759) 2024-11-01 08:13:35 +00:00
Will Eaton 882a1ad0de
[Model] tool calling support for ibm-granite/granite-20b-functioncalling (#8339)
Signed-off-by: Max de Bayser <mbayser@br.ibm.com>
Co-authored-by: Max de Bayser <mbayser@br.ibm.com>
Co-authored-by: Maximilien de Bayser <maxdebayser@gmail.com>
2024-10-29 15:07:37 -07:00
Vinay R Damodaran 33bab41060
[Bugfix]: Make chat content text allow type content (#9358)
Signed-off-by: Vinay Damodaran <vrdn@hey.com>
2024-10-24 05:05:49 +00:00
tomeras91 d2b1bf55ec
[Frontend][Feature] Add jamba tool parser (#9154) 2024-10-18 10:27:48 +00:00
Andy Dai 5df1834895
[Bugfix] Fix order of arguments matters in config.yaml (#8960) 2024-10-05 17:35:11 +00:00
代君 3dbb215b38
[Frontend][Feature] support tool calling for internlm/internlm2_5-7b-chat model (#8405) 2024-10-04 10:36:39 +08:00
Maximilien de Bayser 344cd2b6f4
[Feature] Add support for Llama 3.1 and 3.2 tool use (#8343)
Signed-off-by: Max de Bayser <mbayser@br.ibm.com>
2024-09-26 17:01:42 -07:00
Kyle Mistele e02ce498be
[Feature] OpenAI-Compatible Tools API + Streaming for Hermes & Mistral models (#5649)
Co-authored-by: constellate <constellate@1-ai-appserver-staging.codereach.com>
Co-authored-by: Kyle Mistele <kyle@constellate.ai>
2024-09-04 13:18:13 -07:00
Kaunil Dhruv 058344f89a
[Frontend]-config-cli-args (#7737)
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
Co-authored-by: Kaunil Dhruv <kaunil_dhruv@intuit.com>
2024-08-30 08:21:02 -07:00
Cyrus Leung 5bf35a91e4
[Doc][CI/Build] Update docs and tests to use `vllm serve` (#6431) 2024-07-17 07:43:21 +00:00
Ethan Xu dbfe254eda
[Feature] vLLM CLI (#5090)
Co-authored-by: simon-mo <simon.mo@hey.com>
2024-07-14 15:36:43 -07:00
Roger Wang 7a9cb294ae
[Frontend] Add OpenAI Vision API Support (#5237)
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
2024-06-07 11:23:32 -07:00
Breno Faria f775a07e30
[FRONTEND] OpenAI `tools` support named functions (#5032) 2024-06-03 18:25:29 -05:00
Cyrus Leung 5ae5ed1e60
[Core] Consolidate prompt arguments to LLM engines (#4328)
Co-authored-by: Roger Wang <ywang@roblox.com>
2024-05-28 13:29:31 -07:00
Cyrus Leung 4bfa7e7f75
[Doc] Add API reference for offline inference (#4710) 2024-05-13 17:47:42 -07:00