Jun Duan
|
925d2f1908
|
[Doc] Fix typo for x86 CPU installation (#12514)
Signed-off-by: Jun Duan <jun.duan.phd@outlook.com>
|
2025-01-28 16:37:10 +00:00 |
Mohit Deopujari
|
9a0f3bdbe5
|
[Hardware][Gaudi][Doc] Add missing step in setup instructions (#12382)
|
2025-01-24 09:43:49 +00:00 |
Cyrus Leung
|
d07efb31c5
|
[Doc] Troubleshooting errors during model inspection (#12351)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-01-23 22:46:58 +08:00 |
youkaichao
|
511627445e
|
[doc] explain common errors around torch.compile (#12340)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2025-01-23 14:56:02 +08:00 |
Hongxia Yang
|
09ccc9c8f7
|
[Documentation][AMD] Add information about prebuilt ROCm vLLM docker for perf validation purpose (#12281)
Signed-off-by: Hongxia Yang <hongxyan@amd.com>
|
2025-01-22 07:49:22 +08:00 |
Gregory Shtrasberg
|
d4b62d4641
|
[AMD][Build] Porting dockerfiles from the ROCm/vllm fork (#11777)
Signed-off-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com>
|
2025-01-21 12:22:23 +08:00 |
Hongxia Yang
|
c09503ddd6
|
[AMD][CI/Build][Bugfix] use pytorch stale wheel (#12172)
Signed-off-by: hongxyan <hongxyan@amd.com>
|
2025-01-18 11:15:53 +08:00 |
Harry Mellor
|
e8c23ff989
|
[Doc] Organise installation documentation into categories and tabs (#11935)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-01-13 12:27:36 +00:00 |
Rafael Vasquez
|
43f3d9e699
|
[CI/Build] Add markdown linter (#11857)
Signed-off-by: Rafael Vasquez <rafvasq21@gmail.com>
|
2025-01-12 00:17:13 -08:00 |
Li, Jiang
|
aa1e77a19c
|
[Hardware][CPU] Support MOE models on x86 CPU (#11831)
Signed-off-by: jiang1.li <jiang1.li@intel.com>
|
2025-01-10 11:07:58 -05:00 |
Harry Mellor
|
482cdc494e
|
[Doc] Rename offline inference examples (#11927)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-01-10 23:50:29 +08:00 |
Harry Mellor
|
d85c47d6ad
|
Replace "online inference" with "online serving" (#11923)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-01-10 12:05:56 +00:00 |
Michael Goin
|
730e9592e9
|
[Doc] Recommend uv and python 3.12 for quickstart guide (#11849)
Signed-off-by: mgoin <michael@neuralmagic.com>
|
2025-01-09 11:37:48 +08:00 |
Cyrus Leung
|
6cd40a5bfe
|
[Doc][4/N] Reorganize API Reference (#11843)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-01-08 21:34:44 +08:00 |
Harry Mellor
|
aba8d6ee00
|
[Doc] Move examples into categories (#11840)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-01-08 13:09:53 +00:00 |
Wallas Henrique
|
cfd3219f58
|
[Hardware][Apple] Native support for macOS Apple Silicon (#11696)
Signed-off-by: Wallas Santos <wallashss@ibm.com>
Co-authored-by: Michael Goin <michael@neuralmagic.com>
|
2025-01-08 16:35:49 +08:00 |
youkaichao
|
ad9f1aa679
|
[doc] update wheels url (#11830)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2025-01-08 14:36:49 +08:00 |
Harry Mellor
|
5950f555a1
|
[Doc] Group examples into categories (#11782)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-01-08 09:20:12 +08:00 |
youkaichao
|
d9fa1c05ad
|
[doc] update how pip can install nightly wheels (#11806)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2025-01-07 21:42:58 +08:00 |
youkaichao
|
869e829b85
|
[doc] add doc to explain how to use uv (#11773)
Signed-off-by: youkaichao <youkaichao@gmail.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
|
2025-01-07 18:41:17 +08:00 |
Cyrus Leung
|
8ceffbf315
|
[Doc][3/N] Reorganize Serving section (#11766)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-01-07 11:20:01 +08:00 |
Cyrus Leung
|
402d378360
|
[Doc] [1/N] Reorganize Getting Started section (#11645)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-01-06 02:18:33 +00:00 |
Cyrus Leung
|
32b4c63f02
|
[Doc] Convert list tables to MyST (#11594)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-12-29 15:56:22 +08:00 |
Cyrus Leung
|
d427e5cfda
|
[Doc] Minor documentation fixes (#11580)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-12-28 21:53:59 +08:00 |
Cyrus Leung
|
6ad909fdda
|
[Doc] Improve GitHub links (#11491)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-12-25 14:49:26 -08:00 |
Rafael Vasquez
|
32aa2059ad
|
[Docs] Convert rST to MyST (Markdown) (#11145)
Signed-off-by: Rafael Vasquez <rafvasq21@gmail.com>
|
2024-12-23 22:35:38 +00:00 |
Yuan Tang
|
2e726680b3
|
[Bugfix] torch nightly version in ROCm installation guide (#11423)
Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
|
2024-12-23 17:20:22 +00:00 |
youkaichao
|
5d2248d81a
|
[doc] explain nccl requirements for rlhf (#11381)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-12-20 13:00:56 -08:00 |
youkaichao
|
1ecc645b8f
|
[doc] backward compatibility for 0.6.4 (#11359)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-12-19 21:33:53 -08:00 |
Russell Bryant
|
4863e5fba5
|
[Core] V1: Use multiprocessing by default (#11074)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
|
2024-12-13 16:27:32 -08:00 |
Daniele
|
e4c34c23de
|
[CI/Build] improve python-only dev setup (#9621)
Signed-off-by: Daniele Trifirò <dtrifiro@redhat.com>
Signed-off-by: youkaichao <youkaichao@gmail.com>
Co-authored-by: youkaichao <youkaichao@gmail.com>
|
2024-12-04 21:48:13 +00:00 |
Kevin H. Luu
|
c92acb9693
|
[ci/build] Update vLLM postmerge ECR repo (#10887)
|
2024-12-04 09:01:20 +00:00 |
wangxiyuan
|
7e4bbda573
|
[doc] format fix (#10789)
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
|
2024-11-30 11:38:40 +00:00 |
Sage Moore
|
9a88f89799
|
custom allreduce + torch.compile (#10121)
Signed-off-by: youkaichao <youkaichao@gmail.com>
Co-authored-by: youkaichao <youkaichao@gmail.com>
|
2024-11-25 22:00:16 -08:00 |
Sanket Kale
|
a6760f6456
|
[Feature] vLLM ARM Enablement for AARCH64 CPUs (#9228)
Signed-off-by: Sanket Kale <sanketk.kale@fujitsu.com>
Co-authored-by: Sanket Kale <sanketk.kale@fujitsu.com>
Co-authored-by: mgoin <michael@neuralmagic.com>
|
2024-11-25 18:32:39 -08:00 |
Li, Jiang
|
63f1fde277
|
[Hardware][CPU] Support chunked-prefill and prefix-caching on CPU (#10355)
Signed-off-by: jiang1.li <jiang1.li@intel.com>
|
2024-11-20 10:57:39 +00:00 |
wchen61
|
7629a9c6e5
|
[CI/Build] Support compilation with local cutlass path (#10423) (#10424)
|
2024-11-19 21:35:50 -08:00 |
Michael Green
|
4f168f69a3
|
[Docs] Misc updates to TPU installation instructions (#10165)
|
2024-11-15 13:26:17 -08:00 |
youkaichao
|
377b74fe87
|
Revert "[ci][build] limit cmake version" (#10271)
|
2024-11-12 15:06:48 -08:00 |
youkaichao
|
18081451f9
|
[doc] improve debugging doc (#10270)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-11-12 14:43:52 -08:00 |
youkaichao
|
d1c6799b88
|
[doc] update debugging guide (#10236)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-11-11 15:21:12 -08:00 |
youkaichao
|
f0f2e5638e
|
[doc] improve debugging code (#10206)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-11-10 17:49:40 -08:00 |
youkaichao
|
9fa4bdde9d
|
[ci][build] limit cmake version (#10188)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-11-09 16:27:26 -08:00 |
youkaichao
|
e7b84c394d
|
[doc] add back Python 3.8 ABI (#10100)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2024-11-06 21:06:41 -08:00 |
Li, Jiang
|
a4b3e0c1e9
|
[Hardware][CPU] Update torch 2.5 (#9911)
Signed-off-by: jiang1.li <jiang1.li@intel.com>
|
2024-11-07 04:43:08 +00:00 |
Russell Bryant
|
098f94de42
|
[CI/Build] Drop Python 3.8 support (#10038)
Signed-off-by: Russell Bryant <rbryant@redhat.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2024-11-06 14:31:01 +00:00 |
Konrad Zawora
|
a02a50e6e5
|
[Hardware][Intel-Gaudi] Add Intel Gaudi (HPU) inference backend (#6143)
Signed-off-by: yuwenzho <yuwen.zhou@intel.com>
Signed-off-by: Chendi.Xue <chendi.xue@intel.com>
Signed-off-by: Bob Zhu <bob.zhu@intel.com>
Signed-off-by: zehao-intel <zehao.huang@intel.com>
Signed-off-by: Konrad Zawora <kzawora@habana.ai>
Co-authored-by: Kunshang Ji <kunshang.ji@intel.com>
Co-authored-by: Sanju C Sudhakaran <scsudhakaran@habana.ai>
Co-authored-by: Michal Adamczyk <madamczyk@habana.ai>
Co-authored-by: Marceli Fylcek <mfylcek@habana.ai>
Co-authored-by: Himangshu Lahkar <49579433+hlahkar@users.noreply.github.com>
Co-authored-by: Vivek Goel <vgoel@habana.ai>
Co-authored-by: yuwenzho <yuwen.zhou@intel.com>
Co-authored-by: Dominika Olszewska <dolszewska@habana.ai>
Co-authored-by: barak goldberg <149692267+bgoldberg-habana@users.noreply.github.com>
Co-authored-by: Michal Szutenberg <37601244+szutenberg@users.noreply.github.com>
Co-authored-by: Jan Kaniecki <jkaniecki@habana.ai>
Co-authored-by: Agata Dobrzyniewicz <160237065+adobrzyniewicz-habana@users.noreply.github.com>
Co-authored-by: Krzysztof Wisniewski <kwisniewski@habana.ai>
Co-authored-by: Dudi Lester <160421192+dudilester@users.noreply.github.com>
Co-authored-by: Ilia Taraban <tarabanil@gmail.com>
Co-authored-by: Chendi.Xue <chendi.xue@intel.com>
Co-authored-by: Michał Kuligowski <mkuligowski@habana.ai>
Co-authored-by: Jakub Maksymczuk <jmaksymczuk@habana.ai>
Co-authored-by: Tomasz Zielinski <85164140+tzielinski-habana@users.noreply.github.com>
Co-authored-by: Sun Choi <schoi@habana.ai>
Co-authored-by: Iryna Boiko <iboiko@habana.ai>
Co-authored-by: Bob Zhu <41610754+czhu15@users.noreply.github.com>
Co-authored-by: hlin99 <73271530+hlin99@users.noreply.github.com>
Co-authored-by: Zehao Huang <zehao.huang@intel.com>
Co-authored-by: Andrzej Kotłowski <Andrzej.Kotlowski@intel.com>
Co-authored-by: Yan Tomsinsky <73292515+Yantom1@users.noreply.github.com>
Co-authored-by: Nir David <ndavid@habana.ai>
Co-authored-by: Yu-Zhou <yu.zhou@intel.com>
Co-authored-by: Ruheena Suhani Shaik <rsshaik@habana.ai>
Co-authored-by: Karol Damaszke <kdamaszke@habana.ai>
Co-authored-by: Marcin Swiniarski <mswiniarski@habana.ai>
Co-authored-by: Woosuk Kwon <woosuk.kwon@berkeley.edu>
Co-authored-by: Jacek Czaja <jacek.czaja@intel.com>
Co-authored-by: Jacek Czaja <jczaja@habana.ai>
Co-authored-by: Yuan <yuan.zhou@outlook.com>
|
2024-11-06 01:09:10 -08:00 |
Aaron Pham
|
21063c11c7
|
[CI/Build] drop support for Python 3.8 EOL (#8464)
Signed-off-by: Aaron Pham <contact@aarnphm.xyz>
|
2024-11-06 07:11:55 +00:00 |
Richard Liu
|
cd34029e91
|
Refactor TPU requirements file and pin build dependencies (#10010)
Signed-off-by: Richard Liu <ricliu@google.com>
|
2024-11-05 16:48:44 +00:00 |
Michael Green
|
1d4cfe2be1
|
[Doc] Updated tpu-installation.rst with more details (#9926)
Signed-off-by: Michael Green <mikegre@google.com>
|
2024-11-02 10:06:45 -04:00 |
Cyrus Leung
|
06386a64dd
|
[Frontend] Chat-based Embeddings API (#9759)
|
2024-11-01 08:13:35 +00:00 |
Woosuk Kwon
|
211fe91aa8
|
[TPU] Correctly profile peak memory usage & Upgrade PyTorch XLA (#9438)
|
2024-10-30 09:41:38 +00:00 |
Yan Ma
|
04a3ae0aca
|
[Bugfix] Fix multi nodes TP+PP for XPU (#8884)
Signed-off-by: YiSheng5 <syhm@mail.ustc.edu.cn>
Signed-off-by: yan ma <yan.ma@intel.com>
Co-authored-by: YiSheng5 <syhm@mail.ustc.edu.cn>
|
2024-10-29 21:34:45 -07:00 |
Rafael Vasquez
|
228cfbd03f
|
[Doc] Improve quickstart documentation (#9256)
Signed-off-by: Rafael Vasquez <rafvasq21@gmail.com>
|
2024-10-25 14:32:10 -07:00 |
Yuan
|
32a1ee74a0
|
[Hardware][Intel CPU][DOC] Update docs for CPU backend (#6212)
Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>
Co-authored-by: Rafael Vasquez <rafvasq21@gmail.com>
Co-authored-by: Gubrud, Aaron D <aaron.d.gubrud@intel.com>
Co-authored-by: adgubrud <96072084+adgubrud@users.noreply.github.com>
|
2024-10-22 10:38:04 -07:00 |
Rafael Vasquez
|
f7db5f0fa9
|
[Doc] Use shell code-blocks and fix section headers (#9508)
Signed-off-by: Rafael Vasquez <rafvasq21@gmail.com>
|
2024-10-22 06:43:24 +00:00 |
youkaichao
|
d621c43df7
|
[doc] fix format (#9562)
|
2024-10-21 13:54:57 -07:00 |
Li, Jiang
|
5eda21e773
|
[Hardware][CPU] compressed-tensor INT8 W8A8 AZP support (#9344)
|
2024-10-17 12:21:04 -04:00 |
Yunmeng
|
2b184ddd4f
|
[Misc][Installation] Improve source installation script and doc (#9309)
Co-authored-by: youkaichao <youkaichao@126.com>
|
2024-10-12 09:36:40 -07:00 |
omrishiv
|
f990bab2a4
|
[Doc][Neuron] add note to neuron documentation about resolving triton issue (#9257)
Signed-off-by: omrishiv <327609+omrishiv@users.noreply.github.com>
|
2024-10-10 23:36:32 +00:00 |
Rafael Vasquez
|
055f3270d4
|
[Doc] Improve debugging documentation (#9204)
Signed-off-by: Rafael Vasquez <rafvasq21@gmail.com>
|
2024-10-10 10:48:51 -07:00 |
Rafael Vasquez
|
de24046fcd
|
[Doc] Improve contributing and installation documentation (#9132)
Signed-off-by: Rafael Vasquez <rafvasq21@gmail.com>
|
2024-10-08 20:22:08 +00:00 |
Sergey Shlyapnikov
|
f58d4fccc9
|
[OpenVINO] Enable GPU support for OpenVINO vLLM backend (#8192)
|
2024-10-02 17:50:01 -04:00 |
youkaichao
|
cc276443b5
|
[doc] organize installation doc and expose per-commit docker (#8931)
|
2024-09-28 17:48:41 -07:00 |
youkaichao
|
d86f6b2afb
|
[misc] fix wheel name (#8919)
|
2024-09-27 22:10:44 -07:00 |
youkaichao
|
70de39f6b4
|
[misc][installation] build from source without compilation (#8818)
|
2024-09-26 13:19:04 -07:00 |
Hongxia Yang
|
1c046447a6
|
[CI/Build][Bugfix][Doc][ROCm] CI fix and doc update after ROCm 6.2 upgrade (#8777)
|
2024-09-25 22:26:37 +08:00 |
Hongxia Yang
|
530821d00c
|
[Hardware][AMD] ROCm6.2 upgrade (#8674)
|
2024-09-23 18:52:39 -07:00 |
Daniele
|
ee5f34b1c2
|
[CI/Build] use setuptools-scm to set __version__ (#4738)
Co-authored-by: youkaichao <youkaichao@126.com>
|
2024-09-23 09:44:26 -07:00 |
Yan Ma
|
d23679eb99
|
[Bugfix] fix docker build for xpu (#8652)
|
2024-09-22 22:54:18 -07:00 |
youkaichao
|
d4a2ac8302
|
[build] enable existing pytorch (for GH200, aarch64, nightly) (#8713)
|
2024-09-22 12:47:54 -07:00 |
Andy Dai
|
4dfdf43196
|
[Doc] Fix typo in AMD installation guide (#8689)
|
2024-09-21 00:24:12 -07:00 |
omrishiv
|
7c8566aa4f
|
[Doc] neuron documentation update (#8671)
Signed-off-by: omrishiv <327609+omrishiv@users.noreply.github.com>
|
2024-09-20 15:04:37 -07:00 |
youkaichao
|
fa0c114fad
|
[doc] improve installation doc (#8550)
Co-authored-by: Andy Dai <76841985+Imss27@users.noreply.github.com>
|
2024-09-17 16:24:06 -07:00 |
youkaichao
|
2759a43a26
|
[doc] update doc on testing and debugging (#8514)
|
2024-09-16 12:10:23 -07:00 |
Isotr0py
|
f57092c00b
|
[Doc] Add oneDNN installation to CPU backend documentation (#8467)
|
2024-09-13 18:06:30 +00:00 |
youkaichao
|
cab69a15e4
|
[doc] recommend pip instead of conda (#8446)
|
2024-09-12 23:52:41 -07:00 |
Cyrus Leung
|
288a938872
|
[Doc] Indicate more information about supported modalities (#8181)
|
2024-09-05 10:51:53 +00:00 |
Woosuk Kwon
|
61f4a93d14
|
[TPU][Bugfix] Use XLA rank for persistent cache path (#8137)
|
2024-09-03 18:35:33 -07:00 |
Woosuk Kwon
|
eeffde1ac0
|
[TPU] Upgrade PyTorch XLA nightly (#7967)
|
2024-08-28 13:10:21 -07:00 |
Ilya Lavrenov
|
398521ad19
|
[OpenVINO] Updated documentation (#7687)
|
2024-08-20 07:33:56 -06:00 |
youkaichao
|
199adbb7cf
|
[doc] update test script to include cudagraph (#7501)
|
2024-08-13 21:52:58 -07:00 |
Woosuk Kwon
|
a08df8322e
|
[TPU] Support multi-host inference (#7457)
|
2024-08-13 16:31:20 -07:00 |
tomeras91
|
02b1988b9f
|
[Doc] building vLLM with VLLM_TARGET_DEVICE=empty (#7403)
|
2024-08-11 14:38:17 -07:00 |
Woosuk Kwon
|
90bab18f24
|
[TPU] Use mark_dynamic to reduce compilation time (#7340)
|
2024-08-10 18:12:22 -07:00 |
Ilya Lavrenov
|
80cbe10c59
|
[OpenVINO] migrate to latest dependencies versions (#7251)
|
2024-08-07 09:49:10 -07:00 |
Simon Mo
|
4db5176d97
|
bump version to v0.5.4 (#7139)
|
2024-08-05 14:39:48 -07:00 |
Michael Goin
|
b482b9a5b1
|
[CI/Build] Add support for Python 3.12 (#7035)
|
2024-08-02 13:51:22 -07:00 |
Jee Jee Li
|
7ecee34321
|
[Kernel][RFC] Refactor the punica kernel based on Triton (#5036)
|
2024-07-31 17:12:24 -07:00 |
Ilya Lavrenov
|
5895b24677
|
[OpenVINO] Updated OpenVINO requirements and build docs (#6948)
|
2024-07-30 11:33:01 -07:00 |
Woosuk Kwon
|
fad5576c58
|
[TPU] Reduce compilation time & Upgrade PyTorch XLA version (#6856)
|
2024-07-27 10:28:33 -07:00 |
omrishiv
|
3c3012398e
|
[Doc] add VLLM_TARGET_DEVICE=neuron to documentation for neuron (#6844)
Signed-off-by: omrishiv <327609+omrishiv@users.noreply.github.com>
|
2024-07-26 20:20:16 -07:00 |
Woosuk Kwon
|
ced36cd89b
|
[ROCm] Upgrade PyTorch nightly version (#6845)
|
2024-07-26 20:16:13 -07:00 |
Li, Jiang
|
3bbb4936dc
|
[Hardware] [Intel] Enable Multiprocessing and tensor parallel in CPU backend and update documentation (#6125)
|
2024-07-26 13:50:10 -07:00 |
youkaichao
|
85ad7e2d01
|
[doc][debugging] add known issues for hangs (#6816)
|
2024-07-25 21:48:05 -07:00 |
Hongxia Yang
|
d88c458f44
|
[Doc][AMD][ROCm]Added tips to refer to mi300x tuning guide for mi300x users (#6754)
|
2024-07-24 14:32:57 -07:00 |
Woosuk Kwon
|
ccc4a73257
|
[Docs][ROCm] Detailed instructions to build from source (#6680)
|
2024-07-24 01:07:23 -07:00 |
Matt Wong
|
06d6c5fe9f
|
[Bugfix][CI/Build][Hardware][AMD] Fix AMD tests, add HF cache, update CK FA, add partially supported model notes (#6543)
|
2024-07-20 09:39:07 -07:00 |
Simon Mo
|
30efe41532
|
[Docs] Update docs for wheel location (#6580)
|
2024-07-19 12:14:11 -07:00 |
Cyrus Leung
|
5bf35a91e4
|
[Doc][CI/Build] Update docs and tests to use `vllm serve` (#6431)
|
2024-07-17 07:43:21 +00:00 |