Update installation doc URLs (#40)

Follow up to https://github.com/vllm-project/vllm/pull/14556.

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
This commit is contained in:
Harry Mellor 2025-03-10 18:43:02 +01:00 committed by GitHub
parent 7e828ff7a5
commit 4d264ee9a0
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
4 changed files with 6 additions and 6 deletions

View File

@ -108,7 +108,7 @@ This utilization of vLLM has also significantly reduced operational costs. With
### Get started with vLLM ### Get started with vLLM
Install vLLM with the following command (check out our [installation guide](https://docs.vllm.ai/en/latest/getting_started/installation/index.html) for more): Install vLLM with the following command (check out our [installation guide](https://docs.vllm.ai/en/latest/getting_started/installation.html) for more):
```bash ```bash
$ pip install vllm $ pip install vllm

View File

@ -150,7 +150,7 @@ Importantly, we will also focus on improving the core of vLLM to reduce the comp
### Get Involved ### Get Involved
If you havent, we highly recommend you to update the vLLM version (see instructions [here](https://docs.vllm.ai/en/latest/getting_started/installation/index.html)) and try it out for yourself\! We always love to learn more about your use cases and how we can make vLLM better for you. The vLLM team can be reached out via [vllm-questions@lists.berkeley.edu](mailto:vllm-questions@lists.berkeley.edu). vLLM is also a community project, if you are interested in participating and contributing, we welcome you to check out our [roadmap](https://roadmap.vllm.ai/) and see [good first issues](https://github.com/vllm-project/vllm/issues?q=is:open+is:issue+label:%22good+first+issue%22) to tackle. Stay tuned for more updates by [following us on X](https://x.com/vllm\_project). If you havent, we highly recommend you to update the vLLM version (see instructions [here](https://docs.vllm.ai/en/latest/getting_started/installation.html)) and try it out for yourself\! We always love to learn more about your use cases and how we can make vLLM better for you. The vLLM team can be reached out via [vllm-questions@lists.berkeley.edu](mailto:vllm-questions@lists.berkeley.edu). vLLM is also a community project, if you are interested in participating and contributing, we welcome you to check out our [roadmap](https://roadmap.vllm.ai/) and see [good first issues](https://github.com/vllm-project/vllm/issues?q=is:open+is:issue+label:%22good+first+issue%22) to tackle. Stay tuned for more updates by [following us on X](https://x.com/vllm\_project).
If you are in the Bay Area, you can meet the vLLM team at the following events: [vLLMs sixth meetup with NVIDIA(09/09)](https://lu.ma/87q3nvnh), [PyTorch Conference (09/19)](https://pytorch2024.sched.com/event/1fHmx/vllm-easy-fast-and-cheap-llm-serving-for-everyone-woosuk-kwon-uc-berkeley-xiaoxuan-liu-ucb), [CUDA MODE IRL meetup (09/21)](https://events.accel.com/cudamode), and [the first ever vLLM track at Ray Summit (10/01-02)](https://raysummit.anyscale.com/flow/anyscale/raysummit2024/landing/page/sessioncatalog?search.sessiontracks=1719251906298001uzJ2). If you are in the Bay Area, you can meet the vLLM team at the following events: [vLLMs sixth meetup with NVIDIA(09/09)](https://lu.ma/87q3nvnh), [PyTorch Conference (09/19)](https://pytorch2024.sched.com/event/1fHmx/vllm-easy-fast-and-cheap-llm-serving-for-everyone-woosuk-kwon-uc-berkeley-xiaoxuan-liu-ucb), [CUDA MODE IRL meetup (09/21)](https://events.accel.com/cudamode), and [the first ever vLLM track at Ray Summit (10/01-02)](https://raysummit.anyscale.com/flow/anyscale/raysummit2024/landing/page/sessioncatalog?search.sessiontracks=1719251906298001uzJ2).

View File

@ -29,7 +29,7 @@ For those who prefer a faster package manager, [**uv**](https://github.com/astra
uv pip install vllm uv pip install vllm
``` ```
Refer to the [documentation](https://docs.vllm.ai/en/latest/getting_started/installation/gpu/index.html?device=cuda#create-a-new-python-environment) for more details on setting up [**uv**](https://github.com/astral-sh/uv). Using a simple server-grade setup (Intel 8th Gen CPU), we observe that [**uv**](https://github.com/astral-sh/uv) is 200x faster than pip: Refer to the [documentation](https://docs.vllm.ai/en/latest/getting_started/installation/gpu.html?device=cuda#create-a-new-python-environment) for more details on setting up [**uv**](https://github.com/astral-sh/uv). Using a simple server-grade setup (Intel 8th Gen CPU), we observe that [**uv**](https://github.com/astral-sh/uv) is 200x faster than pip:
```sh ```sh
# with cached packages, clean virtual environment # with cached packages, clean virtual environment
@ -77,11 +77,11 @@ VLLM_USE_PRECOMPILED=1 pip install -e .
The `VLLM_USE_PRECOMPILED=1` flag instructs the installer to use pre-compiled CUDA kernels instead of building them from source, significantly reducing installation time. This is perfect for developers focusing on Python-level features like API improvements, model support, or integration work. The `VLLM_USE_PRECOMPILED=1` flag instructs the installer to use pre-compiled CUDA kernels instead of building them from source, significantly reducing installation time. This is perfect for developers focusing on Python-level features like API improvements, model support, or integration work.
This lightweight process runs efficiently, even on a laptop. Refer to our [documentation](https://docs.vllm.ai/en/latest/getting_started/installation/gpu/index.html?device=cuda#build-wheel-from-source) for more advanced usage. This lightweight process runs efficiently, even on a laptop. Refer to our [documentation](https://docs.vllm.ai/en/latest/getting_started/installation/gpu.html?device=cuda#build-wheel-from-source) for more advanced usage.
### C++/Kernel Developers ### C++/Kernel Developers
For advanced contributors working with C++ code or CUDA kernels, we incorporate a compilation cache to minimize build time and streamline kernel development. Please check our [documentation](https://docs.vllm.ai/en/latest/getting_started/installation/gpu/index.html?device=cuda#build-wheel-from-source) for more details. For advanced contributors working with C++ code or CUDA kernels, we incorporate a compilation cache to minimize build time and streamline kernel development. Please check our [documentation](https://docs.vllm.ai/en/latest/getting_started/installation/gpu.html?device=cuda#build-wheel-from-source) for more details.
## Track Changes with Ease ## Track Changes with Ease

View File

@ -49,7 +49,7 @@ huggingface-cli login --token <YOUR-HF-TOKEN>
huggingface-cli download meta-llama/Llama-3.2-1B-Instruct --local-dir /tmp/test-vllm-llama-stack/.cache/huggingface/hub/models/Llama-3.2-1B-Instruct huggingface-cli download meta-llama/Llama-3.2-1B-Instruct --local-dir /tmp/test-vllm-llama-stack/.cache/huggingface/hub/models/Llama-3.2-1B-Instruct
``` ```
Next, let's build the vLLM CPU container image from source. Note that while we use it for demonstration purposes, there are plenty of [other images available for different hardware and architectures](https://docs.vllm.ai/en/latest/getting_started/installation/index.html). Next, let's build the vLLM CPU container image from source. Note that while we use it for demonstration purposes, there are plenty of [other images available for different hardware and architectures](https://docs.vllm.ai/en/latest/getting_started/installation.html).
``` ```
git clone git@github.com:vllm-project/vllm.git /tmp/test-vllm-llama-stack git clone git@github.com:vllm-project/vllm.git /tmp/test-vllm-llama-stack