mirror of https://github.com/vllm-project/vllm.git
[Doc] small fix (#17277)
Signed-off-by: reidliu41 <reid201711@gmail.com> Co-authored-by: reidliu41 <reid201711@gmail.com>
This commit is contained in:
parent
9053d0b134
commit
f211331c48
|
@ -59,7 +59,7 @@ A code example can be found here: <gh-file:examples/offline_inference/basic/basi
|
|||
|
||||
### `LLM.beam_search`
|
||||
|
||||
The {class}`~vllm.LLM.beam_search` method implements [beam search](https://huggingface.co/docs/transformers/en/generation_strategies#beam-search-decoding) on top of {class}`~vllm.LLM.generate`.
|
||||
The {class}`~vllm.LLM.beam_search` method implements [beam search](https://huggingface.co/docs/transformers/en/generation_strategies#beam-search) on top of {class}`~vllm.LLM.generate`.
|
||||
For example, to search using 5 beams and output at most 50 tokens:
|
||||
|
||||
```python
|
||||
|
|
|
@ -793,6 +793,8 @@ or `--limit-mm-per-prompt` (online serving). For example, to enable passing up t
|
|||
Offline inference:
|
||||
|
||||
```python
|
||||
from vllm import LLM
|
||||
|
||||
llm = LLM(
|
||||
model="Qwen/Qwen2-VL-7B-Instruct",
|
||||
limit_mm_per_prompt={"image": 4},
|
||||
|
|
Loading…
Reference in New Issue