Merge pull request #19 from terrytangyuan/patch-1

Update 2025-01-27-intro-to-llama-stack-with-vllm.md
2025-01-27 11:53:27 -08:00 · 2025-01-27 11:53:27 -08:00 · 01664a2767
parent d28e88240d 1146ced6d0
commit 01664a2767
1 changed files with 1 additions and 1 deletions
--- a/_posts/2025-01-27-intro-to-llama-stack-with-vllm.md
+++ b/_posts/2025-01-27-intro-to-llama-stack-with-vllm.md
@ -16,7 +16,7 @@ Llama Stack defines and standardizes the set of core building blocks needed to b
 Llama Stack focuses on making it easy to build production applications with a variety of models - ranging from the latest Llama 3.3 model to specialized models like Llama Guard for safety and other models. The goal is to provide pre-packaged implementations (aka “distributions”) which can be run in a variety of deployment environments. The Stack can assist you in your entire app development lifecycle - start iterating on local, mobile or desktop and seamlessly transition to on-prem or public cloud deployments. At every point in this transition, the same set of APIs and the same developer experience are available.


-Each specific implementation of an API is called a "Provider" in this architecture. Users can swap providers via configuration. `vLLM` is a prominent example of a high-performance API backing the inference API.
+Each specific implementation of an API is called a "Provider" in this architecture. Users can swap providers via configuration. vLLM is a prominent example of a high-performance API backing the inference API.

 # vLLM Inference Provider