diff --git a/_posts/2025-01-27-intro-to-llama-stack-with-vllm.md b/_posts/2025-01-27-intro-to-llama-stack-with-vllm.md index 99a6e64..29f1cb6 100644 --- a/_posts/2025-01-27-intro-to-llama-stack-with-vllm.md +++ b/_posts/2025-01-27-intro-to-llama-stack-with-vllm.md @@ -2,14 +2,16 @@ layout: post title: "Introducing vLLM Inference Provider in Llama Stack" author: "Yuan Tang (Red Hat) and Ashwin Bharambe (Meta)" -image: /assets/logos/vllm-logo-only-light.png +image: /assets/figures/llama-stack/llama-stack.png --- We are excited to announce that vLLM inference provider is now available in [Llama Stack](https://github.com/meta-llama/llama-stack) through the collaboration between the Red Hat AI Engineering team and the Llama Stack team from Meta. This article provides an introduction to this integration and a tutorial to help you get started using it locally or deploying it in a Kubernetes cluster. # What is Llama Stack? -llama-stack-diagram +
+ Icon +
Llama Stack defines and standardizes the set of core building blocks needed to bring generative AI applications to market. These building blocks are presented in the form of interoperable APIs with a broad set of Service Providers providing their implementations. diff --git a/assets/figures/llama-stack/llama-stack.png b/assets/figures/llama-stack/llama-stack.png new file mode 100644 index 0000000..0a45e18 Binary files /dev/null and b/assets/figures/llama-stack/llama-stack.png differ