diff --git a/_posts/2025-01-21-stack-release.md b/_posts/2025-01-21-stack-release.md
index f6d1a51..6fd6e09 100644
--- a/_posts/2025-01-21-stack-release.md
+++ b/_posts/2025-01-21-stack-release.md
@@ -1,10 +1,10 @@
---
layout: post
title: "High Performance and Easy Deployment of vLLM in K8S with “vLLM production-stack”"
-thumbnail-img: /assets/img/stack-thumbnail.png
-share-img: /assets/img/stack-thumbnail.png
+thumbnail-img: /assets/figure/stack/stack-thumbnail.png
+share-img: /assets/figure/stack/stack-thumbnail.png
author: LMCache Team
-image: /assets/img/stack-thumbnail.png
+image: /assets/figure/stack/stack-thumbnail.png
---
@@ -27,7 +27,7 @@ image: /assets/img/stack-thumbnail.png
How do we extend its power into a **full-stack** inference system that any organization can deploy at scale with *high reliability*, *high throughput*, and *low latency*? That’s precisely why the LMCache team and the vLLM team built **vLLM production-stack**.