trainer/manifests
Shao Wang abc4749f39
KEP-2401: Create LLM Training Runtimes for Llama 3.2 model family (#2590)
* chore(manifests): add manifests for llama3_2_1B and llama3_2_3B.

Signed-off-by: Electronic-Waste <2690692950@qq.com>

* chore(manifests): Add detailed configuration for llama3_2_1B.

Signed-off-by: Electronic-Waste <2690692950@qq.com>

* chore(manifests): Add detailed configuration for llama3_2_3B.

Signed-off-by: Electronic-Waste <2690692950@qq.com>

* chore(manifests): load pvc to trainer node & add initializers.

Signed-off-by: Electronic-Waste <2690692950@qq.com>

* fix(manifest): fix DependsOn error in CI.

Signed-off-by: Electronic-Waste <2690692950@qq.com>

* fix(runtime): fix launching command for CTRs.

Signed-off-by: Electronic-Waste <2690692950@qq.com>

* fix(manifest): remove torchtune temporally in base/runtimes.

Signed-off-by: Electronic-Waste <2690692950@qq.com>

* fix(manifest): provide torchtune-trainer image tag override in overlays.

Signed-off-by: Electronic-Waste <2690692950@qq.com>

* fix(manifest): update image for data-initializer and model-initializer.

Signed-off-by: Electronic-Waste <2690692950@qq.com>

---------

Signed-off-by: Electronic-Waste <2690692950@qq.com>
2025-05-10 05:44:41 +00:00
..
base KEP-2401: Create LLM Training Runtimes for Llama 3.2 model family (#2590) 2025-05-10 05:44:41 +00:00
overlays KEP-2401: Create LLM Training Runtimes for Llama 3.2 model family (#2590) 2025-05-10 05:44:41 +00:00
third-party/jobset Bump JobSet to v0.8.0 (#2463) 2025-03-01 02:46:38 +00:00