trainer

History

Shao Wang abc4749f39 KEP-2401: Create LLM Training Runtimes for Llama 3.2 model family (#2590 ) * chore(manifests): add manifests for llama3_2_1B and llama3_2_3B. Signed-off-by: Electronic-Waste <2690692950@qq.com> * chore(manifests): Add detailed configuration for llama3_2_1B. Signed-off-by: Electronic-Waste <2690692950@qq.com> * chore(manifests): Add detailed configuration for llama3_2_3B. Signed-off-by: Electronic-Waste <2690692950@qq.com> * chore(manifests): load pvc to trainer node & add initializers. Signed-off-by: Electronic-Waste <2690692950@qq.com> * fix(manifest): fix DependsOn error in CI. Signed-off-by: Electronic-Waste <2690692950@qq.com> * fix(runtime): fix launching command for CTRs. Signed-off-by: Electronic-Waste <2690692950@qq.com> * fix(manifest): remove torchtune temporally in base/runtimes. Signed-off-by: Electronic-Waste <2690692950@qq.com> * fix(manifest): provide torchtune-trainer image tag override in overlays. Signed-off-by: Electronic-Waste <2690692950@qq.com> * fix(manifest): update image for data-initializer and model-initializer. Signed-off-by: Electronic-Waste <2690692950@qq.com> --------- Signed-off-by: Electronic-Waste <2690692950@qq.com>		2025-05-10 05:44:41 +00:00
..
base	KEP-2401: Create LLM Training Runtimes for Llama 3.2 model family (#2590 )	2025-05-10 05:44:41 +00:00
overlays	KEP-2401: Create LLM Training Runtimes for Llama 3.2 model family (#2590 )	2025-05-10 05:44:41 +00:00
third-party/jobset	Bump JobSet to v0.8.0 (#2463 )	2025-03-01 02:46:38 +00:00