model-runner

Commit Graph

Author	SHA1	Message	Date
Emily Casey	e607ddcc49	Update pkg/inference/models/manager.go Signed-off-by: Emily Casey <emily.casey@docker.com> Co-authored-by: Jacob Howard <jacob.howard@docker.com>	2025-04-17 14:44:08 -04:00
Emily Casey	5a2117a505	Add endpoint for tagging a model Signed-off-by: Emily Casey <emily.casey@docker.com>	2025-04-16 16:21:32 -04:00
Ignasi	77acaee9dc	AIE-87 Returns 404 on ErrModelNotFound (#10 ) * Bump model-distribution * On ErrModelNotFound returns 404	2025-04-15 12:04:15 +02:00
Ignasi	055b8e451b	Initialize model distribution in model-runner (#8 ) * Initialize model distribution in model-runner. This allows to remove model-distribution dependency in Pinata * Fix main.go * bump model-distribution * gofumpt -l -extra -w .	2025-04-10 17:34:56 +02:00
Dorin Geman	77683e0cce	Add main.go for testing Signed-off-by: Dorin Geman <dorin.geman@docker.com>	2025-04-09 19:13:09 +03:00
Dorin Geman	51f11dd1f8	Extract the routes in a separate function Signed-off-by: Dorin Geman <dorin.geman@docker.com>	2025-04-09 19:11:18 +03:00
Dorin Geman	a217ae6000	Support registry/namespace/repository as model name Signed-off-by: Dorin Geman <dorin.geman@docker.com>	2025-04-09 18:40:55 +03:00
Jacob Howard	36ae1e3b30	inference: adjust for lack of logger and paths packages Signed-off-by: Jacob Howard <jacob.howard@docker.com>	2025-03-28 18:01:42 -06:00
Jacob Howard	95ad19a481	deps: vendor utility dependencies Signed-off-by: Jacob Howard <jacob.howard@docker.com>	2025-03-28 17:53:13 -06:00
Jacob Howard	eab81f859f	inference: use system proxy and enforce RAM for model pulls	2025-03-28 17:53:13 -06:00
Ignasi	97bc6085e6	List models following OpenAI API spec	2025-03-28 17:53:13 -06:00
Ignasi	eef6a7df7c	flusher can't be nil at this point	2025-03-28 17:53:12 -06:00
Ignasi	9232cb2634	Ensure flusher is not nil	2025-03-28 17:53:12 -06:00
Ignasi	0ea001602d	Error handling when pulling models: - Handle invalid reference - Handle model not found	2025-03-28 17:53:12 -06:00
Ignasi	2ddd3a57d8	Update usages of model-distribution	2025-03-28 17:53:12 -06:00
Ignasi	2bbd26fdeb	Move prefix paths to inference package	2025-03-28 17:53:12 -06:00
Ignasi	60bc8a5641	Add models prefix to fix telemetry	2025-03-28 17:53:11 -06:00
Ignasi	348832f7d0	- Removes inference prefix from model manager related endpoints - Adds model_manager_test.go - Removes json suffix from path	2025-03-28 17:53:11 -06:00
Ignasi	15afbbac49	Extract to common variable to not repeat the prefix	2025-03-28 17:53:11 -06:00
Ignasi	7878bc7c69	From ml.docker.internal to model-runner.docker.internal	2025-03-28 17:53:11 -06:00
Dorin Geman	b33e90ec75	inference: models: Fix download progress streaming Signed-off-by: Dorin Geman <dorin.geman@docker.com>	2025-03-28 17:53:11 -06:00
Ignasi	4865beb2c3	Fix IT	2025-03-28 17:53:10 -06:00
Ignasi	7a1cf900d1	fix format	2025-03-28 17:53:10 -06:00
Ignasi	70e59ffd36	Potential fix for code scanning alert no. 459: Reflected cross-site scripting Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>	2025-03-28 17:53:10 -06:00
Ignasi	1c24c91ea9	Sort imports	2025-03-28 17:53:10 -06:00
Ignasi	c5904eaec9	Show progress on pulling to caller	2025-03-28 17:53:10 -06:00
Ignasi	e64f4a9343	Removes repeated types	2025-03-28 17:53:10 -06:00
Ignasi	2b22896c95	Handle case where distributionClient can not be initialized	2025-03-28 17:53:09 -06:00
Ignasi	c7dd57bbc2	Logs pull progress	2025-03-28 17:53:09 -06:00
Ignasi	99af02a4f9	No need to use different implementations for pull models between win/unix for now	2025-03-28 17:53:09 -06:00
Ignasi	ab0738b920	GetModel must be public	2025-03-28 17:53:09 -06:00
Ignasi	c46c97c14d	Simplify getModels	2025-03-28 17:53:09 -06:00
Ignasi	a83b8e49bf	Applies gofumpt	2025-03-28 17:53:09 -06:00
Ignasi	d6423dfae7	Fixes e2e	2025-03-28 17:53:08 -06:00
Ignasi	b69d84f8aa	Using model distribution client	2025-03-28 17:53:08 -06:00
Ignasi	f1c25bd9a2	Use get models from distribution client	2025-03-28 17:53:08 -06:00
Ignasi	1f90aa3456	Adds model distribution client	2025-03-28 17:53:08 -06:00
Jacob Howard	fe715cd9e6	inference: add routes for a default inference backend Signed-off-by: Jacob Howard <jacob.howard@docker.com>	2025-03-28 17:53:08 -06:00
Jacob Howard	910f9350f9	inference: disable pulls on Windows pending docker/model-distribution Signed-off-by: Jacob Howard <jacob.howard@docker.com>	2025-03-28 17:53:07 -06:00
Jacob Howard	348f46991c	inference: wire up model deletion endpoint This endpoint's implementation will wait until we have our official local model store. Signed-off-by: Jacob Howard <jacob.howard@docker.com>	2025-03-28 17:53:06 -06:00
Jacob Howard	d6b1191a01	inference: refactor scheduler to a more modular design This new design will allow for concurrent runner operation (eventually) on systems that support it. Signed-off-by: Jacob Howard <jacob.howard@docker.com>	2025-03-28 17:53:06 -06:00
Dorin Geman	a14517d6bf	inference: Add stub for llama.cpp backend Signed-off-by: Dorin Geman <dorin.geman@docker.com>	2025-03-28 17:53:05 -06:00
Dorin Geman	450b828845	inference: Register /models/{namespace}/{name} Signed-off-by: Dorin Geman <dorin.geman@docker.com>	2025-03-28 17:53:05 -06:00
Jacob Howard	f8cdbc4d81	inference: refactor service and implement scheduling mechanism Signed-off-by: Jacob Howard <jacob.howard@docker.com>	2025-03-28 17:53:05 -06:00

44 Commits