Commit Graph

44 Commits

Author SHA1 Message Date
Emily Casey e607ddcc49
Update pkg/inference/models/manager.go
Signed-off-by: Emily Casey <emily.casey@docker.com>

Co-authored-by: Jacob Howard <jacob.howard@docker.com>
2025-04-17 14:44:08 -04:00
Emily Casey 5a2117a505 Add endpoint for tagging a model
Signed-off-by: Emily Casey <emily.casey@docker.com>
2025-04-16 16:21:32 -04:00
Ignasi 77acaee9dc
AIE-87 Returns 404 on ErrModelNotFound (#10)
* Bump model-distribution

* On ErrModelNotFound returns 404
2025-04-15 12:04:15 +02:00
Ignasi 055b8e451b
Initialize model distribution in model-runner (#8)
* Initialize model distribution in model-runner. This allows to remove model-distribution dependency in Pinata

* Fix main.go

* bump model-distribution

* gofumpt -l -extra -w .
2025-04-10 17:34:56 +02:00
Dorin Geman 77683e0cce Add main.go for testing
Signed-off-by: Dorin Geman <dorin.geman@docker.com>
2025-04-09 19:13:09 +03:00
Dorin Geman 51f11dd1f8 Extract the routes in a separate function
Signed-off-by: Dorin Geman <dorin.geman@docker.com>
2025-04-09 19:11:18 +03:00
Dorin Geman a217ae6000 Support registry/namespace/repository as model name
Signed-off-by: Dorin Geman <dorin.geman@docker.com>
2025-04-09 18:40:55 +03:00
Jacob Howard 36ae1e3b30
inference: adjust for lack of logger and paths packages
Signed-off-by: Jacob Howard <jacob.howard@docker.com>
2025-03-28 18:01:42 -06:00
Jacob Howard 95ad19a481
deps: vendor utility dependencies
Signed-off-by: Jacob Howard <jacob.howard@docker.com>
2025-03-28 17:53:13 -06:00
Jacob Howard eab81f859f
inference: use system proxy and enforce RAM for model pulls 2025-03-28 17:53:13 -06:00
Ignasi 97bc6085e6
List models following OpenAI API spec 2025-03-28 17:53:13 -06:00
Ignasi eef6a7df7c
flusher can't be nil at this point 2025-03-28 17:53:12 -06:00
Ignasi 9232cb2634
Ensure flusher is not nil 2025-03-28 17:53:12 -06:00
Ignasi 0ea001602d
Error handling when pulling models:
- Handle invalid reference
- Handle model not found
2025-03-28 17:53:12 -06:00
Ignasi 2ddd3a57d8
Update usages of model-distribution 2025-03-28 17:53:12 -06:00
Ignasi 2bbd26fdeb
Move prefix paths to inference package 2025-03-28 17:53:12 -06:00
Ignasi 60bc8a5641
Add models prefix to fix telemetry 2025-03-28 17:53:11 -06:00
Ignasi 348832f7d0
- Removes inference prefix from model manager related endpoints
- Adds model_manager_test.go
- Removes json suffix from path
2025-03-28 17:53:11 -06:00
Ignasi 15afbbac49
Extract to common variable to not repeat the prefix 2025-03-28 17:53:11 -06:00
Ignasi 7878bc7c69
From ml.docker.internal to model-runner.docker.internal 2025-03-28 17:53:11 -06:00
Dorin Geman b33e90ec75
inference: models: Fix download progress streaming
Signed-off-by: Dorin Geman <dorin.geman@docker.com>
2025-03-28 17:53:11 -06:00
Ignasi 4865beb2c3
Fix IT 2025-03-28 17:53:10 -06:00
Ignasi 7a1cf900d1
fix format 2025-03-28 17:53:10 -06:00
Ignasi 70e59ffd36
Potential fix for code scanning alert no. 459: Reflected cross-site scripting
Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
2025-03-28 17:53:10 -06:00
Ignasi 1c24c91ea9
Sort imports 2025-03-28 17:53:10 -06:00
Ignasi c5904eaec9
Show progress on pulling to caller 2025-03-28 17:53:10 -06:00
Ignasi e64f4a9343
Removes repeated types 2025-03-28 17:53:10 -06:00
Ignasi 2b22896c95
Handle case where distributionClient can not be initialized 2025-03-28 17:53:09 -06:00
Ignasi c7dd57bbc2
Logs pull progress 2025-03-28 17:53:09 -06:00
Ignasi 99af02a4f9
No need to use different implementations for pull models between win/unix for now 2025-03-28 17:53:09 -06:00
Ignasi ab0738b920
GetModel must be public 2025-03-28 17:53:09 -06:00
Ignasi c46c97c14d
Simplify getModels 2025-03-28 17:53:09 -06:00
Ignasi a83b8e49bf
Applies gofumpt 2025-03-28 17:53:09 -06:00
Ignasi d6423dfae7
Fixes e2e 2025-03-28 17:53:08 -06:00
Ignasi b69d84f8aa
Using model distribution client 2025-03-28 17:53:08 -06:00
Ignasi f1c25bd9a2
Use get models from distribution client 2025-03-28 17:53:08 -06:00
Ignasi 1f90aa3456
Adds model distribution client 2025-03-28 17:53:08 -06:00
Jacob Howard fe715cd9e6
inference: add routes for a default inference backend
Signed-off-by: Jacob Howard <jacob.howard@docker.com>
2025-03-28 17:53:08 -06:00
Jacob Howard 910f9350f9
inference: disable pulls on Windows pending docker/model-distribution
Signed-off-by: Jacob Howard <jacob.howard@docker.com>
2025-03-28 17:53:07 -06:00
Jacob Howard 348f46991c
inference: wire up model deletion endpoint
This endpoint's implementation will wait until we have our official
local model store.

Signed-off-by: Jacob Howard <jacob.howard@docker.com>
2025-03-28 17:53:06 -06:00
Jacob Howard d6b1191a01
inference: refactor scheduler to a more modular design
This new design will allow for concurrent runner operation (eventually)
on systems that support it.

Signed-off-by: Jacob Howard <jacob.howard@docker.com>
2025-03-28 17:53:06 -06:00
Dorin Geman a14517d6bf
inference: Add stub for llama.cpp backend
Signed-off-by: Dorin Geman <dorin.geman@docker.com>
2025-03-28 17:53:05 -06:00
Dorin Geman 450b828845
inference: Register /models/{namespace}/{name}
Signed-off-by: Dorin Geman <dorin.geman@docker.com>
2025-03-28 17:53:05 -06:00
Jacob Howard f8cdbc4d81
inference: refactor service and implement scheduling mechanism
Signed-off-by: Jacob Howard <jacob.howard@docker.com>
2025-03-28 17:53:05 -06:00