Emily Casey
e607ddcc49
Update pkg/inference/models/manager.go
...
Signed-off-by: Emily Casey <emily.casey@docker.com>
Co-authored-by: Jacob Howard <jacob.howard@docker.com>
2025-04-17 14:44:08 -04:00
Emily Casey
5a2117a505
Add endpoint for tagging a model
...
Signed-off-by: Emily Casey <emily.casey@docker.com>
2025-04-16 16:21:32 -04:00
Ignasi
77acaee9dc
AIE-87 Returns 404 on ErrModelNotFound ( #10 )
...
* Bump model-distribution
* On ErrModelNotFound returns 404
2025-04-15 12:04:15 +02:00
Ignasi
055b8e451b
Initialize model distribution in model-runner ( #8 )
...
* Initialize model distribution in model-runner. This allows to remove model-distribution dependency in Pinata
* Fix main.go
* bump model-distribution
* gofumpt -l -extra -w .
2025-04-10 17:34:56 +02:00
Dorin Geman
77683e0cce
Add main.go for testing
...
Signed-off-by: Dorin Geman <dorin.geman@docker.com>
2025-04-09 19:13:09 +03:00
Dorin Geman
51f11dd1f8
Extract the routes in a separate function
...
Signed-off-by: Dorin Geman <dorin.geman@docker.com>
2025-04-09 19:11:18 +03:00
Dorin Geman
a217ae6000
Support registry/namespace/repository as model name
...
Signed-off-by: Dorin Geman <dorin.geman@docker.com>
2025-04-09 18:40:55 +03:00
Jacob Howard
36ae1e3b30
inference: adjust for lack of logger and paths packages
...
Signed-off-by: Jacob Howard <jacob.howard@docker.com>
2025-03-28 18:01:42 -06:00
Jacob Howard
95ad19a481
deps: vendor utility dependencies
...
Signed-off-by: Jacob Howard <jacob.howard@docker.com>
2025-03-28 17:53:13 -06:00
Jacob Howard
eab81f859f
inference: use system proxy and enforce RAM for model pulls
2025-03-28 17:53:13 -06:00
Ignasi
97bc6085e6
List models following OpenAI API spec
2025-03-28 17:53:13 -06:00
Ignasi
eef6a7df7c
flusher can't be nil at this point
2025-03-28 17:53:12 -06:00
Ignasi
9232cb2634
Ensure flusher is not nil
2025-03-28 17:53:12 -06:00
Ignasi
0ea001602d
Error handling when pulling models:
...
- Handle invalid reference
- Handle model not found
2025-03-28 17:53:12 -06:00
Ignasi
2ddd3a57d8
Update usages of model-distribution
2025-03-28 17:53:12 -06:00
Ignasi
2bbd26fdeb
Move prefix paths to inference package
2025-03-28 17:53:12 -06:00
Ignasi
60bc8a5641
Add models prefix to fix telemetry
2025-03-28 17:53:11 -06:00
Ignasi
348832f7d0
- Removes inference prefix from model manager related endpoints
...
- Adds model_manager_test.go
- Removes json suffix from path
2025-03-28 17:53:11 -06:00
Ignasi
15afbbac49
Extract to common variable to not repeat the prefix
2025-03-28 17:53:11 -06:00
Ignasi
7878bc7c69
From ml.docker.internal to model-runner.docker.internal
2025-03-28 17:53:11 -06:00
Dorin Geman
b33e90ec75
inference: models: Fix download progress streaming
...
Signed-off-by: Dorin Geman <dorin.geman@docker.com>
2025-03-28 17:53:11 -06:00
Ignasi
4865beb2c3
Fix IT
2025-03-28 17:53:10 -06:00
Ignasi
7a1cf900d1
fix format
2025-03-28 17:53:10 -06:00
Ignasi
70e59ffd36
Potential fix for code scanning alert no. 459: Reflected cross-site scripting
...
Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
2025-03-28 17:53:10 -06:00
Ignasi
1c24c91ea9
Sort imports
2025-03-28 17:53:10 -06:00
Ignasi
c5904eaec9
Show progress on pulling to caller
2025-03-28 17:53:10 -06:00
Ignasi
e64f4a9343
Removes repeated types
2025-03-28 17:53:10 -06:00
Ignasi
2b22896c95
Handle case where distributionClient can not be initialized
2025-03-28 17:53:09 -06:00
Ignasi
c7dd57bbc2
Logs pull progress
2025-03-28 17:53:09 -06:00
Ignasi
99af02a4f9
No need to use different implementations for pull models between win/unix for now
2025-03-28 17:53:09 -06:00
Ignasi
ab0738b920
GetModel must be public
2025-03-28 17:53:09 -06:00
Ignasi
c46c97c14d
Simplify getModels
2025-03-28 17:53:09 -06:00
Ignasi
a83b8e49bf
Applies gofumpt
2025-03-28 17:53:09 -06:00
Ignasi
d6423dfae7
Fixes e2e
2025-03-28 17:53:08 -06:00
Ignasi
b69d84f8aa
Using model distribution client
2025-03-28 17:53:08 -06:00
Ignasi
f1c25bd9a2
Use get models from distribution client
2025-03-28 17:53:08 -06:00
Ignasi
1f90aa3456
Adds model distribution client
2025-03-28 17:53:08 -06:00
Jacob Howard
fe715cd9e6
inference: add routes for a default inference backend
...
Signed-off-by: Jacob Howard <jacob.howard@docker.com>
2025-03-28 17:53:08 -06:00
Jacob Howard
910f9350f9
inference: disable pulls on Windows pending docker/model-distribution
...
Signed-off-by: Jacob Howard <jacob.howard@docker.com>
2025-03-28 17:53:07 -06:00
Jacob Howard
348f46991c
inference: wire up model deletion endpoint
...
This endpoint's implementation will wait until we have our official
local model store.
Signed-off-by: Jacob Howard <jacob.howard@docker.com>
2025-03-28 17:53:06 -06:00
Jacob Howard
d6b1191a01
inference: refactor scheduler to a more modular design
...
This new design will allow for concurrent runner operation (eventually)
on systems that support it.
Signed-off-by: Jacob Howard <jacob.howard@docker.com>
2025-03-28 17:53:06 -06:00
Dorin Geman
a14517d6bf
inference: Add stub for llama.cpp backend
...
Signed-off-by: Dorin Geman <dorin.geman@docker.com>
2025-03-28 17:53:05 -06:00
Dorin Geman
450b828845
inference: Register /models/{namespace}/{name}
...
Signed-off-by: Dorin Geman <dorin.geman@docker.com>
2025-03-28 17:53:05 -06:00
Jacob Howard
f8cdbc4d81
inference: refactor service and implement scheduling mechanism
...
Signed-off-by: Jacob Howard <jacob.howard@docker.com>
2025-03-28 17:53:05 -06:00