Commit Graph

322 Commits

Author SHA1 Message Date
Jacob Howard 3d8c73c355
[AIE-151] native: support dynamic detection of OpenCL
Signed-off-by: Jacob Howard <jacob.howard@docker.com>
2025-04-30 13:48:31 -06:00
Dorin-Andrei Geman 4c3ffbfa53
Merge pull request #26 from doringeman/support-native-platform 2025-04-30 19:30:45 +03:00
Dorin Geman 596d2f7a94
Support native platform (include linux/arm64)
Signed-off-by: Dorin Geman <dorin.geman@docker.com>
2025-04-30 18:17:49 +03:00
Ignasi dbbb7afe9f
Dockerize (#22)
* Adds Makefile for local development

* Fix chat completions example request

* Added delete example

* Dockerize model-runner

* WIP Run container with host access to socket

* Dockerize model-runner

* WIP Run container with host access to socket

* Debugging

* Run in Docker container with TCP port access

* mounted model storage

* - Remove duplication in .gitignore
- Do not use alpine in builder image
- NVIDIA seems to use Ubuntu in all of their CDI docs and produces Ubuntu tags for nvidia/cuda but not Debian. So use Ubuntu for our final image
For more details: https://github.com/docker/model-runner/pull/22

* - Add MODELS_PATH environment variable to configure model storage location
- Default to $HOME/.docker/models when MODELS_PATH is not set
- Update Docker container to use /models as the default storage path
- Update Makefile to pass MODELS_PATH to container
- Update Dockerfile to create and set permissions for /models directory

This change allows users to:
- Override the model storage location via MODELS_PATH
- Maintain backward compatibility with default $HOME/.docker/models path
- Use a more idiomatic folder for /models

* Removes unneeded logs
2025-04-29 18:03:12 +02:00
Piotr Stankiewicz 4239791795 Basic param tuning on windows/arm64
Signed-off-by: Piotr Stankiewicz <piotr.stankiewicz@docker.com>
2025-04-25 16:26:35 +02:00
Piotr Stankiewicz 978875e99c Enable basic windows/arm64 support
Signed-off-by: Piotr Stankiewicz <piotr.stankiewicz@docker.com>
2025-04-25 16:26:35 +02:00
Ignasi 7545f180cd
Adds makefile (#20)
* Adds Makefile for local development

* Fix chat completions example request

* Added delete example
2025-04-25 09:38:54 +02:00
Dorin-Andrei Geman b02cdd9568
Merge pull request #21 from docker/ps-fix-auto-update
Fix llama-server auto update
2025-04-24 19:02:49 +03:00
Piotr Stankiewicz 87fd6f6466 Fix llama-server auto update
Signed-off-by: Piotr Stankiewicz <piotr.stankiewicz@docker.com>
2025-04-24 17:31:42 +02:00
Emily Casey 4c6c6a3da4
Merge pull request #18 from docker/force-delete
Allow force deletion of models
2025-04-18 18:33:35 -04:00
Emily Casey 858b905c5d Allow force deletion of multiply tagged models
Signed-off-by: Emily Casey <emily.casey@docker.com>
2025-04-18 18:28:49 -04:00
Emily Casey 5fe701d3f7
Merge pull request #17 from docker/handle-missing-dir
Gracefully handle models directory deletion
2025-04-18 18:26:50 -04:00
Emily Casey 247f63df65 Handle missing models dir
Signed-off-by: Emily Casey <emily.casey@docker.com>
2025-04-18 18:02:32 -04:00
Emily Casey 235aaf289f
Merge pull request #16 from docker/untagged-openai-models
Set openAI ID to model ID when untagged
2025-04-18 13:19:09 -04:00
Emily Casey 6a781b0eeb Set openAI id to model ID when untagged
Signed-off-by: Emily Casey <emily.casey@docker.com>
2025-04-18 13:02:16 -04:00
Emily Casey 9010d23cef
Merge pull request #15 from docker/tag-and-push
Tag and push
2025-04-17 22:31:39 -04:00
Emily Casey eab9cf8f2b make tag required
Signed-off-by: Emily Casey <emily.casey@docker.com>
2025-04-17 21:57:25 -04:00
Emily Casey cac9d9dd8b Update pkg/inference/models/manager.go
Signed-off-by: Emily Casey <emily.casey@docker.com>

Co-authored-by: Jacob Howard <jacob.howard@docker.com>
2025-04-17 20:09:36 -04:00
Emily Casey 3a7a483d68 Make pull errors symetrical with push
Return Unauthorized status when appropriate.

Signed-off-by: Emily Casey <emily.casey@docker.com>
2025-04-17 20:08:50 -04:00
Emily Casey 43fdb4cd50 Handle unauthorized error
Signed-off-by: Emily Casey <emily.casey@docker.com>
2025-04-17 20:08:50 -04:00
Emily Casey fdf81fe571 Adds push endpoint
Signed-off-by: Emily Casey <emily.casey@docker.com>
2025-04-17 20:08:50 -04:00
Emily Casey c2f030c9df Fix tag routes
Handle any number of path elements in model name.

Signed-off-by: Emily Casey <emily.casey@docker.com>
2025-04-17 20:08:47 -04:00
Emily Casey 8d986932e4 Revert "Revert "Add endpoint for tagging a model""
This reverts commit 739247ee02.
2025-04-17 19:59:25 -04:00
Jacob Howard 586c6df18f
Merge pull request #14 from docker/revert-11-tag-model
Revert "Add endpoint for tagging a model"
2025-04-17 14:43:47 -06:00
Emily Casey 739247ee02
Revert "Add endpoint for tagging a model" 2025-04-17 16:42:02 -04:00
Jacob Howard 3b88d22cbd
Merge pull request #13 from docker/ps-win-support
Windows support
2025-04-17 14:08:17 -06:00
Jacob Howard ca5fbbd8e8
chore: run go fmt
Signed-off-by: Jacob Howard <jacob.howard@docker.com>
2025-04-17 14:04:02 -06:00
Jacob Howard ed476dcbb8
chore: code review suggestions and go mod tidy
Signed-off-by: Jacob Howard <jacob.howard@docker.com>
2025-04-17 13:59:32 -06:00
Emily Casey d1d8e7a3c1
Merge pull request #11 from docker/tag-model
Add endpoint for tagging a model
2025-04-17 15:54:42 -04:00
Emily Casey e607ddcc49
Update pkg/inference/models/manager.go
Signed-off-by: Emily Casey <emily.casey@docker.com>

Co-authored-by: Jacob Howard <jacob.howard@docker.com>
2025-04-17 14:44:08 -04:00
Dorin Geman 40f7438308 Lock ShouldUseGPUVariant
Signed-off-by: Dorin Geman <dorin.geman@docker.com>
2025-04-17 19:11:23 +02:00
Dorin Geman e5d5ccf2dd Add Status to Backend interface
Signed-off-by: Dorin Geman <dorin.geman@docker.com>
2025-04-17 19:11:20 +02:00
Dorin Geman 5e4719501a Reset on Install
Signed-off-by: Dorin Geman <dorin.geman@docker.com>
2025-04-17 19:10:16 +02:00
Dorin Geman 75f963a112 Show the GPU-backed setting only if it is available
Signed-off-by: Dorin Geman <dorin.geman@docker.com>
2025-04-17 19:10:11 +02:00
Dorin Geman eb0dba0cc8 No need to use the updated llama.cpp if the bundled one is up to date
Signed-off-by: Dorin Geman <dorin.geman@docker.com>
2025-04-17 19:10:05 +02:00
Dorin Geman a3fb86a0bb Force a re-installation if EnableInferenceGPUVariant has changed
Signed-off-by: Dorin Geman <dorin.geman@docker.com>
2025-04-17 19:10:00 +02:00
Dorin Geman 5d56ba5ad3 Kill process on Windows
Signed-off-by: Dorin Geman <dorin.geman@docker.com>
2025-04-17 19:09:52 +02:00
Piotr Stankiewicz 6f3b1b4f25 Dynamic download of CUDA binaries
Signed-off-by: Piotr Stankiewicz <piotr.stankiewicz@docker.com>
2025-04-17 19:09:33 +02:00
Piotr Stankiewicz a9e72bf303 Enable Windows support
Signed-off-by: Piotr Stankiewicz <piotr.stankiewicz@docker.com>
2025-04-17 19:07:16 +02:00
Emily Casey 5a2117a505 Add endpoint for tagging a model
Signed-off-by: Emily Casey <emily.casey@docker.com>
2025-04-16 16:21:32 -04:00
Ignasi 77acaee9dc
AIE-87 Returns 404 on ErrModelNotFound (#10)
* Bump model-distribution

* On ErrModelNotFound returns 404
2025-04-15 12:04:15 +02:00
Jacob Howard 56f1f14e39
Merge pull request #9 from docker/readme-license
[AIE-97] chore: add README.md and LICENSE in preparation for open sourcing
2025-04-14 13:15:03 -06:00
Jacob Howard c370316eb4
chore: add README.md and LICENSE in preparation for open sourcing
Signed-off-by: Jacob Howard <jacob.howard@docker.com>
2025-04-14 10:44:26 -06:00
Ignasi 055b8e451b
Initialize model distribution in model-runner (#8)
* Initialize model distribution in model-runner. This allows to remove model-distribution dependency in Pinata

* Fix main.go

* bump model-distribution

* gofumpt -l -extra -w .
2025-04-10 17:34:56 +02:00
Dorin-Andrei Geman dc011af9e2
Merge pull request #7 from docker/improvements
main.go: Add llama.cpp and scheduler for full experience
2025-04-10 12:18:41 +03:00
Ignasi e7b4657d0b
bump model-distribution (#6) 2025-04-10 10:45:53 +02:00
Dorin Geman 9c16427ada main.go: Add llama.cpp and scheduler for full experience
Signed-off-by: Dorin Geman <dorin.geman@docker.com>
2025-04-10 11:36:19 +03:00
Dorin Geman 566873860c scheduler: Extract the routes in a separate function
Signed-off-by: Dorin Geman <dorin.geman@docker.com>
2025-04-10 11:33:52 +03:00
Dorin Geman 6c16447985 main.go: Socket is removed on Close()
Signed-off-by: Dorin Geman <dorin.geman@docker.com>
2025-04-10 11:32:40 +03:00
Dorin Geman b302636906 llama.cpp: Remove socket on exit
Signed-off-by: Dorin Geman <dorin.geman@docker.com>
2025-04-10 11:29:35 +03:00