Jacob Howard
3d8c73c355
[AIE-151] native: support dynamic detection of OpenCL
...
Signed-off-by: Jacob Howard <jacob.howard@docker.com>
2025-04-30 13:48:31 -06:00
Dorin-Andrei Geman
4c3ffbfa53
Merge pull request #26 from doringeman/support-native-platform
2025-04-30 19:30:45 +03:00
Dorin Geman
596d2f7a94
Support native platform (include linux/arm64)
...
Signed-off-by: Dorin Geman <dorin.geman@docker.com>
2025-04-30 18:17:49 +03:00
Ignasi
dbbb7afe9f
Dockerize ( #22 )
...
* Adds Makefile for local development
* Fix chat completions example request
* Added delete example
* Dockerize model-runner
* WIP Run container with host access to socket
* Dockerize model-runner
* WIP Run container with host access to socket
* Debugging
* Run in Docker container with TCP port access
* mounted model storage
* - Remove duplication in .gitignore
- Do not use alpine in builder image
- NVIDIA seems to use Ubuntu in all of their CDI docs and produces Ubuntu tags for nvidia/cuda but not Debian. So use Ubuntu for our final image
For more details: https://github.com/docker/model-runner/pull/22
* - Add MODELS_PATH environment variable to configure model storage location
- Default to $HOME/.docker/models when MODELS_PATH is not set
- Update Docker container to use /models as the default storage path
- Update Makefile to pass MODELS_PATH to container
- Update Dockerfile to create and set permissions for /models directory
This change allows users to:
- Override the model storage location via MODELS_PATH
- Maintain backward compatibility with default $HOME/.docker/models path
- Use a more idiomatic folder for /models
* Removes unneeded logs
2025-04-29 18:03:12 +02:00
Piotr Stankiewicz
4239791795
Basic param tuning on windows/arm64
...
Signed-off-by: Piotr Stankiewicz <piotr.stankiewicz@docker.com>
2025-04-25 16:26:35 +02:00
Piotr Stankiewicz
978875e99c
Enable basic windows/arm64 support
...
Signed-off-by: Piotr Stankiewicz <piotr.stankiewicz@docker.com>
2025-04-25 16:26:35 +02:00
Ignasi
7545f180cd
Adds makefile ( #20 )
...
* Adds Makefile for local development
* Fix chat completions example request
* Added delete example
2025-04-25 09:38:54 +02:00
Dorin-Andrei Geman
b02cdd9568
Merge pull request #21 from docker/ps-fix-auto-update
...
Fix llama-server auto update
2025-04-24 19:02:49 +03:00
Piotr Stankiewicz
87fd6f6466
Fix llama-server auto update
...
Signed-off-by: Piotr Stankiewicz <piotr.stankiewicz@docker.com>
2025-04-24 17:31:42 +02:00
Emily Casey
4c6c6a3da4
Merge pull request #18 from docker/force-delete
...
Allow force deletion of models
2025-04-18 18:33:35 -04:00
Emily Casey
858b905c5d
Allow force deletion of multiply tagged models
...
Signed-off-by: Emily Casey <emily.casey@docker.com>
2025-04-18 18:28:49 -04:00
Emily Casey
5fe701d3f7
Merge pull request #17 from docker/handle-missing-dir
...
Gracefully handle models directory deletion
2025-04-18 18:26:50 -04:00
Emily Casey
247f63df65
Handle missing models dir
...
Signed-off-by: Emily Casey <emily.casey@docker.com>
2025-04-18 18:02:32 -04:00
Emily Casey
235aaf289f
Merge pull request #16 from docker/untagged-openai-models
...
Set openAI ID to model ID when untagged
2025-04-18 13:19:09 -04:00
Emily Casey
6a781b0eeb
Set openAI id to model ID when untagged
...
Signed-off-by: Emily Casey <emily.casey@docker.com>
2025-04-18 13:02:16 -04:00
Emily Casey
9010d23cef
Merge pull request #15 from docker/tag-and-push
...
Tag and push
2025-04-17 22:31:39 -04:00
Emily Casey
eab9cf8f2b
make tag required
...
Signed-off-by: Emily Casey <emily.casey@docker.com>
2025-04-17 21:57:25 -04:00
Emily Casey
cac9d9dd8b
Update pkg/inference/models/manager.go
...
Signed-off-by: Emily Casey <emily.casey@docker.com>
Co-authored-by: Jacob Howard <jacob.howard@docker.com>
2025-04-17 20:09:36 -04:00
Emily Casey
3a7a483d68
Make pull errors symetrical with push
...
Return Unauthorized status when appropriate.
Signed-off-by: Emily Casey <emily.casey@docker.com>
2025-04-17 20:08:50 -04:00
Emily Casey
43fdb4cd50
Handle unauthorized error
...
Signed-off-by: Emily Casey <emily.casey@docker.com>
2025-04-17 20:08:50 -04:00
Emily Casey
fdf81fe571
Adds push endpoint
...
Signed-off-by: Emily Casey <emily.casey@docker.com>
2025-04-17 20:08:50 -04:00
Emily Casey
c2f030c9df
Fix tag routes
...
Handle any number of path elements in model name.
Signed-off-by: Emily Casey <emily.casey@docker.com>
2025-04-17 20:08:47 -04:00
Emily Casey
8d986932e4
Revert "Revert "Add endpoint for tagging a model""
...
This reverts commit 739247ee02 .
2025-04-17 19:59:25 -04:00
Jacob Howard
586c6df18f
Merge pull request #14 from docker/revert-11-tag-model
...
Revert "Add endpoint for tagging a model"
2025-04-17 14:43:47 -06:00
Emily Casey
739247ee02
Revert "Add endpoint for tagging a model"
2025-04-17 16:42:02 -04:00
Jacob Howard
3b88d22cbd
Merge pull request #13 from docker/ps-win-support
...
Windows support
2025-04-17 14:08:17 -06:00
Jacob Howard
ca5fbbd8e8
chore: run go fmt
...
Signed-off-by: Jacob Howard <jacob.howard@docker.com>
2025-04-17 14:04:02 -06:00
Jacob Howard
ed476dcbb8
chore: code review suggestions and go mod tidy
...
Signed-off-by: Jacob Howard <jacob.howard@docker.com>
2025-04-17 13:59:32 -06:00
Emily Casey
d1d8e7a3c1
Merge pull request #11 from docker/tag-model
...
Add endpoint for tagging a model
2025-04-17 15:54:42 -04:00
Emily Casey
e607ddcc49
Update pkg/inference/models/manager.go
...
Signed-off-by: Emily Casey <emily.casey@docker.com>
Co-authored-by: Jacob Howard <jacob.howard@docker.com>
2025-04-17 14:44:08 -04:00
Dorin Geman
40f7438308
Lock ShouldUseGPUVariant
...
Signed-off-by: Dorin Geman <dorin.geman@docker.com>
2025-04-17 19:11:23 +02:00
Dorin Geman
e5d5ccf2dd
Add Status to Backend interface
...
Signed-off-by: Dorin Geman <dorin.geman@docker.com>
2025-04-17 19:11:20 +02:00
Dorin Geman
5e4719501a
Reset on Install
...
Signed-off-by: Dorin Geman <dorin.geman@docker.com>
2025-04-17 19:10:16 +02:00
Dorin Geman
75f963a112
Show the GPU-backed setting only if it is available
...
Signed-off-by: Dorin Geman <dorin.geman@docker.com>
2025-04-17 19:10:11 +02:00
Dorin Geman
eb0dba0cc8
No need to use the updated llama.cpp if the bundled one is up to date
...
Signed-off-by: Dorin Geman <dorin.geman@docker.com>
2025-04-17 19:10:05 +02:00
Dorin Geman
a3fb86a0bb
Force a re-installation if EnableInferenceGPUVariant has changed
...
Signed-off-by: Dorin Geman <dorin.geman@docker.com>
2025-04-17 19:10:00 +02:00
Dorin Geman
5d56ba5ad3
Kill process on Windows
...
Signed-off-by: Dorin Geman <dorin.geman@docker.com>
2025-04-17 19:09:52 +02:00
Piotr Stankiewicz
6f3b1b4f25
Dynamic download of CUDA binaries
...
Signed-off-by: Piotr Stankiewicz <piotr.stankiewicz@docker.com>
2025-04-17 19:09:33 +02:00
Piotr Stankiewicz
a9e72bf303
Enable Windows support
...
Signed-off-by: Piotr Stankiewicz <piotr.stankiewicz@docker.com>
2025-04-17 19:07:16 +02:00
Emily Casey
5a2117a505
Add endpoint for tagging a model
...
Signed-off-by: Emily Casey <emily.casey@docker.com>
2025-04-16 16:21:32 -04:00
Ignasi
77acaee9dc
AIE-87 Returns 404 on ErrModelNotFound ( #10 )
...
* Bump model-distribution
* On ErrModelNotFound returns 404
2025-04-15 12:04:15 +02:00
Jacob Howard
56f1f14e39
Merge pull request #9 from docker/readme-license
...
[AIE-97] chore: add README.md and LICENSE in preparation for open sourcing
2025-04-14 13:15:03 -06:00
Jacob Howard
c370316eb4
chore: add README.md and LICENSE in preparation for open sourcing
...
Signed-off-by: Jacob Howard <jacob.howard@docker.com>
2025-04-14 10:44:26 -06:00
Ignasi
055b8e451b
Initialize model distribution in model-runner ( #8 )
...
* Initialize model distribution in model-runner. This allows to remove model-distribution dependency in Pinata
* Fix main.go
* bump model-distribution
* gofumpt -l -extra -w .
2025-04-10 17:34:56 +02:00
Dorin-Andrei Geman
dc011af9e2
Merge pull request #7 from docker/improvements
...
main.go: Add llama.cpp and scheduler for full experience
2025-04-10 12:18:41 +03:00
Ignasi
e7b4657d0b
bump model-distribution ( #6 )
2025-04-10 10:45:53 +02:00
Dorin Geman
9c16427ada
main.go: Add llama.cpp and scheduler for full experience
...
Signed-off-by: Dorin Geman <dorin.geman@docker.com>
2025-04-10 11:36:19 +03:00
Dorin Geman
566873860c
scheduler: Extract the routes in a separate function
...
Signed-off-by: Dorin Geman <dorin.geman@docker.com>
2025-04-10 11:33:52 +03:00
Dorin Geman
6c16447985
main.go: Socket is removed on Close()
...
Signed-off-by: Dorin Geman <dorin.geman@docker.com>
2025-04-10 11:32:40 +03:00
Dorin Geman
b302636906
llama.cpp: Remove socket on exit
...
Signed-off-by: Dorin Geman <dorin.geman@docker.com>
2025-04-10 11:29:35 +03:00