Commit Graph

163 Commits

Author SHA1 Message Date
Piotr Stankiewicz c6e45727bc scheduler: Increase runtime start timeout to 300s
Signed-off-by: Piotr Stankiewicz <piotr.stankiewicz@docker.com>
2025-05-22 10:12:11 +02:00
Dorin-Andrei Geman c818aab53e
Merge pull request #46 from doringeman/unload
Add /engines/unload
2025-05-21 18:26:08 +03:00
Dorin-Andrei Geman 7ddea9dde4
Merge pull request #45 from doringeman/df
Add /engines/df
2025-05-21 18:25:22 +03:00
Dorin-Andrei Geman 105562eade
Merge pull request #42 from doringeman/ps
Add /engines/ps
2025-05-21 18:23:21 +03:00
Dorin Geman 47a0fae220
Add /engines/unload
Signed-off-by: Dorin Geman <dorin.geman@docker.com>
2025-05-21 14:14:04 +03:00
Dorin Geman b881521c88
Add /engines/df
Signed-off-by: Dorin Geman <dorin.geman@docker.com>
2025-05-21 09:59:37 +03:00
Dorin Geman 13c093ca1e
Add /engines/ps
Signed-off-by: Dorin Geman <dorin.geman@docker.com>
2025-05-20 16:00:27 +03:00
Ignasi e6fd394300
Do not html encode pull/push progress if accept header is set to text/json (#38) 2025-05-14 09:37:35 +02:00
Jacob Howard 6e32aa492c
Merge pull request #39 from docker/hide-image-on-cloud
standalone: hide supporting images on cloud
2025-05-13 14:32:20 -06:00
Jacob Howard d9b56ede3c
standalone: hide supporting images on cloud
Signed-off-by: Jacob Howard <jacob.howard@docker.com>
2025-05-13 14:29:10 -06:00
Jacob Howard c20b2b9af9
Merge pull request #37 from docker/default-port
image: set default TCP port
2025-05-13 09:24:15 -06:00
Jacob Howard df8bc82281
image: set default TCP port
When running as a container, we'll always want to listen on TCP, so
we'll set the default port to match the standard model runner default.

Signed-off-by: Jacob Howard <jacob.howard@docker.com>
2025-05-12 19:17:50 -06:00
Emily Casey 96af7b750f
Merge pull request #36 from docker/bump-model-distribution
Bump model distribution
2025-05-12 13:04:13 -06:00
Emily Casey f79622e1ad Bump model distribution
Signed-off-by: Emily Casey <emily.casey@docker.com>
2025-05-12 13:01:27 -06:00
Piotr Stankiewicz 84ed5fdb94 llamacpp: Use --host instead of DD_INF_UDS
Signed-off-by: Piotr Stankiewicz <piotr.stankiewicz@docker.com>
2025-05-12 07:45:26 +02:00
Dorin-Andrei Geman b61a82b779
Merge pull request #34 from doringeman/misc
llama.cpp: linux: Set running status on errLlamaCppUpToDate
2025-05-09 00:08:39 +03:00
Dorin Geman 359af9c951 llama.cpp: linux: Set running status on errLlamaCppUpToDate
Signed-off-by: Dorin Geman <dorin.geman@docker.com>
2025-05-08 19:46:36 +03:00
Dorin-Andrei Geman f9a9cd3865
Merge pull request #33 from doringeman/misc
Remove unused env vars from Makefile and update README
2025-05-08 16:55:39 +03:00
Dorin Geman f826ff1830
chore: Update default LLAMA_SERVER_VERSION to latest
Signed-off-by: Dorin Geman <dorin.geman@docker.com>
2025-05-08 16:53:23 +03:00
Dorin Geman 7bec0a3789
chore: Update README with LLAMA_SERVER_VERSION and LLAMA_SERVER_VARIANT
Signed-off-by: Dorin Geman <dorin.geman@docker.com>
2025-05-08 16:46:28 +03:00
Dorin Geman 80bd685064
chore: Remove unused ACCEL
Signed-off-by: Dorin Geman <dorin.geman@docker.com>
2025-05-08 16:45:38 +03:00
Dorin Geman 271550f12b
chore: Remove unused TARGET_OS
Signed-off-by: Dorin Geman <dorin.geman@docker.com>
2025-05-08 16:34:43 +03:00
Piotr Stankiewicz 342b1606a6 standalone: Allow customising the base image
Signed-off-by: Piotr Stankiewicz <piotr.stankiewicz@docker.com>
2025-05-08 15:33:19 +02:00
Dorin-Andrei Geman 8ce9641765
Merge pull request #32 from doringeman/gomod
Run go mod tidy
2025-05-07 20:21:44 +03:00
Dorin-Andrei Geman 1665669f8e
Merge pull request #31 from doringeman/gitignore
gitignore: Ignore models/ which is the default MODELS_PATH in Makefile
2025-05-07 20:21:37 +03:00
Dorin-Andrei Geman 749caa9ac1
Merge pull request #30 from doringeman/normalized-server-mux
Add NormalizedServeMux
2025-05-07 20:21:23 +03:00
Dorin Geman 82bb009ec5
Add NormalizedServeMux
Signed-off-by: Dorin Geman <dorin.geman@docker.com>
2025-05-07 19:39:41 +03:00
Dorin Geman c37ee7fcce
Run go mod tidy
Signed-off-by: Dorin Geman <dorin.geman@docker.com>
2025-05-07 19:00:31 +03:00
Dorin Geman 775a5becde
gitignore: Ignore models/ which is the default MODELS_PATH in Makefile
Signed-off-by: Dorin Geman <dorin.geman@docker.com>
2025-05-07 18:56:12 +03:00
Jacob Howard 5e3cf52e84
Merge pull request #28 from docker/multiarch
Make the `docker-build` target create a multiarch image
2025-05-06 08:05:30 -06:00
Jacob Howard 19a107987a
Change target image name, make multi-arch, and remove binary path ARG
Signed-off-by: Jacob Howard <jacob.howard@docker.com>
2025-05-05 19:03:22 -06:00
Jacob Howard f30a4ce4ed
Merge pull request #24 from docker/opencl-detection
[AIE-151] native: support dynamic detection of OpenCL
2025-05-05 11:06:46 -06:00
Jacob Howard c16f0997ef
Merge pull request #27 from docker/dockerfile-tweaks
standalone: Makefile and Dockerfile tweaks
2025-05-05 08:47:13 -06:00
Jacob Howard 3df299bc66
standalone: fix portability for arm64 platforms
On Linux, uname -m reports "aarch64" (vs. the "arm64" reported on
macOS).

Signed-off-by: Jacob Howard <jacob.howard@docker.com>
2025-05-01 18:13:26 -06:00
Jacob Howard 8ac3bc01da
standalone: set default MODELS_PATH environment variable
Signed-off-by: Jacob Howard <jacob.howard@docker.com>
2025-05-01 18:03:48 -06:00
Jacob Howard 7634048cb8
standalone: extract the backend binary from its packaging directories
Signed-off-by: Jacob Howard <jacob.howard@docker.com>
2025-05-01 18:02:57 -06:00
Jacob Howard 7b020e345b
standalone: give write permissions on /app for backend socket listeners
Signed-off-by: Jacob Howard <jacob.howard@docker.com>
2025-05-01 17:59:12 -06:00
Jacob Howard 3c6008e869
llamacpp: fix socket removal error detection and formatting
Signed-off-by: Jacob Howard <jacob.howard@docker.com>
2025-05-01 16:35:02 -06:00
Jacob Howard 0f1ffa2c19
llamacpp: disable CUDA check for Windows/ARM64
We don't have the com.docker.nv-gpu-info.exe tool there (yet).

Signed-off-by: Jacob Howard <jacob.howard@docker.com>
2025-05-01 14:20:18 -06:00
Jacob Howard 1fef60e334
native: don't apply version restrictions on Adreno GPUs
Signed-off-by: Jacob Howard <jacob.howard@docker.com>
2025-04-30 13:48:34 -06:00
Jacob Howard 3b5bf559db
native: add Adreno device constraints for OpenCL backend
Signed-off-by: Jacob Howard <jacob.howard@docker.com>
2025-04-30 13:48:34 -06:00
Jacob Howard 3d8c73c355
[AIE-151] native: support dynamic detection of OpenCL
Signed-off-by: Jacob Howard <jacob.howard@docker.com>
2025-04-30 13:48:31 -06:00
Dorin-Andrei Geman 4c3ffbfa53
Merge pull request #26 from doringeman/support-native-platform 2025-04-30 19:30:45 +03:00
Dorin Geman 596d2f7a94
Support native platform (include linux/arm64)
Signed-off-by: Dorin Geman <dorin.geman@docker.com>
2025-04-30 18:17:49 +03:00
Ignasi dbbb7afe9f
Dockerize (#22)
* Adds Makefile for local development

* Fix chat completions example request

* Added delete example

* Dockerize model-runner

* WIP Run container with host access to socket

* Dockerize model-runner

* WIP Run container with host access to socket

* Debugging

* Run in Docker container with TCP port access

* mounted model storage

* - Remove duplication in .gitignore
- Do not use alpine in builder image
- NVIDIA seems to use Ubuntu in all of their CDI docs and produces Ubuntu tags for nvidia/cuda but not Debian. So use Ubuntu for our final image
For more details: https://github.com/docker/model-runner/pull/22

* - Add MODELS_PATH environment variable to configure model storage location
- Default to $HOME/.docker/models when MODELS_PATH is not set
- Update Docker container to use /models as the default storage path
- Update Makefile to pass MODELS_PATH to container
- Update Dockerfile to create and set permissions for /models directory

This change allows users to:
- Override the model storage location via MODELS_PATH
- Maintain backward compatibility with default $HOME/.docker/models path
- Use a more idiomatic folder for /models

* Removes unneeded logs
2025-04-29 18:03:12 +02:00
Piotr Stankiewicz 4239791795 Basic param tuning on windows/arm64
Signed-off-by: Piotr Stankiewicz <piotr.stankiewicz@docker.com>
2025-04-25 16:26:35 +02:00
Piotr Stankiewicz 978875e99c Enable basic windows/arm64 support
Signed-off-by: Piotr Stankiewicz <piotr.stankiewicz@docker.com>
2025-04-25 16:26:35 +02:00
Ignasi 7545f180cd
Adds makefile (#20)
* Adds Makefile for local development

* Fix chat completions example request

* Added delete example
2025-04-25 09:38:54 +02:00
Dorin-Andrei Geman b02cdd9568
Merge pull request #21 from docker/ps-fix-auto-update
Fix llama-server auto update
2025-04-24 19:02:49 +03:00
Piotr Stankiewicz 87fd6f6466 Fix llama-server auto update
Signed-off-by: Piotr Stankiewicz <piotr.stankiewicz@docker.com>
2025-04-24 17:31:42 +02:00