Commit Graph

  • 257c9cccd4
    Merge pull request #2 from thaJeztah/migrate_containerd_v2 Dorin-Andrei Geman 2025-04-08 12:03:32 +0300
  • 795693e560
    pkg/internal/dockerhub: migrate to containerd v2 Sebastiaan van Stijn 2025-04-07 17:24:27 +0200
  • 473b9df71e llama.cpp: Pass server storage paths as parameters Dorin Geman 2025-04-02 17:29:10 +0300
  • dccbd7ee30 llama.cpp: Pass server storage paths as parameters dorin.geman/llamacpp-paths Dorin Geman 2025-04-02 17:29:10 +0300
  • 184f9c426b
    inference: remove platform support detection Jacob Howard 2025-03-28 18:26:04 -0600
  • 36ae1e3b30
    inference: adjust for lack of logger and paths packages Jacob Howard 2025-03-28 17:41:41 -0600
  • 10a7de56cd
    deps: update go.mod and add go.sum Jacob Howard 2025-03-28 16:03:03 -0600
  • 95ad19a481
    deps: vendor utility dependencies Jacob Howard 2025-03-28 15:23:38 -0600
  • bd68cb37ef
    chore: add go.mod Jacob Howard 2025-03-28 15:23:56 -0600
  • f7cec84173
    deps: remove errordef package references Jacob Howard 2025-03-28 15:09:10 -0600
  • eab81f859f
    inference: use system proxy and enforce RAM for model pulls Jacob Howard 2025-03-25 14:53:04 -0600
  • 1d903ff29d
    inference: Retry backend install if it failed due to context.Canceled Dorin Geman 2025-03-22 12:11:31 +0200
  • fbf1e0f579
    inference: Add temporary mechanism for dynamic installation Dorin Geman 2025-03-22 10:48:37 +0200
  • 97bc6085e6
    List models following OpenAI API spec Ignasi 2025-03-20 13:51:35 +0100
  • eef6a7df7c
    flusher can't be nil at this point Ignasi 2025-03-19 18:27:53 +0100
  • 9232cb2634
    Ensure flusher is not nil Ignasi 2025-03-19 17:13:58 +0100
  • 0ea001602d
    Error handling when pulling models: - Handle invalid reference - Handle model not found Ignasi 2025-03-19 14:17:09 +0100
  • 2ddd3a57d8
    Update usages of model-distribution Ignasi 2025-03-18 23:37:52 +0100
  • 4c46f589d7
    inference: Supported on darwin/arm64 only Dorin Geman 2025-03-14 11:13:21 +0200
  • 2bbd26fdeb
    Move prefix paths to inference package Ignasi 2025-03-12 11:36:36 +0100
  • 60bc8a5641
    Add models prefix to fix telemetry Ignasi 2025-03-12 10:47:21 +0100
  • 348832f7d0
    - Removes inference prefix from model manager related endpoints - Adds model_manager_test.go - Removes json suffix from path Ignasi 2025-03-11 22:32:22 +0100
  • 15afbbac49
    Extract to common variable to not repeat the prefix Ignasi 2025-03-11 13:12:48 +0100
  • 7878bc7c69
    From ml.docker.internal to model-runner.docker.internal Ignasi 2025-03-10 16:50:51 +0100
  • 55843c8685
    Adds --jinja to provide tool calling support Ignasi 2025-03-12 14:11:30 +0100
  • b33e90ec75
    inference: models: Fix download progress streaming Dorin Geman 2025-03-07 13:09:01 +0200
  • 4865beb2c3
    Fix IT Ignasi 2025-03-06 17:09:02 +0100
  • 7a1cf900d1
    fix format Ignasi 2025-03-06 16:03:22 +0100
  • 70e59ffd36
    Potential fix for code scanning alert no. 459: Reflected cross-site scripting Ignasi 2025-03-06 15:54:51 +0100
  • 1c24c91ea9
    Sort imports Ignasi 2025-03-06 15:38:36 +0100
  • c5904eaec9
    Show progress on pulling to caller Ignasi 2025-03-06 15:33:31 +0100
  • 837da6d3a7
    Fix type Ignasi 2025-03-06 15:24:31 +0100
  • e64f4a9343
    Removes repeated types Ignasi 2025-03-06 14:03:46 +0100
  • 2b22896c95
    Handle case where distributionClient can not be initialized Ignasi 2025-03-06 13:47:26 +0100
  • c7dd57bbc2
    Logs pull progress Ignasi 2025-03-06 13:39:39 +0100
  • 99af02a4f9
    No need to use different implementations for pull models between win/unix for now Ignasi 2025-03-06 12:10:37 +0100
  • ab0738b920
    GetModel must be public Ignasi 2025-03-05 21:40:05 +0100
  • c46c97c14d
    Simplify getModels Ignasi 2025-03-05 20:13:14 +0100
  • a83b8e49bf
    Applies gofumpt Ignasi 2025-03-05 16:18:26 +0100
  • d6423dfae7
    Fixes e2e Ignasi 2025-03-05 15:05:44 +0100
  • b69d84f8aa
    Using model distribution client Ignasi 2025-03-04 22:00:57 +0100
  • f1c25bd9a2
    Use get models from distribution client Ignasi 2025-03-04 13:09:24 +0100
  • 1f90aa3456
    Adds model distribution client Ignasi 2025-03-04 10:41:30 +0100
  • ac5324bd3a
    [AIE-52] inference: add separate completion/embedding backend modes Jacob Howard 2025-03-05 11:52:37 -0700
  • fe715cd9e6
    inference: add routes for a default inference backend Jacob Howard 2025-03-03 15:58:30 -0700
  • dba8db4f8f
    [AIE-41] inference: disable automatic model pulls on inference calls Jacob Howard 2025-03-04 12:27:03 -0700
  • 4403a2a9f9
    inference: hide and disable inference services on unsupported platforms Jacob Howard 2025-02-28 17:49:10 -0700
  • 910f9350f9
    inference: disable pulls on Windows pending docker/model-distribution Jacob Howard 2025-02-28 11:16:50 -0700
  • 9abc853ec3
    inference: Bump llama.cpp runtime to 0.0.0-experimental2 Piotr Stankiewicz 2025-02-28 08:38:10 +0100
  • 3201fb5049
    inference: Update telemetry Dorin Geman 2025-02-27 16:28:19 +0200
  • 7a93a6e3db
    inference: disable llama.cpp installs on unsupported platforms Jacob Howard 2025-02-27 09:31:00 -0700
  • 7c351f6aa0
    inference: add minor optimization to loader Jacob Howard 2025-02-27 04:32:38 -0700
  • 5c7f902bfe
    inference/llamacpp: Remove socket before starting Dorin Geman 2025-02-27 13:15:27 +0200
  • ae9d65a364
    inference: Installer only runs once Dorin Geman 2025-02-27 13:14:35 +0200
  • 0f6f2f863c
    inference: Tiny loader fix Dorin Geman 2025-02-27 13:13:37 +0200
  • 7f7aa129fa
    inference/llamacpp: Remove debug log Dorin Geman 2025-02-27 11:03:02 +0200
  • 69971fb598
    inference: two small fixes to the scheduler Jacob Howard 2025-02-27 03:35:21 -0700
  • 348f46991c
    inference: wire up model deletion endpoint Jacob Howard 2025-02-26 16:45:42 -0700
  • d6b1191a01
    inference: refactor scheduler to a more modular design Jacob Howard 2025-02-26 16:03:32 -0700
  • a14517d6bf
    inference: Add stub for llama.cpp backend Dorin Geman 2025-02-26 19:30:58 +0200
  • 8dd1f8dbce
    inference/scheduler: Cancel backend context to avoid leaks Dorin Geman 2025-02-25 15:32:27 +0200
  • c8a97ae68d
    inference: Require "model" field for completion or embedding Dorin Geman 2025-02-25 14:27:20 +0200
  • 842ce2ddbb
    inference: Handle /v1/completions Dorin Geman 2025-02-25 14:09:31 +0200
  • 450b828845
    inference: Register /models/{namespace}/{name} Dorin Geman 2025-02-25 13:55:03 +0200
  • f8cdbc4d81
    inference: refactor service and implement scheduling mechanism Jacob Howard 2025-02-24 20:05:35 -0700
  • 21e10c378a
    inference: move to modular backend structure and implement stubs Jacob Howard 2025-02-19 15:16:30 -0700