model-runner

Commit Graph

Select branches

Hide Pull Requests

AIE-86/notify-errors-via-sse

AIE-87/model-not-found-on-rm

add-distribution-initializer

adds-makefile

bump-model-distribution

bump-model-distribution-progress

dockerize

dorin.geman/llamacpp-paths

force-delete

gitignore

handle-missing-dir

improvements

main

main.go

openai-passthrough

ps-beter-server-error

ps-better-estimation-error

ps-enable-fa

ps-fix-auto-update

ps-no-warmup

ps-win-arm64-support

revert-11-tag-model

support-registry-in-model-name

tag-and-push

tag-model

untagged-openai-models

#1

#10

#100

#101

#102

#103

#104

#105

#106

#107

#11

#110

#111

#112

#113

#115

#116

#117

#119

#12

#120

#123

#124

#125

#128

#129

#13

#133

#134

#135

#136

#137

#14

#142

#144

#145

#146

#147

#148

#148

#149

#15

#16

#17

#18

#19

#2

#20

#21

#22

#23

#24

#26

#27

#28

#29

#3

#30

#31

#32

#33

#34

#35

#36

#37

#38

#39

#4

#40

#41

#42

#43

#44

#44

#45

#46

#47

#47

#48

#48

#49

#5

#50

#51

#52

#53

#54

#55

#56

#57

#58

#59

#6

#60

#61

#62

#63

#65

#66

#67

#68

#69

#7

#70

#72

#73

#74

#75

#76

#77

#78

#79

#8

#80

#81

#83

#85

#86

#87

#88

#89

#9

#91

#92

#93

#94

#95

#96

#98

#99

257c9cccd4

Merge pull request #2 from thaJeztah/migrate_containerd_v2 Dorin-Andrei Geman 2025-04-08 12:03:32 +0300
795693e560

pkg/internal/dockerhub: migrate to containerd v2 Sebastiaan van Stijn 2025-04-07 17:24:27 +0200
473b9df71e llama.cpp: Pass server storage paths as parameters Dorin Geman 2025-04-02 17:29:10 +0300
dccbd7ee30 llama.cpp: Pass server storage paths as parameters dorin.geman/llamacpp-paths Dorin Geman 2025-04-02 17:29:10 +0300
184f9c426b

inference: remove platform support detection Jacob Howard 2025-03-28 18:26:04 -0600
36ae1e3b30

inference: adjust for lack of logger and paths packages Jacob Howard 2025-03-28 17:41:41 -0600
10a7de56cd

deps: update go.mod and add go.sum Jacob Howard 2025-03-28 16:03:03 -0600
95ad19a481

deps: vendor utility dependencies Jacob Howard 2025-03-28 15:23:38 -0600
bd68cb37ef

chore: add go.mod Jacob Howard 2025-03-28 15:23:56 -0600
f7cec84173

deps: remove errordef package references Jacob Howard 2025-03-28 15:09:10 -0600
eab81f859f

inference: use system proxy and enforce RAM for model pulls Jacob Howard 2025-03-25 14:53:04 -0600
1d903ff29d

inference: Retry backend install if it failed due to context.Canceled Dorin Geman 2025-03-22 12:11:31 +0200
fbf1e0f579

inference: Add temporary mechanism for dynamic installation Dorin Geman 2025-03-22 10:48:37 +0200
97bc6085e6

List models following OpenAI API spec Ignasi 2025-03-20 13:51:35 +0100
eef6a7df7c

flusher can't be nil at this point Ignasi 2025-03-19 18:27:53 +0100
9232cb2634

Ensure flusher is not nil Ignasi 2025-03-19 17:13:58 +0100
0ea001602d

Error handling when pulling models: - Handle invalid reference - Handle model not found Ignasi 2025-03-19 14:17:09 +0100
2ddd3a57d8

Update usages of model-distribution Ignasi 2025-03-18 23:37:52 +0100
4c46f589d7

inference: Supported on darwin/arm64 only Dorin Geman 2025-03-14 11:13:21 +0200
2bbd26fdeb

Move prefix paths to inference package Ignasi 2025-03-12 11:36:36 +0100
60bc8a5641

Add models prefix to fix telemetry Ignasi 2025-03-12 10:47:21 +0100
348832f7d0

- Removes inference prefix from model manager related endpoints - Adds model_manager_test.go - Removes json suffix from path Ignasi 2025-03-11 22:32:22 +0100
15afbbac49

Extract to common variable to not repeat the prefix Ignasi 2025-03-11 13:12:48 +0100
7878bc7c69

From ml.docker.internal to model-runner.docker.internal Ignasi 2025-03-10 16:50:51 +0100
55843c8685

Adds --jinja to provide tool calling support Ignasi 2025-03-12 14:11:30 +0100
b33e90ec75

inference: models: Fix download progress streaming Dorin Geman 2025-03-07 13:09:01 +0200
4865beb2c3

Fix IT Ignasi 2025-03-06 17:09:02 +0100
7a1cf900d1

fix format Ignasi 2025-03-06 16:03:22 +0100
70e59ffd36

Potential fix for code scanning alert no. 459: Reflected cross-site scripting Ignasi 2025-03-06 15:54:51 +0100
1c24c91ea9

Sort imports Ignasi 2025-03-06 15:38:36 +0100
c5904eaec9

Show progress on pulling to caller Ignasi 2025-03-06 15:33:31 +0100
837da6d3a7

Fix type Ignasi 2025-03-06 15:24:31 +0100
e64f4a9343

Removes repeated types Ignasi 2025-03-06 14:03:46 +0100
2b22896c95

Handle case where distributionClient can not be initialized Ignasi 2025-03-06 13:47:26 +0100
c7dd57bbc2

Logs pull progress Ignasi 2025-03-06 13:39:39 +0100
99af02a4f9

No need to use different implementations for pull models between win/unix for now Ignasi 2025-03-06 12:10:37 +0100
ab0738b920

GetModel must be public Ignasi 2025-03-05 21:40:05 +0100
c46c97c14d

Simplify getModels Ignasi 2025-03-05 20:13:14 +0100
a83b8e49bf

Applies gofumpt Ignasi 2025-03-05 16:18:26 +0100
d6423dfae7

Fixes e2e Ignasi 2025-03-05 15:05:44 +0100
b69d84f8aa

Using model distribution client Ignasi 2025-03-04 22:00:57 +0100
f1c25bd9a2

Use get models from distribution client Ignasi 2025-03-04 13:09:24 +0100
1f90aa3456

Adds model distribution client Ignasi 2025-03-04 10:41:30 +0100
ac5324bd3a

[AIE-52] inference: add separate completion/embedding backend modes Jacob Howard 2025-03-05 11:52:37 -0700
fe715cd9e6

inference: add routes for a default inference backend Jacob Howard 2025-03-03 15:58:30 -0700
dba8db4f8f

[AIE-41] inference: disable automatic model pulls on inference calls Jacob Howard 2025-03-04 12:27:03 -0700
4403a2a9f9

inference: hide and disable inference services on unsupported platforms Jacob Howard 2025-02-28 17:49:10 -0700
910f9350f9

inference: disable pulls on Windows pending docker/model-distribution Jacob Howard 2025-02-28 11:16:50 -0700
9abc853ec3

inference: Bump llama.cpp runtime to 0.0.0-experimental2 Piotr Stankiewicz 2025-02-28 08:38:10 +0100
3201fb5049

inference: Update telemetry Dorin Geman 2025-02-27 16:28:19 +0200
7a93a6e3db

inference: disable llama.cpp installs on unsupported platforms Jacob Howard 2025-02-27 09:31:00 -0700
7c351f6aa0

inference: add minor optimization to loader Jacob Howard 2025-02-27 04:32:38 -0700
5c7f902bfe

inference/llamacpp: Remove socket before starting Dorin Geman 2025-02-27 13:15:27 +0200
ae9d65a364

inference: Installer only runs once Dorin Geman 2025-02-27 13:14:35 +0200
0f6f2f863c

inference: Tiny loader fix Dorin Geman 2025-02-27 13:13:37 +0200
7f7aa129fa

inference/llamacpp: Remove debug log Dorin Geman 2025-02-27 11:03:02 +0200
69971fb598

inference: two small fixes to the scheduler Jacob Howard 2025-02-27 03:35:21 -0700
348f46991c

inference: wire up model deletion endpoint Jacob Howard 2025-02-26 16:45:42 -0700
d6b1191a01

inference: refactor scheduler to a more modular design Jacob Howard 2025-02-26 16:03:32 -0700
a14517d6bf

inference: Add stub for llama.cpp backend Dorin Geman 2025-02-26 19:30:58 +0200
8dd1f8dbce

inference/scheduler: Cancel backend context to avoid leaks Dorin Geman 2025-02-25 15:32:27 +0200
c8a97ae68d

inference: Require "model" field for completion or embedding Dorin Geman 2025-02-25 14:27:20 +0200
842ce2ddbb

inference: Handle /v1/completions Dorin Geman 2025-02-25 14:09:31 +0200
450b828845

inference: Register /models/{namespace}/{name} Dorin Geman 2025-02-25 13:55:03 +0200
f8cdbc4d81

inference: refactor service and implement scheduling mechanism Jacob Howard 2025-02-24 20:05:35 -0700
21e10c378a

inference: move to modular backend structure and implement stubs Jacob Howard 2025-02-19 15:16:30 -0700