Jacob Howard
ed476dcbb8
chore: code review suggestions and go mod tidy
...
Signed-off-by: Jacob Howard <jacob.howard@docker.com>
2025-04-17 13:59:32 -06:00
Dorin Geman
e5d5ccf2dd
Add Status to Backend interface
...
Signed-off-by: Dorin Geman <dorin.geman@docker.com>
2025-04-17 19:11:20 +02:00
Jacob Howard
ac5324bd3a
[AIE-52] inference: add separate completion/embedding backend modes
...
Signed-off-by: Jacob Howard <jacob.howard@docker.com>
2025-03-28 17:53:08 -06:00
Jacob Howard
fe715cd9e6
inference: add routes for a default inference backend
...
Signed-off-by: Jacob Howard <jacob.howard@docker.com>
2025-03-28 17:53:08 -06:00
Jacob Howard
d6b1191a01
inference: refactor scheduler to a more modular design
...
This new design will allow for concurrent runner operation (eventually)
on systems that support it.
Signed-off-by: Jacob Howard <jacob.howard@docker.com>
2025-03-28 17:53:06 -06:00
Jacob Howard
f8cdbc4d81
inference: refactor service and implement scheduling mechanism
...
Signed-off-by: Jacob Howard <jacob.howard@docker.com>
2025-03-28 17:53:05 -06:00
Jacob Howard
21e10c378a
inference: move to modular backend structure and implement stubs
...
Signed-off-by: Jacob Howard <jacob.howard@docker.com>
2025-03-28 17:53:00 -06:00