Daniel J Walsh
3f8e31a073
Merge pull request #1714 from slp/install-virglrenderer
...
container-images: add virglrenderer to vulkan
2025-07-19 06:35:54 -04:00
Daniel J Walsh
08722738cf
Merge pull request #1718 from containers/konflux/references/main
...
Update Konflux references
2025-07-19 06:34:54 -04:00
red-hat-konflux-kflux-prd-rh03[bot]
ab7adbb430
Update Konflux references
...
Signed-off-by: red-hat-konflux-kflux-prd-rh03 <206760901+red-hat-konflux-kflux-prd-rh03[bot]@users.noreply.github.com>
2025-07-19 08:03:10 +00:00
Mike Bonnet
72504179fc
Merge pull request #1716 from containers/fix-sentencepiece-build
...
build_rag.sh: install cmake
2025-07-18 10:54:06 -07:00
Mike Bonnet
dcfeee8538
build_rag.sh: install cmake
...
cmake is required to build sentencepiece.
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-07-18 09:17:33 -07:00
Daniel J Walsh
1d903e746c
Merge pull request #1677 from containers/vllm-cpu
...
Add vllm to cpu inferencing Containerfile
2025-07-18 06:05:26 -04:00
Daniel J Walsh
13a22f6671
Merge pull request #1708 from containers/konflux-more-images
...
konflux: add pipelines for asahi, cann, intel-gpu, llama-stack, musa, openvino, and ramalama-cli
2025-07-18 06:04:12 -04:00
Daniel J Walsh
1d6aa51cd7
Merge pull request #1712 from tonyjames/main
...
Add support for Intel Iris Xe Graphics (46AA, 46A6, 46A8)
2025-07-18 06:03:34 -04:00
Tony James
50d01f177b
Add support for Intel Iris Xe Graphics (46AA, 46A6, 46A8)
...
Signed-off-by: Tony James <3128081+tonyjames@users.noreply.github.com>
2025-07-17 18:58:07 -04:00
Eric Curtin
234134b5cc
Add vllm to cpu inferencing Containerfile
...
To be built upon "ramalama" image
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-07-17 21:09:20 +01:00
Daniel J Walsh
64ca9cfb4a
Merge pull request #1709 from containers/fix-cuda-gpu
...
fix GPU selection and pytorch URL when building rag images
2025-07-17 11:31:41 -04:00
Eric Curtin
e3dda75ec6
Merge pull request #1707 from rhatdan/install
...
README: remove duplicate statements
2025-07-17 15:57:12 +01:00
Daniel J Walsh
075df4bb87
Merge pull request #1617 from jwieleRH/check_nvidia
...
Improve NVIDIA GPU detection.
2025-07-17 06:29:40 -04:00
Daniel J Walsh
5b46b23f2e
README: remove duplicate statements
...
Simplify ramalama's top-level description. Remove the duplicate
statements.
Also make sure all references to PyPI are spelled this way.
Signed-off-by: Ken Dreyer <kdreyer@redhat.com>
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-07-17 06:26:55 -04:00
Daniel J Walsh
1fe1b20c8c
Merge pull request #1711 from carlwgeorge/include-config-in-wheel
...
Included ramalama.conf in wheel
2025-07-17 06:21:47 -04:00
Mike Bonnet
f5512c8f65
build_rag.sh: install sentencepiece via pip
...
python3-sentencepiece was pulling in an older version of protobuf.
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-07-16 23:27:54 -07:00
Mike Bonnet
7132d5a7f8
build_rag.sh: disable pip cache
...
pip's caching behavior was causing errors when downloading huge (4.5G) torch wheels during
the rocm-ubi-rag build.
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-07-16 23:27:54 -07:00
Mike Bonnet
2d3f8dfe28
fix GPU selection and pytorch URL when building rag images
...
A previous commit changed the second argument to add_rag() from the image name to the
full repo path. Update the case statement accordingly, so the "GPU" variable is set correctly.
The "cuda" directory is no longer available on download.pytorch.org. When building for cuda,
pull wheels from the "cu128" directory, which contains binaries built for CUDA 12.8.
When building rocm* images, download binaries from the "rocm6.3" directory, which are built
for ROCm 6.3.
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-07-16 23:27:54 -07:00
Carl George
1d8a2e5b6c
Included ramalama.conf in wheel
...
Currently other data files such as shortnames.conf, man pages, and shell
completions are included in the Python wheel. Including ramalama.conf
as well means we can avoid several calls to make in the RPM spec file,
instead relying on the wheel mechanisms to put these files in place. As
long as `make docs` is run before the wheel generation, all the
necessary files are included.
Signed-off-by: Carl George <carlwgeorge@gmail.com>
2025-07-17 01:20:28 -05:00
Eric Curtin
42ac787686
Merge pull request #1710 from containers/konflux/mintmaker/main/registry.access.redhat.com-ubi9-ubi-9.x
...
chore(deps): update registry.access.redhat.com/ubi9/ubi docker tag to v9.6-1752625787
2025-07-17 01:15:43 +01:00
red-hat-konflux-kflux-prd-rh03[bot]
18c560fff6
chore(deps): update registry.access.redhat.com/ubi9/ubi docker tag to v9.6-1752625787
...
Signed-off-by: red-hat-konflux-kflux-prd-rh03 <206760901+red-hat-konflux-kflux-prd-rh03[bot]@users.noreply.github.com>
2025-07-17 00:03:57 +00:00
John Wiele
ce35ccb4c3
Remove pyYAML as a dependency.
...
Extract information directly from the CDI YAML file by making some
simplifying assumptions instead of doing a complete YAML parse.
Default to all devices known to nvidia-smi.
Fix the signature of check_nvidia().
Remove some debug logging.
Signed-off-by: John Wiele <jwiele@redhat.com>
2025-07-16 16:11:39 -04:00
John Wiele
b97177b408
Apply suggestions from code review
...
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>
Signed-off-by: John Wiele <jwiele@redhat.com>
2025-07-16 16:08:32 -04:00
John Wiele
14c4aaca39
Improve NVIDIA GPU detection.
...
Allow GPUs to be specified by UUID as well as index since the index is
not guaranteed to persist across reboots.
Crosscheck requested GPUs with nvidia-smi and CDI configuration. If
any requested GPUs lack corresponding CDI configuration, print a
message with a pointer to documentation.
If the only GPU specified in the CDI configuration is "all", as
appears to be the case on WSL2, use "all" as the default.
Add an optional encoding argument to run_cmd() to facilitate checking
the output of the command.
Add pyYAML as a dependency for parsing the CDI configuration.
Signed-off-by: John Wiele <jwiele@redhat.com>
2025-07-16 16:08:32 -04:00
Mike Bonnet
bf4fd56106
konflux: add pipelines for asahi, cann, intel-gpu, llama-stack, musa, openvino, and ramalama-cli
...
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-07-16 12:23:12 -07:00
Mike Bonnet
1373a8e7ba
konflux: don't trigger pipelines on PR transition to "Ready for Review"
...
By default, Konflux triggers new pipelines when a PR moves from Draft to
"Ready for Review". Because the commit SHA hasn't changed, no new builds
are performed. However, a new integration test is also triggered, and because
no builds were performed it is unable to find the URL and digest of the images,
causing the integration test to fail. Updating the "on-cel-expression" to exclude
the transition to "Ready to Review" avoids the unnecessary pipelines and the
false integration test failures.
Update the whitespace of the "on-cel-expression" in the push pipelines for consistency.
No functional change.
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-07-16 12:11:54 -07:00
Sergio Lopez
74584d0b5e
container-images: add virglrenderer to vulkan
...
When running in a krun-isolated container, we need
"/usr/libexec/virgl_render_server" to be present in the container
image to launch it before entering the microVM.
Install the virglrenderer package in addition to mesa-vulkan-drivers.
Signed-off-by: Sergio Lopez <slp@redhat.com>
2025-07-16 18:53:11 +02:00
Daniel J Walsh
4dea2ee02f
Merge pull request #1687 from containers/konflux-cuda-arm64
...
konflux: build cuda on arm64, and simplify testing
2025-07-16 12:01:45 -04:00
Mike Bonnet
069e98c095
fix unit tests to be independent of environment
...
Setting RAMALAMA_IMAGE would cause some unit tests to fail. Make those
tests independent of the calling environment.
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-07-16 06:44:00 -07:00
Mike Bonnet
f57b8eb284
konflux: copy source into the bats image
...
Including the source in the bats image ensures that we're always testing with the same
version of the code that was used to build the images. It also eliminates the need for
repeated checkouts of the repo and simplifies testing, avoiding additional volumes and
artifact references.
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-07-16 06:44:00 -07:00
Mike Bonnet
299d3b9b75
konflux: build cuda and layered images on arm64
...
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-07-16 06:44:00 -07:00
Stephen Smoogen
683b8fb8a0
Minor fixes to rpm builds by packit and spec file. ( #1704 )
...
* This removes epel9 from packit rules as epel9 does not currently
build without many additional packages added to the distro.
* This fixes a breakage in epel10 by adding mailcap as a buildrequires.
Signed-off-by: Stephen Smoogen <ssmoogen@redhat.com>
Co-authored-by: Stephen Smoogen <ssmoogen@redhat.com>
2025-07-16 09:37:00 -04:00
Mike Bonnet
64e22ee0aa
Merge pull request #1700 from containers/test-optimization-and-fixup
...
reduce unnecessary image pulls during testing, and re-enable a couple tests
2025-07-15 11:34:59 -07:00
Mike Bonnet
651fc503bd
implement "ps --noheading" for docker using --format
...
"docker ps" does not support the "--noheading" option. Use the --format
option to emulate the behavior.
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-07-15 10:32:53 -07:00
Daniel J Walsh
384cad7161
Merge pull request #1696 from containers/renovate/quay.io-konflux-ci-build-trusted-artifacts-latest
...
chore(deps): update quay.io/konflux-ci/build-trusted-artifacts:latest docker digest to f7d0c51
2025-07-15 13:17:33 -04:00
Daniel J Walsh
3dec0d7487
Merge pull request #1699 from containers/renovate/registry.access.redhat.com-ubi9-ubi-9.x
...
chore(deps): update registry.access.redhat.com/ubi9/ubi docker tag to v9.6-1752587049
2025-07-15 13:16:27 -04:00
Daniel J Walsh
d7763ad1c5
Merge pull request #1698 from containers/mistral
...
Mistral should point to lmstudio gguf
2025-07-15 13:15:11 -04:00
Mike Bonnet
b550cc97d2
bats: re-enable a couple tests, and minor cleanup
...
Fix the "serve and stop" test by passing the correct (possibly random) port to "ramalama chat".
Fix the definition of "ramalama_runtime".
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-07-15 09:03:49 -07:00
Mike Bonnet
927d2f992a
bats: allow the container to use the overlay driver when possible
...
Remove the STORAGE_DRIVER env var from the container so it doesn't force use
of the vfs driver in all cases.
Mount /dev/fuse into the container when running locally.
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-07-15 09:03:49 -07:00
Mike Bonnet
f176bb3926
add a dryrun field to Config, and set it early
...
accel_image() is called to set option defaults, before options are even parsed.
This can cause images to be pulled even if they will not actually be used, slowing
down testing and making the cli less responsive. Set the "dryrun" option before
the first call to accel_image() to avoid unnecessary image pulls.
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-07-15 09:03:49 -07:00
renovate[bot]
f38c736d23
chore(deps): update registry.access.redhat.com/ubi9/ubi docker tag to v9.6-1752587049
...
Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2025-07-15 15:47:38 +00:00
Eric Curtin
fa2f485175
Mistral should point to lmstudio gguf
...
I don't know who MaziyarPanahi is, but I know who lmstudio are
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-07-15 15:04:54 +01:00
Mike Bonnet
f8c41b38c1
avoid unnecessary image pulls
...
Don't pull images in _get_rag() and _get_source_model() if pull == "never"
or if running with "--dryrun".
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-07-14 14:42:21 -07:00
renovate[bot]
b7323f7972
chore(deps): update quay.io/konflux-ci/build-trusted-artifacts:latest docker digest to f7d0c51
...
Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2025-07-14 20:13:34 +00:00
Daniel J Walsh
53e38dea8f
Merge pull request #1694 from rhatdan/VERSION
...
Bump to 0.11.0
2025-07-14 10:59:06 -04:00
Daniel J Walsh
bf68cfddd3
Bump to 0.11.0
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-07-14 10:32:32 -04:00
Stephen Smoogen
8ab242f820
Move rpms ( #1693 )
...
* Start adding rpm/ramalama.spec for Fedora
Add a ramalama.spec to sit next to python-ramalama.spec while we get
this reviewed. Change various configs so they are aware of
ramalama.spec
Signed-off-by: Stephen Smoogen <ssmoogen@redhat.com>
* Add needed obsoletes/provides in base rpm to start process.
Signed-off-by: Stephen Smoogen <ssmoogen@redhat.com>
* Try to fix CI problems with initial mr
The initial MR puts two spec files in the same directory which was
causing problems with the CI. This splits them off into different
directories which should allow for the tooling to work.
Signed-off-by: Stephen Smoogen <ssmoogen@redhat.com>
* Finish move of Fedora rpm package to new name.
Put changes into various files needed to allow for new RPM package
`ramalama` to build in Fedora infrastructure versus python3-ramalama.
Signed-off-by: Stephen Smoogen <ssmoogen@redhat.com>
* Fix problem with path names lsm5 caught
Signed-off-by: Stephen Smoogen <ssmoogen@redhat.com>
---------
Signed-off-by: Stephen Smoogen <ssmoogen@redhat.com>
Co-authored-by: Stephen Smoogen <ssmoogen@redhat.com>
2025-07-14 10:13:49 -04:00
Daniel J Walsh
eba46c8df6
Merge pull request #1691 from mbortoli/readme-improvements
...
Readme improvements: Update model's name and improve CUDA_VISIBLE_DEVICES section
2025-07-14 10:03:20 -04:00
Mario Antonio Bortoli Filho
b5826c96e9
README: fix model name and improve CUDA section
...
- Corrected the model name under the Benchmark section; previous name was not available in Ollama's registry.
- Added instructions to switch between CPU-only mode and using all available GPUs via CUDA_VISIBLE_DEVICES.
Signed-off-by: Mario Antonio Bortoli Filho <mario@bortoli.dev>
2025-07-14 09:43:16 -03:00
Daniel J Walsh
066b659f3a
Merge pull request #1689 from containers/pip-install
...
Only install if pyproject.toml exists
2025-07-14 06:07:24 -04:00
Eric Curtin
6d7effadc2
Only install if pyproject.toml exists
...
Otherwise skip
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-07-13 22:13:11 +01:00
Daniel J Walsh
1d2e1a1e01
Merge pull request #1688 from containers/konflux/references/main
...
chore(deps): update konflux references
2025-07-12 06:06:35 -04:00
Daniel J Walsh
a54e2b78c4
Merge pull request #1681 from ramalama-labs/bug/chat-fix
...
Bug/chat fix
2025-07-12 06:05:31 -04:00
red-hat-konflux-kflux-prd-rh03[bot]
f4cec203ac
chore(deps): update konflux references
...
Signed-off-by: red-hat-konflux-kflux-prd-rh03 <206760901+red-hat-konflux-kflux-prd-rh03[bot]@users.noreply.github.com>
2025-07-12 08:02:23 +00:00
Ian Eaves
a616005695
resolve merge conflicts
...
Signed-off-by: Ian Eaves <ian.k.eaves@gmail.com>
2025-07-11 16:47:56 -05:00
Daniel J Walsh
c7c0f7d2e5
Merge pull request #1685 from rhatdan/convert
...
Allow `ramalama rag` to output different formats
2025-07-11 16:18:19 -04:00
Daniel J Walsh
b630fcdea2
Allow ramalama rag to output different formats
...
Add ramalama rag --format option to allow outputing
of markdown, json as well as qdrant databases.
This content can then be used as input to the client tool.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
Co-authored-by: Ian Eaves <ian.k.eaves@gmail.com>
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-07-11 11:04:32 -04:00
Daniel J Walsh
027f88cf31
Merge pull request #1683 from containers/konflux-integration
...
konflux: add integration tests that run in multi-arch VMs
2025-07-10 15:05:03 -04:00
Mike Bonnet
d7ed2216dd
konflux: build entrypoint images on smaller instances
...
The entrypoint image builds are very lightweight, use smaller instances
to reduce resource consumption.
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-07-10 09:19:16 -07:00
Mike Bonnet
6d9a7eea9e
konflux: build rag images on instance types with more disk space
...
-rag builds were failing due to the 40G disk filling up. Run builds on
newly-available "d160" instance types which have 160G of disk space
available.
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-07-10 09:19:16 -07:00
Mike Bonnet
5ebc48f453
konflux: add integration tests that run in multi-arch VMs
...
The integration tests will be triggered after all image builds associated with a single
commit are complete. Tests are currently being run on amd64 and arm64 platforms.
Remove "bats-nocontainer" from the build-time tests, since those are covered by "bats" run
in the integration tests.
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-07-10 09:19:16 -07:00
Mike Bonnet
b6cb2fdbe2
konflux: build all ramalama layered images on arm64
...
Some bats tests need the ramalama-rag image avilable for the current arch. Build
all the ramalama layered images on arm64 as well as amd64.
Switch to building on larger VM instance types to reduce build times and improve
developer feedback and experience.
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-07-10 06:44:58 -07:00
Mike Bonnet
f75599097e
bats: ignore RAMALAMA_IMAGE from the calling environment
...
Some tests parse the output of the ramalama cli and hard-code the location of the expected
default image. However, this output changes based on the value of the RAMALAMA_IMAGE
environment variable, and setting this variable in the calling environment can cause those
tests to fail. Unset the RAMALAMA_IMAGE environment variable in these tests to avoid false failures.
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-07-10 06:44:58 -07:00
Daniel J Walsh
80317bffbc
Merge pull request #1684 from containers/konflux/mintmaker/main/registry.access.redhat.com-ubi9-ubi-9.x
...
chore(deps): update registry.access.redhat.com/ubi9/ubi docker tag to v9.6-1752069608
2025-07-10 07:45:47 -04:00
red-hat-konflux-kflux-prd-rh03[bot]
124afc14bb
chore(deps): update registry.access.redhat.com/ubi9/ubi docker tag to v9.6-1752069608
...
Signed-off-by: red-hat-konflux-kflux-prd-rh03 <206760901+red-hat-konflux-kflux-prd-rh03[bot]@users.noreply.github.com>
2025-07-10 00:02:12 +00:00
Daniel J Walsh
79b23e1237
Merge pull request #1668 from containers/renovate/registry.access.redhat.com-ubi9-ubi-9.x
...
Update registry.access.redhat.com/ubi9/ubi Docker tag to v9.6-1752069608
2025-07-09 16:04:00 -04:00
Daniel J Walsh
5fd301532c
Merge pull request #1679 from containers/bugfix-for-chat
...
Bugfix for chat
2025-07-09 16:03:14 -04:00
Daniel J Walsh
64d53180fd
Merge pull request #1680 from nathan-weinberg/bump-er
...
chore: bump ramalama-stack to 0.2.5
2025-07-09 16:01:09 -04:00
Daniel J Walsh
c0278c1b8c
Merge pull request #1676 from rhatdan/selinux
...
Enable SELinux separation
2025-07-09 16:00:35 -04:00
Nathan Weinberg
e402a456cf
chore: bump ramalama-stack to 0.2.5
...
Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
2025-07-09 15:42:33 -04:00
Eric Curtin
3da38bc7b8
Bugfix for chat
...
This was recently removed:
+ if getattr(self.args, "model", False):
+ data["model"] = self.args.model
it is required
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-07-09 20:13:09 +01:00
Daniel J Walsh
980179d5ca
Enable SELinux separation
...
Remove some unused functions from model.py
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-07-09 13:11:58 -04:00
Daniel J Walsh
657bacb52e
Merge pull request #1675 from rhatdan/image
...
Hide --container option, having --container/--nocontainer is confusing
2025-07-09 09:55:44 -04:00
Daniel J Walsh
09c6ccb2f0
Hide --container option, having --container/--nocontainer is confusing
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-07-09 06:58:04 -04:00
Daniel J Walsh
7f09d4bf5b
Merge pull request #1643 from engelmi/enhance-ref-file
...
Enhance ref file and mount all snapshot files to container
2025-07-08 13:37:45 -04:00
Michael Engel
7a6c9977f7
Disable generate and serve OCI image test
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-07-08 18:36:39 +02:00
Michael Engel
def6116f15
Add deduplication check by file hash to update_snapshot
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-07-08 18:33:42 +02:00
Michael Engel
5e39e11678
Remove limitation of only one model file per snapshot
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-07-08 18:33:42 +02:00
Michael Engel
21e42fc837
Move split model logic to url model class
...
The split model feature was exclusive to URL models. Because of this - and the
improvements in mounting all model snapshot files - the logic has been removed
from the ModelFactory and put to the URL model class.
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-07-08 18:33:42 +02:00
Michael Engel
eefafe24fd
Refactor model classes to mount all snapshot files instead of explicit ones
...
Previously, the model, mmproj and chat template files were mounted explicity if
present using many if-exists checks. Relying on the new ref file all files of that
model snapshot are either mounted or used directly with its blob path. When mounted
into a container, the files are put into MNT_DIR with the respective file names.
The split_model part has been dropped for now, but will be refactored in the next
commit.
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-07-08 18:33:42 +02:00
Michael Engel
6cbaf692aa
Remove obsolete glob check if model exists
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-07-08 18:33:42 +02:00
Michael Engel
129ee175d6
Fixed using model_store instead of store
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-07-08 18:33:42 +02:00
Michael Engel
8e98c77f54
Removed unused functio
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-07-08 18:33:42 +02:00
Michael Engel
e398941913
Remove unused garbage collection function
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-07-08 18:33:42 +02:00
Michael Engel
d95bd13ca0
Replace RefFile by RefJSONFile in model store
...
Replacing the use of RefFile with the new RefJSONFile in model store. It also adds
support for adhoc migration of old to new ref file format.
This will break ramalama as is since no specific functionality for getting the explicit
(gguf) model file path has been implemented. Will be adjusted in the next commit to
fix this.
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-07-08 18:33:42 +02:00
Michael Engel
496439ea02
Added new ref file format
...
Added a new, simpler ref file format serialized as JSON. It also gets additional
fields such as the hash of the file that is used as the name of the blob file.
This essentially makes the snapshot directory and all symlinks obsolete, further
simplifying the storage and improving stability. It also leads to the ref file as
being the single source for all files of a model.
Further refactoring, incl. swapping and migrating from the old to new format, will
follow in subsequent commits.
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-07-08 18:33:42 +02:00
Daniel J Walsh
bf0af8034a
Merge pull request #1673 from containers/mlx-fixes
...
mlx fixes
2025-07-08 11:51:26 -04:00
Daniel J Walsh
99f56a7684
Merge pull request #1669 from rhatdan/image
...
move --image & --keep-groups to run, serve, perplexity, bench commands
2025-07-08 11:49:07 -04:00
Eric Curtin
5b20aa4e2c
mlx fixes
...
mlx_lm.server is the only one in my path at least on my system.
Also, printing output like this which doesn't make sense:
Downloading huggingface://RedHatAI/Llama-3.2-1B-Instruct-FP8-dynamic/model.safetensors:latest ...
Trying to pull huggingface://RedHatAI/Llama-3.2-1B-Instruct-FP8-dynamic/model.safetensors:latest ...
Also remove recommendation to install via `brew install ramalama`, skips installing Apple specific
dependancies.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-07-08 15:57:25 +01:00
Eric Curtin
957dfd52e7
Merge pull request #1672 from containers/revert-1671-set-ramalama-stack-version
...
Revert "feat: allow for dynamic version installing of ramalama-stack"
2025-07-08 14:54:23 +01:00
Eric Curtin
ebb8ea93fd
Revert "feat: allow for dynamic version installing of ramalama-stack"
2025-07-08 14:53:25 +01:00
Daniel J Walsh
7dc3d9da8e
move --image & --keep-groups to run, serve, perplexity, bench commands
...
This eliminates the need for pulling images by accident when not
using containers. Since these commands are only used for container
commands, no need for them in other places.
Fixes: https://github.com/containers/ramalama/issues/1662
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-07-08 09:09:41 -04:00
Daniel J Walsh
72aa795b17
Merge pull request #1666 from engelmi/inspect-add-safetensor-support
...
Inspect add safetensor support
2025-07-08 08:54:15 -04:00
Eric Curtin
2fea5f86f6
Merge pull request #1671 from nathan-weinberg/set-ramalama-stack-version
...
feat: allow for dynamic version installing of ramalama-stack
2025-07-08 13:39:00 +01:00
Michael Engel
412d5616d3
Catch error on creating snapshot and log error
...
Relates to: https://github.com/containers/ramalama/issues/1663
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-07-08 12:57:12 +02:00
Michael Engel
3b880923c0
Added support for safetensors to inspect command
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-07-08 12:57:12 +02:00
Daniel J Walsh
b7c15ce86a
Merge pull request #1664 from containers/konflux/references/main
...
chore(deps): update konflux references
2025-07-08 06:48:46 -04:00
Daniel J Walsh
87287ae574
Merge pull request #1670 from containers/konflux/mintmaker/main/registry.access.redhat.com-ubi9-ubi-9.x
...
Update registry.access.redhat.com/ubi9/ubi Docker tag to v9.6-1751897624
2025-07-08 06:43:49 -04:00
Nathan Weinberg
eeaab7276c
feat: allow for dynamic version installing of ramalama-stack
...
previously we were setting an explicit version of `ramalama-stack`
in the Containerfile restricting what we used at runtime
moved the install to the entrypoint script and allowed the use of
the RAMALAMA_STACK_VERSION env var to install a specific version
(default with no env var installs the latest package and pulls the
YAML files from the main branch)
Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
2025-07-07 21:38:30 -04:00
red-hat-konflux-kflux-prd-rh03[bot]
8104b697dd
Update registry.access.redhat.com/ubi9/ubi Docker tag to v9.6-1751897624
...
Signed-off-by: red-hat-konflux-kflux-prd-rh03 <206760901+red-hat-konflux-kflux-prd-rh03[bot]@users.noreply.github.com>
2025-07-08 00:05:41 +00:00
renovate[bot]
eacaffe03d
Update registry.access.redhat.com/ubi9/ubi Docker tag to v9.6-1751897624
...
Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2025-07-07 19:49:58 +00:00
Daniel J Walsh
21957b22c2
Merge pull request #1661 from ramalama-labs/feat/vision
...
Adds the ability to include vision based context to chat via --rag
2025-07-07 09:36:52 -04:00
Daniel J Walsh
cd7220a3ea
Merge pull request #1667 from rhatdan/VERSION
...
Bump to v0.10.1
2025-07-07 08:19:41 -04:00
Daniel J Walsh
fe3731dffc
Bump to v0.10.1
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-07-07 06:52:23 -04:00
Eric Curtin
0947e11f13
Merge pull request #1665 from rhatdan/pull
...
Make sure errors and progress messages go to STDERR
2025-07-06 14:58:38 +01:00
Daniel J Walsh
ab4d0f2202
Make sure errors and progress messages go to STDERR
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-07-06 07:12:14 -04:00
Daniel J Walsh
c62a2a4e5b
Move download_file to http_client rather then common
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-07-06 06:52:51 -04:00
red-hat-konflux-kflux-prd-rh03[bot]
ee8d7a3a04
chore(deps): update konflux references
...
Signed-off-by: red-hat-konflux-kflux-prd-rh03 <206760901+red-hat-konflux-kflux-prd-rh03[bot]@users.noreply.github.com>
2025-07-05 08:02:20 +00:00
Daniel J Walsh
c9f9f691aa
Merge pull request #1642 from kush-gupt/feat/mlx
...
MLX runtime support
2025-07-04 05:53:41 -04:00
Ian Eaves
fe2d22c848
renamed tests + lint
...
Signed-off-by: Ian Eaves <ian.k.eaves@gmail.com>
2025-07-03 17:47:02 -05:00
Ian Eaves
cba091b265
adds vision to chat
...
Signed-off-by: Ian Eaves <ian.k.eaves@gmail.com>
2025-07-03 17:37:11 -05:00
Kush Gupta
bc92481a66
Fix API request
...
Signed-off-by: Kush Gupta <kushalgupta@gmail.com>
2025-07-03 10:18:26 -04:00
Eric Curtin
149c9f101c
Merge pull request #1660 from telemaco/pre-commit-conf
...
Add .pre-commit-config.yaml
2025-07-03 14:54:26 +01:00
Eric Curtin
06488b45f1
Merge pull request #1637 from rhatdan/store
...
Always use absolute path for --store option
2025-07-03 14:33:44 +01:00
Eric Curtin
4482803eb2
Merge pull request #1657 from containers/konflux-layered-images
...
konflux: add pipelines for the layered images of ramalama, cuda, rocm, and rocm-ubi
2025-07-03 14:32:37 +01:00
Kush Gupta
277cb4f504
make sure host is not in container, dont care about llama.cpp args
...
Signed-off-by: Kush Gupta <kushalgupta@gmail.com>
2025-07-03 09:27:15 -04:00
Kush Gupta
d77b7ce231
mlx runtime with client/server
...
Signed-off-by: Kush Gupta <kushalgupta@gmail.com>
2025-07-03 09:27:00 -04:00
Roberto Majadas
5a51552d1f
Add pre-commit configuration
...
Signed-off-by: Roberto Majadas <rmajadas@redhat.com>
2025-07-03 14:40:49 +02:00
Daniel J Walsh
8501240d43
Merge pull request #1659 from telemaco/lint-and-format-conf-updates
...
Update lint and format tools configuration
2025-07-03 07:12:52 -04:00
Eric Curtin
c791ac1602
Merge pull request #1658 from containers/konflux/mintmaker/main/registry.access.redhat.com-ubi9-ubi-9.x
...
chore(deps): update registry.access.redhat.com/ubi9/ubi docker tag to v9.6-1751445649
2025-07-03 11:45:32 +01:00
Daniel J Walsh
689955480c
Always use absolute path for --store option
...
Fixes: https://github.com/containers/ramalama/issues/1634
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-07-03 06:38:57 -04:00
Roberto Majadas
e5e6195c49
Update lint and format tools configuration
...
Signed-off-by: Roberto Majadas <rmajadas@redhat.com>
2025-07-03 12:32:26 +02:00
Daniel J Walsh
ae38e3f09c
Merge pull request #1632 from ramalama-labs/feat/user-prompt-configs
...
Adds a user configuration setting to disable gpu prompting
2025-07-03 06:30:52 -04:00
Daniel J Walsh
c32d67fd4e
Merge pull request #1635 from containers/list-models
...
Add command to list available models
2025-07-03 06:23:35 -04:00
red-hat-konflux-kflux-prd-rh03[bot]
9c43c0ba71
chore(deps): update registry.access.redhat.com/ubi9/ubi docker tag to v9.6-1751445649
...
Signed-off-by: red-hat-konflux-kflux-prd-rh03 <206760901+red-hat-konflux-kflux-prd-rh03[bot]@users.noreply.github.com>
2025-07-03 00:03:59 +00:00
Ian Eaves
27fa3909a3
adds user prompt controls
...
Signed-off-by: Ian Eaves <ian.k.eaves@gmail.com>
2025-07-02 16:58:37 -05:00
Mike Bonnet
0808cf76b9
konflux: add pipelines for the layered images of ramalama, cuda, rocm, and rocm-ubi
...
Build the -llama-server, -whisper-server, and -rag layered images, which inherit from
the existing ramalama, cuda, rocm, and rocm-ubi images.
Layered images use shared Containerfiles, and customize their builds using --build-arg.
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-07-02 12:19:25 -07:00
Eric Curtin
3a61309e10
Add command to list available models
...
With commands such as:
ramalama chat --url https://generativelanguage.googleapis.com/v1beta/openai --ls
we can now list the various models available.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-07-02 16:52:28 +01:00
Eric Curtin
8dc1144cbd
Merge pull request #1641 from containers/layered-containerfiles
...
build layered images from Containerfiles
2025-07-02 10:19:32 +01:00
Mike Bonnet
46c0154d2a
build layered images from Containerfiles
...
Move the Containerfiles for the entrypoint and rag images out of container_build.sh and into their
own files. This is necessary so they can be built with Konflux.
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-07-01 15:14:26 -07:00
Eric Curtin
e624a41063
Merge pull request #1631 from jbtrystram/fix_quadlet
...
quadlet: add missing privileged options
2025-07-01 22:38:16 +01:00
jbtrystram
412372de9c
quadlet: use shlex to join shell arguments
...
this is to avoid incorrect parsing if arguments contain spaces.
See https://github.com/containers/ramalama/pull/1631#discussion_r2175358681
Signed-off-by: jbtrystram <jbtrystram@redhat.com>
2025-07-01 22:34:06 +02:00
jbtrystram
a3a199664c
quadlet: add missing privileged options
...
The default privileged options were ommited from the generated quadlet
file. Add them using the same argument parsing as in engine.py. [1]
Also add a few base options found in model.py[2] that were missing.
Fixes https://github.com/containers/ramalama/issues/1593
[1] 8341ddcf7b/ramalama/engine.py (L71-L82)
[2] 8341ddcf7b/ramalama/model.py (L205-L223)
Signed-off-by: jbtrystram <jbtrystram@redhat.com>
2025-07-01 22:33:21 +02:00
Daniel J Walsh
58922cd285
Merge pull request #1638 from engelmi/use-config-for-pull-flag-in-accel-image
...
Use config instance for defining pull behavior in accel_image
2025-07-01 14:39:49 -04:00
Daniel J Walsh
5468b1b4c7
Merge pull request #1639 from nathan-weinberg/rlls-0.2.4
...
chore: bump ramalama-stack to 0.2.4
2025-07-01 14:33:15 -04:00
Nathan Weinberg
1dad8284b7
chore: bump ramalama-stack to 0.2.4
...
Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
2025-07-01 14:28:10 -04:00
Daniel J Walsh
fe756ccf70
Merge pull request #1640 from engelmi/split-model-store-into-files
...
Split the model store into multiple files
2025-07-01 14:24:22 -04:00
Michael Engel
d7ecda282b
Added staticmethod annotation
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-07-01 17:45:51 +02:00
Michael Engel
3327df7852
Split the model store into multiple files
...
The source code for the model store is getting bigger, so splitting it
into multiple source files under a directory helps keeping it easier
to read.
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-07-01 17:41:34 +02:00
Michael Engel
4a5724e673
Use config instance for defining pull behavior in accel_image
...
By using the pull field in the config instance for the flag to
indicate pulling of the container image should be attempted in
the accel_image function, the behavior is tied to the cli options.
This also prevents a ramalama ls to seemingly block since the
image is downloaded (with no output).
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-07-01 15:24:54 +02:00
Daniel J Walsh
162e2e5991
Merge pull request #1614 from containers/konflux-tests
...
run tests during build pipelines
2025-07-01 06:55:33 -04:00
Daniel J Walsh
3b11fcf343
Merge pull request #1633 from containers/konflux/mintmaker/main/registry.access.redhat.com-ubi9-ubi-9.x
...
chore(deps): update registry.access.redhat.com/ubi9/ubi docker tag to v9.6-1751287003
2025-07-01 06:37:26 -04:00
Eric Curtin
34eae809b6
Merge pull request #1620 from olliewalsh/store_delete_refcount
...
Fix modelstore deleting logic when multiple reference refer to the same blob/snapshot
2025-07-01 09:42:46 +01:00
red-hat-konflux-kflux-prd-rh03[bot]
1e346cc083
chore(deps): update registry.access.redhat.com/ubi9/ubi docker tag to v9.6-1751287003
...
Signed-off-by: red-hat-konflux-kflux-prd-rh03 <206760901+red-hat-konflux-kflux-prd-rh03[bot]@users.noreply.github.com>
2025-07-01 00:03:13 +00:00
Oliver Walsh
7b211d0aef
Only remove .parial blob file when the snapshot refcount is 0
...
Previously would always remove this partial blob file.
Note: this assumes the blob hash equals the snapshot hash, which
is only true for repos with a single model
Signed-off-by: Oliver Walsh <owalsh@redhat.com>
2025-06-30 23:30:35 +01:00
Oliver Walsh
80fd6d95fe
Handle existing but broken symlink to snapshot file
...
Signed-off-by: Oliver Walsh <owalsh@redhat.com>
2025-06-30 23:11:41 +01:00
Oliver Walsh
69e0929ca0
Add bats tests for pullling llama.cpp multimodal images
...
Signed-off-by: Oliver Walsh <owalsh@redhat.com>
2025-06-30 23:11:41 +01:00
Oliver Walsh
990a7412e8
Fix modelstore deleting logic
...
When deleting a reference, count the remaining references to the
snapshot/blobs to determine if they should be deleted.
Signed-off-by: Oliver Walsh <owalsh@redhat.com>
2025-06-30 23:11:41 +01:00
Mike Bonnet
8b1d2c03cd
konflux: skip checks on PR builds
...
Most of the checks don't (yet) apply to these images, and they add significant time to the builds.
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-06-30 12:11:13 -07:00
Mike Bonnet
36e55002fe
konflux: set PipelineRun timeouts to 6 hours
...
Container builds and tests can take a long time. We'd rather them eventually complete successfully
than fail with a timeout.
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-06-30 12:11:13 -07:00
Mike Bonnet
ee05ed0586
run tests during build pipelines
...
Use the bats container to run a set of Makefile targets to test the code
and images in parallel.
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-06-30 12:11:13 -07:00
Stephen Smoogen
8341ddcf7b
Start process of moving python-ramalama to ramalama ( #1498 )
...
* Start adding rpm/ramalama.spec for Fedora
Add a ramalama.spec to sit next to python-ramalama.spec while we get
this reviewed. Change various configs so they are aware of
ramalama.spec
Signed-off-by: Stephen Smoogen <ssmoogen@redhat.com>
* Add needed obsoletes/provides in base rpm to start process.
Signed-off-by: Stephen Smoogen <ssmoogen@redhat.com>
* Try to fix CI problems with initial mr
The initial MR puts two spec files in the same directory which was
causing problems with the CI. This splits them off into different
directories which should allow for the tooling to work.
Signed-off-by: Stephen Smoogen <ssmoogen@redhat.com>
---------
Signed-off-by: Stephen Smoogen <ssmoogen@redhat.com>
Co-authored-by: Stephen Smoogen <ssmoogen@redhat.com>
2025-06-30 14:51:29 +01:00
Eric Curtin
afbb01760f
Merge pull request #1628 from rhatdan/host
...
Fix handling of --host option when running in a container
2025-06-30 13:58:46 +01:00
Daniel J Walsh
1270b7fba6
Merge pull request #1629 from rhatdan/VERSION
...
Bump to v0.10.0
2025-06-30 08:31:07 -04:00
Daniel J Walsh
8d054ff751
Bump to v0.10.0
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-06-30 08:29:27 -04:00
Daniel J Walsh
67b3d6ebba
Merge pull request #1627 from containers/konflux/references/main
...
chore(deps): update konflux references
2025-06-30 05:14:15 -04:00
Daniel J Walsh
bc561d2597
Merge pull request #1570 from ieaves/feat/file-upload
...
Adds the ability to pass files to `ramalama run`
2025-06-29 05:36:00 -04:00
Ian Eaves
1f03de03f8
Add file upload feature
...
Signed-off-by: Ian Eaves <ian.k.eaves@gmail.com>
2025-06-28 21:18:16 -05:00
Daniel J Walsh
6b13f497fa
Fix handling of --host option when running in a container
...
When you run a Model server within a container and only wanted it bound
to a certain port, the port binding should happen to the container not
inside of the container.
Fixes: https://github.com/containers/ramalama/issues/1572
Also fix handling of -t option, should not be used with anything other
then run command, and now I am not sure of that.
The LLAMA_PROMPT_PREFIX= environment variable should not be set within
containers as an environment variable, since we are doing chat on the
outside.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-06-28 11:48:24 -04:00
red-hat-konflux-kflux-prd-rh03[bot]
f5298105e3
chore(deps): update konflux references
...
Signed-off-by: red-hat-konflux-kflux-prd-rh03 <206760901+red-hat-konflux-kflux-prd-rh03[bot]@users.noreply.github.com>
2025-06-28 08:03:12 +00:00
Daniel J Walsh
7e1d159a3b
Merge pull request #1624 from containers/gemma3n-alias
...
Add gemma aliases
2025-06-27 10:36:26 -04:00
Daniel J Walsh
ca9885ac99
Merge pull request #1623 from containers/bump-llamacpp2
...
Want to pick up support for gemma3n
2025-06-27 10:35:53 -04:00
Eric Curtin
b42eb5762d
Merge pull request #1621 from sarroutbi/202506271328-fix-unit-tests-for-machines-running-gpus
...
Fix unit tests for machines with GPUs
2025-06-27 15:32:46 +01:00
Eric Curtin
089589cdfe
Add gemma aliases
...
The ollama variants are incompatible
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-06-27 15:28:00 +01:00
Eric Curtin
289e682f2a
Want to pick up support for gemma3n
...
And the other latest and greatest llama.cpp features
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-06-27 15:06:32 +01:00
Sergio Arroutbi
8ab3ce3f56
Fix test_common to use expected image
...
Signed-off-by: Sergio Arroutbi <sarroutb@redhat.com>
2025-06-27 16:03:36 +02:00
Sergio Arroutbi
146a5d011a
Fix quadlet tests to pass on a machine with GPU
...
Signed-off-by: Sergio Arroutbi <sarroutb@redhat.com>
2025-06-27 13:44:21 +02:00
Daniel J Walsh
895fb0d1dd
Merge pull request #1588 from rhatdan/llama-stack
...
Fixup to work with llama-stack
2025-06-27 07:22:24 -04:00
Daniel J Walsh
e0108b9d34
Merge pull request #1616 from nathan-weinberg/rlls-0.2.3
...
chore: bump ramalama-stack to 0.2.3
2025-06-27 06:41:28 -04:00
Daniel J Walsh
1c87479aee
Fixes to work with llama-stack
...
Adapt ramalama stack and chat modules for compatibility with llama-stack by updating host binding, argument formatting, and command invocation patterns, and add robust attribute checks in the chat utility.
Bug Fixes:
Add hasattr checks around optional args (pid2kill, name) in chat kills() to prevent attribute errors
Enhancements:
Bind model server to 0.0.0.0 instead of localhost for external accessibility
Convert port, context size, and thread count arguments to strings for consistent CLI usage
Reformat container YAML to use JSON array and multiline args for llama-server and llama-stack commands
Update Containerfile CMD to JSON exec form for llama-stack entrypoint
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-06-27 05:57:45 -04:00
Eric Curtin
b2cd9dc36e
Merge pull request #1610 from rhatdan/url
...
Fix removing of file based URL models
2025-06-27 08:46:40 +01:00
Eric Curtin
faacef5ea5
Merge pull request #1615 from rhatdan/build
...
Free up disk space for building all images
2025-06-27 08:45:55 +01:00
Eric Curtin
a019b91b8a
Merge pull request #1619 from carlwgeorge/zsh-completions
...
Use standard zsh completion directory
2025-06-27 08:43:47 +01:00
Carl George
10cdbfb28d
Use standard zsh completion directory
...
We're currently using /usr/share/zsh/vendor-completions for zsh
completions. However, the RPM macro %{zsh_completions_dir} (which is
required by the Fedora packaging guidelines) is defined as
/usr/share/zsh/site-functions, so let's switch to that.
https://docs.fedoraproject.org/en-US/packaging-guidelines/ShellCompletions/
Signed-off-by: Carl George <carlwgeorge@gmail.com>
2025-06-27 02:07:37 -05:00
Nathan Weinberg
00a5f084b4
chore: bump ramalama-stack to 0.2.3
...
Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
2025-06-26 16:13:21 -04:00
Daniel J Walsh
93d23c93e6
Free up disk space for building all images
...
Were using Podman to build images, so don't futz with Docker.
only build base images, not as necessary to build RAG Images.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-06-26 16:06:54 -04:00
Daniel J Walsh
8c2bc88284
Fix removing of file based URL models
...
Currently we are incorrectly reporting file models as
file://PATH as opposed to the correct file:///PATH.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-06-26 15:44:29 -04:00
Daniel J Walsh
370f1ccc1c
Merge pull request #1611 from ktdreyer/nopull
...
rename "nopull" boolean to "pull"
2025-06-26 14:57:23 -04:00
Daniel J Walsh
c98c3a0cb4
Merge pull request #1612 from containers/konflux-bats
...
konflux: build bats image
2025-06-26 13:27:42 -04:00
Ken Dreyer
f9e6fed54a
rename "nopull" boolean to "pull"
...
Rename "nopull" to "pull" for improved clarity and readability. This
avoids the double-negative, making the logic more straightforward to
reason about. "pull = True" now means "pull the image", "pull = False"
means "don't pull the image."
Signed-off-by: Ken Dreyer <kdreyer@redhat.com>
2025-06-26 13:06:34 -04:00
Mike Bonnet
7f05324a7a
bats: only install ollama on x86_64
...
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-06-26 08:10:45 -07:00
Mike Bonnet
0f4c0fee43
konflux: bats: use shared pipelines
...
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-06-26 07:43:59 -07:00
red-hat-konflux-kflux-prd-rh03
27460c5c97
Red Hat Konflux kflux-prd-rh03 update bats
...
Signed-off-by: red-hat-konflux-kflux-prd-rh03 <konflux@no-reply.konflux-ci.dev>
2025-06-26 14:06:56 +00:00
Daniel J Walsh
aa1e4f1f30
Merge pull request #1603 from slp/pin-copr-mesa
...
container-images: pin mesa version to COPR
2025-06-26 09:38:31 -04:00
Daniel J Walsh
0f90023a52
Merge pull request #1609 from rhatdan/build
...
Separate build image into its own VM
2025-06-26 09:36:59 -04:00
Daniel J Walsh
d4e76d3638
Merge pull request #1598 from containers/bats-container
...
add support for running bats in a container
2025-06-26 09:36:35 -04:00
Eric Curtin
61efb04416
Merge pull request #1605 from rhatdan/chat
...
Switchout hasattr for getattr wherever possible
2025-06-26 14:27:22 +01:00
Eric Curtin
932a1d8c08
Merge pull request #1607 from engelmi/prune-model-store-code
...
Prune model store code
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-06-26 09:21:03 -04:00
Daniel J Walsh
de46cd16c7
Separate build image into its own VM
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-06-26 09:12:27 -04:00
Eric Curtin
9140476c7d
Merge pull request #1607 from engelmi/prune-model-store-code
...
Prune model store code
2025-06-26 13:24:09 +01:00
Sergio Lopez
385a992e2b
container-images: pin mesa version to COPR
...
When building on Fedora systems make sure we install the
mesa version from the COPR, which has the patches to force
alignment to 16K (needed for GPU acceleration on macOS, but
harmless to other systems).
We also need to add "--nobest" to "dnf update" to ensure it
doesn't get frustrated by being unable to install the mesa package
from appstream.
Signed-off-by: Sergio Lopez <slp@redhat.com>
2025-06-26 12:09:05 +02:00
Michael Engel
2f3af6afff
Use property for model store
...
By accessing the model store via property a None-check can be performed
and creating an instance on-the-fly. In addition, this removes the need
for setting the store from the factory and removes its optional trait.
The unit tests for ollama have been rewritten as well since functions
such as repo_pull or exists have been removed. It only tests the pull
function which mocks away http calls to external services.
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-06-26 09:29:03 +02:00
Michael Engel
ef3863904f
Refactoring model base class for new model store
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-06-26 09:29:03 +02:00
Michael Engel
8482cf3957
Remove model flag for safetensors via mscli
...
Relates to: github.com/containers/ramalama/pull/1559
Remove Model flag for safetensor files for now in order to
allow multiple safetensor files be downloaded for the
convert command.
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-06-26 09:29:03 +02:00
Michael Engel
8f578ebf30
Prune old model store code in hf-style models
...
In addition to pruning old model store code, the usage of downloading
files using the hfcli or modelscope cli has been removed.
In the future, the download of multiple files - incl. safetensors - will
be done explicitly based on the metadata only by http requests.
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-06-26 09:29:03 +02:00
Michael Engel
512ccbaba5
Prune old model store code in Ollama model
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-06-26 09:29:03 +02:00
Michael Engel
38f16c42c4
Prune old model store code in URL model
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-06-26 09:29:03 +02:00
Michael Engel
5f688686d8
Remove script for old to new model store migration
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-06-26 09:29:03 +02:00
Michael Engel
51f766d898
Remove --use-model-store feature flag
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-06-26 09:29:03 +02:00
Mike Bonnet
4f479484de
support running all Makefile targets in the bats container
...
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-06-25 21:45:37 -07:00
Mike Bonnet
a651be7832
add support for running bats in a container
...
Add a new "bats" container which is configured to run the bats tests.
The container supports running the standard bats test suite
(container-in-container) as well as the "--nocontainer" tests.
Add two new Makefile targets for running the bats container via podman.
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-06-25 21:45:12 -07:00
Mike Bonnet
77d30733be
make use of /dev/dri optional when serving llama-stack
...
Add the --dri option to disable mounting /dev/dri into the container when running "ramalama serve --api llama-stack".
Update bats test to pass "--dri off".
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-06-25 12:06:30 -07:00
Mike Bonnet
681c488e28
Merge pull request #1608 from containers/konflux-rocm-cuda
...
konflux: use shared pipelines for rocm, rocm-ubi, and cuda
2025-06-25 09:51:21 -07:00
Eric Curtin
f4e929896a
Merge pull request #1606 from containers/fix-text-input
...
Allow std input
2025-06-25 17:07:49 +01:00
Mike Bonnet
7be12487c6
konflux: use shared pipelines for rocm, rocm-ubi, and cuda
...
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-06-25 08:46:11 -07:00
Daniel J Walsh
4b71dafa29
Merge pull request #1599 from containers/konflux-centralize-pipelines
...
konflux: centralize pipeline definitions
2025-06-25 10:37:26 -04:00
Mike Bonnet
ed4879d301
konflux: move Pipeline and PipelineRun definitions into subdirs of .tekton
...
This will simplify management as more components are on-boarded.
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-06-25 07:14:03 -07:00
Eric Curtin
aab36b04d4
Allow std input
...
We used to have this feature, got dropped recently accidentally,
can do things like:
`cat text_file_with_prompt.txt | ramalama run smollm:135m`
or
`cat some_doc | ramalama run smollm:135m Explain this document:`
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-06-25 15:01:48 +01:00
Daniel J Walsh
2526ab6223
Merge pull request #1600 from containers/konflux/mintmaker/main/registry.access.redhat.com-ubi9-ubi-9.x
...
chore(deps): update registry.access.redhat.com/ubi9/ubi docker tag to v9.6-1750786174
2025-06-25 09:04:29 -04:00
Eric Curtin
f70b13c8db
Merge pull request #1602 from rhatdan/timeout
...
Some of our tests are running for hours, need to be timed out
2025-06-25 13:35:50 +01:00
Eric Curtin
82d04a7469
Merge pull request #1601 from rhatdan/chat
...
Missing options of api_key and pid2kill are causing crashes
2025-06-25 13:34:03 +01:00
Daniel J Walsh
951246f228
Missing options of api_key and pid2kill are causing crashes
...
Also add debug information to chat.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-06-25 06:46:31 -04:00
Daniel J Walsh
2ba6f6f167
Some of our tests are running for hours, need to be timed out
...
None of our tests should take more then 1 hour, so time them
out and then need to figure out what is causing the issue.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-06-25 06:34:47 -04:00
red-hat-konflux-kflux-prd-rh03[bot]
18527f87a6
chore(deps): update registry.access.redhat.com/ubi9/ubi docker tag to v9.6-1750786174
...
Signed-off-by: red-hat-konflux-kflux-prd-rh03 <206760901+red-hat-konflux-kflux-prd-rh03[bot]@users.noreply.github.com>
2025-06-25 00:02:06 +00:00
Mike Bonnet
e661d87580
konflux: centralize pipeline definitions
...
Move the pipeline definitions into their own files and references them from the PipelineRuns
that are created on pull request and push. This allows the pipelines to be used for multiple
components and dramatically reduces code duplication and maintenance burden.
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-06-24 17:00:10 -07:00
Daniel J Walsh
dc43419f21
Merge pull request #1595 from rhatdan/fedora
...
Move RamaLama container image to default to fedora:42
2025-06-24 15:56:50 -04:00
Daniel J Walsh
189d722eb7
Move RamaLama container image to default to fedora:42
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-06-24 14:34:59 -04:00
Eric Curtin
788d5564d5
Merge pull request #1578 from containers/gemini
...
API key support
2025-06-24 13:07:35 +01:00
Eric Curtin
fd71bac96a
Merge pull request #1589 from rhatdan/accel
...
Don't pull image when doing ramalama --help call
2025-06-24 12:38:54 +01:00
Daniel J Walsh
1b6b415d0c
Don't pull image when doing ramalama --help call
...
Fixes: https://github.com/containers/ramalama/issues/1587
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-06-23 19:44:14 -04:00
Daniel J Walsh
1ee66c0964
Merge pull request #1576 from rhatdan/chat
...
Remove last libexec program
2025-06-23 13:48:45 -04:00
Daniel J Walsh
6d7bd22ee1
Remove last libexec program
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-06-23 11:47:01 -04:00
Daniel J Walsh
eaa0da253d
Hide --max-model-len from option list
...
This fixes make validate to not complain about --ctx-size option.
No reason to have this available in display, since this is only for
users assuming vllm options.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-06-23 11:44:17 -04:00
Daniel J Walsh
a00188027c
Merge pull request #1586 from rhatdan/VERSION
...
Bump to v0.9.3
2025-06-23 11:23:09 -04:00
Eric Curtin
1465086ded
API key support
...
If we pass --api-key, we can talk to OpenAI providers.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-06-23 14:21:20 +01:00
Daniel J Walsh
a9abe6909d
Bump to v0.9.3
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-06-23 07:01:30 -04:00
Daniel J Walsh
4d49658853
Merge pull request #1579 from nathan-weinberg/rlls-0.2.2
...
chore: bump ramalama-stack to 0.2.2
2025-06-23 06:52:17 -04:00
Daniel J Walsh
693827df74
Merge pull request #1580 from nathan-weinberg/fix-dash
...
fix: broken link in CI dashboard
2025-06-23 06:50:30 -04:00
Nathan Weinberg
50d1a8ccb7
fix: broken link in CI dashboard
...
Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
2025-06-22 21:54:40 -04:00
Nathan Weinberg
bfa4d32af6
chore: bump ramalama-stack to 0.2.2
...
Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
2025-06-22 21:45:15 -04:00
Daniel J Walsh
fe095f1f8d
Merge pull request #1574 from containers/specify-model
...
Make model argument mandatory
2025-06-21 06:25:58 -04:00
Eric Curtin
cb8ab961b5
Make model argument mandatory
...
To be consistent with "ramalama run" experience. Inferencing
servers that have implemented model-swapping require this. In the
case of servers like llama-server that only load one server, any
value is sufficient.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-06-21 09:39:38 +01:00
Eric Curtin
aa29aa6efa
Merge pull request #1571 from kush-gupt/main
...
fix: vLLM serving and model mounting
2025-06-20 15:49:44 +01:00
Kush Gupta
c4ec0a57e0
fix doc validation
...
Signed-off-by: Kush Gupta <kushalgupta@gmail.com>
2025-06-20 10:11:51 -04:00
Kush Gupta
e698424f78
fix doc typo and codespell test
...
Signed-off-by: Kush Gupta <kushalgupta@gmail.com>
2025-06-20 08:59:46 -04:00
Kush Gupta
d0ecd5b65a
alias max model len, improve file mounting logic
...
Signed-off-by: Kush Gupta <kushalgupta@gmail.com>
2025-06-20 08:26:50 -04:00
Daniel J Walsh
3f87444b6f
Merge pull request #1101 from lsm5/tmt-gpu
...
TMT: run tests with GPUs
2025-06-20 06:40:38 -04:00
Daniel J Walsh
cdc1edc13c
Merge pull request #1566 from containers/containers-install-from-checkout
...
install ramalama into containers from the current checkout
2025-06-20 06:34:59 -04:00
Daniel J Walsh
f795b41ed5
Merge pull request #1567 from sarroutbi/202506182026-fix-accel-image-test
...
Fix test_accel unit test to fallback to latest
2025-06-20 06:34:29 -04:00
Sergio Arroutbi
307fd722e6
Fix test_accel unit test to fallback to latest
...
Signed-off-by: Sergio Arroutbi <sarroutb@redhat.com>
2025-06-20 10:29:24 +02:00
Kush Gupta
55b7d568a9
Merge branch 'containers:main' into main
2025-06-19 22:04:53 -04:00
Kush Gupta
847ec6c33f
vllm mount fixes for safetensor directories ( #12 )
...
* vllm mount fixes for safetensor directories
Signed-off-by: Kush Gupta <kushalgupta@gmail.com>
* Update ramalama/model.py for better file detection
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
* make format
Signed-off-by: Kush Gupta <kushalgupta@gmail.com>
* improve mount for files
Signed-off-by: Kush Gupta <kushalgupta@gmail.com>
* fix docs for new vllm param
Signed-off-by: Kush Gupta <kushalgupta@gmail.com>
* add error handling
Signed-off-by: Kush Gupta <kushalgupta@gmail.com>
* fix cli param default implementation
Signed-off-by: Kush Gupta <kushalgupta@gmail.com>
* adjust error message string
Signed-off-by: Kush Gupta <kushalgupta@gmail.com>
* skip broken test
Signed-off-by: Kush Gupta <kushalgupta@gmail.com>
---------
Signed-off-by: Kush Gupta <kushalgupta@gmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-06-19 22:04:36 -04:00
Eric Curtin
5ad0f802ba
Merge pull request #1541 from containers/save-space2
...
Trying to save space
2025-06-20 00:18:05 +01:00
Eric Curtin
6d52980aeb
Merge pull request #1569 from mtrmac/oci-docs
...
Document the image format created/consumed by the oci:// transport
2025-06-19 21:46:18 +01:00
Miloslav Trmač
c63ddbcc64
Document the image format created/consumed by the oci:// transport
...
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2025-06-19 21:53:57 +02:00
Lokesh Mandvekar
a53c42723a
TMT: run tests with GPUs
...
This commit adds TMT test jobs triggered via Packit that fetches an
instance with NVIDIA GPU, specified in `plans/no-rpm.fmf`, and can be
verified in the gpu_info test result.
In addition, system tests (nocontainer), validate, and unit tests are
also triggered via TMT.
Fixes : #1054
TODO:
1. Enable bats-docker tests
2. Resolve f41 validate test failures
Signed-off-by: Lokesh Mandvekar <lsm5@fedoraproject.org>
2025-06-19 15:32:31 -04:00
Eric Curtin
5f75e6f6f4
Trying to save space
...
tiny is is not so tiny, it's 600M
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-06-19 17:37:44 +01:00
Mike Bonnet
ae114e45af
install ramalama into containers from the current checkout
...
Copy the current checkout of the ramalama repo into the containers and use that for installation.
This removes the need for an extra checkout of the ramalama repo, and is consistent with the build
process used by container_build.sh (which used a bind-mount rather than a copy).
This keeps the version of ramalama in sync with the Containerfiles, and makes testing and CI more
useful.
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-06-19 08:37:53 -07:00
Lokesh Mandvekar
66f7c0d110
System tests: account for rootful default store
...
For the rootful case, the default store is at /var/lib/ramalama.
Signed-off-by: Lokesh Mandvekar <lsm5@fedoraproject.org>
2025-06-19 10:17:49 -04:00
Daniel J Walsh
1af46a247b
Merge pull request #1550 from rhatdan/chat
...
Replace ramalama-client-code with ramalama chat
2025-06-19 07:59:42 -04:00
Daniel J Walsh
95a5a14ebf
Replace ramalama-client-code with ramalama chat
...
ramalama chat does not use --context or --temp, these are server
settings not client side.
Also remove ramalama client command, since this is a duplicate of
ramalama chat.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-06-19 07:34:36 -04:00
Daniel J Walsh
6c77edfdee
Merge pull request #1534 from containers/latest-only-makes-sense-ollama
...
:latest tag should not be assumed for non-OCI artefacts
2025-06-18 14:43:40 -04:00
Daniel J Walsh
4e4f5f329c
Merge pull request #1564 from sarroutbi/202506181805-reuse-common-command-execution
...
Reuse code for unit test execution rules
2025-06-18 14:42:18 -04:00
Sergio Arroutbi
628b723dae
Fix test_accel unit test to fallback to latest
...
Signed-off-by: Sergio Arroutbi <sarroutb@redhat.com>
2025-06-18 20:26:46 +02:00
Sergio Arroutbi
ce24886c1d
Reuse code for unit test execution rules
...
Signed-off-by: Sergio Arroutbi <sarroutb@redhat.com>
2025-06-18 18:06:10 +02:00
Eric Curtin
8ff0cd3287
:latest tag should not be assumed for non-OCI artefacts
...
I see people showing things like:
file://srv/llm/modles/unsloth/Qwen3-235B-A22B-GGUF/UD_Q2_K_XL/Qwen3-235B-A22B-UD-Q2_K_XL-00001-of-00002.gguf:latest 1 month ago 46.42 GB
file://srv/llm/modles/unsloth/Qwen3-235B-A22B-GGUF/UD_Q2_K_XL/Qwen3-235B-A22B-UD-Q2_K_XL-00002-of-00002.gguf:latest 1 month ago 35.55 GB
file://srv/llm/modles/unsloth/Qwen3-235B-A22B-GGUF/Q8_0/Qwen3-235B-A22B-Q8_0-00001-of-00006.gguf:latest 1 week ago 46.44 GB
file://srv/llm/modles/unsloth/Qwen3-235B-A22B-GGUF/Q8_0/Qwen3-235B-A22B-Q8_0-00002-of-00006.gguf:latest 1 week ago 46.0 GB
file://srv/llm/modles/unsloth/Qwen3-235B-A22B-GGUF/Q8_0/Qwen3-235B-A22B-Q8_0-00003-of-00006.gguf:latest 1 week ago 45.93 GB
file://srv/llm/modles/unsloth/Qwen3-235B-A22B-GGUF/Q8_0/Qwen3-235B-A22B-Q8_0-00004-of-00006.gguf:latest 1 week ago 46.0 GB
file://srv/llm/modles/unsloth/Qwen3-235B-A22B-GGUF/Q8_0/Qwen3-235B-A22B-Q8_0-00005-of-00006.gguf:latest 1 week ago 46.0 GB
file://srv/llm/modles/unsloth/Qwen3-235B-A22B-GGUF/Q8_0/Qwen3-235B-A22B-Q8_0-00006-of-00006.gguf:latest 1 week ago 2.39 GB
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-06-18 16:53:05 +01:00
Eric Curtin
9df9532ed4
Merge pull request #1562 from sarroutbi/202506181742-add-verbose-rule-for-unit-test-execution
...
Add verbose rule for complete output on unit tests
2025-06-18 16:47:14 +01:00
Sergio Arroutbi
5218906464
Add verbose rule for complete output on unit tests
...
Signed-off-by: Sergio Arroutbi <sarroutb@redhat.com>
2025-06-18 17:43:20 +02:00
Aaron Teo
91dd2df8e5
Merge pull request #1559 from engelmi/do-not-flag-safetensors-as-model
...
Remove Model flag for safetensor files for now
2025-06-18 22:10:42 +08:00
Eric Curtin
f780e41313
Merge pull request #1558 from scraly/patch-1
...
Add install command via homebrew
2025-06-18 14:47:13 +01:00
Michael Engel
ac2ae1e8e9
Remove model flag for safetensor files via hf cli
...
Fixes: https://github.com/containers/ramalama/issues/1557
Remove Model flag for safetensor files for now in order to
allow multiple safetensor files be downloaded for the
convert command.
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-06-18 15:26:33 +02:00
Aurelie Vache
9517fbb90a
feat: add install command via homebrew
...
Signed-off-by: scraly <scraly@gmail.com>
2025-06-18 15:05:35 +02:00
Eric Curtin
67eb9420e1
Merge pull request #1556 from rhatdan/engine
...
Fix default prefix for systems with no engines
2025-06-18 10:19:24 +01:00
Daniel J Walsh
c946769700
Merge pull request #1555 from containers/konflux/mintmaker/main/registry.access.redhat.com-ubi9-ubi-9.x
...
chore(deps): update registry.access.redhat.com/ubi9/ubi docker tag to v9.6-1749542372
2025-06-18 05:09:31 -04:00
Daniel J Walsh
1a5fd28a4d
Fix default prefix for systems with no engines
...
Fixes: https://github.com/containers/ramalama/issues/1552
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-06-18 05:00:21 -04:00
red-hat-konflux-kflux-prd-rh03[bot]
f2ef4d4f6a
chore(deps): update registry.access.redhat.com/ubi9/ubi docker tag to v9.6-1749542372
...
Signed-off-by: red-hat-konflux-kflux-prd-rh03 <206760901+red-hat-konflux-kflux-prd-rh03[bot]@users.noreply.github.com>
2025-06-18 00:03:23 +00:00
Eric Curtin
13b29fab14
Merge pull request #1542 from containers/konflux-ramalama
...
Red Hat Konflux kflux-prd-rh03 update ramalama
2025-06-17 22:23:29 +01:00
Mike Bonnet
df5a093531
konflux: reference the UBI image by digest
...
This will allow MintMaker to submit PRs to update the UBI reference when new versions
are released.
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-06-17 14:09:54 -07:00
Mike Bonnet
6bf454d8ed
konflux: add builds for arm64
...
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-06-17 14:09:50 -07:00
Mike Bonnet
2a9704fb1b
konflux: set path-context to the container-images directory
...
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-06-17 14:09:41 -07:00
red-hat-konflux-kflux-prd-rh03
3dbad48272
Red Hat Konflux kflux-prd-rh03 update ramalama
...
Signed-off-by: red-hat-konflux-kflux-prd-rh03 <konflux@no-reply.konflux-ci.dev>
2025-06-17 14:09:41 -07:00
Daniel J Walsh
eb45f50bda
Merge pull request #1551 from rhatdan/test
...
Create tempdir when run as non-root user
2025-06-17 17:02:46 -04:00
Daniel J Walsh
bbf24ae0e9
Create tempdir when run as non-root user
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-06-17 12:01:08 -04:00
Eric Curtin
13d133490e
Merge pull request #1547 from containers/add_GGML_VK_VISIBLE_DEVICES
...
Add GGML_VK_VISIBLE_DEVICES env var
2025-06-17 12:07:23 +01:00
Daniel J Walsh
aaa6f0f362
Merge pull request #1549 from containers/spaces2tabs
...
Tabs to spaces
2025-06-17 07:04:29 -04:00
Eric Curtin
5fe848eb93
Add GGML_VK_VISIBLE_DEVICES env var
...
Can be used to manually select vulkan device
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-06-17 11:57:16 +01:00
Eric Curtin
10350d61f8
Tabs to spaces
...
github UI showed red, changing just in case, incorrect tabs or
spaces can cause github ui to skip builds.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-06-17 11:55:24 +01:00
Eric Curtin
03110ac2e5
Merge pull request #1548 from rhatdan/test
...
Run bats test with TMPDIR pointing at /mnt/tmp
2025-06-17 11:54:17 +01:00
Daniel J Walsh
f8396fc6bf
Run bats test with TMPDIR pointing at /mnt/tmp
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-06-17 06:47:10 -04:00
Eric Curtin
3f012ba00e
Merge pull request #1502 from alaviss/push-qsrlulqsylxk
...
model: always pass in GPU offloading parameters
2025-06-17 10:20:02 +01:00
Daniel J Walsh
9e2ef6fced
Merge pull request #1544 from containers/add-dnf-update
...
Add dnf update -y to Fedora ROCm build
2025-06-17 05:00:07 -04:00
Eric Curtin
65a08929bb
Add dnf update -y to Fedora ROCm build
...
Trying to fix a compiler issue
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-06-16 21:54:30 +01:00
Daniel J Walsh
f0e799319c
Merge pull request #1539 from containers/dedepu
...
Deduplicate code
2025-06-16 14:30:38 -04:00
Daniel J Walsh
4382641624
Merge pull request #1543 from containers/whisper-downgrade
...
Downgrade whisper
2025-06-16 14:21:38 -04:00
Eric Curtin
e4eca9c059
Downgrade whisper
...
We don't need the latest released version right now
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-06-16 19:20:36 +01:00
Daniel J Walsh
e0e3ee137c
Merge pull request #1537 from rhatdan/VERSION
...
Bump to v0.9.2
2025-06-16 13:40:13 -04:00
Eric Curtin
d62f9d0284
Deduplicate code
...
So there is only one version of this function
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-06-16 18:29:37 +01:00
Eric Curtin
11186fac1d
Merge pull request #1540 from containers/update-podman
...
Upgrade podman
2025-06-16 19:28:19 +02:00
Eric Curtin
3d71a9f7c9
Upgrade podman
...
Use ubuntu plucky repo for podman
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-06-16 18:28:08 +01:00
Eric Curtin
ae54b39c31
Merge pull request #1512 from Hasnep/make-minimum-python-version-consistent
...
Make minimum version of Python consistent
2025-06-16 18:58:37 +02:00
Eric Curtin
7955e292df
Merge pull request #1538 from containers/tabs2spaces
...
Convert tabs to spaces
2025-06-16 17:08:31 +02:00
Eric Curtin
3930d68b8a
Convert tabs to spaces
...
Saw this in github ui
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-06-16 16:06:39 +01:00
Daniel J Walsh
96c28b179a
Bump to v0.9.2
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-06-16 09:56:40 -04:00
Daniel J Walsh
e7ab2cb96b
Merge pull request #1527 from rhatdan/image
...
honor the user specifying the image
2025-06-16 09:55:44 -04:00
Daniel J Walsh
f48293cd85
Merge pull request #1536 from nathan-weinberg/bump-rls
...
chore: bump ramalama-stack to 0.2.1
2025-06-16 09:52:45 -04:00
Nathan Weinberg
257d8597d8
chore: bump ramalama-stack to 0.2.1
...
adds RAG capabilities
Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
2025-06-16 09:37:38 -04:00
Daniel J Walsh
94f3a4e83a
honor the user specifying the image
...
Currently we are ignoreing the user specified image if it does not
contain a ':'
Fixes: https://github.com/containers/ramalama/issues/1525
While I was in the code base, I standardized on container-images for
Fedora to come from quay.io/fedora repo.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-06-16 09:37:00 -04:00
Hannes
fad170d198
Make minimum version of Python consistent
...
Signed-off-by: Hannes <h@nnes.dev>
2025-06-16 21:09:32 +08:00
Daniel J Walsh
de9c7ed89e
Merge pull request #1535 from containers/dont-always-set-up-this-symlink
...
Not sure this is supposed to be here
2025-06-16 08:11:53 -04:00
Eric Curtin
ce11e66dd4
Not sure this is supposed to be here
...
Think it's only meant for the:
container-images/scripts/build-cli.sh
version, it's breaking podman on my bootc system and replacing
/usr/bin/podman with a broken /usr/bin/podman-remote symlink.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-06-16 12:55:57 +01:00
Eric Curtin
acc426bbe1
Merge pull request #1532 from rhatdan/huggingface
...
Suggest using uv pip install to get missing module
2025-06-16 11:21:11 +02:00
Daniel J Walsh
e455d82def
Suggest using uv pip install to get missing module
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-06-16 04:06:23 -04:00
Eric Curtin
2fe2e517be
Merge pull request #1531 from rhatdan/chat
...
Add ramalama chat command
2025-06-15 22:47:47 +02:00
Daniel J Walsh
3cd6a59a76
Apply suggestions from code review
...
Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-06-15 22:16:18 +02:00
Daniel J Walsh
a21fa39b45
Add ramalama chat command
...
For now we will just add the chat command, next PR will remove the
external chat command and just use this internal one.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-06-15 21:56:36 +02:00
Daniel J Walsh
093d5a4cf1
Merge pull request #1488 from ieaves/imp/typed-config
...
Refactor config and arg typing
2025-06-15 03:29:01 -04:00
Daniel J Walsh
c637a404f8
Merge pull request #1523 from containers/change-from
...
Change the FROM for asahi container image
2025-06-15 02:30:10 -04:00
Daniel J Walsh
913c0c2cdf
Merge pull request #1529 from containers/add-colors
...
Add colors to "ramalama serve" if we can
2025-06-15 02:24:32 -04:00
Eric Curtin
ee4ccffb29
Add colors to "ramalama serve" if we can
...
I don't notice any difference but a lot of things are LOG_INFO in
llama.cpp
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-06-15 00:57:41 +01:00
Eric Curtin
7a2c30415a
Merge pull request #1528 from engelmi/add-all-option-to-ls
...
Add --all option to ramalama ls
2025-06-14 18:33:31 +02:00
Michael Engel
68052a156b
Remove unneeded list and type cast
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-06-14 18:11:04 +02:00
Michael Engel
28b4d8a9c0
Add --all option to ramalama ls
...
Relates to: https://github.com/containers/ramalama/issues/1278
By default, ramalama ls should not display partially downloaded
AI Models. In order to enable users to view all models, the new
option --all for the ls command has been introduced.
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-06-14 17:41:11 +02:00
Ian Eaves
cb6226534d
sourcery changes
...
Signed-off-by: Ian Eaves <ian.k.eaves@gmail.com>
2025-06-13 14:56:09 -05:00
Ian Eaves
f6b33ebafd
sourcery sucks
...
Signed-off-by: Ian Eaves <ian.k.eaves@gmail.com>
2025-06-13 14:22:43 -05:00
Ian Eaves
796d7b5782
sourcery changes
...
Signed-off-by: Ian Eaves <ian.k.eaves@gmail.com>
2025-06-13 12:39:27 -05:00
Ian Eaves
91a12887a5
modified ollama-model_pull test
...
Signed-off-by: Ian Eaves <ian.k.eaves@gmail.com>
2025-06-13 11:02:47 -05:00
Ian Eaves
eff6eab2ba
sourcery nits
...
Signed-off-by: Ian Eaves <ian.k.eaves@gmail.com>
2025-06-13 10:44:31 -05:00
Eric Curtin
90f7fe6e79
Change the FROM for asahi container image
...
Explicitly add quay.io
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-06-13 13:57:58 +01:00
Eric Curtin
6556b04df9
Merge pull request #1522 from rhatdan/demo
...
Update to add multi-modal
2025-06-13 14:38:33 +02:00
Daniel J Walsh
9f1faba404
Update to add multi-modal
...
Remove failing on pipe errors, since something the network
can fail and break the demo, it would be better to continue
after failures.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-06-13 14:16:23 +02:00
Daniel J Walsh
2f92ec55c7
Merge pull request #1506 from rhatdan/tty
...
Do not run with --tty when not in interactive mode
2025-06-13 08:13:23 -04:00
Daniel J Walsh
9cc4b7f266
Merge pull request #1517 from kwaa/chore/intel_gpus
...
chore(common/intel_gpus): detect arc a770, a750
2025-06-13 04:26:43 -04:00
藍+85CD
9172e3fb15
chore(common/intel_gpus): detect arc a770, a750
...
Signed-off-by: 藍+85CD <50108258+kwaa@users.noreply.github.com>
2025-06-13 15:31:04 +08:00
Daniel J Walsh
b7555c0e81
Do not run with --tty when not in interactive mode
...
I have found that when running with nvidia the -t (--tty) option
in podman is covering up certain errors. When we are not running
ramalama interactively, we do not need this flag set, and this
would make it easier to diagnose what is going on with users
systems.
Don't add -i unless necessary
Server should not need to be run with --interactive or --tty.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-06-13 09:23:14 +02:00
Daniel J Walsh
7550fd37c8
Merge pull request #1505 from containers/renovate/huggingface-hub-0.x
...
fix(deps): update dependency huggingface-hub to ~=0.33.0
2025-06-13 02:37:15 -04:00
Daniel J Walsh
d583955bdd
Merge pull request #1497 from containers/change-install-script
...
This installs ramalama via uv if python3 version is too old
2025-06-13 02:35:58 -04:00
Daniel J Walsh
87e6d5ece7
Merge pull request #1510 from containers/increase-retry-attempt-v2
...
Wait for upto 16 seconds for model to load
2025-06-13 02:35:22 -04:00
Daniel J Walsh
83363e7814
Merge pull request #1513 from Hasnep/update-black-target-version
...
Update black target version
2025-06-13 02:33:11 -04:00
Daniel J Walsh
7c730e03bf
Merge pull request #1516 from containers/cosmetic
...
For `ramalama ls` shorten huggingface lines
2025-06-13 02:26:37 -04:00
Hannes
1b2867e995
Update black target version to 3.11, 3.12 and 3.13
...
Signed-off-by: Hannes <h@nnes.dev>
2025-06-13 08:04:12 +08:00
Ian Eaves
e87740c06d
sourcery nits
...
Signed-off-by: Ian Eaves <ian.k.eaves@gmail.com>
2025-06-12 18:04:58 -05:00
Ian Eaves
40705263e1
merge
2025-06-12 17:27:21 -05:00
Ian Eaves
0be170bb56
fixing typo
...
Signed-off-by: Ian Eaves <ian.k.eaves@gmail.com>
2025-06-12 17:23:43 -05:00
Ian Eaves
82d24551ca
refactored layered config to preserve previous functionality
...
Signed-off-by: Ian Eaves <ian.k.eaves@gmail.com>
2025-06-12 17:19:35 -05:00
Ian Eaves
5b138bdba5
ollama tests, type fixes, format tests
...
Signed-off-by: Ian Eaves <ian.k.eaves@gmail.com>
2025-06-12 17:19:35 -05:00
Ian Eaves
6f49a310be
unnecessary dep group
...
Signed-off-by: Ian Eaves <ian.k.eaves@gmail.com>
2025-06-12 17:19:35 -05:00
Ian Eaves
0f39374f2b
type and bug fixes
...
Signed-off-by: Ian Eaves <ian.k.eaves@gmail.com>
2025-06-12 17:19:35 -05:00
Ian Eaves
2de9b928d4
sourcery found a few things
...
Signed-off-by: Ian Eaves <ian.k.eaves@gmail.com>
2025-06-12 17:19:35 -05:00
Ian Eaves
47b8b6055c
config rewrite + tests
...
Signed-off-by: Ian Eaves <ian.k.eaves@gmail.com>
2025-06-12 17:19:35 -05:00
Eric Curtin
e9fac56ad5
Merge pull request #1514 from Hasnep/add-python-shebang-files-to-linting
...
Add Python shebang files to linting
2025-06-12 08:49:47 -05:00
Eric Curtin
6196c88713
For `ramalama ls` shorten huggingface lines
...
Substitute huggingface with hf and remove :latest as it doesn't
really apply. huggingface lines are particularly lengthy so it's
welcome characters saved. hf is a common acronym for huggingface
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-06-12 14:26:53 +01:00
Eric Curtin
12b37d60cd
Merge pull request #1511 from engelmi/ignore-rm-of-non-existing-snapshot-dir
...
Ignore errors when removing snapshot directory
2025-06-12 07:42:08 -05:00
Eric Curtin
42b6525187
This installs ramalama via uv if python3 version is too old
...
Lets say in the case of RHEL9.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-06-12 13:33:37 +01:00
Hannes
4712924e78
Fix unformatted Python files
...
Signed-off-by: Hannes <h@nnes.dev>
2025-06-12 20:03:14 +08:00
Hannes
752516fce7
Add Python shebang files to Makefile linting
...
Signed-off-by: Hannes <h@nnes.dev>
2025-06-12 20:03:08 +08:00
Michael Engel
830409e618
Ignore errors when removing snapshot directory on failed creation
...
Relates to: https://github.com/containers/ramalama/issues/1508
remove_snapshot should never fail, therefore adding the ignore_errors=True.
Before removing a snapshot with ramalama rm an existence check is made. If
the model does not exist, an error will be raised to preserve the previous
behavior of that command.
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-06-12 13:31:21 +02:00
Eric Curtin
493d34bd29
Wait for upto 16 seconds for model to load
...
Trying to put this timeout to bed once and for all. There is a
chance a really large model on certain hardware could take more
than 16 seconds to load.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-06-12 09:58:42 +01:00
Daniel J Walsh
e5635d1d14
Merge pull request #1507 from containers/increase-retry-attempt
...
Increase retry attempts to attempt to connect to server
2025-06-12 03:58:14 -04:00
Eric Curtin
22986e0d6a
Increase retry attempts to attempt to connect to server
...
increase i to 512
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-06-12 05:15:41 +01:00
renovate[bot]
75436923b1
fix(deps): update dependency huggingface-hub to ~=0.33.0
...
Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2025-06-11 20:26:34 +00:00
Daniel J Walsh
003612abf7
Merge pull request #1503 from nathan-weinberg/fix-container-dep
...
fix: remove unneeded dependency from Llama Stack container
2025-06-11 00:11:17 -04:00
Daniel J Walsh
d98adcbc9f
Merge pull request #1499 from containers/update-shortnames
...
This is not a multi-model model
2025-06-10 23:43:49 -04:00
Nathan Weinberg
ea9ba184ac
fix: remove unneeded dependency from Llama Stack container
...
`blobfile` dependency is already included in ramalama-stack version 0.2.0
adding it explicitly is unnecessarily
Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
2025-06-10 22:20:38 -04:00
Leorize
6bac6d497a
readme: apply styling suggestions
...
Signed-off-by: Leorize <leorize+oss@disroot.org>
2025-06-10 17:53:49 -05:00
Leorize
5a3e55eb0c
model: always pass in GPU offloading parameters
...
This does nothing on systems with no GPUs, but on Vulkan-capable
systems, this would automatically offload the model to capable
accelerators.
Take this moment to claim Vulkan support in README also.
Signed-off-by: Leorize <leorize+oss@disroot.org>
2025-06-10 17:46:14 -05:00
Eric Curtin
6959d73d30
Merge pull request #1501 from alaviss/push-tumrzqxpzvkn
...
amdkfd: add constants for heap types
2025-06-10 17:41:49 -05:00
Leorize
309766dd8c
amdkfd: add constants for heap types
...
Signed-off-by: Leorize <leorize+oss@disroot.org>
2025-06-10 17:22:30 -05:00
Eric Curtin
4808a49de0
Merge pull request #1500 from alaviss/push-pwxuznmnqptr
...
Only enumerate ROCm-capable AMD GPUs
2025-06-10 17:02:17 -05:00
Leorize
db4a7d24af
Apply formatting fixes
...
Signed-off-by: Leorize <leorize+oss@disroot.org>
2025-06-10 15:20:18 -05:00
Leorize
93e36ac24e
Extract VRAM minimum into a constant
...
Signed-off-by: Leorize <leorize+oss@disroot.org>
2025-06-10 15:17:37 -05:00
Leorize
ecb9fb086f
Extract amdkfd utilities to its own module
...
Signed-off-by: Leorize <leorize+oss@disroot.org>
2025-06-10 15:17:20 -05:00
Leorize
fab87654cb
Only enumerate ROCm-capable AMD GPUs
...
Discover AMD graphics devices using AMDKFD topology instead of
enumerating the PCIe bus. This interface exposes a lot more information
about potential devices, allowing RamaLama to filter out unsupported
devices.
Currently, devices older than GFX9 are filtered, as they are no longer
supported by ROCm.
Signed-off-by: Leorize <leorize+oss@disroot.org>
2025-06-10 14:54:48 -05:00
Eric Curtin
9bc76c2757
This is not a multi-model model
...
Although the other gemma once are. Point the user towards a single
gguf.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-06-10 18:43:06 +01:00
Daniel J Walsh
83a75f16f7
Merge pull request #1492 from containers/renovate/registry.access.redhat.com-ubi9-ubi-9.x
...
chore(deps): update registry.access.redhat.com/ubi9/ubi docker tag to v9.6-1749542372
2025-06-10 08:42:14 -04:00
Daniel J Walsh
8a9f6a0291
Merge pull request #1496 from containers/fix-build
...
Install uv to fix build issue
2025-06-10 08:32:17 -04:00
Eric Curtin
b21556b513
Install uv to fix build issue
...
Run the install-uv.sh script.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-06-10 13:14:56 +01:00
Daniel J Walsh
4be8cbc71e
Merge pull request #1495 from containers/dont-use-llvmpipe
...
There's a change that we want that avoids using software rasterizers
2025-06-10 08:08:50 -04:00
Eric Curtin
b4a3375d94
There's a change that we want that avoids using software rasterizers
...
It avoids using llvmpipe when Vulkan is built in and fallsback to
ggml-cpu.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-06-10 13:05:31 +01:00
Daniel J Walsh
7bdd073b59
Merge pull request #1491 from makllama/xd/fix_hf
...
Fix #1489
2025-06-10 05:25:40 -04:00
renovate[bot]
5b849722cb
chore(deps): update registry.access.redhat.com/ubi9/ubi docker tag to v9.6-1749542372
...
Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2025-06-10 09:22:45 +00:00
Daniel J Walsh
5925bb6908
Merge pull request #1490 from rhatdan/llama-stack
...
Make sure llama-stack URL is shown to user
2025-06-10 05:22:05 -04:00
Xiaodong Ye
ae0775afd1
Address review comments
...
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
2025-06-10 16:45:47 +08:00
Xiaodong Ye
6f020d361c
Fix #1489
...
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
2025-06-10 16:39:26 +08:00
Daniel J Walsh
764fc2d829
Make sure llama-stack URL is shown to user
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-06-10 09:50:04 +02:00
Daniel J Walsh
b64d82276c
Merge pull request #1471 from rhatdan/oci
...
Throw exception when using OCI without engine
2025-06-10 03:36:20 -04:00
Daniel J Walsh
041c05d2b8
Throw exception when using OCI without engine
...
Fixes: https://github.com/containers/ramalama/issues/1463
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-06-10 08:46:01 +02:00
Daniel J Walsh
97a14e9c2d
Merge pull request #1486 from containers/remove-duplicate-line-on-restapi
...
Only print this in the llama-stack case
2025-06-10 00:09:54 -04:00
Eric Curtin
2368da00ac
Only print this in the llama-stack case
...
In the llama.cpp case it doesn't make as much sense, llama-server
prints this string when it's ready to be served like so:
main: server is listening on http://0.0.0.0:8080 - starting the main loop
This can be printed seconds or minutes too early potentially in
the llama.cpp case.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-06-09 15:25:08 +01:00
Daniel J Walsh
c62acfbba6
Merge pull request #1484 from rhatdan/VERSION
...
Bump to v0.9.1
2025-06-09 08:37:35 -04:00
Daniel J Walsh
9c639fc651
Bump to v0.9.1
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-06-09 14:37:05 +02:00
Daniel J Walsh
bbcfb7c0f1
Fix llama-stack
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-06-09 14:37:05 +02:00
Daniel J Walsh
3317372625
Merge pull request #1474 from rhatdan/demos
...
Update demos to show serving models.
2025-06-09 03:35:06 -04:00
Daniel J Walsh
cd2a8c3539
Update demo scripts to show serve
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-06-09 09:34:36 +02:00
Daniel J Walsh
fe6d90461f
Merge pull request #1472 from rhatdan/llama-stack
...
Fix handling of generate with llama-stack
2025-06-09 03:29:53 -04:00
Daniel J Walsh
e4ea40a1b8
Merge pull request #1483 from containers/renovate/huggingface-hub-0.x
...
fix(deps): update dependency huggingface-hub to ~=0.32.4
2025-06-09 00:14:15 -04:00
renovate[bot]
9627b5617b
fix(deps): update dependency huggingface-hub to ~=0.32.4
...
Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2025-06-08 20:35:25 +00:00
Eric Curtin
4a10c02716
Merge pull request #1481 from ieaves/imp/dev-dependency-groups
...
Adds dev dependency groups
2025-06-08 15:34:54 -05:00
Daniel J Walsh
4fe7ae73a1
Fix stopping of llama-stack based containers by name
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-06-08 11:54:24 +02:00
Daniel J Walsh
2ca6b57dc3
Fix handling of generate with llama-stack
...
llama-stack API is not working without --generate command.
Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-06-07 10:36:46 +02:00
Ian Eaves
f65529bda7
adds dev dependency groups
...
Signed-off-by: Ian Eaves <ian.k.eaves@gmail.com>
2025-06-06 18:12:33 -05:00
Nathan Weinberg
268e47ccc0
Merge pull request #1478 from nathan-weinberg/stack-bump
...
chore: bump 'ramalama-stack' version to 0.2.0
2025-06-05 16:15:03 -04:00
Nathan Weinberg
c59a507426
chore: bump 'ramalama-stack' version to 0.2.0
...
Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
2025-06-05 15:00:11 -04:00
Daniel J Walsh
fc9b33e436
Merge pull request #1477 from containers/no-warmup
...
Don't warmup by default
2025-06-05 14:46:30 -04:00
Eric Curtin
8d2041a0bb
Don't warmup by default
...
llama-server by default warms up the model with an empty run for
performance reasons. We can warm up ourselves with a real query.
Warming up was causing issues and delays start time.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-06-05 19:42:41 +01:00
Daniel J Walsh
a67d8c1f6a
Merge pull request #1476 from containers/env-var
...
Call set_gpu_type_env_vars rather than set_accel_env_vars
2025-06-05 14:05:08 -04:00
Eric Curtin
882011029c
Call set_gpu_type_env_vars rather than set_accel_env_vars
...
For GPU detection.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-06-05 16:43:48 +01:00
Daniel J Walsh
f07a062124
Merge pull request #1475 from containers/env-var
...
Do not override a small subset of env vars
2025-06-05 11:00:31 -04:00
Eric Curtin
ff446f96fb
Do not override a small subset of env vars
...
RamaLama does not try to detect GPU if the user has already set
certain env vars. Make this list smaller.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-06-05 14:01:45 +01:00
Daniel J Walsh
ef7bd2a004
Merge pull request #1467 from rhatdan/llama-stack
...
llama-stack container build fails with == 1.5.0
2025-06-05 01:39:04 -04:00
Daniel J Walsh
b990ef0392
Merge pull request #1469 from containers/timeout-change
...
Change timeouts
2025-06-04 20:13:44 -04:00
Eric Curtin
0bcf3b8308
Merge pull request #1468 from waltdisgrace/documentation_improvements
...
Documentation improvements
2025-06-04 11:38:55 -05:00
Eric Curtin
0455e45073
Change timeouts
...
Most we want to sleep between request attempts in 100ms, a request
every 100ms isn't that expensive.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-06-04 17:37:11 +01:00
Grace Chin
c4777a9ccc
Add documentation about running tests
...
Signed-off-by: Grace Chin <gchin@redhat.com>
2025-06-04 11:55:57 -04:00
Daniel J Walsh
56b62ec756
Merge pull request #1466 from makllama/xd/rename
...
Rename: RepoFile=>HFStyleRepoFile, BaseRepository=>HFStyleRepository, BaseRepoModel=>HFStyleRepoModel
2025-06-04 05:28:53 -04:00
Daniel J Walsh
8538e01667
llama-stack container build fails with == 1.5.0
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-06-04 10:58:41 +02:00
Xiaodong Ye
86fbd93e5f
Rename: RepoFile=>HFStyleRepoFile, BaseRepository=>HFStyleRepository, BaseRepoModel=>HFStyleRepoModel
...
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
2025-06-04 09:01:20 +08:00
Daniel J Walsh
31b82b36de
Merge pull request #1465 from nathan-weinberg/stack-lock
...
fix: lock down ramalama-stack version in llama-stack Containerfile
2025-06-03 15:21:00 -04:00
Nathan Weinberg
ae17010390
fix: lock down ramalama-stack version in llama-stack Containerfile
...
Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
2025-06-03 14:16:50 -04:00
Eric Curtin
8056437669
Merge pull request #1464 from taronaeo/chore/rm-else-in-llama-whisper-build
...
chore: remove unclear else from llama and whisper build
2025-06-03 12:19:31 -05:00
Aaron Teo
bbd6afc8e9
chore: remove unclear else from llama and whisper build
...
Ref: https://github.com/containers/ramalama/pull/1459#discussion_r2124350835
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
2025-06-04 00:43:05 +08:00
Eric Curtin
cc73b6bd1e
Merge pull request #1461 from taronaeo/doc/container-build-help
...
docs: update container_build.sh help information
2025-06-03 11:15:10 -05:00
Eric Curtin
cc2970f027
Merge pull request #1459 from taronaeo/feat/s390x-build
...
feat: s390x build commands
2025-06-03 11:13:17 -05:00
Aaron Teo
bf0bfe0761
docs: update container_build.sh help information
...
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
fix: remove -v from print information
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
2025-06-03 23:49:37 +08:00
Aaron Teo
3996f1b4a4
feat: s390x build commands
...
currently it builds correctly on s390x but we want to enforce the
-DGGML_VXE=ON flag. we also want to disable whisper.cpp for now until we
can bring up support for it, otherwise it will be a product that none of us
have experience in.
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
fix: missing s390x for ramalama
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
feat: disable whisper.cpp for s390x
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
chore: remove s390x containerfile
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
2025-06-03 23:38:33 +08:00
Daniel J Walsh
e053285a7c
Merge pull request #1462 from rhatdan/VERSION
...
Bump to v0.9.0
2025-06-03 07:28:33 -04:00
Daniel J Walsh
50df70c48c
Bump to v0.9.0
...
Switching pyproject.toml to python 3.10 since
CANN and MUSE containerfiles only have access to those
versions of python.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-06-03 06:53:08 -04:00
Daniel J Walsh
6d7cfa88a4
Merge pull request #1457 from rhatdan/llama-stack
...
Add support for generating kube.yaml and quadlet/kube files for llama…
2025-06-03 06:51:51 -04:00
Eric Curtin
75b36dc3ba
Merge pull request #1458 from engelmi/snapshot-verification
...
Snapshot verification
2025-06-02 07:47:30 -05:00
Michael Engel
b84527bdd5
Replace exception with explicit is_gguf check
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-06-02 13:11:01 +02:00
Michael Engel
4f53c65386
Improved error handling when creating new snapshot
...
An error when creating new snapshots has only been partially handled
inside the model store and the caller side had to clean up properly.
In order to simplify this, more error handling has been added when
creating new snapshots - removing the (faulty) snapshot, logging and
passing the exception upwards so that the caller can do additional
actions. This ensures that the state remains consistent.
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-06-02 13:11:01 +02:00
Michael Engel
228c985d2c
Moved endianness verification to dedicated step
...
Previously, the endianness check was done for each SnapshotFile and
these files might not be models, but could also be miscellaneous such
as chat templates or other meta data. By removing only the affected file
on a mismatch error the store might get into an inconsistent state since
the cleanup depends on the error handling of the caller.
Therefore, the check for endianness has been moved one layer up and only
checks the flagged model file. In case of a mismatch an implicit removal
of the whole snapshot is triggered.
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-06-02 13:11:01 +02:00
Michael Engel
381d052d55
Extract model endianness into dedicated function
...
By moving the recently improved code to detect the endianness into
a dedicated function, its reusability is increased. Also, a specific
exception class if the model is not in the gguf format has been added.
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-06-02 13:11:01 +02:00
Michael Engel
b6d1eb77a1
Remove unused GGUFEndian members
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-06-02 13:11:01 +02:00
Daniel J Walsh
b218c099e4
Add support for generating kube.yaml and quadlet/kube files for llama-stack
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-06-02 06:03:26 -04:00
Daniel J Walsh
c15a1e31f4
Merge pull request #1451 from rhatdan/selinux
...
Eliminate selinux-policy packages from containers
2025-06-01 06:20:27 -04:00
Eric Curtin
a1cbd017e9
Merge pull request #1456 from makllama/xd/refactoring
...
Refactoring huggingface.py and modelscope.py and extract repo_model_base.py
2025-05-31 11:39:20 -05:00
Eric Curtin
ee7cb50849
Merge pull request #1413 from rhatdan/llama-stack
...
Add support for llama-stack
2025-05-31 11:37:48 -05:00
Xiaodong Ye
816593caf6
Refactoring huggingface.py and modelscope.py and extract repo_model_base.py
...
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
2025-05-31 21:04:16 +08:00
Daniel J Walsh
360f075fed
Eliminate selinux-policy packages from containers
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-05-31 05:28:57 -04:00
Eric Curtin
408ee66000
Merge pull request #1454 from taronaeo/feat/hf-byteswap-on-save
...
feat(model_store): prevent model endianness mismatch on download
2025-05-30 15:08:22 -05:00
Aaron Teo
a8dec56641
feat(model_store): prevent model endianness mismatch on download
...
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
fix(gguf_parser): missed some calls
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
fix(gguf_parser): typo `return` vs `raise`
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
fix(gguf_parser): missing staticmethod declarations
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
feat(model_store): prevent model endianness mismatch on download
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
feat(model_store): prevent downloading of non-native endian models
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
feat(model_store): check file for gguf and verify endianness
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
feat(model_store): add more information on why we deny endian mismatch
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
fix(model_store): linters complaining
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
fix(model_store): linter complaining
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
fix(model_store): linter complaining
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
2025-05-31 03:04:27 +08:00
Daniel J Walsh
a437651934
Add support for llama-stack
...
Add new option --api which allows users to specify the API Server
either llama-stack or none. With None, we just generate a service with
serve command. With `--api llama-stack`, RamaLama will generate an API
Server listening on port 8321 and a openai server listening on port
8080.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-05-30 09:16:52 -04:00
Daniel J Walsh
52204997b2
Merge pull request #1455 from almusil/logging
...
Small logging improvements
2025-05-30 08:01:24 -04:00
Daniel J Walsh
27ec0d05ef
Merge pull request #1452 from taronaeo/fix/gguf-parser-string-endian
...
fix(gguf_parser): fix memoryerror exception when loading non-native models
2025-05-30 05:50:56 -04:00
Ales Musil
7d62050941
Add more logging around HTTP requests.
...
Add more logging to indacate requests to http/https addresses in
debug. This should make it easier to find out what exactly is going
on under the hood mainly for pull command.
Signed-off-by: Ales Musil <amusil@redhat.com>
2025-05-30 08:57:50 +02:00
Ales Musil
4c905a4207
Add global logger and use it in the existing code.
...
Add global logger that can be used to print message to stderr.
Replace all perror calls in dabug cases with logger.debug calls
which reduces the extra argument required to pass as the module
will print error message based on the level.
Signed-off-by: Ales Musil <amusil@redhat.com>
2025-05-30 08:57:50 +02:00
Eric Curtin
fbca7ec238
Merge pull request #1450 from rhatdan/libexec
...
make ramalama-client-core send default model to server
2025-05-30 00:14:46 -05:00
Aaron Teo
1b32a09190
fix(gguf_parser): fix memoryerror exception when loading non-native
...
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
fix(gguf_parser): missed some calls
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
fix(gguf_parser): typo `return` vs `raise`
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
fix(gguf_parser): missing staticmethod declarations
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
2025-05-30 10:05:05 +08:00
Eric Curtin
d1bfe4a18a
Merge pull request #1449 from rhatdan/vulkan1
...
Switch default ramalama image build to use VULKAN
2025-05-29 19:41:23 -05:00
Daniel J Walsh
b26a82c132
make ramalama-client-core send default model to server
...
Also move most of the helper functions into ramalamashell class
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-05-29 16:19:38 -04:00
Daniel J Walsh
b6d5e95e2c
Switch default ramalama image build to use VULKAN
...
podman 5.5 and Podman Desktop have been updated, this
should give us better performance then previous versions
on MAC.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-05-29 09:30:23 -04:00
Daniel J Walsh
8a604e3b13
Merge pull request #1430 from melodyliu1986/melodyliu1986-feature-branch
...
fix(run): Ensure 'run' subcommand works with host proxy settings.
2025-05-29 08:52:48 -04:00
Daniel J Walsh
7c2b21bb25
Merge pull request #1447 from rhatdan/choice
...
Choice could be not set and should not be used
2025-05-29 08:42:21 -04:00
Daniel J Walsh
398309a354
Choice could be not set and should not be used
...
Fixes: https://github.com/containers/ramalama/issues/1445
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-05-29 06:54:54 -04:00
Song Liu
206d669ce7
fix(run): Ensure 'run' subcommand works with host proxy settings.
...
When 'ramalama run' is used with '--network none', it was inheriting host proxy
environment variables. This caused the interanl client to fail when connecting to
the internal llama-server on 127.0.0.1, as it tried to route loopback traffic through
the unreachable proxy.
This change modifies engine.py to:
- Correctly set NO_PROXY/no_proxy for localhost and 127.0.0.1.
- Explicitly unset http_proxy, https_proxy, HTTP_PROXY, and HTTPS_PROXY variables
for the container when the 'run' subcommand is invoked.
This allows the internal client to connect directly to the internal server, resolving
the connection error.
Fixes : #1414
Signed-off-by: Song Liu <soliu@redhat.com>
2025-05-29 15:50:15 +08:00
Daniel J Walsh
859609e59e
Merge pull request #1444 from taronaeo/feat/s390x-build
...
fix(gguf_parser): fix big endian model parsing
2025-05-28 11:22:57 -04:00
Aaron Teo
0bf4f5daf7
refactor(gguf_parser): fix big endian model parsing
...
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
wip(gguf_parser): debug gguf_version
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
wip(gguf_parser): swap endianness for model version
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
wip(gguf_parser): manually set to big endian mode first
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
wip(gguf_parser): refactor endianness read
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
fix(gguf_parser): unable to load big endian models on big-endian
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
wip(gguf_parser): add print statements for debug
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
wip(gguf_parser): pin endianness for now
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
fix(gguf_parser): support big-endian model
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
wip(build_llama_and_whisper): add s390x build flags
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
wip(build_llama_and_whisper): add openblas-openmp dep
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
Revert "wip(build_llama_and_whisper): add openblas-openmp dep"
This reverts commit 375a358d192789cd4651886308cc723e56baf50f.
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
Revert "wip(build_llama_and_whisper): add s390x build flags"
This reverts commit 00fc3ea21b64a9a39226878a0bf194c1b4bc3c41.
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
chore(build_rag): add notification of rag and docling build skip
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
fix(linter): code formatting
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
fix(linter): separately declare variables
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
fix(model_inspect): fix model endianness detection
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
fix(linter): fix code styling
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
fix(model_inspect): circular import for ggufendian
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
fix(endian): missing import
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
2025-05-28 22:32:09 +08:00
Daniel J Walsh
b9171dcf4f
Merge pull request #1442 from olliewalsh/quadlet_duplicate_options
...
Fix quadlet handling of duplicate options
2025-05-28 08:27:43 -04:00
Oliver Walsh
82c45f2171
Fix quadlet handling of duplicate options
...
Re-implement without relying on ConfigParser which does not support duplicate
options.
Extend unit test coverage for this and correct the existing test data.
Signed-off-by: Oliver Walsh <owalsh@redhat.com>
2025-05-28 02:16:31 +01:00
Daniel J Walsh
e26a141f33
Merge pull request #1441 from nathan-weinberg/py-version
...
fix: update references to Python 3.8 to Python 3.11
2025-05-27 13:14:09 -04:00
Nathan Weinberg
31b23a2ff7
fix: update references to Python 3.8 to Python 3.11
...
prev commit made Python 3.11 the min version for
ramalama, but not all references in the project
were updated to reflect this
Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
2025-05-27 11:41:08 -04:00
Daniel J Walsh
69a371c626
Merge pull request #1439 from rhatdan/VERSION
...
Bump to v0.8.5
2025-05-27 08:53:06 -04:00
Daniel J Walsh
b7d45f48aa
Merge pull request #1438 from p5/bump-llama.cpp
...
chore: bump llama.cpp to support tool streaming
2025-05-27 07:34:03 -04:00
Robert Sturla
b3adc7445b
fix(ci): remove aditional unused software during build workflow
...
Signed-off-by: Robert Sturla <robertsturla@outlook.com>
2025-05-27 10:00:05 +01:00
Robert Sturla
d23ed7d7ec
chore: bump llama.cpp to support tool streaming
...
Signed-off-by: Robert Sturla <robertsturla@outlook.com>
2025-05-26 22:02:20 +01:00
Daniel J Walsh
691c235b80
Bump to v0.8.5
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-05-26 06:26:40 -04:00
Daniel J Walsh
90974a49af
Merge pull request #1436 from makllama/xd/mudnn
...
Support Moore Threads GPU #3
2025-05-26 05:58:09 -04:00
Xiaodong Ye
57243bcfb0
musa: switch to mudnn images
...
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
2025-05-26 11:23:53 +08:00
Daniel J Walsh
a1e2fad76f
Merge pull request #1435 from containers/multimodal
...
Don't use jinja in the multimodal case
2025-05-24 05:41:30 -04:00
Eric Curtin
98e15e40db
Don't use jinja in the multimodal case
...
At least with smolvlm the output becomes junk with this option on.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-05-23 18:19:53 +01:00
Eric Curtin
e041cc8e44
Merge pull request #1426 from afazekas/split_file
...
split/big model support for llama.cpp
2025-05-20 08:58:17 -04:00
Daniel J Walsh
64ce8a1018
Merge pull request #1425 from olliewalsh/hftokenauth
...
Add support for Hugging Face token authentication
2025-05-20 08:47:49 -04:00
Daniel J Walsh
e3dc18558a
Merge pull request #1428 from sarroutbi/202505201103-remove-unused-parameters
...
Remove unused parameters from ollama_repo_utils.py
2025-05-20 07:19:57 -04:00
Eric Curtin
157c598568
Merge pull request #1427 from afazekas/bump-bug-rocm
...
Bump llama.cpp to fix rocm bug
2025-05-20 06:13:04 -04:00
Sergio Arroutbi
6e0b3f4361
Remove unused parameters from ollama_repo_utils.py
...
Signed-off-by: Sergio Arroutbi <sarroutb@redhat.com>
2025-05-20 11:03:41 +02:00
Attila Fazekas
14e443dd47
Bump llama.cpp to fix rocm bug
...
Last bump unfortunately bring a bug to rocm/hip support
bumping the version to include the fix.
[0] https://github.com/ggml-org/llama.cpp/issues/13437
Signed-off-by: Attila Fazekas <afazekas@redhat.com>
2025-05-20 10:29:27 +02:00
Attila Fazekas
efa62203eb
split/big model support for llama.cpp
...
Bigger than 70B models typically stored in multiple gguf files
with a special naming what the llama.cpp expects.
Signed-off-by: Attila Fazekas <afazekas@redhat.com>
2025-05-20 10:02:36 +02:00
Oliver Walsh
14a632b081
Only remove a snapshot if we tried to create one
...
Otherwise can remove an existing snapshot due to an unrelated error
e.g HTTP 401 if token auth fails
Signed-off-by: Oliver Walsh <owalsh@redhat.com>
2025-05-19 11:53:51 +01:00
Oliver Walsh
b7b6172626
Use cached huggingface auth token if it exists
...
Signed-off-by: Oliver Walsh <owalsh@redhat.com>
2025-05-19 11:42:04 +01:00
Daniel J Walsh
b04d88e9c4
Merge pull request #1423 from containers/renovate/registry.access.redhat.com-ubi9-ubi-9.x
...
Update registry.access.redhat.com/ubi9/ubi Docker tag to v9.6-1747219013
2025-05-18 08:18:10 -04:00
Daniel J Walsh
2f6b6d1f49
Merge pull request #1424 from containers/add-shortnames
...
Add smolvlm vision models
2025-05-18 08:17:14 -04:00
Eric Curtin
e494a6d924
Add smolvlm vision models
...
For multimodal usage
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-05-18 12:02:01 +01:00
renovate[bot]
7c452618f8
Update registry.access.redhat.com/ubi9/ubi Docker tag to v9.6-1747219013
...
Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2025-05-17 18:09:58 +00:00
Daniel J Walsh
520a8379a2
Merge pull request #1407 from makllama/xd/mthreads
...
Support Moore Threads GPU #1
2025-05-16 13:20:02 -04:00
Daniel J Walsh
9d65a2e546
Merge pull request #1422 from olliewalsh/hf_repo_norm
...
Normalize hf repo quant/tag
2025-05-16 11:29:29 -04:00
Daniel J Walsh
798db33f49
Merge pull request #1420 from rhatdan/except
...
Don't throw Exceptions, be more specific
2025-05-16 11:11:40 -04:00
Oliver Walsh
ab96b97751
Normalize hf repo quant/tag
...
The huggingface repo tag refers to the quantization and is case insensitive.
Normalize this to uppercase.
Fixes : #1421
Signed-off-by: Oliver Walsh <owalsh@redhat.com>
2025-05-16 15:28:59 +01:00
Daniel J Walsh
a27e56cb16
Don't throw Exceptions, be more specific
...
Fixes: https://github.com/containers/ramalama/issues/1419
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-05-16 10:28:04 -04:00
Daniel J Walsh
924f38358d
Merge pull request #1409 from engelmi/add-port-mapping-to-gen
...
Added host:container port mapping to quadlet generation
2025-05-16 09:35:55 -04:00
Daniel J Walsh
ffc9d46dda
Merge pull request #1416 from olliewalsh/multimodal
...
Multimodal/vision support
2025-05-16 09:34:57 -04:00
Eric Curtin
0dc0de1cf1
Merge pull request #1418 from containers/typo2
...
Small typo
2025-05-16 12:52:39 +01:00
Eric Curtin
98f17220f3
Small typo
...
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-05-16 12:52:09 +01:00
Eric Curtin
270608f1c2
Merge pull request #1415 from containers/add-more-debug
...
Add more debug for non starting servers with "ramalama run"
2025-05-16 12:47:14 +01:00
Oliver Walsh
1e34882beb
Omit unused tag when creating ModelScopeRepository instance
...
Co-authored-by: Michael Engel <mengel@redhat.com>
Signed-off-by: Oliver Walsh <owalsh@redhat.com>
2025-05-16 12:24:51 +01:00
Eric Curtin
1bd1b2ae98
Add more debug for non starting servers with "ramalama run"
...
Sometimes the server doesn't start
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-05-16 11:56:41 +01:00
Michael Engel
fff099f130
Validate --port option input
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-05-16 12:18:48 +02:00
Michael Engel
0707857dae
Added host:container port mapping to quadlet generation
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-05-16 12:18:48 +02:00
Oliver Walsh
e9ae198547
Multimodal/vision support
...
Add support for pulling hf repos vs individual models, replicating
the `llama.cpp -hf <model>` logic.
Add support for mmproj file in model store snapshot.
If an mmproj file is available pass it on the llama.cpp command line.
Structure classes to continue support for modelscope as ModelScopeRepository
inherits from HuggingfaceRepository.
Example usage:
$ ramalama serve huggingface://ggml-org/gemma-3-4b-it-GGUF
...
Open webui, upload a picture, ask for a description.
Fixes : #1405
Signed-off-by: Oliver Walsh <owalsh@redhat.com>
2025-05-16 10:40:18 +01:00
Xiaodong Ye
267979fa44
Address review comments
...
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
2025-05-15 19:20:31 +08:00
Xiaodong Ye
b68c6b4c45
Address review comments
...
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
2025-05-15 19:20:31 +08:00
Xiaodong Ye
7e4a0102af
Address review comments
...
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
2025-05-15 19:20:31 +08:00
Xiaodong Ye
d33efcc5ec
Address review comments
...
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
2025-05-15 19:20:31 +08:00
Xiaodong Ye
80f2393283
Support Moore Threads GPU
...
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
2025-05-15 19:20:31 +08:00
Eric Curtin
04032f28c1
Merge pull request #1410 from makllama/xd/mthreads_doc
...
Support Moore Threads GPU #2
2025-05-15 09:44:51 +01:00
Xiaodong Ye
e2faafa68e
Address review comments
...
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
2025-05-15 10:41:09 +08:00
Xiaodong Ye
ae79ab16b2
Add doc for Moore Threads GPU
...
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
2025-05-15 10:26:21 +08:00
Eric Curtin
9e01abf5ef
Merge pull request #1408 from bmahabirbu/ocr-cleanup
...
fix: removed ocr print statement and updated ocr description
2025-05-14 15:41:33 +01:00
Brian
dd6e03f991
fix: removed ocr print statement and updated ocr description
...
Signed-off-by: Brian <bmahabir@bu.edu>
2025-05-14 10:04:57 -04:00
Daniel J Walsh
e8f14a3e5b
Merge pull request #1400 from bmahabirbu/ocr
...
added a docling ocr flag ( text image recognition) flag to address RAM issue
2025-05-14 05:52:35 -04:00
Brian
2b5ee4e7c0
added a docling ocr flag ( text image recognition) flag to address RAM issue
...
Signed-off-by: Brian <bmahabir@bu.edu>
2025-05-13 22:51:21 -04:00
Daniel J Walsh
daa2948ef6
Merge pull request #1399 from rhatdan/VERSION
...
Fix cuda builds installation of python3.11
2025-05-13 11:21:51 -04:00
Daniel J Walsh
5fa60ffb68
Merge pull request #1406 from sarroutbi/202505131640-include-building-containers-in-contributing-md
...
Include additional information in CONTRIBUTING.md
2025-05-13 11:20:57 -04:00
Sergio Arroutbi
2f0813c378
Include additional information in CONTRIBUTING.md
...
Include additional information such as:
- Possibility to generate containers through Makefile
- Possibility to generate coverage reports through Makefile
Signed-off-by: Sergio Arroutbi <sarroutb@redhat.com>
2025-05-13 16:57:08 +02:00
Daniel J Walsh
091147d844
Merge pull request #1404 from sarroutbi/202505131435-include-minor-contributing-md-improvements
...
Add minor CONTRIBUTING.md enhancements
2025-05-13 09:50:22 -04:00
Daniel J Walsh
3490306484
Merge pull request #1403 from sarroutbi/202505131335-increase-cli-coverage
...
Increase cli.py coverage
2025-05-13 09:49:55 -04:00
Sergio Arroutbi
a2a040f830
Increase cli.py coverage
...
Signed-off-by: Sergio Arroutbi <sarroutb@redhat.com>
2025-05-13 15:02:31 +02:00
Sergio Arroutbi
f3cd12dce8
Add minor CONTRIBUTING.md enhancements
...
Signed-off-by: Sergio Arroutbi <sarroutb@redhat.com>
2025-05-13 14:36:23 +02:00
Daniel J Walsh
e67cae5b66
Fix cuda builds installation of python3.11
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-05-13 08:28:20 -04:00
Eric Curtin
c802a33b6b
Merge pull request #1402 from sarroutbi/202505131227-fix-pylint-issues
...
Fix issues reported by pylint for cli.py
2025-05-13 13:06:51 +01:00
Sergio Arroutbi
0030b8ae4c
Fix issues reported by pylint for cli.py
...
Signed-off-by: Sergio Arroutbi <sarroutb@redhat.com>
2025-05-13 13:25:48 +02:00
Daniel J Walsh
1c945ea06d
Merge pull request #1398 from containers/less-paths-added
...
Remove all path additions to this file
2025-05-13 05:43:50 -04:00
Eric Curtin
35dc8aac2f
Remove all path additions to this file
...
This was added when we didn't have good installation techniques
for mac. We have pipx which was not intuitive and a hacked
together shell script as an alternative. Now that we have brew and
uv integrated we don't need this code.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-05-13 10:16:56 +01:00
Eric Curtin
935dc717b1
Merge pull request #1396 from containers/build-fix-2
...
Fix builds
2025-05-13 09:58:46 +01:00
Eric Curtin
36137ac613
Fix builds
...
Use array for list of packages. Move start of script execution
after all function definitions.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-05-13 09:58:19 +01:00
Eric Curtin
a4bcd52d14
Merge pull request #1395 from ieaves/imp/main-error-reporting
...
Using perror in cli.main
2025-05-12 19:54:12 +01:00
Ian Eaves
0238164464
switched cli.main away from print to perror
...
Signed-off-by: Ian Eaves <ian.k.eaves@gmail.com>
2025-05-12 13:22:44 -05:00
Daniel J Walsh
4ad7812185
Merge pull request #1393 from containers/reword-description
...
This script is not macOS only
2025-05-12 12:27:53 -04:00
Daniel J Walsh
e7e8182fec
Merge pull request #1391 from rhatdan/VERSION
...
Bump to 0.8.3
2025-05-12 12:25:16 -04:00
Eric Curtin
f7296d0e23
This script is not macOS only
...
This script works for many platforms and generally does the right
thing.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-05-12 17:07:13 +01:00
Daniel J Walsh
730a09dba3
Merge pull request #1392 from containers/change-install-url
...
Shorten url in README.md
2025-05-12 12:04:39 -04:00
Daniel J Walsh
71872f8ced
Bump to v0.8.3
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-05-12 12:02:34 -04:00
Eric Curtin
c5ea7fc9d1
Shorten url in README.md
...
This is now installable via https://ramalama.ai/install.sh
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-05-12 16:57:43 +01:00
Daniel J Walsh
4f240699da
Merge pull request #1389 from containers/punctuation-consistency-2
...
More de-duplication and consistency
2025-05-12 07:48:12 -04:00
Eric Curtin
24698d1c4b
More de-duplication and consistency
...
After the modelscope changes
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-05-12 12:02:18 +01:00
Daniel J Walsh
beba6bb066
Merge pull request #1371 from engelmi/add-output-path-to-generate
...
Add output path to generate quadlet/kube
2025-05-12 06:21:57 -04:00
Daniel J Walsh
e06824a572
Merge pull request #1381 from makllama/xd/modelscope
...
Add support for modelscope and update doc
2025-05-12 06:17:32 -04:00
Michael Engel
1323b25a7a
Added unit and system test for generate change
...
Added unit tests for new parsing feature of --generate option as
well as for the refactored quadlet file generation. In addition,
a system test has been added to verify the output directory of
the --generate option works as expected.
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-05-12 11:59:38 +02:00
Michael Engel
385a82ab69
Added support for expanding user directory in IniFile and PlainFile
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-05-12 11:59:37 +02:00
Michael Engel
0a90abe25a
Extended --generate option by output directory
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-05-12 11:59:37 +02:00
Michael Engel
56bea00ea7
Refactor kube generation by wrapping in file class
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-05-12 11:59:37 +02:00
Michael Engel
fe19a7384a
Refactor quadlet generation for use of configparser
...
Instead of writing the quadlet string manually, lets use the
configparser from the standard library. A slim wrapper class
has been added as well to simplify the usage of configparser.
In addition, the generated quadlets are not directly written to
file, but instead the inifile instances are returned. This
implies that the caller needs to do the write_to_file call and
enables writing simple unit tests for the generation.
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-05-12 11:59:37 +02:00
Eric Curtin
107cd50e63
Merge pull request #1363 from melodyliu1986/melodyliu1986-feature-branch
...
update the shortnames path according to the shortnames.py
2025-05-12 10:53:20 +01:00
Eric Curtin
2b8cdbbe83
Merge pull request #1387 from makllama/xd/docker_build
...
Fix #1382
2025-05-12 10:44:30 +01:00
Eric Curtin
505061c0af
Merge pull request #1388 from nathan-weinberg/mac-no-bats
...
ci(fix): macOS runner didn't have bats
2025-05-12 10:43:31 +01:00
Song Liu
4beb41aca6
update the shortnames path according to the shortnames.py
...
Signed-off-by: Song Liu <soliu@redhat.com>
2025-05-12 16:01:11 +08:00
Nathan Weinberg
a80a556332
Merge pull request #1386 from containers/punctuation-consistency
...
Punctuation consistency when pulling models
2025-05-11 23:05:43 -04:00
Nathan Weinberg
2a1317936a
ci(fix): macOS runner didn't have bats
...
couldn't run e2e tests
also consolidated all package installs into one step
Signed-off-by: Nathan Weinberg <nathan2@stwmd.net>
2025-05-11 22:55:06 -04:00
Xiaodong Ye
1801378950
Fix #1382
...
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
2025-05-12 10:32:23 +08:00
Xiaodong Ye
34af059f3d
Address issues found by CI
...
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
2025-05-12 09:52:16 +08:00
Eric Curtin
8f112e7c0d
Punctuation consistency when pulling models
...
Around spacing
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-05-11 19:38:18 +01:00
Daniel J Walsh
b183d0e4f5
Merge pull request #1383 from makllama/xd/docker
...
Support older version of Docker
2025-05-11 11:46:09 -04:00
Eric Curtin
fad29e5bac
Merge pull request #1380 from antbbn/patch-1
...
Check nvidia-container-runtime executable also in engine.py
2025-05-11 15:39:57 +01:00
Eric Curtin
4505758ca2
Merge pull request #1384 from nathan-weinberg/more-ci-fixes
...
ci: additional fixes and cleanup for image build jobs
2025-05-11 13:56:49 +01:00
Xiaodong Ye
590199c9dd
Address review comments
...
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
2025-05-11 14:04:03 +08:00
Xiaodong Ye
9e410944cb
Add unit tests
...
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
2025-05-11 13:54:18 +08:00
Xiaodong Ye
1aac29e783
Address review comments
...
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
2025-05-11 11:56:58 +08:00
Nathan Weinberg
f5313251ad
ci: additional fixes and cleanup for image build jobs
...
Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
2025-05-10 16:24:32 -04:00
Xiaodong Ye
d32d6ed6dd
Support older version of Docker
...
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
2025-05-10 20:51:28 +08:00
Xiaodong Ye
9985b9ef75
Format changes for passing CI
...
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
2025-05-10 20:48:37 +08:00
Antonio Bibiano
bff58da5d4
Update test in case nvidia-container-runtime binary is not present
...
Signed-off-by: Antonio Bibiano <antbbn@gmail.com>
2025-05-10 14:37:50 +02:00
Antonio Bibiano
5adaa9b8b8
Check nvidia-container-runtime executable also in engine.py
...
Signed-off-by: Antonio Bibiano <antbbn@gmail.com>
2025-05-10 14:14:24 +02:00
Xiaodong Ye
15da984fe0
Add support for modelscope
...
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
2025-05-10 19:51:08 +08:00
Eric Curtin
177397c346
Merge pull request #1375 from nathan-weinberg/fix-image-jobs
...
ci: modify 'latest' job to only run on release
2025-05-10 12:31:06 +01:00
Daniel J Walsh
6392d3c7e9
Merge pull request #1378 from TristanCacqueray/vision-support
...
Update llama_cpp_sha to the latest version
2025-05-10 06:32:40 -04:00
Tristan Cacqueray
6e4c290ca2
Update llama_cpp_sha to the latest version
...
This change brings vision support to the rpc-server.
Signed-off-by: Tristan Cacqueray <tdecacqu@redhat.com>
2025-05-10 11:07:34 +02:00
Nathan Weinberg
1281879f9f
ci: modify 'latest' job to only run on release
...
Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
2025-05-09 13:10:20 -04:00
Daniel J Walsh
a9f5238082
Merge pull request #1340 from ieaves/feat/standardized-build
...
Remove hardcoded /usr/local site-packages injection to fix sys.path pollution
2025-05-09 10:55:44 -04:00
Daniel J Walsh
d381120860
Merge pull request #1373 from rhatdan/build
...
Make version optional in build
2025-05-09 10:33:29 -04:00
Daniel J Walsh
17d91fbe24
Merge pull request #1372 from nathan-weinberg/ci-tweaks
...
Various CI fixes
2025-05-09 10:15:14 -04:00
Daniel J Walsh
5260ad701a
Make version optional in build
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-05-09 10:14:04 -04:00
Nathan Weinberg
c8e992846f
ci(docs): add CI report matrix
...
Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
2025-05-09 09:28:16 -04:00
Nathan Weinberg
2db912c383
ci(chore): remove incorrect 'Fedora' message from install job
...
also remove some trailing whitespace
Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
2025-05-09 09:27:26 -04:00
Nathan Weinberg
01691fe899
ci(fix): fix regex for CI image job
...
Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
2025-05-09 09:27:26 -04:00
Nathan Weinberg
99bdf6097f
ci(fix): add 'make install-requirements' to 'latest' and 'nightly' jobs
...
Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
2025-05-09 09:27:26 -04:00
Eric Curtin
340c417820
Merge pull request #1370 from jelly/404-urls
...
Update not found urls
2025-05-09 13:17:21 +01:00
Eric Curtin
d740fc6135
Merge pull request #1367 from rhatdan/rag
...
Use python3.11 on systems with older python
2025-05-09 13:16:58 +01:00
Jelle van der Waa
3c253faed6
Update podman markdown links
...
Signed-off-by: Jelle van der Waa <jvanderwaa@redhat.com>
2025-05-09 14:16:14 +02:00
Jelle van der Waa
a08af92c55
Update llama.cpp documentation url
...
Signed-off-by: Jelle van der Waa <jvanderwaa@redhat.com>
2025-05-09 14:09:11 +02:00
Daniel J Walsh
a3beed7d14
Merge pull request #1369 from mcornea/cuda_all_devices
...
Use all GPUs in CUDA_VISIBLE_DEVICES as default
2025-05-09 06:10:56 -04:00
Marius Cornea
2a218f8bce
Use all GPUs in CUDA_VISIBLE_DEVICES as default
...
Currently the CUDA_VISIBLE_DEVICES environment variable defaults to '0'
when it's not overidden by the user. This commit updates it to include all
available GPUs detected by nvidia-smi, allowing the application to
utilize multiple GPUs by default.
Signed-off-by: Marius Cornea <mcornea@redhat.com>
2025-05-09 09:51:36 +03:00
Daniel J Walsh
317103e542
Use python3.11 on systems with older python
...
Fixes: https://github.com/containers/ramalama/issues/1362
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-05-08 13:27:45 -04:00
Eric Curtin
ae57590e66
Merge pull request #1366 from mikebonnet/fix-client-cmd
...
fix "ramalama client"
2025-05-08 16:55:05 +01:00
Eric Curtin
3ab554f155
Merge pull request #1359 from rhatdan/docling
...
Allow docling to handle URLs rather then handling locally
2025-05-08 16:04:46 +01:00
Mike Bonnet
2d7407cc90
fix "ramalama client"
...
get_cmd_with_wrapper() was changed in 849813f8
to accept a single string argument instead
of a list. Update cli.py to pass only the first element of the list.
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-05-08 07:46:19 -07:00
Daniel J Walsh
0e616778eb
Merge pull request #1361 from mikebonnet/rag-build-tweaks
...
small improvements to the build of the ramalama-rag image
2025-05-08 08:46:27 -04:00
Daniel J Walsh
0530c1e6bf
Merge pull request #1364 from sarroutbi/202505081057-extend-tomlparser-coverity
...
Extend TOMLParser coverage to 100%
2025-05-08 08:39:29 -04:00
Daniel J Walsh
b5e6269e81
Merge pull request #1365 from sarroutbi/202505081128-groom-coverage-rules
...
Groom coverage rules, genreate xml/lcov reports
2025-05-08 08:37:01 -04:00
Sergio Arroutbi
1de0c27534
Groom coverage rules, genreate xml/lcov reports
...
Signed-off-by: Sergio Arroutbi <sarroutb@redhat.com>
2025-05-08 11:36:42 +02:00
Sergio Arroutbi
675f302f1c
Extend TOMLParser coverage to 100%
...
Signed-off-by: Sergio Arroutbi <sarroutb@redhat.com>
2025-05-08 11:12:49 +02:00
Daniel J Walsh
b4dc9ad977
Merge pull request #1355 from mcornea/fix_cuda_devices
...
Allow user-defined CUDA_VISIBLE_DEVICES environment variable
2025-05-07 14:18:59 -04:00
Marius Cornea
0db9aac978
Allow user-defined CUDA_VISIBLE_DEVICES environment variable
...
The check_nvidia function was previously overriding any user-defined
CUDA_VISIBLE_DEVICES environment variable with a default value of "0".
This change adds a check to only set CUDA_VISIBLE_DEVICES=0 when it's not
already present in the environment.
Signed-off-by: Marius Cornea <mcornea@redhat.com>
2025-05-07 21:08:55 +03:00
Mike Bonnet
e8415fc4da
build_rag.sh: set the pip installation prefix to /usr
...
This is consistent with how pip installs packages in the base ramalama image.
Remove some redundant package names from docling(), they're already installed in rag().
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-05-07 10:49:38 -07:00
Mike Bonnet
c4d9940e56
install git-core instead of git
...
Avoid pulling in a bunch of unnecessary perl packages.
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-05-07 10:49:38 -07:00
Mike Bonnet
ed5e8a3dce
build_rag.sh: fix logic error when building from a UBI9-based image
...
"$VERSION_ID" is set to "9.5" when building from a UBI9-based image (the default). This fails
the "-ge" test. Check if "$ID" is "fedora" before assuming "$VERSION_ID" is an integer.
If python3.11 is getting installed, also install python3.11-devel explicitly.
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-05-07 10:49:04 -07:00
Daniel J Walsh
8a0f0f2038
Allow docling to handle URLs rather then handling locally
...
Docling has support for pulling html pages, and we were not pulling them
correctly.
Also support --dryrun
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-05-07 13:04:04 -04:00
Daniel J Walsh
30c13c04d1
Merge pull request #1360 from nathan-weinberg/update-lls-container
...
chore: update curl commands in llama-stack Containerfile
2025-05-07 13:00:29 -04:00
Nathan Weinberg
071426e2e7
chore: update curl commands in llama-stack Containerfile
...
Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
2025-05-07 12:05:13 -04:00
Ian Eaves
785c66184b
updated build to remove setup.py dependency to fix cli entrypoint
...
Signed-off-by: Ian Eaves <ian.k.eaves@gmail.com>
removed uv.lock
Signed-off-by: Ian Eaves <ian.k.eaves@gmail.com>
reverts uv-install.sh, bin/ramalama, and flat cli hierarchy
Signed-off-by: Ian Eaves <ian.k.eaves@gmail.com>
packit version extraction from pyproject.toml
Signed-off-by: Ian Eaves <ian.k.eaves@gmail.com>
pyproject.toml references license file
Signed-off-by: Ian Eaves <ian.k.eaves@gmail.com>
fixed completion directory location
Signed-off-by: Ian Eaves <ian.k.eaves@gmail.com>
fixed format and check-format. There is no longer a root .py file to check
Signed-off-by: Ian Eaves <ian.k.eaves@gmail.com>
newline at end of install-uv.sh
Signed-off-by: Ian Eaves <ian.k.eaves@gmail.com>
remove *.py from make lint flake8 command
Signed-off-by: Ian Eaves <ian.k.eaves@gmail.com>
added import for ModelStoreImport to main
Signed-off-by: Ian Eaves <ian.k.eaves@gmail.com>
attempt to consolidate main functions
Signed-off-by: Ian Eaves <ian.k.eaves@gmail.com>
lint
Signed-off-by: Ian Eaves <ian.k.eaves@gmail.com>
Make bin/ramalama executable
Signed-off-by: Ian Eaves <ian.k.eaves@gmail.com>
typo
Signed-off-by: Ian Eaves <ian.k.eaves@gmail.com>
2025-05-07 10:57:31 -05:00
Daniel J Walsh
7cc1052e0c
Merge pull request #1356 from sarroutbi/202505062329-add-test-tomlparser-unit-test
...
Add TOMLParser unit tests
2025-05-07 08:25:41 -04:00
Sergio Arroutbi
feb46d7c5c
Add TOMLParser unit tests
...
Signed-off-by: Sergio Arroutbi <sarroutb@redhat.com>
2025-05-07 12:10:52 +02:00
Eric Curtin
a4cf5cca48
Merge pull request #1358 from sarroutbi/202505062219-install-coverity-tools-and-execute-them
...
Add coverage tools, run them via specific rules
2025-05-07 11:08:29 +01:00
Sergio Arroutbi
70806fa8ab
Add coverage tools, run them via specific rules
...
Added new rules to install/run specific coverity tools:
* install-detailed-cov-requirements: Install basic coverage tools
* install-cov-requirements: Install extended coverage tools
* cov-tests: Execute basic coverage tools
* detailed-cov-tests: Execute extended coverage tools
Signed-off-by: Sergio Arroutbi <sarroutb@redhat.com>
2025-05-07 11:44:59 +02:00
Eric Curtin
279a5ff32c
Merge pull request #1353 from engelmi/use-model-type-instead-of-class-name
...
Use model type instead of class name
2025-05-06 21:44:56 +01:00
Michael Engel
5e5e35b4b5
Use model type instead of class name
...
Relates to: https://github.com/containers/ramalama/issues/1325
Follow-up of: https://github.com/containers/ramalama/pull/1350
Previously, the model_type member of the model store has been set to
the class name of the model, which mapped URL types like http or file
to url. This is now changed to use the model_type property of the
model class. It is, by default, still the inferred class name, except
in the URL class where it gets set to the URL scheme.
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-05-06 15:39:50 +02:00
Daniel J Walsh
ae9c30d50c
Merge pull request #1345 from containers/change-to-cli-serve
...
Use CLI ramalama serve here
2025-05-06 08:51:20 -04:00
Eric Curtin
dfa7ec81db
Merge pull request #1349 from rhatdan/options
...
Consolidate and alphabetize runtime options
2025-05-06 13:50:41 +01:00
Eric Curtin
1496e13108
Use CLI ramalama serve here
...
It's easier to debug, we can do ps -ef, etc. Easier to code also.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-05-06 13:00:50 +01:00
Eric Curtin
da4f3b5489
Merge pull request #1350 from engelmi/fix-partial-model-listing
...
Fix partial model listing
2025-05-06 12:42:49 +01:00
Daniel J Walsh
3f95d053cc
Consolidate and alphabetize runtime options
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-05-06 07:31:06 -04:00
Daniel J Walsh
f57404ff2e
Merge pull request #1352 from sarroutbi/202505061305-minor-typo
...
Fix typo (RAMALAMA_TRANSPORTS->RAMALAMA_TRANSPORT)
2025-05-06 07:28:10 -04:00
Sergio Arroutbi
9beeb29c59
Fix typo (RAMALAMA_TRANSPORTS->RAMALAMA_TRANSPORT)
...
Signed-off-by: Sergio Arroutbi <sarroutb@redhat.com>
2025-05-06 13:06:28 +02:00
Michael Engel
fa4f40f547
Map url:// prefix of model name to URL class
...
Relates to: https://github.com/containers/ramalama/issues/1325
In the list models function only the url:// prefix is present.
Passing a listed model to the factory can not map this model
input correctly to the URL model class. Therefore, this gets
extended and the unit tests updated by appropriate cases.
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-05-06 12:31:44 +02:00
Michael Engel
53e51c72c5
Remove partial postfix from model name
...
Relates to: https://github.com/containers/ramalama/issues/1325
Instead of appending the (partial) identifier directly, the returned
ModelFile class is extended to indicate if the file is partially
downloaded or not.
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-05-06 12:29:41 +02:00
Daniel J Walsh
3813a01e04
Merge pull request #1346 from rhatdan/VERSION
...
Bump to v0.8.2
2025-05-05 11:34:15 -04:00
Daniel J Walsh
982e70d51b
Bump to v0.8.2
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-05-05 10:35:02 -04:00
Daniel J Walsh
969e0f6b36
Merge pull request #1347 from containers/if-run-ramalama-exists
...
Only execute this if /run/ramalama exists
2025-05-05 10:34:52 -04:00
Eric Curtin
800224e6ee
Only execute this if /run/ramalama exists
...
Using this script to install llama.cpp and whisper.cpp bare metal
on a bootc system, the build stops executing here:
+ ln -sf /usr/bin/podman-remote /usr/bin/podman
+ python3 -m pip install /run/ramalama --prefix=/usr
ERROR: Invalid requirement: '/run/ramalama': Expected package name at the start of dependency specifier
/run/ramalama
^
Hint: It looks like a path. File '/run/ramalama' does not exist.
Error: building at STEP "RUN chmod a+rx /usr/bin/build_llama_and_whisper.sh && build_llama_and_whisper.sh "rocm"": while running runtime: exit status 1
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-05-05 14:40:55 +01:00
Eric Curtin
f1668aea45
Merge pull request #1344 from schuellerf/patch-1
...
Update ramalama-cuda.7.md
2025-05-05 13:32:56 +01:00
Florian Schüller
d6a21a9582
Update ramalama-cuda.7.md
...
Looks like a typo?
Signed-off-by: Florian Schüller <florian.schueller@redhat.com>
2025-05-05 11:24:03 +02:00
Eric Curtin
5670fc0d66
Merge pull request #1339 from benoitf/ignore-none
...
fix: ignore <none>:<none> images
2025-05-05 07:13:36 +01:00
Eric Curtin
c4f7aaa953
Merge pull request #1343 from xxiong2021/main
...
according to Commit 1d36b36, the files path was changed
2025-05-05 06:38:03 +01:00
Xiaoqiang Xiong
0b815654cf
according to Commit 1d36b36, the files path was changed
...
Signed-off-by: Xiaoqiang Xiong <xxiong@redhat.com>
2025-05-05 11:31:26 +08:00
Florent Benoit
d43b715d78
fix: ignore <none>:<none> images
...
related to https://github.com/containers/ramalama/issues/904
Signed-off-by: Florent Benoit <fbenoit@redhat.com>
2025-05-04 21:39:08 +02:00
Daniel J Walsh
34f8cf2d50
Merge pull request #1336 from rhatdan/llama-stack
...
INFERENCE_MODEL should be set by the container engine
2025-05-03 06:51:00 -04:00
Daniel J Walsh
d6b3b2da14
INFERENCE_MODEL should be set by the container engine
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-05-02 11:12:56 -04:00
Eric Curtin
449b7b7fbd
Merge pull request #1335 from rhatdan/llama-stack
...
llama stack run should be the CMD not run during build
2025-05-02 12:04:37 +01:00
Daniel J Walsh
e81cab92c4
llama stack run should be the CMD not run during build
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-05-02 06:34:54 -04:00
Daniel J Walsh
1846fb8611
Merge pull request #1328 from containers/add-comment
...
Describe what this test does
2025-05-02 06:11:10 -04:00
Daniel J Walsh
b18030fca3
Merge pull request #1334 from containers/ramalama-shell-fixes
...
RamaLamaShell fixes
2025-05-02 06:10:36 -04:00
Eric Curtin
ba485eec11
Merge pull request #1332 from containers/make-install-more-resiliant
...
Make installer more resilliant
2025-05-02 10:40:36 +01:00
Eric Curtin
995d0b1cd8
RamaLamaShell fixes
...
Was testing this, found some bugs, mainly caused by the recursive
call of cmdloop. Fixed this by using no recursion. Some
refactorings.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-05-02 10:39:16 +01:00
Eric Curtin
8ec74cb553
Merge pull request #1333 from bmahabirbu/mac-fix
...
Fixed mac gpu not being enabled from stale global var check
2025-05-02 09:33:46 +01:00
Eric Curtin
abb2cf47fa
Merge pull request #1329 from containers/mistral-small
...
Add shortnames for mistral-small3.1 model
2025-05-02 09:33:07 +01:00
Brian
8a02b1d0d3
Fixed mac gpu not being enabled from stale global var check
...
Signed-off-by: Brian <bmahabir@bu.edu>
2025-05-02 00:00:20 -04:00
Eric Curtin
37a56dbdbf
Make installer more resilliant
...
This checked in file is an exact copy of:
curl -LsSfO https://astral.sh/uv/0.7.2/install.sh
Checking in the 0.7.2 version, because now a user can install with
access to github.com alone. Even if astral.sh is down for whatever
reason.
We may want to update uv installer from time to time.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-05-01 20:08:28 +01:00
Daniel J Walsh
88a56ab35f
Merge pull request #1331 from rhatdan/llama-stack
...
Fixup use of /.venv
2025-05-01 14:47:18 -04:00
Daniel J Walsh
907cb41315
Fixup use of /.venv
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-05-01 14:38:14 -04:00
Daniel J Walsh
9a7444c6dc
Merge pull request #1330 from nathan-weinberg/container-fix
...
fix: additional fixes for llama-stack Containerfile
2025-05-01 14:35:47 -04:00
Nathan Weinberg
0ed1029e31
fix: additional fixes for llama-stack Containerfile
...
update locations of YAML files and fix typo with 'uv run'
Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
2025-05-01 14:05:48 -04:00
Eric Curtin
2c48af0175
Add shortnames for mistral-small3.1 model
...
Another Ollama model that's only compatible with Ollama's forking
of llama.cpp
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-05-01 17:48:46 +01:00
Eric Curtin
c7b92e1564
Describe what this test does
...
It failed for me once
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-05-01 16:57:32 +01:00
Daniel J Walsh
023f875427
Merge pull request #1327 from rhatdan/docs
...
Expose http line in man pages
2025-05-01 11:46:13 -04:00
Eric Curtin
39c29b2857
Merge pull request #1158 from containers/use-wrapper-everywhere
...
Turn on client/server implementation of run
2025-05-01 15:15:49 +01:00
Daniel J Walsh
e2f382ab62
Expose http line in man pages
...
The llama.cpp documentation links is lost when we convert markdown to
nroff format. This change will expose the link in man pages.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-05-01 09:45:12 -04:00
Eric Curtin
849813f8b7
Turn on client/server implementation of run
...
Now that we've had one release with the wrapper scripts included
in the container images it should be safe to turn this on
everywhere.
Only add libexec for commands that have wrappers
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-05-01 14:13:10 +01:00
Daniel J Walsh
38a57687dd
Merge pull request #1323 from rhatdan/docs
...
Switch all Ramalama to RamaLama
2025-05-01 06:15:01 -04:00
Daniel J Walsh
ee588cecee
Switch all Ramalama to RamaLama
...
Fix Ramalama names that have snuck into the repo.
Cleanup whitespace in README.md doc.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-05-01 06:13:42 -04:00
Daniel J Walsh
9b639c6172
Merge pull request #1320 from arburka/main
...
Update Docs page
2025-05-01 05:31:12 -04:00
Daniel J Walsh
2c4693a3bd
Merge pull request #1319 from arburka/patch-1
...
Updates to ReadMe doc
2025-05-01 05:27:12 -04:00
arburka
1de54a92e3
Content Update to ramalama docs/readme file
...
Signed-off-by: arburka <88330245+arburka@users.noreply.github.com>
2025-04-30 21:46:51 -04:00
arburka
20868bf17b
Updated Ramalama README.md for readability, clarity, and scanability improvments
...
Signed-off-by: arburka <88330245+arburka@users.noreply.github.com>
Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>
2025-04-30 21:24:37 -04:00
Eric Curtin
8f6e135bb2
Merge pull request #1317 from rhatdan/llama-stack
...
Fix up several issue in llama-stack Containerfile
2025-04-30 20:16:18 +01:00
Daniel J Walsh
18333b431d
Fix up several issue in llama-stack Containerfile
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-30 15:12:42 -04:00
Daniel J Walsh
922ac6bc6b
Merge pull request #1314 from nathan-weinberg/lls-container
...
feat: update llama-stack Containerfile to use ramalama-stack
2025-04-30 15:00:58 -04:00
Daniel J Walsh
ad27acb095
Merge pull request #1312 from containers/simplify-install-script
...
Simplify installer
2025-04-30 13:57:48 -04:00
Eric Curtin
f790ae4361
Simplify installer
...
Use uv installer from uv itself
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-04-30 16:38:58 +01:00
Nathan Weinberg
e5dde374b7
feat: update llama-stack Containerfile to use ramalama-stack
...
Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
2025-04-30 11:30:08 -04:00
Daniel J Walsh
21257c3d79
Merge pull request #1311 from dougsland/smi
...
common: adjust nvidia-smi for check cuda version
2025-04-30 09:47:12 -04:00
Douglas Landgraf
393a67b9b0
common: adjust nvidia-smi for check cuda version
...
Be compatible with both versions of nvidia-smi
with version flag or not.
Signed-off-by: Douglas Landgraf <dlandgra@redhat.com>
2025-04-30 09:05:24 -04:00
Daniel J Walsh
db8f30ec18
Merge pull request #1292 from containers/pass-args-to-ramalama-run-core
...
Pass args to ramalama run core
2025-04-30 08:14:46 -04:00
Eric Curtin
bb259ad7af
Pass args to *core scripts
...
Ensure arguments are passed to *core scripts
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-04-30 11:55:44 +01:00
Daniel J Walsh
a13764c363
Merge pull request #1309 from sarroutbi/202504292004-avoid-unused-parameter
...
Avoid unused parameter
2025-04-29 15:49:55 -04:00
Daniel J Walsh
3fd4fdb4d1
Merge pull request #1310 from rhatdan/main
...
Bump to 0.8.1
2025-04-29 15:29:06 -04:00
Daniel J Walsh
e1f84cb1b9
Bump to 0.8.1
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-29 14:15:52 -04:00
Sergio Arroutbi
d1c0eda2aa
Avoid unused parameter
...
Signed-off-by: Sergio Arroutbi <sarroutb@redhat.com>
2025-04-29 20:05:44 +02:00
Daniel J Walsh
b9ffbf8955
Merge pull request #1307 from engelmi/catch-possible-go-to-jinja-error
...
Catch possible error when parsing Go to Jinja template
2025-04-29 12:29:47 -04:00
Eric Curtin
c160628ef8
Merge pull request #1305 from sarroutbi/202504291518-avoid-usage-of-reserved-words
...
Avoid reserved words usage and fix format
2025-04-29 16:53:04 +01:00
Michael Engel
98c2af92f5
Catch possible error when parsing Go to Jinja template
...
Parsing a chat template in Go-syntax to a Jinja template might raise an
exception. Since this is only a nice-to-have feature and we fallback to
the chat template specified in the backend, lets silently skip it.
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-04-29 17:52:40 +02:00
Sergio Arroutbi
4c20cdc392
Avoid reserved words usage and fix format
...
* Avoid reserved words usage such as `hash` or `all`
* Fix format
Signed-off-by: Sergio Arroutbi <sarroutb@redhat.com>
2025-04-29 15:23:00 +02:00
Daniel J Walsh
4a8477c8e2
Merge pull request #1298 from melodyliu1986/melodyliu1986-feature-branch
...
Fix the error: ramalama login can NOT get the value of RAMALAMA_TRANSPORT
2025-04-29 06:44:33 -04:00
Daniel J Walsh
36e2055426
Merge pull request #1299 from rhatdan/pull
...
Report on the use of cached models
2025-04-29 06:43:09 -04:00
Song Liu
d226d1fb17
Fix the error: ramalama login can NOT get the value of env var RAMALAMA_TRANSPORT
...
Signed-off-by: Song Liu <soliu@redhat.com>
2025-04-29 15:49:05 +08:00
Daniel J Walsh
2fba91c28e
Report on the use of cached models
...
Don't attempt to pull a model when inspecting it
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-28 14:48:54 -04:00
Daniel J Walsh
c999b7fbe0
Merge pull request #1301 from rhatdan/VERSION
...
Fix rpm scripts to correct version
2025-04-28 14:26:23 -04:00
Daniel J Walsh
8a060e4611
Fix rpm scripts to correct version
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-28 14:25:04 -04:00
Daniel J Walsh
a963045414
Merge pull request #1284 from rhatdan/VERSION
...
Bump to v0.8.0
2025-04-28 14:20:08 -04:00
Daniel J Walsh
3050d9393f
Merge pull request #1300 from containers/renovate/registry.access.redhat.com-ubi9-ubi-9.x
...
Update registry.access.redhat.com/ubi9/ubi Docker tag to v9.5-1745848351
2025-04-28 14:18:22 -04:00
renovate[bot]
0459d6b79e
Update registry.access.redhat.com/ubi9/ubi Docker tag to v9.5-1745848351
...
Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2025-04-28 16:12:17 +00:00
Daniel J Walsh
381dd738aa
Bump to v0.8.0
...
Stop using installed library version.
Fixes: https://github.com/containers/ramalama/issues/1297
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-28 07:36:34 -04:00
Daniel J Walsh
b4aaba551b
Merge pull request #1294 from rhatdan/cache
...
Make --no-cache optional for make build
2025-04-28 07:24:17 -04:00
Daniel J Walsh
3b2cae4691
Merge pull request #1295 from rhatdan/info
...
Add shortname information to ramalama info
2025-04-28 07:23:55 -04:00
Daniel J Walsh
715dffbb53
Make --no-cache optional for make build
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-28 06:59:13 -04:00
Daniel J Walsh
75600a6c36
Add shortname information to ramalama info
...
Fixes: https://github.com/containers/ramalama/issues/1263
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-27 08:00:25 -05:00
Daniel J Walsh
aefb785494
Merge pull request #1282 from rhatdan/build
...
Use current ramalama directory rather them main from repo
2025-04-26 08:23:05 -04:00
Daniel J Walsh
6c59fd7fd5
Use currenct ramalama directory rather them main from repo
...
This allows users to experiment with content and get it into
container image.
Fixes: https://github.com/containers/ramalama/issues/1274
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-26 07:04:59 -05:00
Daniel J Walsh
193ec04793
Merge pull request #1290 from containers/fix-for-finding-correct-config-file
...
Fix to check correct directory for shortnames file
2025-04-26 07:51:37 -04:00
Daniel J Walsh
cf2cb3b570
Merge pull request #1291 from containers/add-exit
...
Exit ramalama if user types exit
2025-04-26 07:49:39 -04:00
Daniel J Walsh
63eccb105a
Merge pull request #1293 from containers/run-server-ignore-ctrl-c
...
Change CLI behaviour
2025-04-26 07:46:19 -04:00
Eric Curtin
fc78d0bf18
Change CLI behaviour
...
Don't exit on Ctrl-C, cut response short or print an info message
to the user telling them how they may exit.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-04-26 12:23:20 +01:00
Eric Curtin
db6cdb5296
Exit ramalama if user types exit
...
To behave like bash or python3
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-04-25 23:19:01 +01:00
Eric Curtin
7d07701b1a
Fix to check correct directory for shortnames file
...
If installed with uv the correct directory wasn't being looked for
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-04-25 22:27:43 +01:00
Daniel J Walsh
c58f56e112
Merge pull request #1287 from olivergs/workaroundcuda
...
Workaround for CUDA image not pointing to libcuda.so.1 in ld.so.conf
2025-04-25 17:07:25 -04:00
Daniel J Walsh
ae1adcc12f
Merge pull request #1288 from containers/shortnames-gemma3
...
Add gemma3 shortnames
2025-04-25 17:05:03 -04:00
Oliver Gutierrez
a22e209551
Workaround for CUDA image not pointing to libcuda.so.1 in ld.so.conf
...
libcuda.so.1 is located at /usr/local/cuda-12.8/compat and that path
is not in any /etc/ld.so.conf.d/* files.
The workaround is to simply add the path and run ldconfig to make it
available.
Signed-off-by: Oliver Gutierrez <ogutsua@gmail.com>
2025-04-25 19:39:36 +01:00
Eric Curtin
35cbabdb74
Add gemma3 shortnames
...
Otherwise we will pull the incompatible gguf's from Ollama
registry.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-04-25 18:06:06 +01:00
Eric Curtin
7fb4710b2e
Merge pull request #1285 from sarroutbi/202504251610-fix-issues-reported-by-checkmake
...
Fix minor issues
2025-04-25 16:53:13 +01:00
Eric Curtin
0bd07f9aa6
Merge pull request #1286 from sarroutbi/202504251655-use-camel-case-for-consistency
...
Use RamaLama instead of Ramalama for consistency
2025-04-25 16:00:36 +01:00
Sergio Arroutbi
e6e8b1e881
Use RamaLama instead of Ramalama for consistency
...
Signed-off-by: Sergio Arroutbi <sarroutb@redhat.com>
2025-04-25 16:56:01 +02:00
Sergio Arroutbi
837b0c99c2
Fix minor issues
...
- Include test rule in global Makefile
- Use GO variable in doc/Makefile
- Ramalama->RamaLama
Signed-off-by: Sergio Arroutbi <sarroutb@redhat.com>
2025-04-25 16:53:03 +02:00
Eric Curtin
cde9e1ae48
Merge pull request #1283 from engelmi/sanitize-filename-on-migrate
...
Remove model tag from file name on migration
2025-04-25 14:58:55 +01:00
Michael Engel
a09e711ceb
Remove model tag from file name on migration
...
Relates to: https://github.com/containers/ramalama/issues/1278
Remove the model tag including the : symbol from the file name on
migration from the old to new store. Also, rename the sanitize_hash
to sanitize_filename function.
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-04-25 15:24:54 +02:00
Daniel J Walsh
efa333848d
Merge pull request #1281 from containers/revert-1271-spdx
...
Revert "Replace project.license with an SPDX license expression."
2025-04-25 08:53:53 -04:00
Daniel J Walsh
10d314dada
Merge pull request #1280 from containers/cli-advancements
...
Remove hardcodeing to ecurtin $HOME
2025-04-25 08:40:45 -04:00
Eric Curtin
0423c62a29
Merge pull request #1273 from rhatdan/docs
...
Add information on configuring the libkrun machine provider
2025-04-25 13:38:27 +01:00
Eric Curtin
3b60c77e88
Merge pull request #1279 from rhatdan/draftmodel
...
Fix up description of draft model
2025-04-25 13:37:40 +01:00
Eric Curtin
4381b3a799
Revert "Replace project.license with an SPDX license expression."
...
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-04-25 13:35:19 +01:00
Eric Curtin
0af2c37984
Remove hardcodeing to ecurtin $HOME
...
This was left in by mistake
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-04-25 13:19:13 +01:00
Daniel J Walsh
40d1e17e7c
Merge pull request #1270 from jwieleRH/validate
...
Fix formatting as suggested by isort so that "make validate" passes.
2025-04-25 08:03:53 -04:00
Daniel J Walsh
702b44853f
Fix up description of draft model
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-25 07:02:16 -05:00
Eric Curtin
c5908c3cc2
Merge pull request #1255 from afazekas/draft_model
...
Initial draft model support
2025-04-25 12:54:08 +01:00
John Wiele
883534363e
Fix formatting as suggested by isort so that "make validate" passes.
...
Signed-off-by: John Wiele <jwiele@redhat.com>
2025-04-25 07:48:43 -04:00
Daniel J Walsh
c27ab276b9
Add information on configuring the libkrun machine provider
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-25 06:43:03 -05:00
Daniel J Walsh
99639c8e19
Merge pull request #1277 from sarroutbi/202504251255-enhance-pylint-mark
...
Enhance pylint mark in ramalama/cli.py
2025-04-25 07:35:22 -04:00
Sergio Arroutbi
762e20af46
Enhance pylint mark in ramalama/cli.py
...
- Use preferred f-string
- Minimum code refactoring
- Remove unnecessary comments
Signed-off-by: Sergio Arroutbi <sarroutb@redhat.com>
2025-04-25 13:16:52 +02:00
Eric Curtin
b047195d29
Merge pull request #1271 from jwieleRH/spdx
...
Replace project.license with an SPDX license expression.
2025-04-25 11:07:28 +01:00
Eric Curtin
2f1db29e8b
Merge pull request #1272 from rhatdan/VERSION
...
Use --quiet for ramalama version
2025-04-25 11:06:48 +01:00
Eric Curtin
ee621c0e42
Merge pull request #1276 from sarroutbi/202504251018-minor-changes-to-enhance-pylint-results-for-toml-parser
...
Add minor changes to enhance pylint mark
2025-04-25 10:53:50 +01:00
Sergio Arroutbi
e45c731e80
Add minor changes to enhance pylint mark
...
Signed-off-by: Sergio Arroutbi <sarroutb@redhat.com>
2025-04-25 10:18:47 +02:00
Daniel J Walsh
c327936811
Merge pull request #1266 from edmcman/cuda-12.4
...
Automatically pick cuda docker container
2025-04-24 22:58:21 -04:00
Daniel J Walsh
615ea2cd73
Use --quiet for ramalama version
...
Makes figuring out what the version of ramalama is easier.
Fixes: https://github.com/containers/ramalama/issues/1258
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-24 21:56:19 -05:00
John Wiele
def274c681
Replace project.license with an SPDX license expression.
...
`project.license` as a TOML table is deprecated. The new format for
license is a valid SPDX license expression consisting of one or more
license identifiers.
See https://packaging.python.org/en/latest/guides/writing-pyproject-toml/#license .
Signed-off-by: John Wiele <jwiele@redhat.com>
2025-04-24 18:33:34 -04:00
Edward J. Schwartz
83712b7cd8
Fix cuda version logic
...
Signed-off-by: Edward J. Schwartz <edmcman@cmu.edu>
2025-04-24 11:36:30 -04:00
Edward J. Schwartz
3d8804877a
Automatically pick cuda docker container
...
Signed-off-by: Edward J. Schwartz <edmcman@cmu.edu>
2025-04-24 10:53:41 -04:00
Attila Fazekas
d0b983383e
Initial draft model support
...
Allows to pass draft model to serve and fetching it when needed.
'run' does not supports passing draft_model.
You should also pass draft related args tuned to your combination
and do not forget to set the sampling parameters like top_k
on the UI.
Signed-off-by: Attila Fazekas <afazekas@redhat.com>
2025-04-24 14:28:38 +02:00
Eric Curtin
c43013d62d
Merge pull request #1265 from containers/bump-llamacpp
...
We need fixes around CPU support etc.
2025-04-24 13:23:11 +01:00
Eric Curtin
53ad1b58d4
We need fixes around CPU support etc.
...
And other enhancements.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-04-24 12:49:53 +01:00
Daniel J Walsh
e1f8fb8b6b
Merge pull request #1254 from rhatdan/pull
...
Change default testing to use --pull=missing
2025-04-23 16:20:08 -04:00
Daniel J Walsh
01b5014f7c
Change default testing to use --pull=missing
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-23 14:29:12 -04:00
Daniel J Walsh
173ebead83
Merge pull request #1189 from engelmi/set-model-store-as-default
...
Set model store as default
2025-04-23 14:04:08 -04:00
Daniel J Walsh
de8eeaee79
Merge pull request #1256 from containers/remove-more-values
...
Deleting default values for bug reports
2025-04-23 14:03:41 -04:00
Eric Curtin
441ede951e
Deleting default values for bug reports
...
You have to delete this text in every box or it gets left around,
polluting the issue.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-04-23 18:56:43 +01:00
Eric Curtin
fb9bc738e6
Merge pull request #1236 from rhatdan/engine
...
Move model and rag to use shared Engine implementation
2025-04-23 18:23:46 +01:00
Daniel J Walsh
9fb76b6f73
Move model and rag to use shared Engine implementation
...
Shrink the size of cly.py and model.py by moving all engine
related functions into new engine.py python module.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-23 11:08:55 -04:00
Michael Engel
d8183a85a8
Refactor convert and push cli for model store usage
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-04-23 17:07:24 +02:00
Michael Engel
528da77904
Fixed system tests
...
Fixed system tests which broken by switching from old to new
model store.
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-04-23 17:07:24 +02:00
Michael Engel
754d64b594
Added ref file is None check to run and serve
...
The ref file could be not available, e.g. when running with --dryrun,
so the retrieved ref file instance is None. By checking this we
prevent ramalama from crashing.
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-04-23 17:07:24 +02:00
Michael Engel
9a18d6f312
Pass on KeyError for first remove in OCI model
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-04-23 17:07:24 +02:00
Michael Engel
0a61ff7ef2
Align remove behavior of new to old store
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-04-23 17:07:24 +02:00
Michael Engel
0e2940becd
Use new model store by default
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-04-23 17:07:24 +02:00
Michael Engel
1cda889ca7
Added migrate script to import models to new store
...
The migration is run on each command to import all models
from the old store to the new one. It also removes the old
directories and creating the old structure is prevented.
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-04-23 17:07:24 +02:00
Michael Engel
413354a778
Do not pass chat template to llama.cpp
...
Relates to: https://github.com/containers/ramalama/issues/1202
Passing the chat template file to the model run or serve leads to bad
results recently. As a temporary fix the template is not passed to the
model run.
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-04-23 17:07:23 +02:00
Michael Engel
f90e2087d3
Check for model to exist when ensuring chat template
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-04-23 17:07:23 +02:00
Michael Engel
fbc37b06bd
Update ref file list when files to download are not found
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-04-23 17:07:23 +02:00
Michael Engel
c8b151fadd
Added model name to base directory path in model store
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-04-23 17:07:23 +02:00
Eric Curtin
f7d39c4dcf
Merge pull request #1253 from rhatdan/oci
...
Verify OCI image format in rag command
2025-04-23 13:58:07 +01:00
Daniel J Walsh
50287a6341
Merge pull request #1250 from containers/jinja
...
Allow jinja argument
2025-04-23 08:05:17 -04:00
Daniel J Walsh
00eae2130b
Merge pull request #1252 from containers/add-awk
...
Add gawk
2025-04-23 08:03:52 -04:00
Eric Curtin
8e42df4615
Add gawk
...
For awk binary
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-04-23 12:33:41 +01:00
Daniel J Walsh
23e039385e
Verify OCI image format in rag command
...
Fixes: https://github.com/containers/ramalama/issues/1244
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-23 07:30:04 -04:00
Daniel J Walsh
425e1a8cc2
Merge pull request #1246 from bmahabirbu/preload-docling
...
preload docling models
2025-04-23 06:49:40 -04:00
Daniel J Walsh
caa2f082cb
Merge pull request #1249 from containers/simplify-github-issues
...
Deleting default values for issues
2025-04-23 06:48:38 -04:00
Eric Curtin
f4a2dd2d86
Allow jinja argument
...
It's on everywhere anyway, this just ensures these wrapper scripts
don't crash if jinja gets passed.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-04-23 11:48:01 +01:00
Eric Curtin
7417a56b80
Deleting default values for issues
...
You have to delete this text in every box or it gets left around,
polluting the issue.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-04-23 11:30:35 +01:00
Eric Curtin
b3008dd95c
Merge pull request #1245 from nathan-weinberg/bot-docs
...
docs: update CONTRIBUTING.md section on bots
2025-04-23 10:29:18 +01:00
Eric Curtin
604697f71d
Merge pull request #1243 from rhatdan/docling
...
support AI Models environment variables doc2rag/rag_framework
2025-04-23 10:28:11 +01:00
Nathan Weinberg
329f26acb8
docs: update CONTRIBUTING.md section on bots
...
Signed-off-by: Nathan Weinberg <nathan2@stwmd.net>
2025-04-22 21:26:28 -04:00
Brian
3522a2d709
preload docling models
...
Signed-off-by: Brian <bmahabir@bu.edu>
2025-04-22 20:53:02 -04:00
Daniel J Walsh
72678b983d
support AI Models environment variables doc2rag/rag_framework
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-22 14:12:37 -04:00
Daniel J Walsh
e9a1ed392e
Merge pull request #1242 from containers/client-server-fix-containers
...
Fix for using client/server version of "ramalama run"
2025-04-22 13:38:01 -04:00
Eric Curtin
e387d3ddc2
Fix for using client/server version of "ramalama run"
...
When using containers we need this check so the code doesn't start
going into some pulling logic.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-04-22 17:49:53 +01:00
Daniel J Walsh
6768890a82
Merge pull request #1241 from afazekas/jinja-run
...
Always use --jinja with run too
2025-04-22 11:18:53 -04:00
Attila Fazekas
47eaa922b8
--temp 0 in test
...
temp 0 significatly reduces the sampling making unexpected output,
in many cases it makes the inference to always produce the same
output. Small modles are likely to get into a loop unless
the sampling is tuned.
Signed-off-by: Attila Fazekas <afazekas@redhat.com>
2025-04-22 17:00:29 +02:00
Daniel J Walsh
b261cb57a8
Merge pull request #1235 from rhatdan/cuda
...
Allow building older versions of cuda
2025-04-22 10:49:21 -04:00
Attila Fazekas
486c27da81
Always use --jinja with run too
...
`ramalama serve` already using --jinja by default,
`ramalama run` should do it too.
fixes : #1212
Signed-off-by: Attila Fazekas <afazekas@redhat.com>
2025-04-22 15:48:46 +02:00
Eric Curtin
4e77444a46
Merge pull request #1240 from afazekas/sai-1239
...
Fix Typo in Clustering Placeholder Comment
2025-04-22 12:47:10 +01:00
Attila Fazekas
cd9b51415b
Fix Typo in Clustering Placeholder Comment
...
fixes : #1239
Signed-off-by: Attila Fazekas <afazekas@redhat.com>
2025-04-22 13:32:51 +02:00
Eric Curtin
da3dfea987
Merge pull request #1237 from melodyliu1986/melodyliu1986-feature-branch
...
Fix: huggingface logout doesn't use token
2025-04-22 12:14:50 +01:00
Eric Curtin
51040c5ca5
Merge pull request #1238 from afazekas/rpc_tmp_cleint
...
ad-hoc llama clustering option
2025-04-22 12:14:26 +01:00
Attila Fazekas
5bd53340a9
ad-hoc llama clustering option
...
RAMALAMA_LLAMACPP_RPC_NODES allow you to use
rpc nodes from other places.
Example:
Worker node:
$ podman run --replace --name llama_rpc_cuda_0 -it --gpus=all \
--runtime /usr/bin/nvidia-container-runtime --network host \
quay.io/ramalama/cuda /usr/bin/rpc-server -p 50052 -H 0.0.0.0
or
$ podman run --replace --name llama_rpc_cuda_0 -it --gpus=all \
--runtime /usr/bin/nvidia-container-runtime -p 50052:50052 \
quay.io/ramalama/cuda /usr/bin/rpc-server -p 50052 -H 0.0.0.0
Client node (rocm):
$ RAMALAMA_LLAMACPP_RPC_NODES=192.168.142.5:50052 ramalama serve qwq:32b-q8_0 --ctx 8192
output:
load_tensors: loading model tensors, this can take a while... (mmap = true)
load_tensors: offloading 64 repeating layers to GPU
load_tensors: offloading output layer to GPU
load_tensors: offloaded 65/65 layers to GPU
load_tensors: RPC[192.168.142.5:50052] model buffer size = 19271.03 MiB
load_tensors: CPU_Mapped model buffer size = 788.91 MiB
load_tensors: ROCm0 model buffer size = 13142.15 MiB
Signed-off-by: Attila Fazekas <afazekas@redhat.com>
2025-04-22 12:53:38 +02:00
Song Liu
7fe4f00589
Fix: huggingface logout doesn't use token
...
Signed-off-by: Song Liu <soliu@redhat.com>
2025-04-22 15:00:36 +08:00
Daniel J Walsh
c66e931b7a
Allow building older versions of cuda
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-21 13:51:10 -04:00
Daniel J Walsh
255d43438a
Merge pull request #1233 from rhatdan/release
...
Bump to v0.7.5
2025-04-21 10:27:23 -04:00
Daniel J Walsh
2147ca83ac
Bump to v0.7.5
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-21 06:12:47 -04:00
Daniel J Walsh
0412732674
Merge pull request #1231 from afazekas/f42
...
Switch all f41 container to f42
2025-04-21 06:12:19 -04:00
Daniel J Walsh
47cfeffcf9
Merge pull request #1230 from edmcman/tag-refactor
...
Refactor exception handling of huggingface pull operation
2025-04-21 06:10:27 -04:00
Daniel J Walsh
c9115ff6cb
Merge pull request #1232 from melodyliu1986/melodyliu1986-feature-branch
...
Fix bug in login_cli and update huggingface or hf registry behavior
2025-04-21 06:05:28 -04:00
Song Liu
1c3027d4be
Fix bug in login_cli and update huggingface or hf registry behavior
...
Signed-off-by: Song Liu <soliu@redhat.com>
2025-04-21 15:30:07 +08:00
Edward J. Schwartz
353c6360f2
Move directory check to be executed once
...
Signed-off-by: Edward J. Schwartz <edmcman@cmu.edu>
2025-04-20 11:30:29 -04:00
Edward J. Schwartz
36aa6068b0
Formatting
...
Signed-off-by: Edward J. Schwartz <edmcman@cmu.edu>
2025-04-20 10:24:04 -04:00
Edward J. Schwartz
258831631e
Refactor ollama repo downloading utilities
...
Signed-off-by: Edward J. Schwartz <edmcman@cmu.edu>
2025-04-20 10:18:18 -04:00
Edward J. Schwartz
97bb3618ad
Refactor exception handling of huggingface pull operation
...
Signed-off-by: Edward J. Schwartz <edmcman@cmu.edu>
2025-04-20 09:46:17 -04:00
Attila Fazekas
d8ff9d0496
Switch all f41 container to f42
...
Actually only 1 container left on f41,
making all to use f42.
Signed-off-by: Attila Fazekas <afazekas@redhat.com>
2025-04-20 15:04:42 +02:00
Daniel J Walsh
243142c1e6
Merge pull request #1226 from afazekas/intel-gpu-rag
...
fix: intel-gpu-rag build
2025-04-20 06:26:29 -04:00
Attila Fazekas
145d0fa0f8
fix rocm-ubi container build
...
close : #1222
Signed-off-by: Attila Fazekas <afazekas@redhat.com>
2025-04-20 07:56:13 +02:00
Attila Fazekas
95455b94ec
fix: intel-gpu-rag build
...
* intle-gpu the only rag user container with f41, moving to f42
* dependencies referenced by git url, adding git package
* numpy compile requires gcc-c++, python3-devel
* f42 has python3-sentencepiece same version (no compile)
Fixes issues with several other rag containers too
Signed-off-by: Attila Fazekas <afazekas@redhat.com>
2025-04-19 21:50:58 +02:00
Eric Curtin
2d84b42ace
Merge pull request #1227 from afazekas/llama-rpc
...
Enable llama.cpp rpc feature in containers
2025-04-19 17:58:24 +01:00
Attila Fazekas
58fc144c87
Enable llama.cpp rpc feature in containers
...
Enable both server and client support for rpc.
The feature currently PoC in llama.cpp, but can work in practice.
Required for distributed inference.
Signed-off-by: Attila Fazekas <afazekas@redhat.com>
2025-04-19 18:47:25 +02:00
Daniel J Walsh
b7f89a4d7b
Merge pull request #1225 from bmahabirbu/rag-opt
...
Optimized doc2rag for reduced ram and fixed batch size
2025-04-19 07:11:25 -04:00
Brian
75e8391d43
Optimized doc2rag for reduced ram and fixed batch size
...
Signed-off-by: Brian <bmahabir@bu.edu>
2025-04-18 21:07:54 -04:00
Daniel J Walsh
b683ed2e19
Merge pull request #1224 from afazekas/intel-gpu
...
fix intel-gpu container build
2025-04-18 16:37:38 -04:00
Attila Fazekas
b6388f6c84
fix intel-gpu container build
...
49656ee3e9
removed one line
but did not removed the line join, leading to build failures.
Signed-off-by: Attila Fazekas <afazekas@redhat.com>
2025-04-18 22:33:25 +02:00
Eric Curtin
487b255226
Merge pull request #1223 from lsm5/packit-copr-rpm-version
...
Packit: use latest version for rpm
2025-04-18 19:07:04 +01:00
Daniel J Walsh
a5e602d535
Merge pull request #1219 from rhatdan/release
...
Tag images on push with digests, so they are permanent
2025-04-18 11:43:12 -04:00
Lokesh Mandvekar
c030c3f0bb
Packit: use latest version for rpm
...
Packit by default uses `git describe` for rpm version in copr builds.
This can often lag behind the latest release thus making it impossible
to update default distro builds with copr builds.
Ref: https://copr.fedorainfracloud.org/coprs/rhcontainerbot/podman-next/package/python-ramalama/
The latest build in there still shows version: `0.7.3` when v0.7.4 is
the latest upstream release.
This commit adds a packit action to modify the spec file which fetches
version info from setup.py.
The rpm release info is also modified such that it will update over the
latest distro package in almost all circumstances, assuming no distro
package will have a release 1001+.
Signed-off-by: Lokesh Mandvekar <lsm5@fedoraproject.org>
2025-04-18 21:05:22 +05:30
Daniel J Walsh
807bd54fc5
Tag images on push with digests, so they are permanent
...
We have accidently overwridden the images release version
if we also tag by digest, then we will not destroy the
image or manifest list. Since Podman Desktop AI Lab Recipes
relies on the image digest this makes it safer for them.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-18 10:30:37 -04:00
Eric Curtin
0dfb51b2a7
Merge pull request #1218 from rhatdan/completion
...
Improve shell completions for all arguments
2025-04-18 11:00:39 +01:00
Daniel J Walsh
8dada5a934
Improve shell completions for all arguments
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-17 19:52:03 -04:00
Daniel J Walsh
513f6eb119
Merge pull request #1217 from rhatdan/llama-stack
...
Fixes for llama-stack image to build and install
2025-04-17 12:45:51 -04:00
Daniel J Walsh
4497be31a5
Fixes for llama-stack image to build and install
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-17 12:40:25 -04:00
Daniel J Walsh
5ff8ebdcde
Merge pull request #1216 from rhatdan/openvino
...
Fix release scripts for openvino
2025-04-17 12:28:14 -04:00
Daniel J Walsh
a67538f9d9
Fix release scripts for openvino
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-17 12:26:11 -04:00
Daniel J Walsh
25ebe31ab9
Merge pull request #1213 from afazekas/llama-stack-entry
...
llama-stack relative COPY
2025-04-17 12:07:43 -04:00
Eric Curtin
54060dabaf
Merge pull request #1214 from afazekas/rocm-ubi
...
rocm-ubi repo path fix
2025-04-17 16:52:49 +01:00
Attila Fazekas
0ae62b90e4
llama-stack relative COPY
...
cdb6df6877
recently added
the entrypoint.sh however the Containerfile does not have
relative path to container-images/ as for example
intel-gpu has.
container_build.sh uses container-images as working directory.
Signed-off-by: Attila Fazekas <afazekas@redhat.com>
2025-04-17 17:14:59 +02:00
Attila Fazekas
d40d221049
rocm-ubi repo path fix
...
rocm-ubi pointed to a wrong path for the repo files,
this change fixing it.
Signed-off-by: Attila Fazekas <afazekas@redhat.com>
2025-04-17 17:14:10 +02:00
Daniel J Walsh
7fa760c5c6
Merge pull request #1215 from Ferenc-/fix-cuda-link
...
Fix link to ramalama-cuda
2025-04-17 11:06:58 -04:00
Eric Curtin
2e0a314746
Merge pull request #1208 from rhatdan/man
...
Ship nvidia and cann man pages
2025-04-17 14:45:06 +01:00
Daniel J Walsh
2afc6f1433
Merge pull request #1209 from kush-gupt/to-gguf
...
Add --gguf option to convert Safetensors using llama.cpp scripts and functionality
2025-04-17 09:23:38 -04:00
Daniel J Walsh
969420096f
Merge pull request #1211 from rhatdan/nvidia
...
Only use nvidia-container-runtime if it is installed
2025-04-17 08:16:46 -04:00
Ferenc Géczi
611474739f
Fix link to ramalama-cuda
...
Signed-off-by: Ferenc Géczi <ferenc.geczi@ibm.com>
2025-04-17 12:00:00 +00:00
Eric Curtin
ee6246954b
Merge pull request #1210 from containers/renovate/registry.access.redhat.com-ubi9-ubi-9.x
...
Update registry.access.redhat.com/ubi9/ubi Docker tag to v9.5-1744101466
2025-04-17 10:22:24 +01:00
Daniel J Walsh
49656ee3e9
Only use nvidia-container-runtime if it is installed
...
Also we no longer need to ship openvino with intel containerfile
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-17 05:11:39 -04:00
renovate[bot]
f82358b5ab
Update registry.access.redhat.com/ubi9/ubi Docker tag to v9.5-1744101466
...
Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2025-04-17 09:09:45 +00:00
Daniel J Walsh
28cfc5b1c1
Merge pull request #1183 from bmahabirbu/openvino
...
Create openvino model server image and add it quay.io/ramalama
2025-04-17 05:09:21 -04:00
Daniel J Walsh
47e39a75e9
Merge pull request #1207 from rhatdan/debug
...
Quote strings with spaces in debug mode
2025-04-17 05:08:20 -04:00
Kush Gupta
7cd56bd8f0
formatting
...
Signed-off-by: Kush Gupta <kushalgupta@gmail.com>
2025-04-17 03:09:00 -04:00
Kush Gupta
09f3b3826e
add gguf option to convert
...
Signed-off-by: Kush Gupta <kushalgupta@gmail.com>
2025-04-17 02:47:09 -04:00
Brian
cb8630e9aa
add openvino model server image to quay.io/ramalama
...
Signed-off-by: Brian <bmahabir@bu.edu>
2025-04-16 18:25:31 -04:00
Daniel J Walsh
11349e36c3
Ship nvidia and cann man pages
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-16 17:33:36 -04:00
Daniel J Walsh
ed838da4d1
Quote strings with spaces in debug mode
...
Currently if you run in Debug mode and attempt to cut and paste the
Podman or Docker line, the PROMPT field has a space with a > in it.
When pasted this causes issues since it is not properly quoted.
With this change the the command can be successfully cut and pasted.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-16 17:12:59 -04:00
Daniel J Walsh
79aa781fc4
Merge pull request #1201 from rhatdan/docling
...
Fix doc2rag warning
2025-04-16 15:15:44 -04:00
Eric Curtin
e010fb12ab
Merge pull request #1203 from rhatdan/VERSION
...
Add newver.sh script
2025-04-16 16:10:13 +01:00
Daniel J Walsh
7a6cc8968e
Add newver.sh script
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-16 10:04:51 -04:00
Daniel J Walsh
aafcf6af6e
Fix doc2rag warning
...
/usr/bin/doc2rag:46: DeprecationWarning: Use contextualize() instead.
doc_text = chunker.serialize(chunk=chunk)
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-16 09:48:02 -04:00
Eric Curtin
748d399242
Merge pull request #1196 from rhatdan/pull
...
Default to --pull=newer for ramalama rag command.
2025-04-16 14:43:41 +01:00
Daniel J Walsh
64d28fbc57
Handle --pull=newer on Docker
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-16 09:20:18 -04:00
Daniel J Walsh
181064d59a
Default to --pull=newer for ramalama rag command.
...
Fixes: https://github.com/containers/ramalama/issues/1192
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-16 05:53:14 -04:00
Daniel J Walsh
7541c654e3
Merge pull request #1199 from rhatdan/llama-stack
...
Add missing entrypoint.sh
2025-04-16 05:41:06 -04:00
Daniel J Walsh
cdb6df6877
Add missing entrypoint.sh
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-16 05:39:50 -04:00
Daniel J Walsh
0111eaffc9
Merge pull request #1197 from rhatdan/llama-stack
...
Setup /venv for running llama-stack
2025-04-16 05:36:17 -04:00
Daniel J Walsh
d39d878d19
Setup /venv for running llama-stack
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-15 20:44:27 -04:00
Eric Curtin
763c10f79d
Merge pull request #1193 from marcnuri-forks/feat/container_image_ctx_size
...
feat: add CTX_SIZE env config to container-images llama-server.sh
2025-04-15 18:02:02 +01:00
Daniel J Walsh
dcc810c81a
Merge pull request #1160 from containers/install-script-fix
...
Also hardcode version into version.py as fallback
2025-04-15 09:25:19 -04:00
Eric Curtin
5988cdcfc2
Also hardcode version into version.py as fallback
...
This way we should always return the current version
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-04-15 13:07:48 +01:00
Daniel J Walsh
2356e2f4a2
Merge pull request #1195 from containers/readme-update
...
macOS tip to install homebrew
2025-04-15 07:42:27 -04:00
Eric Curtin
ed4046fb82
macOS tip to install homebrew
...
It's a requirement for macOS installs.
Co-Authored-By: Rashid Khan <rkhan@redhat.com>
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-04-15 12:10:29 +01:00
Eric Curtin
32feccf75e
Merge pull request #1194 from leo-pony/main
...
fix llama.cpp CANN backend x86 build failing issue
2025-04-15 11:20:41 +01:00
leo-pony
a444fff302
fix llama.cpp cann backend x86 build failing issue: update llama.cpp to the new commit that has fixed this buiild issue.
...
Signed-off-by: leo-pony <nengjunma@outlook.com>
2025-04-15 16:48:32 +08:00
Marc Nuri
f0e2edd938
feat: CTX_SIZE env config in container-images llama-server.sh is optional
...
Signed-off-by: Marc Nuri <marc@marcnuri.com>
2025-04-15 09:08:47 +02:00
Marc Nuri
02bacb8daf
feat: add CTX_SIZE env config to container-images llama-server.sh
...
Relates to https://github.com/containers/podman-desktop-extension-ai-lab/issues/2630
Allow overriding the context size when running ramalama from a container.
2048 tokens (the default if not specified) is a small context window when running the inference server with
MCP tools or even for longer chat completion conversations.
Being able to provide a context window larger than 2048 is critical for those use cases.
Signed-off-by: Marc Nuri <marc@marcnuri.com>
2025-04-15 06:23:25 +02:00
Daniel J Walsh
6e97173af6
Merge pull request #1191 from rhatdan/VERSION
...
More fixes to get release out
2025-04-14 15:33:39 -04:00
Daniel J Walsh
0f830f2afb
More fixes to get release out
...
intel-gpu will not currently build on Fedora 42, there are issues
in the glibc library. Should try again when Fedora 42 is released
in May.
Verification of the ramalama-cli command, was broken, since ramalama
is the entrypoint.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-14 13:25:09 -04:00
Daniel J Walsh
c0aed4a616
Merge pull request #1186 from rhatdan/VERSION
...
Bump version to v0.7.4
2025-04-14 11:05:23 -04:00
Daniel J Walsh
74a8b67b0b
Bump to 0.7.4
...
Fix handling of minor_release
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-14 10:07:11 -04:00
Daniel J Walsh
07d8ba417a
Fixup build and release scripts
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-14 10:07:11 -04:00
Daniel J Walsh
da5d5a2e71
Merge pull request #1185 from containers/fix-cann-build
...
Fix cann build
2025-04-14 09:54:19 -04:00
Eric Curtin
8367579f77
Fix cann build
...
set_env.sh uses unbound variables deliberately
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-04-14 14:40:20 +01:00
Daniel J Walsh
82dc19438d
Merge pull request #1174 from containers/toolbox-support
...
Add check for toolbox
2025-04-14 09:26:49 -04:00
Daniel J Walsh
211a92dcad
Merge pull request #1184 from containers/disable-arm-neon
...
Disable ARM neon for now in cuda builds
2025-04-14 08:59:20 -04:00
Eric Curtin
ad2cf9f2df
Disable ARM neon for now in cuda builds
...
Otherwize we get build errors
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-04-14 13:26:23 +01:00
Daniel J Walsh
49233a3bc8
Merge pull request #1182 from reidliu41/update-install
...
[Misc] update install script
2025-04-14 05:38:48 -04:00
Eric Curtin
b41977c623
Add check for toolbox
...
If we are in toolbox, don't attempt to run nested containers. We
then have to rely on the user to install llama.cpp in the container
themselves. It's tempting to do an even more generic attempt to see
if we are already inside a container, so we never attempt to do
nested containers, whether toolbox, podman, docker, etc.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-04-13 20:29:11 +01:00
Daniel J Walsh
8955602f9a
Merge pull request #1173 from rhatdan/cuda
...
Improve performance for certain workloads
2025-04-13 14:45:06 -04:00
Eric Curtin
afb8225d80
Merge pull request #1171 from rhatdan/oci
...
Fix failover to OCI image on push
2025-04-13 16:39:47 +01:00
Daniel J Walsh
5455c5ed47
Merge pull request #1123 from edmcman/hf-tag
...
Add ability to pull via hf://user/repo:tag syntax
2025-04-13 05:57:48 -04:00
Daniel J Walsh
c53e238286
Merge pull request #1180 from engelmi/improve-model-store
...
Improve model store
2025-04-13 05:54:55 -04:00
reidliu41
c5e9fb8511
update from suggestion
...
Signed-off-by: reidliu41 <reid201711@gmail.com>
2025-04-13 10:57:15 +08:00
reidliu41
21456ed680
[Misc] update install script
...
Signed-off-by: reidliu41 <reid201711@gmail.com>
2025-04-13 10:49:33 +08:00
Michael Engel
6ee9a75d18
Only list OCI container with --container true
...
Related to: https://github.com/containers/ramalama/pull/1164
Copies the improvement to only list OCI containers when the
--container flag is true.
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-04-13 00:08:07 +02:00
Michael Engel
1f5777db2d
Split model source and path in list models for --use-model-store
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-04-13 00:08:07 +02:00
Daniel J Walsh
d07aa4f9f9
Merge pull request #1176 from nathan-weinberg/make-fix
...
fix: add 'pipx' install to 'make install-requirements'
2025-04-11 15:10:14 -04:00
Daniel J Walsh
15e2a5a8cf
Merge pull request #1177 from nathan-weinberg/issue-template
...
github: add issue templates
2025-04-11 15:08:05 -04:00
Daniel J Walsh
14960256b2
Merge pull request #1179 from nathan-weinberg/fix-contrib-2
...
docs: fix python version guidance in CONTRIBUTING.md
2025-04-11 15:06:33 -04:00
Daniel J Walsh
045831508f
Apply suggestions from code review
...
Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>
2025-04-11 15:06:14 -04:00
Nathan Weinberg
6d4a613e6d
docs: fix python version guidance in CONTRIBUTING.md
...
ramalama can run as low as Python 3.8
Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
2025-04-11 11:27:51 -04:00
Eric Curtin
e4836744b8
Merge pull request #1175 from nathan-weinberg/contrib-fix
...
docs: fix broken link in CONTRIBUTING.md
2025-04-11 15:44:44 +01:00
Nathan Weinberg
b916aa6057
docs: fix broken link in CONTRIBUTING.md
...
Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
2025-04-11 10:33:15 -04:00
Nathan Weinberg
e968ffb764
github: add issue templates
...
the CONTRIBUTING.md doc refers to several issue templates
being present in the projec but currently none exist
this commit adds templates in based on the podman project
Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
2025-04-11 10:28:59 -04:00
Nathan Weinberg
082ed827f1
fix: add 'pipx' install to 'make install-requirements'
...
'make install-requirements' currently assumes 'pipx'
is installed in your env, but this may not be the case
add an explict install/upgrade command via pip
Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
2025-04-11 10:04:33 -04:00
Daniel J Walsh
cc47ae8015
Improve performance for certain workloads
...
Fixes: https://github.com/containers/ramalama/issues/1156
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-11 07:17:22 -04:00
Daniel J Walsh
0338d26589
Merge pull request #1169 from rhatdan/llama-stack
...
Build images for llama-stack
2025-04-10 18:29:19 -04:00
Daniel J Walsh
53091b192d
Fix failover to OCI image on push
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-10 18:26:45 -04:00
Daniel J Walsh
4c43042343
Merge pull request #1166 from containers/update-llama.cpp
...
Update llama.cpp add llama 4
2025-04-10 17:55:46 -04:00
Daniel J Walsh
b996ae315d
Build images for llama-stack
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-10 17:54:02 -04:00
Eric Curtin
20be18708e
Merge pull request #1167 from rhatdan/fedora
...
Bump all images to f42
2025-04-10 18:38:40 +01:00
Daniel J Walsh
12d4a46b23
Bump all images to f42
...
Some of the images were using f41 and others f42, moving
them all to the same version. f42 is in beta now so good time
to move.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-10 13:04:37 -04:00
Eric Curtin
559efcbd76
Merge pull request #1164 from rhatdan/nocontainer
...
Do not list OCI containers when running with nocontainer
2025-04-10 17:50:01 +01:00
Eric Curtin
3e41c8749d
Update llama.cpp add llama 4
...
To pick up llama 4 support among other things.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-04-10 17:48:07 +01:00
Daniel J Walsh
1f93767e34
Do not list OCI containers when running with nocontainer
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-10 11:41:48 -04:00
Eric Curtin
8b9721c16d
Merge pull request #1161 from rhatdan/docling
...
Docling on certain platforms needs accellerate package
2025-04-10 15:13:10 +01:00
Daniel J Walsh
602d4b9e37
Docling on certain platforms needs accellerate package
...
Fixes: https://github.com/containers/ramalama/issues/1157
Also make sure build scripts blow up if any command fails.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-10 09:44:18 -04:00
Daniel J Walsh
cbb4ec0a4a
Merge pull request #1149 from containers/renovate/registry.access.redhat.com-ubi9-ubi-9.x
...
chore(deps): update registry.access.redhat.com/ubi9/ubi docker tag to v9.5-1744101466
2025-04-09 13:44:45 -04:00
Edward J. Schwartz
211beaba00
attempt to download file/repo if tag format fails
...
Signed-off-by: Edward J. Schwartz <edmcman@cmu.edu>
2025-04-09 11:45:53 -04:00
Daniel J Walsh
bcfc715f77
Merge pull request #1154 from rhatdan/rag
...
Use -rag images when using --rag commands
2025-04-09 09:51:07 -04:00
Daniel J Walsh
458b44b3c7
Merge pull request #1153 from rhatdan/rocm
...
Scripts currently used for releasing images
2025-04-09 07:54:18 -04:00
Daniel J Walsh
96dbf92bd5
Use -rag images when using --rag commands
...
Fixes: https://github.com/containers/ramalama/issues/1143
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-09 07:53:49 -04:00
Daniel J Walsh
81faa3e7ba
Merge pull request #1151 from containers/client
...
feat: Add ramalama client command with basic implementation
2025-04-08 16:23:57 -04:00
Daniel J Walsh
1282987809
Scripts currently used for releasing images
...
These are the scripts I am using to push images and build multi-arch
images to the quay.io repositories.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-08 13:32:33 -04:00
Eric Curtin
a84a11d7b7
feat: Add ramalama client command with basic implementation
...
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-04-08 17:46:10 +01:00
Daniel J Walsh
8c8f2cfcb8
Merge pull request #1150 from ueno/wip/dueno/image-arg3
...
Give "image" config option precedence over hardware-based defaults
2025-04-08 10:35:33 -04:00
Daiki Ueno
ab7e7594fb
Give "image" config option precedence over hardware-based defaults
...
Currently, the config options are stored in a single dict, regardless
of where they are originated, e.g., environment variables, files, or
the preset default. This prevents overriding certain options, such as
"image", with a config file.
This groups config options by origins in collections.ChainMap.
Signed-off-by: Daiki Ueno <dueno@redhat.com>
2025-04-08 23:07:45 +09:00
Daniel J Walsh
4f44702271
Merge pull request #1142 from ueno/wip/dueno/image-arg2
...
Exercise image detection in tests
2025-04-08 09:53:45 -04:00
Daiki Ueno
42e7da7365
Exercise image detection in tests
...
This adds a unit test to check whether the image can be properly
overridden, with the --image command-line option or RAMALAMA_IMAGE
envvar.
Signed-off-by: Daiki Ueno <dueno@redhat.com>
2025-04-08 22:51:35 +09:00
renovate[bot]
c25e59c881
chore(deps): update registry.access.redhat.com/ubi9/ubi docker tag to v9.5-1744101466
...
Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2025-04-08 13:48:30 +00:00
Eric Curtin
1196d4df82
Merge pull request #1148 from rhatdan/rocm
...
Removing git breaks rocm images
2025-04-08 14:47:55 +01:00
Daniel J Walsh
bd967d1515
Removing git breaks rocm images
...
Rocm requres a couple of -devel packages which require git.
Removing git removed these packages.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-08 09:05:23 -04:00
Daniel J Walsh
cf43e310f0
Merge pull request #1137 from rhatdan/VERSION
...
Bump to 0.7.3
2025-04-07 15:17:18 -04:00
Daniel J Walsh
928541dff9
Bump to 0.7.3
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-07 14:55:36 -04:00
Daniel J Walsh
14bef6c830
Merge pull request #1138 from containers/fix-ramalama-rag
...
Build fix container image parsing
2025-04-07 14:03:51 -04:00
Eric Curtin
02c75fb974
Build fix container image parsing
...
There's two cases we enter this function at the end of a rag command
or when manually specifing a container as inferencing engine.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-04-07 17:19:31 +01:00
Eric Curtin
c3a9205153
Merge pull request #1136 from rhatdan/VERSION
...
don't use version 0 for pulling images
2025-04-07 16:18:18 +01:00
Daniel J Walsh
5c01db5db5
don't use version 0 for pulling images
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-07 10:55:50 -04:00
Eric Curtin
30ccfcddca
Merge pull request #1134 from rhatdan/vulkan
...
Revert VULKAN change until podman 5.5 is released
2025-04-07 15:30:51 +01:00
Daniel J Walsh
3c407247f7
Revert VULKAN change until podman 5.5 is released
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-07 09:57:10 -04:00
Eric Curtin
a39fef55fd
Merge pull request #1124 from containers/quick-fix
...
Quick fix to installer
2025-04-07 11:09:23 +01:00
Eric Curtin
c11a84ae5f
Quick fix to installer
...
Directory was not right
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-04-06 15:22:06 +01:00
Edward J. Schwartz
be45d785c9
reformat
...
Signed-off-by: Edward J. Schwartz <edmcman@cmu.edu>
2025-04-06 09:55:53 -04:00
Edward J. Schwartz
55b81e37b9
rename ollama's repo pull function
...
Signed-off-by: Edward J. Schwartz <edmcman@cmu.edu>
2025-04-06 09:55:30 -04:00
Edward J. Schwartz
9906e7faca
gguf handled by tag now
...
Signed-off-by: Edward J. Schwartz <edmcman@cmu.edu>
2025-04-06 09:44:17 -04:00
Edward J. Schwartz
19ea4095d5
Initial work on huggingface gguf tags
...
Signed-off-by: Edward J. Schwartz <edmcman@cmu.edu>
2025-04-05 16:09:40 -04:00
Daniel J Walsh
62a8f9b9f2
Merge pull request #1121 from containers/nocontainerfix-and-rm-duplicate
...
Remove duplicate code
2025-04-05 09:56:26 -04:00
Eric Curtin
93b6d0a5ed
Remove duplicate code
...
Ensure container images aren't attempeted to be downloaded when
using:
--nocontainer
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-04-05 14:20:58 +01:00
Eric Curtin
e0fb7f0bd5
Merge pull request #1120 from nathan-weinberg/toolkit-docs
...
docs: add note about COPR repo for Fedora users
2025-04-05 01:30:19 +01:00
Nathan Weinberg
16ad55cdb4
docs: add note about COPR repo for Fedora users
...
Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
2025-04-04 16:11:48 -04:00
Eric Curtin
f0598d7b18
Merge pull request #1117 from engelmi/add-partial-files-to-be-listed
...
Add partial files to be listed for ramalama list command
2025-04-04 16:11:41 +01:00
Eric Curtin
5ebdb53d3f
Merge pull request #1118 from jguiditta/doc_fix
...
Fix get/set selbool references.
2025-04-04 16:11:23 +01:00
Jason Guiditta
ca63ba27e5
Fix get/set selbool references.
...
Documentation:
* As installed in current versions of Fedora, these commands are not
'boolean', but 'bool'.
* The set command will give an error message when the value of the
boolean is not set.
Signed-off-by: Jason Guiditta <jguiditt@redhat.com>
2025-04-04 10:50:58 -04:00
Michael Engel
6d4919b62c
Removed unused imports
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-04-04 15:27:23 +02:00
Michael Engel
5fbc524ec4
Fix error removing blobs
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-04-04 15:25:41 +02:00
Michael Engel
ce8c586f33
Added .partial files to be listed for list command
...
Fixes: https://github.com/containers/ramalama/issues/1104
The model store iterates through all files in the ref files for
the list command. If a pull has been cancelled, then these point
to non-existent files even though there are might be already
partial files. To avoid this error, the model store list will
check for the files to exist and check for partial files.
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-04-04 15:25:41 +02:00
Daniel J Walsh
5a30bcc5f3
Merge pull request #1068 from containers/ramalama-serve-code
...
Introduce wrapper for serve and run
2025-04-03 15:10:40 -04:00
Daniel J Walsh
052b52c489
Merge pull request #1116 from containers/certificates
...
macOS python certificate issue
2025-04-03 15:09:13 -04:00
Eric Curtin
83215cfd33
Introduce wrapper for serve and run
...
We are coming to the limits of what we can do in a "podman run"
line. Create wrapper functions so we can do things like forking
processes and other similar things that you need to do inside a
container in python3. There are some features coming up where
rather than upstreaming separate solutions to all our engines
like vLLM and llama.cpp we want to solve the problem in the
python3 layer.
The "if True:"'s will remain for a while, we may need to wait for
containers to be distributed around the place before we turn things
on.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-04-03 18:13:41 +01:00
Eric Curtin
fbaf980ba0
macOS python certificate issue
...
I never encountered this but some have
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-04-03 18:09:56 +01:00
Eric Curtin
87c44edacc
Merge pull request #1113 from engelmi/remove-unused-cli-args-param
...
Removed unused cli_args param from GGUF parse function
2025-04-03 16:24:18 +01:00
Michael Engel
ba56b2fc12
Removed unused cli_args param from GGUF parse function
...
Fixes: https://github.com/containers/ramalama/issues/1103
Removed unused cli_args param from GGUFInfoParser.parse function which
caused also the call in the model store to fail since it wasn't passed in.
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-04-03 16:44:29 +02:00
Eric Curtin
85f4a76e28
Merge pull request #1107 from rhatdan/images
...
No longer install podman remote and openvino in general images
2025-04-03 14:04:57 +01:00
Eric Curtin
b4112d417e
Merge pull request #1108 from nathan-weinberg/docs
...
docs: fix documentation README
2025-04-03 13:32:22 +01:00
Nathan Weinberg
4916009a8c
docs: fix documentation README
...
docs README had lots of broken links and phantom
make targets
this commit removes a lot of the content and fixes
some other workflow guidance
Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
2025-04-03 08:04:19 -04:00
Daniel J Walsh
bd8a0260fb
No longer install podman remote or openvino in general images
...
Remove other packages that are not necessary when running the
containers.
Remove leftover build_rag load command and move openvino to only the
intel container.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-03 06:43:23 -04:00
Eric Curtin
8b6964587d
Merge pull request #1099 from machadovilaca/fix-694
...
Add --webui flag to optionally disable web UI for ramalama serve
2025-04-03 11:12:51 +01:00
Eric Curtin
bf63bf24a3
Merge pull request #1110 from rhatdan/convert
...
Don't leak intermediate OCI image when converting model to OCI
2025-04-03 11:09:49 +01:00
Eric Curtin
fbf30a6919
Merge pull request #1111 from rhatdan/docs
...
Fix all links to ramalama-cuda
2025-04-03 11:09:25 +01:00
Daniel J Walsh
0369d044b0
Fix all links to ramalama-cuda
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-02 19:26:21 -04:00
Daniel J Walsh
f54df7c337
Don't leak intermediate OCI image when converting model to OCI
...
Fixes: https://github.com/containers/ramalama/issues/904
We are currently leaking a <none><none> image every time we convert an
image with ramalama convert.
There will still be a <none><none> image but this is assocated with the
Manifest list created for the OCI image.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-02 19:09:25 -04:00
Daniel J Walsh
41328d282a
Merge pull request #1109 from containers/pull-vllm
...
Point vllm ramalama at rhel registry
2025-04-02 19:08:44 -04:00
Eric Curtin
dd39c5ee3a
Point vllm ramalama at rhel registry
...
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-04-02 23:29:35 +01:00
Daniel J Walsh
cedf7d3b14
Merge pull request #1102 from edmcman/pull2
...
Be verbose about pulling image
2025-04-02 16:05:24 -04:00
Edward J. Schwartz
5d0be4b88d
Fix test output
...
Signed-off-by: Edward J. Schwartz <edmcman@cmu.edu>
2025-04-02 15:21:33 -04:00
Daniel J Walsh
441471dfe0
Merge pull request #1106 from containers/jinja
...
Enable --jinja for all llama-server instances
2025-04-02 14:15:07 -04:00
Eric Curtin
53245c1bf6
Enable --jinja for all llama-server instances
...
--jinja has been around a few months now, enable it everywhere.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-04-02 17:34:31 +01:00
Eric Curtin
f86e852c60
Merge pull request #1105 from maxamillion/docs/fix_readme_link_to_cuda_manpage
...
fix broken link to cuda docs in readme
2025-04-02 17:16:52 +01:00
Adam Miller
a7093ffdf3
fix broken link to cuda docs in readme
...
Signed-off-by: Adam Miller <admiller@redhat.com>
2025-04-02 10:51:18 -05:00
Edward J. Schwartz
50bbb50f27
Be verbose about pulling image
...
Signed-off-by: Edward J. Schwartz <edmcman@cmu.edu>
2025-04-02 11:29:23 -04:00
João Vilaça
c0d7f0b271
Add --webui flag to optionally disable web UI for ramalama serve
...
Fixes #694
Signed-off-by: João Vilaça <machadovilaca@gmail.com>
2025-04-02 16:23:36 +01:00
Eric Curtin
62424d3345
Merge pull request #1100 from rhatdan/docs
...
Make NVIDIA configuration more present in documentation
2025-04-02 15:05:02 +01:00
Daniel J Walsh
47633fbe78
Make NVIDIA configuration more present in documentation
...
Fixes: https://github.com/containers/ramalama/issues/899
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-02 07:12:42 -04:00
Eric Curtin
5edacf1588
Merge pull request #1097 from rhatdan/ci
...
Mv ramalama-ci to ramalama-cli
2025-04-01 23:30:43 +01:00
Daniel J Walsh
275fe11a23
Mv ramalama-ci to ramalama-cli
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-01 18:24:32 -04:00
Daniel J Walsh
ffe9900298
Merge pull request #1094 from rhatdan/rocm
...
Complete move from fedora-rocm to rocm
2025-04-01 18:23:21 -04:00
Daniel J Walsh
cf0b842959
Merge pull request #1090 from rhatdan/rag
...
Move all RAG support to the -rag images
2025-04-01 17:37:39 -04:00
Daniel J Walsh
8732cd554e
Merge pull request #1096 from rhatdan/ci
...
Add ramalama-ci image
2025-04-01 17:37:16 -04:00
Daniel J Walsh
ef9e3fb377
Add ramalama-ci image
...
This image will just run ramalama inside of a container and
requires the user to leak the podman-socket into the container.
It will use Podman-remote for all of its actions.
Requested by the Podman Desktop team.
Fixes: https://github.com/containers/ramalama/issues/837
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-01 15:59:45 -04:00
Daniel J Walsh
ca23630470
Move all RAG support to the -rag images
...
Images have grown considerably with RAG support.
Do not force users who do not use rag to pay the
penalty.
Helps revert some growth complained about here:
https://github.com/containers/ramalama/issues/838
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-01 14:50:49 -04:00
Eric Curtin
e4b15baf0e
Merge pull request #1085 from rhatdan/url
...
Add url support to rag to pull content to the host
2025-04-01 19:41:42 +01:00
Daniel J Walsh
00cf1b91ad
Complete move from fedora-rocm to rocm
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-01 14:27:30 -04:00
Eric Curtin
4ed1e15378
Merge pull request #1093 from rhatdan/mac
...
Describe where default ramalama.conf file is on mac
2025-04-01 19:14:44 +01:00
Daniel J Walsh
c2575f5be0
Describe where default ramalama.conf file is on mac
...
Fixes: https://github.com/containers/ramalama/issues/858
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-01 14:12:24 -04:00
Eric Curtin
21e1cc93db
Merge pull request #1084 from rhatdan/fedora
...
mv rocm to rocm-ubi and rocm-fedora to rocm
2025-04-01 19:02:31 +01:00
Eric Curtin
0d9d7e19e9
Merge pull request #1091 from rhatdan/docling
...
Do not stack fault when using unsupported docling format
2025-04-01 18:46:21 +01:00
Daniel J Walsh
623cda587c
Add url support to rag to pull content to the host
...
Users should be able to list URLs and pull them to the host to
be processed by doc2rag command.
Also should force building of AI Data images to --network=none.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-01 10:21:07 -04:00
Daniel J Walsh
8b2bec8c15
Do not stack fault when using unsupported docling format
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-01 10:16:49 -04:00
Eric Curtin
ba801c2e1b
Merge pull request #1089 from engelmi/fix-format-and-lint
...
Fix formatting and lint errors
2025-04-01 13:07:46 +01:00
Michael Engel
ddca0fcde0
Fix formatting and lint errors
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-04-01 14:02:37 +02:00
Eric Curtin
75bc45806b
Merge pull request #1088 from engelmi/fix-file-names-for-windows
...
Remove files with colon in their name
2025-04-01 12:52:23 +01:00
Michael Engel
92c80d0134
Remove files with colon in their name
...
The verify_checksum unit tests use files with a colon in their name. This
causes issues for Windows machines since file names/paths can not contain
this symbol. Therefore, these files have been removed and the tests create
these on the fly and only when not run on Windows machines.
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-04-01 13:48:26 +02:00
Daniel J Walsh
fcc08c7704
mv rocm to rocm-ubi and rocm-fedora to rocm
...
Since we are going to concentrate mainly on upstream,
we want to default the name quay.io/ramalama/rocm to the
rocm-fedora Containerfiles.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-31 14:23:12 -04:00
Daniel J Walsh
e09f3f1f8e
Merge pull request #1083 from rhatdan/VERSION
...
Bump to v0.7.2
2025-03-31 14:14:18 -04:00
Daniel J Walsh
8ba3ffe95f
Bump to v0.7.2
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-31 13:43:36 -04:00
Daniel J Walsh
6ec874091a
Merge pull request #1081 from rhatdan/intel
...
Fix handling of entrypoint for Intel
2025-03-31 13:12:03 -04:00
Daniel J Walsh
8d1f3f3ea0
Fix handling of entrypoint for Intel
...
Additional Fix from https://github.com/lirc572
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-31 12:14:58 -04:00
Daniel J Walsh
2e81be1301
Merge pull request #1082 from rhatdan/quadlet
...
Fix gen of name in quadlet to be on its own line.
2025-03-31 11:42:07 -04:00
Daniel J Walsh
8bf16a0525
Fix gen of name in quadlet to be on its own line.
...
Fixes: https://github.com/containers/ramalama/issues/1078
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-31 11:14:04 -04:00
Daniel J Walsh
859e1f4644
Merge pull request #1080 from containers/container-fix
...
Only install epel on rhel-based OSes
2025-03-31 10:46:37 -04:00
Eric Curtin
64ed0551fc
Merge pull request #1072 from rhatdan/pull
...
We should be pulling minor versions not latest
2025-03-31 15:15:45 +01:00
Eric Curtin
a6a3768cdf
Only install epel on rhel-based OSes
...
Was attempting to install on others
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-03-31 14:18:40 +01:00
Daniel J Walsh
19224552bf
Merge pull request #1070 from miabbott/cuda_privileged
...
docs: fixes to ramalama-cuda
2025-03-31 09:17:30 -04:00
Micah Abbott
35e338db8e
docs: use uppercase NVIDIA
...
Signed-off-by: Micah Abbott <miabbott@redhat.com>
2025-03-31 08:13:01 -04:00
Micah Abbott
1eca34548c
docs: add note about container_use_devices usage
...
On SELinux systems, it may be necessary to turn on the
`container_use_devices` boolean in order to run the `nvidia-smi`
command from within a container.
Signed-off-by: Micah Abbott <miabbott@redhat.com>
2025-03-31 08:11:23 -04:00
Daniel J Walsh
61c37425ae
We should be pulling minor versions not latest
...
This way users can stick with an older version of RamaLama and
not get breakage from a major upgrade. Then when their RamaLama version
gets updated, it will pull an updated image.
Also update the README.md and ramalama.1.md man page to show
the accelerated images.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-31 06:37:34 -04:00
Eric Curtin
0f2de2a57f
Merge pull request #1075 from rhatdan/intel
...
Make sure build_rag.sh is in intel-gpu container image
2025-03-31 10:59:10 +01:00
Daniel J Walsh
23f9e06233
Make sure build_rag.sh is in intel-gpu container image
...
Fixes: https://github.com/containers/ramalama/issues/1074
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-30 21:03:15 -04:00
Eric Curtin
0d47d6588e
Merge pull request #1060 from rhatdan/nocontainer
...
Catch errors early about no support for --nocontainer
2025-03-30 14:27:28 +01:00
Daniel J Walsh
3a322cc7bb
Catch errors early about no support for --nocontainer
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-29 05:53:08 -04:00
Micah Abbott
e1865100dd
docs: linting on ramalama-cuda
...
Applied some fixes based on the Markdown linter in VSCode
See: https://github.com/DavidAnson/vscode-markdownlint
Signed-off-by: Micah Abbott <miabbott@redhat.com>
2025-03-28 17:37:12 -04:00
Adam Miller
d80f49c294
Merge pull request #1069 from containers/build-fix
...
args.engine can be None in this code path
2025-03-28 15:06:54 -04:00
Eric Curtin
b52339bc3f
args.engine can be None in this code path
...
The code will then crash, ensure args.engine has a True value of
some type.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-03-28 18:51:56 +00:00
Eric Curtin
398584d5ca
Merge pull request #1066 from rhatdan/docker
...
Docker running of containers is blowing up
2025-03-28 13:29:09 +00:00
Eric Curtin
35a4f815e7
Merge pull request #1065 from rhatdan/apple
...
Fix handling of $RAMALAMA_CONTAINER_ENGINE
2025-03-28 13:00:39 +00:00
Daniel J Walsh
9dda839dcd
Fix handling of $RAMALAMA_CONTAINER_ENGINE
...
apple_vm has a side effect of setting the podman_machine_accel global
variable which is used when running and serving models. Currently if
the user sets RAMALAMA_CONTAINER_ENGINE to podman in an alternative path
the apple_vm code is not called, so the global variable is not set.
Fixes: https://github.com/containers/ramalama/issues/1040
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-28 08:25:51 -04:00
Daniel J Walsh
ccc05a322d
Docker running of containers is blowing up
...
Seems Docker does not support
--entrypoint=[] like podman. --entrypoint "" seems to work on both
platforms.
When attempting to mimic --pull=newer on Docker we were pulling the
wrong image, we should be attempting to pull the accellerated image
not the default.
For some reason llama-run --threads X is blowing up in a docker
container with the option not being supported. This could be something
being masked inside of Docker containers that is not masked inside of podman
containers. Someone who understands what llama-run is doing with the
--threads option would need to look further into this.
This should fix the issue in CI that is blowing up Docker tests.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-28 08:25:02 -04:00
Eric Curtin
e2a1d86871
Merge pull request #1064 from rhatdan/nvidia
...
Link ramalama-nvidia.1 to ramalama-cuda.1
2025-03-28 12:20:59 +00:00
Daniel J Walsh
61a2199314
Link ramalama-nvidia.1 to ramalama-cuda.1
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-28 06:27:59 -04:00
Eric Curtin
a476ded7c4
Merge pull request #1063 from rhatdan/VERSION
...
Bump to v0.7.1
2025-03-27 21:41:03 +00:00
Daniel J Walsh
96a954b783
Bump to v0.7.1
...
RAG support is broken in current release.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-27 17:07:33 -04:00
Daniel J Walsh
1b13c57462
Merge pull request #1055 from rhatdan/device
...
Add support for /dev/accel being leaked into containers
2025-03-27 17:00:14 -04:00
Daniel J Walsh
ee286e37e7
Merge pull request #1061 from rhatdan/rag
...
Don't display server port when using run --rag
2025-03-27 16:59:11 -04:00
Daniel J Walsh
7597f9c8c4
Don't display server port when using run --rag
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-27 15:59:14 -04:00
Daniel J Walsh
04cb59012b
Merge pull request #1059 from containers/rag-chunk-fix
...
fixed chunk error
2025-03-27 15:21:23 -04:00
Daniel J Walsh
240c9f653c
Add /dev/accel if it exists to containers
...
Certain AI Accellerators are stored in /dev/accel rather then /dev/dri.
Ramalama should support these as well.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-27 14:37:37 -04:00
Daniel J Walsh
f1c2a2fb37
Default devices should be added even if user specified devices
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-27 14:37:10 -04:00
Brian Mahabir
ca7b70104c
fixed chunk error
...
Signed-off-by: Brian Mahabir <bmahabir@bu.edu>
2025-03-27 14:20:27 -04:00
Daniel J Walsh
125bc3918e
Merge pull request #1056 from containers/default-threads
...
Hardcode threads to 2 in this test
2025-03-27 14:20:17 -04:00
Eric Curtin
8cde572b01
Merge pull request #1022 from containers/combine-vulkan-kompute-cpu
...
Combine Vulkan, Kompute and CPU inferencing into one image
2025-03-27 16:04:23 +00:00
Eric Curtin
73c54bf34c
Combine Vulkan, Kompute and CPU inferencing into one image
...
Less images to maintain, Vulkan is more mature and more widely
used than Kompute.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-03-27 15:17:44 +00:00
Eric Curtin
3463411463
Hardcode threads to 2 in this test
...
To help stabilize the build
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-03-27 13:09:34 +00:00
Eric Curtin
c09713f61a
Merge pull request #1049 from rhatdan/build
...
fix ramalama rag build code
2025-03-27 12:43:54 +00:00
Eric Curtin
61c5e648a7
Merge pull request #1046 from rhatdan/nvidia
...
Never use entrypoint
2025-03-27 11:30:27 +00:00
Eric Curtin
82eb9580a3
Merge pull request #1053 from benoitf/RAMALAMA-988
...
feat: add --jinja to the list of arguments if MODEL_JINJA env var is true
2025-03-27 11:28:52 +00:00
Florent Benoit
49054b7778
feat: add --jinja to the list of arguments if MODEL_JINJA is true
...
fixes https://github.com/containers/ramalama/issues/988
Signed-off-by: Florent Benoit <fbenoit@redhat.com>
2025-03-27 12:07:30 +01:00
Daniel J Walsh
74a19e757f
fix ramalama rag build code
...
Also on --dryrun, do not pull images when running on a docker platform
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-27 04:52:24 -04:00
Daniel J Walsh
51a2bd4320
Never use entrypoints
...
Turn off all use of entrypoints when running and serving containers.
Entrypoints have the chance of screwing up the way containers run, and
if a user provides their own image with an entrypoint this could become
tough to diagnose errors.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-27 04:46:06 -04:00
Eric Curtin
6dc9453c42
Merge pull request #1050 from rhatdan/intel
...
Attempt to install openvino using pip
2025-03-27 07:38:45 +00:00
Eric Curtin
3e06caddfa
Merge pull request #982 from containers/default-threads
...
Default the number of threads to (nproc)/(2)
2025-03-27 00:04:54 +00:00
Eric Curtin
e4e0e10dea
Default the number of threads to (nproc)/(2)
...
The llama.cpp default for threads is hardcoded to 4. This changes
that harcoding so instead we use the (number of cpu cores)/(2).
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-03-26 22:01:32 +00:00
Daniel J Walsh
7d058686d4
Merge pull request #1044 from containers/minor-fix
...
Remove unused variable
2025-03-26 17:25:40 -04:00
Daniel J Walsh
3bca1f5d89
Attempt to install openvino using pip
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-26 16:23:55 -04:00
Eric Curtin
d1acb41c86
Merge pull request #1047 from edmcman/pull
...
Print status message when emulating --pull=newer for docker
2025-03-26 19:43:49 +00:00
Edward J. Schwartz
c2cb25267f
Print status message when emulating --pull=newer for docker
...
Close #1043
Signed-off-by: Edward J. Schwartz <edmcman@cmu.edu>
2025-03-26 15:22:20 -04:00
Eric Curtin
7e87e19991
Remove unused variable
...
Closes issue opened by sourcery.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-03-26 19:07:30 +00:00
Eric Curtin
fc50bcaf10
Merge pull request #1045 from rhatdan/intel
...
Add openvino to all images
2025-03-26 19:05:13 +00:00
Daniel J Walsh
2622a914e9
Add openvino to all images
...
Podman desktop has asked us to add openvino support to our containers,
this is first step, next we need to pull non-gguf images and start
actually allowing users to specify openvino as a service.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-26 13:51:50 -04:00
Daniel J Walsh
c776b759b8
Merge pull request #1041 from containers/explain-option-better
...
Explain dryrun option better in container_build.sh
2025-03-26 10:13:31 -04:00
Adam Miller
18547bbca3
Merge pull request #1042 from rhatdan/VERSION
...
Bump to v0.7.0
2025-03-26 09:09:41 -04:00
Eric Curtin
83d9ac3966
Explain dryrun option better in container_build.sh
...
It was given some generic explanation
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-03-26 12:47:59 +00:00
Daniel J Walsh
5ef94aa479
Bump to v0.7.0
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-26 08:43:11 -04:00
Daniel J Walsh
065b6eda72
Merge pull request #1036 from rhatdan/images
...
More updates for builds
2025-03-25 20:37:48 -04:00
Eric Curtin
6f71835edc
Merge pull request #1038 from marceloleitner/py3.9
...
Fix errors on python3.9
2025-03-26 00:36:01 +00:00
Daniel J Walsh
4333e1311e
Merge pull request #1039 from containers/update-llama.c
...
Typo in the webui
2025-03-25 20:35:33 -04:00
Eric Curtin
1e98c381a6
Typo in the webui
...
Somebody noticed a typo in the built-in webui in llama.cpp, it was
fixed in upstream llama.cpp. This just ensures we get the fix
downstream!
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-03-26 00:06:28 +00:00
Marcelo Ricardo Leitner
3e28287fa6
Fix errors on python3.9
...
Fixes: https://github.com/containers/ramalama/issues/1037
Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com>
2025-03-25 20:58:27 -03:00
Daniel J Walsh
ff0e5223d0
More updates for builds
...
Fix doc2rag to handle load properly
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-25 16:30:19 -04:00
Daniel J Walsh
bcf5c9576b
Merge pull request #1035 from containers/renovate/registry.access.redhat.com-ubi9-ubi-9.x
...
Update registry.access.redhat.com/ubi9/ubi Docker tag to v9.5-1742918310
2025-03-25 15:14:03 -04:00
renovate[bot]
4328dabb7c
Update registry.access.redhat.com/ubi9/ubi Docker tag to v9.5-1742918310
...
Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2025-03-25 18:11:07 +00:00
Eric Curtin
61e164c6a7
Merge pull request #1033 from containers/rag-eof-fix
...
Added terminal name fixed eof bug and added another model to rag_framework load
2025-03-25 15:12:45 +00:00
Brian Mahabir
8a70dfbb18
Added terminal name fixed eof bug added model to load
...
Signed-off-by: Brian Mahabir <bmahabir@bu.edu>
2025-03-25 11:07:23 -04:00
Daniel J Walsh
c0af65f421
Merge pull request #1032 from containers/minor-fix
...
Minor bugfix remove self. from self.prompt
2025-03-25 10:32:05 -04:00
Eric Curtin
c99f5e3188
Minor bugfix remove self. from self.prompt
...
Not needed in non-constructor scope
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-03-25 14:16:35 +00:00
Eric Curtin
992d07eb95
Use stdlib from cmd in stdlib
...
Handles a lot of cases by default, helps handle Ctrl-C, Ctrl-D,
adds ability to cycle through prompts via up and down keyboard
arrow.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-03-25 13:55:24 +00:00
Eric Curtin
7d6baa87f0
Merge pull request #1031 from rhatdan/images
...
More fixes to build scripts
2025-03-25 12:52:10 +00:00
Daniel J Walsh
420c39f7e8
More fixes to build scripts
...
Adding back rag_framework load
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-25 08:46:02 -04:00
Eric Curtin
7a19e002dc
Merge pull request #1029 from containers/dr-rag
...
Updated rag to have much better querys at the cost of slight delay
2025-03-25 11:31:11 +00:00
Brian
baa7c16489
Updated rag to have much better querys at the cost of slightly more delay
...
Signed-off-by: Brian <bmahabir@bu.edu>
2025-03-24 23:42:11 -04:00
Eric Curtin
254f11ad00
Merge pull request #1028 from rhatdan/images
...
More fixes to build scripts
2025-03-24 21:32:37 +00:00
Daniel J Walsh
809a914fb4
More fixes to build scripts
...
Adding back rag_framework load
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-24 16:53:30 -04:00
Eric Curtin
de6a149d30
Merge pull request #1026 from containers/rag-run-fix
...
added hacky method to use 'run' instead of 'serve' for rag
2025-03-24 17:53:21 +00:00
Daniel J Walsh
e9b5ad9265
Merge pull request #1027 from rhatdan/images
...
Run build_rag.sh as root
2025-03-24 13:49:01 -04:00
Daniel J Walsh
2e1aebfa87
Merge pull request #1024 from rhatdan/rocm
...
Change default ROCM image to rocm-fedora
2025-03-24 13:48:23 -04:00
Daniel J Walsh
d216212207
Run build_rag.sh as root
...
This fixes the build on intel-gpu container.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-24 13:36:55 -04:00
Daniel J Walsh
87cbd9f302
Merge pull request #1023 from rhatdan/images
...
Fix up building of images
2025-03-24 13:36:42 -04:00
Brian Mahabir
8dd0f1d804
added hacky method to use 'run' instead of 'serve' for rag
...
Signed-off-by: Brian Mahabir <bmahabir@bu.edu>
2025-03-24 13:35:28 -04:00
Daniel J Walsh
b276491ff4
Fix up building of images
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-24 12:07:56 -04:00
Daniel J Walsh
7da14df5db
Change default ROCM image to rocm-fedora
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-24 12:04:39 -04:00
Daniel J Walsh
76041eb154
Merge pull request #1021 from containers/optionally-turn-off-color
...
Add feature to turn off colored text
2025-03-24 06:46:59 -04:00
Eric Curtin
0d21651784
Add feature to turn off colored text
...
Requested by user. Also add check to see if terminal is color
capable.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-03-23 14:19:00 +00:00
Daniel J Walsh
02b7c109cd
Merge pull request #1017 from containers/reset-colors
...
Color each word individually
2025-03-22 13:17:26 -04:00
Daniel J Walsh
ab9898dccb
Merge pull request #1019 from containers/install
...
Make install script more aesthetically pleasing
2025-03-22 13:16:41 -04:00
Eric Curtin
dbe7775513
Make install script more aesthetically pleasing
...
Print RamaLama and Llama based loading bar.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-03-21 23:05:43 +00:00
Eric Curtin
106073b265
Merge pull request #1009 from bachp/modelname-in-api
...
Show model name in API instead of model file path
2025-03-21 21:33:12 +00:00
Pascal Bach
a9ccfb64d0
chore: extend flake to allow nix develop
...
This adds all dependencies needed to run
make bats inside the flake
Signed-off-by: Pascal Bach <pascal.bach@nextrem.ch>
2025-03-21 21:53:51 +01:00
Pascal Bach
422cd02173
feat: show model name in API instead of model file path
...
The model file path is always /mnt/models/model.file which makes it hard
to distingish model in the API.
By using the llama-cpp alias flag the server will serve the model name
Signed-off-by: Pascal Bach <pascal.bach@nextrem.ch>
2025-03-21 21:53:51 +01:00
Pascal Bach
85fc3000be
refactor: use long argument names for llama-server
...
Signed-off-by: Pascal Bach <pascal.bach@nextrem.ch>
2025-03-21 21:53:51 +01:00
Eric Curtin
44a16a7d65
Color each word individually
...
We don't want the color yellow to leak into the terminal if the
process dies suddenly.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-03-21 19:47:23 +00:00
Daniel J Walsh
c97cb2dde0
Merge pull request #1016 from containers/bugfix-1
...
Rag condition should be and instead of or
2025-03-21 14:01:15 -04:00
Eric Curtin
a18bcab73e
Rag condition should be and instead of or
...
We want both of these things to be true to execute rag
functionality.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-03-21 17:39:38 +00:00
Eric Curtin
4523c6e9e6
Merge pull request #1010 from containers/rag
...
Adds Rag chatbot to ramalama serve and preloads models for doc2rag and rag_framework
2025-03-21 16:45:48 +00:00
Brian Mahabir
8a5d94a072
Updated rag
...
Signed-off-by: Brian Mahabir <bmahabir@bu.edu>
2025-03-21 12:23:32 -04:00
Eric Curtin
a3ef16ec83
Merge pull request #1015 from rhatdan/rag1
...
Fix ramalama serve --rag ABC --generate kube
2025-03-21 16:23:13 +00:00
Daniel J Walsh
8fb794bee7
Fix ramalama serve --rag ABC --generate kube
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-21 11:59:44 -04:00
Daniel J Walsh
68f41c22f4
Merge pull request #1014 from containers/conversation-history
...
Keep conversation history
2025-03-21 11:40:25 -04:00
Eric Curtin
439a7413ec
Merge pull request #1012 from rhatdan/rag1
...
Generate quadlets with rag databases
2025-03-21 15:30:58 +00:00
Eric Curtin
5878094c31
Keep conversation history
...
Don't treat every prompt like a separate prompt, keep the
conversation history.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-03-21 15:17:56 +00:00
Daniel J Walsh
223a42ac54
Generate quadlets with rag databases
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-21 08:23:39 -04:00
Daniel J Walsh
1421497a84
Merge pull request #1011 from cgruver/cleanup
...
update docs for Intel GPU support. Clean up code comments
2025-03-21 08:12:01 -04:00
Eric Curtin
e61d80f9bc
Merge pull request #1013 from containers/enhance-client
...
Improve UX for ramalama-client
2025-03-21 11:58:23 +00:00
Eric Curtin
69e3ed22b3
Improve UX for ramalama-client
...
I couldn't figure out why things weren't printing word by word
yesterday as I was going for an evening walk it dawned on me, we
were not flushing the buffers. Also adds color to the response
like llama-run.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-03-21 11:38:49 +00:00
Charro Gruver
be7c8c6e13
update docs for Intel GPU support. Clean up code comments
...
Signed-off-by: Charro Gruver <cgruver@redhat.com>
2025-03-21 11:34:38 +00:00
Eric Curtin
ec7cfa4620
Merge pull request #1008 from containers/vllm-rocm
...
Use this container if we detect ROCm accelerator
2025-03-21 11:17:53 +00:00
Eric Curtin
15b8cce09c
Merge pull request #1007 from rhatdan/rag1
...
Fix errors on python3.9
2025-03-20 20:27:57 +00:00
Eric Curtin
82a62951db
Use this container if we detect ROCm accelerator
...
Should help get ROCm + vLLM working
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-03-20 20:27:09 +00:00
Daniel J Walsh
d0e6e6781e
Fix errors on python3.9
...
Fixes: https://github.com/containers/ramalama/issues/1004
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-20 16:05:34 -04:00
Daniel J Walsh
f42c734928
Merge pull request #1005 from containers/appstudio-ramalama
...
Red Hat Konflux update ramalama
2025-03-20 15:45:33 -04:00
red-hat-konflux
026176d12a
Red Hat Konflux update ramalama
...
Signed-off-by: red-hat-konflux <konflux@no-reply.konflux-ci.dev>
2025-03-20 19:08:19 +00:00
Brian M
9d2f70f34f
Merge pull request #1003 from rhatdan/rag1
...
Don't use relative paths for destination
2025-03-20 15:06:24 -04:00
Daniel J Walsh
5213e858e3
Don't use relative paths for destination
...
This can cause the file to not be installed in a subdir of the
destination (/docs) within the container.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-20 14:49:49 -04:00
Daniel J Walsh
e9a253f933
Merge pull request #1001 from containers/verbose
...
Turn on verbose logging in llama-server if --debug is on
2025-03-20 14:21:51 -04:00
Daniel J Walsh
bd5cbecffd
Merge pull request #998 from rhatdan/rag
...
Fix errors found in RamaLama RAG
2025-03-20 14:20:59 -04:00
Daniel J Walsh
f7dc4d7ca4
Merge pull request #997 from containers/ramalama-client
...
Add ramalama client
2025-03-20 14:05:52 -04:00
Daniel J Walsh
94246b6977
Fix errors found in RamaLama RAG
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-20 14:03:51 -04:00
Eric Curtin
9e02876bef
Turn on verbose logging in llama-server if --debug is on
...
Can see more verbose request/response info, etc.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-03-20 17:32:50 +00:00
Eric Curtin
7784b4ef30
Merge pull request #996 from cgruver/intel-gpus
...
Add the ability to identify a wider set of Intel GPUs that have enough Execution Units to produce decent results
2025-03-20 17:31:31 +00:00
Eric Curtin
6416c9e3e9
Add ramalama client
...
Once we achieve feature parity with llama-run, we will more
tightly integrate this into RamaLama.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-03-20 17:30:38 +00:00
Eric Curtin
3237ef4af5
Merge pull request #1002 from kush-gupt/main
...
FIX: Ollama install with brew for CI
2025-03-20 17:16:39 +00:00
Kush Gupta
7ca228bb7c
update brew before starting ollama
...
Signed-off-by: Kush Gupta <kushalgupta@gmail.com>
2025-03-20 13:04:01 -04:00
Charro Gruver
309a14f6e6
Make Linter Happy :-)
...
Signed-off-by: Charro Gruver <cgruver@redhat.com>
2025-03-20 14:50:36 +00:00
Charro Gruver
4a3cd65180
Add the ability to identify a wider set of Intel GPUs that have enough Execution Units to produce decent results
...
Signed-off-by: Charro Gruver <cgruver@redhat.com>
2025-03-20 14:42:18 +00:00
Eric Curtin
1e90133123
Merge pull request #995 from benoitf/fix-condition
...
chore: use the reverse condition for models
2025-03-20 13:32:47 +00:00
Florent Benoit
546ad5d0b8
chore: use the reverse condition for models
...
Signed-off-by: Florent Benoit <fbenoit@redhat.com>
2025-03-20 14:16:26 +01:00
Eric Curtin
08bfd60eb8
Merge pull request #979 from rhatdan/rag
...
Add docling support version 2
2025-03-20 11:43:06 +00:00
Eric Curtin
6f502a87d9
Merge pull request #994 from leo-pony/main
...
[CANN]Fix the bug that openEuler repo does not have ffmpeg-free package, instand of using ffmpeg for openEuler
2025-03-20 03:10:57 +00:00
leo-pony
912ac01af1
[CANN]Fix the bug that openEuler repo does not have ffmpeg-free package. Instand of using ffmpeg for openEuler, which also has LGPL license.
...
Signed-off-by: leo-pony <nengjunma@outlook.com>
2025-03-20 11:01:13 +08:00
Daniel J Walsh
1a17a4497f
Add ramalama serve --rag and ramalama run -rag
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-19 14:47:46 -04:00
Daniel J Walsh
27ca51d87a
Add docling support version 2
...
Remove pragmatic, and move to using local implementation
until llama-stack version is ready.
python3 container-images/scripts/doc2rag.py --help
usage: docling [-h] target source [source ...]
process source files into RAG vector database
positional arguments:
target
source
options:
-h, --help show this help message and exit
ramalama rag should be using accelerated images.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-19 14:47:46 -04:00
Eric Curtin
f73f37a3b8
Merge pull request #992 from benoitf/RAMALAMA-991
...
fix: use expected condition
2025-03-19 14:06:29 +00:00
Florent Benoit
b46ac7e24a
fix: use expected condition
...
fixes https://github.com/containers/ramalama/issues/991
Signed-off-by: Florent Benoit <fbenoit@redhat.com>
2025-03-19 14:50:13 +01:00
Eric Curtin
83285cbad5
Merge pull request #989 from rhatdan/build
...
Fix container_build.sh to build all images
2025-03-19 10:07:28 +00:00
Daniel J Walsh
18d90bb1ed
Fix container_build.sh to build all images
...
Fixes: https://github.com/containers/ramalama/issues/987
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-19 05:04:38 -04:00
Daniel J Walsh
45dfb82926
Merge pull request #985 from containers/ffmpeg-free
...
whisper.cpp requires ffmpeg
2025-03-18 20:32:37 -04:00
Eric Curtin
f8cefcdd87
whisper.cpp requires ffmpeg
...
Installing ffmpeg-free from Fedora/EPEL, ffmpeg-free includes only
the FOSS/patent free bits of ffmpeg.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-03-18 19:18:57 +00:00
Daniel J Walsh
11a23b9279
Merge pull request #986 from rhatdan/intel
...
Improve intel-gpu to work with whisper-server and llama-server
2025-03-18 14:48:09 -04:00
Daniel J Walsh
b0c8c84a04
Improve intel-gpu to work with whisper-server and llama-server
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-18 11:17:05 -04:00
Daniel J Walsh
787f3558b0
Merge pull request #984 from rhatdan/whisper
...
Default whisper-server.sh, llama-server.sh to /mnt/models/model.file
2025-03-18 10:32:26 -04:00
Daniel J Walsh
e0ba69e89c
Default whisper-server.sh, llama-server.sh to /mnt/models/model.file
...
Fixes: https://github.com/containers/ramalama/issues/980
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-18 09:15:06 -04:00
Daniel J Walsh
e18d78045d
Merge pull request #978 from rhatdan/VERSION
...
Bump to v0.6.4
2025-03-17 14:20:54 -04:00
Daniel J Walsh
1bcbfd5ab4
Bump to v0.6.4
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-17 11:55:02 -04:00
Daniel J Walsh
40fe75230b
Merge pull request #975 from rhatdan/main
...
Fix handling of whisper-server and llama-server entrypoints
2025-03-17 11:54:48 -04:00
Daniel J Walsh
064b28d10f
FIx handling of whisper-server and llama-server entrypoints
...
Entrypoint tests are blowing up so remove for now.
Fixes: https://github.com/containers/ramalama/issues/977
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-17 11:07:09 -04:00
Daniel J Walsh
e5249621fb
Merge pull request #949 from edmcman/main
...
Add --runtime-arg option for run and serve
2025-03-17 10:08:46 -04:00
Eric Curtin
9a91ce3594
Merge pull request #976 from cgruver/intel-gpg-fail
...
GPG Check is failing on the Intel Repo
2025-03-17 14:03:11 +00:00
Edward J. Schwartz
65bd965359
Add --runtime-args option
...
Signed-off-by: Edward J. Schwartz <edmcman@cmu.edu>
2025-03-17 09:52:31 -04:00
Charro Gruver
ff270aeee9
GPG Check is failing on the Intel Repo
...
Signed-off-by: Charro Gruver <cgruver@redhat.com>
2025-03-17 13:43:32 +00:00
Eric Curtin
ae2c250ed0
Merge pull request #974 from rhatdan/main
...
Asashi build is failing because of no python3-devel package
2025-03-17 13:03:05 +00:00
Daniel J Walsh
1a0492dbb9
Asashi build is failing because of no python3-devel package
...
Also remove devel packages when completing install.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-17 09:01:15 -04:00
Daniel J Walsh
af6b551f6b
Merge pull request #966 from antheas/threads
...
feat(cpu): add --threads option to specify number of cpu threads
2025-03-17 06:53:22 -04:00
Daniel J Walsh
ded1c436f5
Merge pull request #971 from containers/nvidia-fix
...
Only set this environment variable if we can resolve CDI
2025-03-17 06:49:41 -04:00
Daniel J Walsh
b67d6d43e3
Merge pull request #973 from containers/update-llama
...
Update llama.cpp for some Gemma features
2025-03-17 06:48:47 -04:00
Eric Curtin
e7083607d9
Update llama.cpp for some Gemma features
...
We want to get some new Gemma features added to llama.cpp .
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-03-16 22:50:30 +00:00
Eric Curtin
6019bda457
Only set this environment variable if we can resolve CDI
...
We don't want to use Nvidia/CUDA just because nvidia-smi is present
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-03-16 18:59:19 +00:00
Eric Curtin
3c41b2ff98
Merge pull request #968 from rhatdan/rag1
...
Add software to support using rag in RamaLama
2025-03-15 15:34:43 +00:00
Brian
57f4a6097b
Add software to support using rag in RamaLama
...
This PR just installs the python requirements needed to play with the
rag_framework.py file.
I have not added the docling support yet, since that would swell the
size of the images. Will add that in a separate PR.
Also remove pragmatic and begin conversion to new rag tooling.
Signed-off-by: Brian <bmahabir@bu.edu>
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-15 07:28:47 -04:00
Antheas Kapenekakis
7b0789187a
feat(cpu): add --threads option to specify number of cpu threads
...
Signed-off-by: Antheas Kapenekakis <git@antheas.dev>
2025-03-15 12:17:04 +01:00
Eric Curtin
c8ea9fe3ba
Merge pull request #965 from rhatdan/whisper
...
Fix ENTRYPOINTS of whisper-server and llama-server
2025-03-14 21:47:59 +00:00
Eric Curtin
349ad48008
Merge pull request #967 from containers/llama-cpp-threads
...
Update llama.cpp to contain threads features
2025-03-14 18:22:32 +00:00
Eric Curtin
745b960e77
Update llama.cpp to contain threads features
...
So we can specify CPU threads for llama-run.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-03-14 16:42:24 +00:00
Daniel J Walsh
bff0b2de0b
Fix ENTRYPOINTS of whisper-server and llama-server
...
Fixes:https://github.com/containers/ramalama/issues/964
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-14 10:08:03 -04:00
Daniel J Walsh
3513d47150
Merge pull request #960 from containers/renovate/docker.io-nvidia-cuda-12.x
...
Update docker.io/nvidia/cuda Docker tag to v12.8.1
2025-03-14 09:35:38 -04:00
Eric Curtin
e91b21bfba
Merge pull request #963 from andreadecorte/fix_readme
...
Fix port rendering in README
2025-03-14 13:17:06 +00:00
Andrea Decorte
901a7b2bf2
Fix port rendering in README
...
Port was not rendering in README.md, add a space around it as a workaround.
Signed-off-by: Andrea Decorte <adecorte@redhat.com>
2025-03-14 11:30:01 +01:00
Eric Curtin
cd2150abb1
Merge pull request #962 from leo-pony/main
...
[NPU][Fix] only specify device num, but without ascend-docker-runtime installed, running ramalama/cann container image will failing
2025-03-14 10:27:55 +00:00
leo-pony
d0f02648fa
1. Keep the environment variable of visible Ascend device in ramalama consistent with ascend-docker-runtime.
...
2. Temporarily remove the default value of using device 0 when no ascend device is specified. The reason is that currently, if you only specify device 0 without using ascend-docker-runtime, it cannot be offloaded to NPU normally.
Signed-off-by: leo-pony <nengjunma@outlook.com>
2025-03-14 17:33:46 +08:00
renovate[bot]
ddec113669
Update docker.io/nvidia/cuda Docker tag to v12.8.1
...
Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2025-03-13 17:00:21 +00:00
Daniel J Walsh
8f8f96fa65
Merge pull request #954 from containers/enhance-cuda-check
...
There must be at least one CDI device present to use CUDA
2025-03-13 13:00:11 -04:00
Daniel J Walsh
c4d6772b31
Merge pull request #959 from containers/validate-python3
...
python3 validator
2025-03-13 12:59:51 -04:00
Eric Curtin
68764aa088
There must be at least one CDI device present to use CUDA
...
Otherwise we get failures like:
Error: setting up CDI devices: unresolvable CDI devices nvidia.com/gpu=all
Co-authored-by: Brian <bmahabir@bu.edu>
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-03-13 15:16:00 +00:00
Eric Curtin
f6eaeb6b49
python3 validator
...
We are encountering issues where newer python3 features are
breaking systems with older versions of python3, such as macOS,
this should ensure we validate this in CI.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-03-13 15:13:51 +00:00
Eric Curtin
143f4b0f41
Merge pull request #953 from rhatdan/nvidia
...
Add specified nvidia-oci runtime
2025-03-13 12:58:15 +00:00
Eric Curtin
cd8f5f90ae
Merge pull request #956 from containers/renovate/registry.access.redhat.com-ubi9-ubi-9.x
...
Update registry.access.redhat.com/ubi9/ubi Docker tag to v9.5-1741850090
2025-03-13 12:54:01 +00:00
Daniel J Walsh
066dc1cbbf
Merge pull request #952 from engelmi/add-chat-template-support-to-serve
...
Added --chat-template-file support to ramalama serve
2025-03-13 06:42:01 -04:00
renovate[bot]
c28635f2f0
Update registry.access.redhat.com/ubi9/ubi Docker tag to v9.5-1741850090
...
Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2025-03-13 09:43:30 +00:00
Daniel J Walsh
940953aaaf
Add specified nvidia-oci runtime
...
Nvidia recommends using their nvidia-container-runtime when running
containers with GPU support, so ramalama should use this feature as
well.
Allow users to override the oci-runtime when appropriate.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-12 14:52:28 -04:00
Michael Engel
cb88e40e24
Added --chat-template-file support to ramalama serve
...
Relates to: https://github.com/containers/ramalama/issues/890
Relates to: https://github.com/containers/ramalama/issues/947
If a chat template file can be extracted from the gguf model or if specified by
the model repo, it will now be used in the ramalama serve command and mounted
into the container. It has been included in the generation of the quadlet and
kube files as well.
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-03-12 15:40:25 +01:00
Daniel J Walsh
fd06246fbf
Merge pull request #946 from rhatdan/docker
...
Lets run container in all tests, to make sure it does not explode.
2025-03-12 10:23:17 -04:00
Daniel J Walsh
23de968693
Lets run container in all tests, to make sure it does not explode.
...
Also switch to using smollm
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-12 09:56:34 -04:00
Daniel J Walsh
6a76efc4ca
Merge pull request #951 from containers/fix-install
...
Fix install.sh for OSTree system
2025-03-12 09:47:04 -04:00
Eric Curtin
741ecf2718
Fix install.sh for OSTree system
...
Don't run dnf install on OSTree system.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-03-12 12:14:04 +00:00
Daniel J Walsh
b9505b8699
Merge pull request #939 from s3rj1k/CNAI
...
Handle CNAI annotation deprecation
2025-03-12 07:00:24 -04:00
Daniel J Walsh
ee062e5ee7
Merge pull request #950 from leo-pony/main
...
Add Linux x86-64 support for Ascend NPU accelerator in llama.cpp backend
2025-03-12 06:57:22 -04:00
Daniel J Walsh
0985819215
Merge pull request #915 from containers/more-scaffoling
...
Implement RamaLama shell
2025-03-12 06:52:03 -04:00
leo-pony
ff187ab029
Add Linux x86-64 support for Ascend NPU accelerator in llama.cpp backend
...
Signed-off-by: leo-pony <nengjunma@outlook.com>
2025-03-12 15:38:37 +08:00
Eric Curtin
ea7ec50eef
Merge pull request #943 from containers/consolidate-gpu-detection
...
Consolidate gpu detection
2025-03-11 21:05:24 +00:00
Eric Curtin
8d3a44adac
Consolidate gpu detection
...
This makes sure the gpu detection techniques are the same
throughout the project. We do not display detailed accelerator info,
leave that to tools like "fastfetch" it is hard to maintain, there
are no standards.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-03-11 18:27:33 +00:00
Daniel J Walsh
a28d902a9a
Merge pull request #917 from engelmi/add-chat-template-support
...
Add chat template support
2025-03-11 14:16:26 -04:00
s3rj1k
f716fcc2e9
Update `opencontainers` spec link
...
Signed-off-by: s3rj1k <evasive.gyron@gmail.com>
2025-03-11 18:44:58 +01:00
Eric Curtin
742b6d85ba
Merge pull request #942 from containers/macos-detect
...
macOS detection fix
2025-03-11 17:37:52 +00:00
Eric Curtin
2e1bb04b6d
macOS detection fix
...
Handling of global variables not correct.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-03-11 17:10:38 +00:00
Eric Curtin
58998f20b9
Merge pull request #941 from rhatdan/docker
...
Fix docker handling of GPUs.
2025-03-11 15:23:44 +00:00
Michael Engel
b751eef975
Added snapshot file validation
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-03-11 15:41:55 +01:00
Michael Engel
f55475e36d
Added converting go templates to jinja templates
...
Usually, the chat templates for gguf models are written as jinja templates.
Ollama, however, uses Go Templates specific to ollama. In order to use the
proper templates for models pulled from ollama, the chat templates are
converted to jinja ones and passed to llama-run.
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-03-11 15:41:55 +01:00
Michael Engel
a4c401f303
Use chat template file on model run
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-03-11 15:41:55 +01:00
Michael Engel
5c911fda79
Encode model and chat template information in RefFile
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-03-11 15:41:55 +01:00
Michael Engel
a756441f33
Extract chat template from GGUF file
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-03-11 15:41:55 +01:00
Daniel J Walsh
b4ff470268
Fix docker handling of GPUs.
...
Fixes: https://github.com/containers/ramalama/issues/940
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-11 10:37:17 -04:00
Eric Curtin
742cccce34
Merge pull request #938 from rhatdan/oci
...
Add note about updating nvidia.yaml file
2025-03-11 14:20:09 +00:00
s3rj1k
b66b48de0c
Drop unsupported CNAI annotations
...
Signed-off-by: s3rj1k <evasive.gyron@gmail.com>
2025-03-11 14:35:19 +01:00
s3rj1k
439a95743f
Use `opencontainers` annotations where it makes sense
...
Signed-off-by: s3rj1k <evasive.gyron@gmail.com>
2025-03-11 14:34:52 +01:00
Daniel J Walsh
0698dd8882
Add note about updating nvidia.yaml file
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-11 09:07:24 -04:00
Daniel J Walsh
c9f9266dd7
Merge pull request #935 from containers/raspberripi
...
Bugfixes noticed while installing on Raspberry Pi
2025-03-11 08:54:35 -04:00
Eric Curtin
87cfc4bd18
Bugfixes noticed while installing on Raspberry Pi
...
Just some nits
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-03-11 12:26:24 +00:00
Daniel J Walsh
94d206c845
Merge pull request #933 from containers/macos-fix
...
Make compatible with the macOS system python3
2025-03-10 15:58:04 -04:00
Eric Curtin
e644b63945
Merge pull request #932 from rhatdan/oci
...
Print error when converting from an OCI Image
2025-03-10 19:46:56 +00:00
Eric Curtin
5543d71cac
Make compatible with the macOS system python3
...
populate variables first. Reproducible on systems with brew macOS
also by switching shebang to #!/usr/bin/python3
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-03-10 19:39:24 +00:00
Daniel J Walsh
1a547a0258
Print error when converting from an OCI Image
...
Fixes: https://github.com/containers/ramalama/issues/929
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-10 14:22:02 -04:00
Daniel J Walsh
72c5fafb12
Merge pull request #931 from rhatdan/VERSION
...
Bump to v0.6.3
2025-03-10 13:48:17 -04:00
Daniel J Walsh
583f9a9cac
Bump to v0.6.3
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-10 13:18:06 -04:00
Daniel J Walsh
0ff95703d1
Remove print statement on ports
...
Fixes: https://github.com/containers/ramalama/issues/930
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-10 13:17:44 -04:00
Eric Curtin
7cd661b0ec
Merge pull request #928 from containers/renovate/registry.access.redhat.com-ubi9-ubi-9.x
...
Update registry.access.redhat.com/ubi9/ubi Docker tag to v9.5-1741600006
2025-03-10 15:33:25 +00:00
Daniel J Walsh
199221373d
Merge pull request #926 from benoitf/RAMALAMA-925
...
fix: CHAT_FORMAT variable should be expanded
2025-03-10 11:23:21 -04:00
Florent Benoit
14a80ab1ea
fix: propagate correct command line arguments
...
fixes https://github.com/containers/ramalama/issues/925
Signed-off-by: Florent Benoit <fbenoit@redhat.com>
2025-03-10 16:21:41 +01:00
renovate[bot]
5b8ba9651a
Update registry.access.redhat.com/ubi9/ubi Docker tag to v9.5-1741600006
...
Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2025-03-10 15:04:26 +00:00
Eric Curtin
eefeaff7d5
Merge pull request #921 from rhatdan/config
...
Allow user to specify the images to use per hardware
2025-03-10 15:03:57 +00:00
Daniel J Walsh
0847a76c2d
Allow user to specify the images to use per hardware
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-10 10:03:30 -04:00
Eric Curtin
5394518bbb
Merge pull request #922 from rhatdan/env
...
Add passing of environment variables to ramalama commands
2025-03-10 10:48:01 +00:00
Daniel J Walsh
e18280767c
Add passing of environment variables to ramalama commands
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-09 14:32:55 -10:00
Eric Curtin
247b995bb1
Merge pull request #898 from andreadecorte/797
...
Try to choose a free port on serve if default one is not available
2025-03-09 12:43:52 +00:00
Andrea Decorte
78239f2f7f
Try to choose a free port on serve if default one is not available
...
This change tries first to find if the default port 8080 is available.
If not, it tries to find an available free port in the range 8081-8090 in random order.
An error if no free port is found.
In case of success, the chosen port is printed out for the user.
This does not apply if the user chooses a port different from 8080.
Note that this check could be still not be enough if the chosen port is taken
by another process after our check, in that case we will still fail at a later phase as today.
Includes unit testing.
Closes #797
Signed-off-by: Andrea Decorte <adecorte@redhat.com>
2025-03-08 23:39:11 +01:00
Eric Curtin
ebf056cc4a
Merge pull request #920 from cgruver/readme
...
Add Intel ARC 155H to list of supported hardware
2025-03-08 18:25:51 +00:00
Charro Gruver
36c7662956
Add Intel ARC 155H to list of supported hardware
...
Signed-off-by: Charro Gruver <cgruver@redhat.com>
2025-03-08 17:17:15 +00:00
Daniel J Walsh
a668c76e9a
Merge pull request #919 from cgruver/env-vars
...
Modify GPU detection to match against env var value instead of prefix
2025-03-08 11:25:09 -05:00
Charro Gruver
1c57208df0
replace meaningless env var with HIP_VISIBLE_DEVICES in test script
...
Signed-off-by: Charro Gruver <cgruver@redhat.com>
2025-03-08 12:15:22 +00:00
Charro Gruver
7542de5ca2
add HSA_OVERRIDE_GFX_VERSION to env check
...
Signed-off-by: Charro Gruver <cgruver@redhat.com>
2025-03-07 20:21:48 +00:00
Charro Gruver
5859f83f8f
fix tests
...
Signed-off-by: Charro Gruver <cgruver@redhat.com>
2025-03-07 20:02:02 +00:00
Charro Gruver
406ad34a33
fix formatting to satisy lint again... :-)
...
Signed-off-by: Charro Gruver <cgruver@redhat.com>
2025-03-07 19:41:41 +00:00
Charro Gruver
ad39af9aed
fix formatting to satisy lint
...
Signed-off-by: Charro Gruver <cgruver@redhat.com>
2025-03-07 19:39:57 +00:00
Charro Gruver
37ec32594a
Modify GPU detection to match against env var value instead of prefix
...
Signed-off-by: Charro Gruver <cgruver@redhat.com>
2025-03-07 19:29:51 +00:00
Eric Curtin
bf53e71106
Merge pull request #916 from containers/validate
...
Extend make validate check to do more
2025-03-07 12:47:35 +00:00
Eric Curtin
2769347597
Extend make validate check to do more
...
It also does check-format now.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-03-07 12:24:38 +00:00
Daniel J Walsh
adc53fea4e
Merge pull request #911 from leo-pony/main
...
Add support for llama.cpp engine to use ascend NPU device
2025-03-06 12:11:30 -05:00
leo-pony
93c023d4a7
Add code for ascend npu supporting for llama.cpp engine
...
Signed-off-by: leo-pony <nengjunma@outlook.com>
2025-03-06 17:25:46 +08:00
Eric Curtin
31930cd08b
Merge pull request #905 from engelmi/add-new-model-store
...
Add new model store
2025-03-05 13:08:28 +00:00
Eric Curtin
9ce1984102
Implement RamaLama shell
...
This will eventually replace things like linenoise.cpp, llama-run,
etc.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-03-05 13:00:52 +00:00
Michael Engel
62a7765ad3
Added OCI support in list models
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-03-05 11:26:11 +01:00
Michael Engel
6a842d419a
Added URL and local file integration with model store
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-03-05 11:26:11 +01:00
Michael Engel
7c07c1d792
Added huggingface integration with model store
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-03-05 11:26:11 +01:00
Michael Engel
62b320660d
Raise exception instead of sys.exit on download failure
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-03-05 11:26:11 +01:00
Michael Engel
885e8deda2
Enabled the use of dash instead of colon in filenames and directories
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-03-05 11:26:11 +01:00
Michael Engel
4dd6b66723
Added model store and ollama integration
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-03-05 11:26:08 +01:00
Michael Engel
927f10b0e5
Added --use-model-store CLI option
...
Added new --use-model-store CLI option with False as default. Also, updated
the ModelFactory to use that flag and set the store member of models. This
will be used in subsequent commits.
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-03-05 10:01:39 +01:00
Eric Curtin
2af9f0059f
Merge pull request #596 from maxamillion/fedora-rocm
...
Add ramalama image built on Fedora using Fedora's rocm packages
2025-03-04 23:53:16 +00:00
Adam Miller
a79f912ea7
ignore rocm-fedora images from github image build ci workflow
...
Signed-off-by: Adam Miller <admiller@redhat.com>
2025-03-04 17:05:32 -06:00
Daniel J Walsh
2a10dedc87
Merge pull request #901 from kush-gupt/main
...
Detect & get info on hugging face repos, fix sizing of symlinked directories
2025-03-04 11:32:59 -05:00
Eric Curtin
a15928bb74
Merge pull request #909 from containers/ramalama-serve-core
...
Add new ramalama-*-core executables
2025-03-04 16:30:31 +00:00
Daniel J Walsh
2dedd92c5e
Merge pull request #913 from containers/re-introduce-emoji-prompts
...
Reintroduce emoji prompts
2025-03-04 11:04:29 -05:00
Adam Miller
5a253fcb20
remove no longer used ARG
...
Signed-off-by: Adam Miller <admiller@redhat.com>
2025-03-04 08:07:47 -06:00
Eric Curtin
4b1b4f4bc4
Add new ramalama-*-core executables
...
ramalama-serve-core is intended to act as a proxy and implement
multiple-models. ramalama-client-core in intended to act as a OpenAI
client. ramalama-run-core is intended to act as ramalama-serve-core +
ramalama-client-core, both processes will die on completion of
ramalama-run-core.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-03-04 14:00:50 +00:00
Eric Curtin
e35920d4d5
Reintroduce emoji prompts
...
We fixed most of the bugs around UTF-8, I hope! UTF-8 is not
straightforward in C/C++.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-03-04 11:45:24 +00:00
Adam Miller
b28b3b84f2
consolidate back to a single image
...
The Fedora 42 ROCm stack is a little over 3.5G smaller then the
Fedora 41 ROCm stack in packaging.
Signed-off-by: Adam Miller <admiller@redhat.com>
2025-03-03 16:06:39 -06:00
Eric Curtin
f8f020abe5
Merge pull request #910 from containers/kompute2vulkan
...
Build a non-kompute Vulkan container image
2025-03-03 20:52:46 +00:00
Adam Miller
f9b43a176a
rebase on Fedora 42
...
Signed-off-by: Adam Miller <admiller@redhat.com>
2025-03-03 14:27:35 -06:00
Eric Curtin
43b9ab5d5f
Build a non-kompute Vulkan container image
...
There's no image to pull to play around with this backend right
now.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-03-03 19:15:28 +00:00
Daniel J Walsh
a7d9d82600
Merge pull request #908 from containers/utf-8
...
Update llama.cpp
2025-03-03 12:26:55 -05:00
Brian M
c6e36e9c41
Merge pull request #907 from containers/rm-env-var
...
Use python variable instead of environment variable
2025-03-03 10:00:09 -05:00
Eric Curtin
126cf8744d
Update llama.cpp
...
This version of llama.cpp and linenoise.cpp has UTF-8 support
properly implemented.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-03-03 14:57:14 +00:00
Eric Curtin
d07e8d096d
Use python variable instead of environment variable
...
environment variables are more global than we need.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-03-03 12:43:42 +00:00
Eric Curtin
3b0f28569b
Merge pull request #902 from containers/macos-lib-fix
...
Added support for mac cpu and clear warning message
2025-03-03 11:24:31 +00:00
Eric Curtin
2ead1b121e
Merge pull request #903 from alaviss/patch-1
...
readme: fix artifactory link
2025-03-03 11:09:01 +00:00
alaviss
d9567fa71d
readme: fix artifactory link
...
The previous link was to someone with the handle `artifactory`, not JFrog Artifactory.
Signed-off-by: Leorize <leorize+oss@disroot.org>
2025-03-03 03:11:15 -06:00
Brian Mahabir
fb775252f4
Added support for macos cpu for apple sillicon and clear warning message
...
Signed-off-by: Brian Mahabir <56164556+bmahabirbu@users.noreply.github.com>
2025-03-03 00:12:04 -05:00
Kush Gupta
9ceb962263
Merge pull request #9 from kush-gupt/hf-repo-detection
...
detect hf repos, fix get_size for directories
2025-03-02 20:58:17 -05:00
Kush Gupta
992a3cd865
safeguard access to siblings in repo_info
...
Signed-off-by: Kush Gupta <kushalgupta@gmail.com>
2025-03-02 20:30:02 -05:00
Kush Gupta
14093209ce
detect hf repos, fix get_size for directories
...
Signed-off-by: Kush Gupta <kushalgupta@gmail.com>
2025-03-02 20:19:24 -05:00
Eric Curtin
1e61963d48
Merge pull request #897 from benoitf/fix-iso8601
...
fix: handling of date with python 3.8/3.9/3.10
2025-02-28 16:26:12 +00:00
Florent Benoit
bdb7ae58a3
fix: handling of date with python 3.9/3.10
...
use a function working on 3.9+
Signed-off-by: Florent Benoit <fbenoit@redhat.com>
2025-02-28 17:06:08 +01:00
Eric Curtin
dc25be9c2a
Merge pull request #894 from rhatdan/readme
...
Update the README.md to point people at ramalama.ai web site
2025-02-27 15:11:47 +00:00
Daniel J Walsh
5588c2e562
Update the README.md to point people at ramalama.ai web site
...
We need to start updating the web site with blog pointers and release
announcements.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-02-27 04:49:06 -10:00
Daniel J Walsh
70cfebfb56
Merge pull request #888 from containers/fix-bench
...
benchmark failing because of lack of flag
2025-02-26 22:38:05 -05:00
Daniel J Walsh
ff258ae51f
Merge pull request #891 from containers/demo
...
Switch from tiny to smollm:135m
2025-02-26 22:37:26 -05:00
Eric Curtin
c2f81c54cd
benchmark failing because of lack of flag
...
Specifically priviledged because it's not present in the args
object.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-02-26 21:05:06 +00:00
Eric Curtin
b027740e42
Switch from tiny to smollm:135m
...
This is probably a consequence of my slow network, but I switched
to smollm:135m, it's easier for demos. tiny was taking too long
to download.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-02-26 19:43:34 +00:00
Daniel J Walsh
4acfc6b662
Merge pull request #889 from engelmi/inject-config-to-cli-functions
...
Inject config to cli functions
2025-02-26 12:25:43 -05:00
Michael Engel
3622340635
Apply formatting and linting to unit tests
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-02-26 17:42:30 +01:00
Michael Engel
5957153637
Added unit tests for config functions
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-02-26 17:42:30 +01:00
Michael Engel
181d4959dd
Provide module for configuration
...
In cli.py we already load and merge configuration from various sources
and set defaults in the load_and_merge_config(). However, we still define
defaults when getting config values in various places.
In order to streamline this, the merged config is being provided by a
dedicated config.py module. Also, access to values is changed from .get
to access by index since a missing key is a bug and should throw an error.
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-02-26 17:42:30 +01:00
Daniel J Walsh
56e3a71a58
Merge pull request #884 from containers/temp-rm-emoji-usage
...
Remove emoji usage until linenoise.cpp and llama-run are compatible
2025-02-25 20:57:10 -05:00
Adam Miller
a21e1e53b1
suppress shellcheck issue with source=/etc/os-release
...
Signed-off-by: Adam Miller <admiller@redhat.com>
2025-02-25 19:03:08 -06:00
Adam Miller
7e0b5d5095
collapse to a single containerfile, refactor build scripts to accomodate
...
Signed-off-by: Adam Miller <admiller@redhat.com>
2025-02-25 18:48:37 -06:00
Adam Miller
0b02681cf2
move source to main() and clean up conditional
...
Signed-off-by: Adam Miller <admiller@redhat.com>
2025-02-25 18:41:14 -06:00
Adam Miller
61a3f0ac2b
update to handle vulkan/blas packages for fedora
...
Signed-off-by: Adam Miller <admiller@redhat.com>
2025-02-25 18:41:14 -06:00
Adam Miller
549d2eaa4f
Fedora rocm images
...
Signed-off-by: Adam Miller <admiller@redhat.com>
2025-02-25 18:41:14 -06:00
Eric Curtin
8eb9cf2930
Remove emoji usage until linenoise.cpp and llama-run are compatible
...
Less eyecandy, but at least this works, backspaces for example
were broken. Also split function into multiple functions, it was
getting meaty.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-02-25 15:37:21 +00:00
Eric Curtin
16d95effbf
Merge pull request #882 from engelmi/move-model-input-prune-to-factory
...
Moved pruning protocol from model to factory
2025-02-25 15:10:55 +00:00
Eric Curtin
65984d1ddb
Merge pull request #881 from kush-gupt/main
...
Add Ollama to CI and system tests for its caching
2025-02-25 15:04:54 +00:00
Michael Engel
fc75d9f593
Moved pruning protocol from model to factory
...
By moving the pruning of the protocol from the model input to
the model_factory and encapsulating it in a dedicated function,
unit tests can be written more easily.
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-02-25 15:48:31 +01:00
Eric Curtin
3721dd9d0c
Merge pull request #879 from containers/dnf
...
The package available via dnf is in a good place
2025-02-25 14:43:44 +00:00
Kush Gupta
7bdc739d32
replace logname (which needs tty) with whoami
...
Signed-off-by: Kush Gupta <kushalgupta@gmail.com>
2025-02-25 09:42:38 -05:00
Kush Gupta
ddca454c3f
replace hardcoded runner user with whatever user is running the script
...
Signed-off-by: Kush Gupta <kushalgupta@gmail.com>
2025-02-25 09:39:52 -05:00
Kush Gupta
d863cc998e
script os agnostic ollama install
...
Signed-off-by: Kush Gupta <kushalgupta@gmail.com>
2025-02-25 09:22:14 -05:00
Kush Gupta
33cf7f6965
Merge branch 'containers:main' into main
2025-02-25 09:06:54 -05:00
Eric Curtin
14a876d544
The package available via dnf is in a good place
...
Defaulting to that on platforms that have dnf, if it fails for
whatever reason, fall back to this script.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-02-25 13:33:58 +00:00
Kush Gupta
163212b0a1
Simplify check for manifest path
...
Co-authored-by: Michael Engel <mengel@redhat.com>
Signed-off-by: Kush Gupta <kushalgupta@gmail.com>
2025-02-25 07:43:54 -05:00
Eric Curtin
4b14026a26
Merge pull request #880 from containers/vllm
...
Use vllm-openai upstream image
2025-02-25 10:22:57 +00:00
Eric Curtin
3c3b26295c
Merge pull request #878 from containers/check-for-utf8
...
Check if terminal is compatible with emojis before using them
2025-02-25 09:58:20 +00:00
Kush Gupta
f489022d81
Merge pull request #7 from kush-gupt/ollama-cache-tests
...
This adds improvements to the logic for detecting existing Ollama caches and adds a system test to verify the cache functionality. The CI environment is updated to include Ollama.
2025-02-24 20:22:34 -05:00
Kush Gupta
01fb831f27
fix linting
...
Signed-off-by: Kush Gupta <kushalgupta@gmail.com>
2025-02-24 20:04:55 -05:00
Kush Gupta
033522e70a
remove os.getlogin for TTY issues, install ollama in CI and add ollama cache tests
...
Signed-off-by: Kush Gupta <kushalgupta@gmail.com>
2025-02-24 19:59:47 -05:00
Eric Curtin
f237011618
Use vllm-openai upstream image
...
The one we are currently using is old and doesn't have .gguf
compatibility.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-02-25 00:37:54 +00:00
Eric Curtin
94c5e8034f
Check if terminal is compatible with emojis before using them
...
Just in case it doesn't.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-02-24 22:54:49 +00:00
Eric Curtin
00839ee10f
Merge pull request #875 from rhatdan/version
...
Bump to 0.6.2
2025-02-24 14:50:41 +00:00
Daniel J Walsh
b24e933f8e
Bump to 0.6.2
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-02-24 09:30:02 -05:00
Daniel J Walsh
967c521595
Merge pull request #873 from benoitf/RAMALAMA-871
...
fix: use iso8601 for JSON modified field
2025-02-24 07:26:08 -05:00
Eric Curtin
cb4ea96b17
Merge pull request #856 from rhatdan/kube
...
Fix up handling of image selection on generate
2025-02-24 12:20:43 +00:00
Florent Benoit
3082ad9cdf
fix: use iso8601 for JSON modified field
...
ensure all dates are using iso8601 format (and in JSON output)
and then use the humanize field for the CLI output
fixes https://github.com/containers/ramalama/issues/871
Signed-off-by: Florent Benoit <fbenoit@redhat.com>
2025-02-24 12:49:13 +01:00
Daniel J Walsh
ffc8eba1da
Fix up handling of image selection on generate
...
Also fall back to trying OCI images on ramalama run and serve.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-02-24 06:41:54 -05:00
Daniel J Walsh
7d304a4d51
Merge pull request #872 from benoitf/RAMALAMA-783
...
feat: display emoji of the engine for the run in the prompt
2025-02-24 06:13:49 -05:00
Daniel J Walsh
e9c47dccad
Merge pull request #874 from engelmi/added-model-factory
...
Added model factory
2025-02-24 06:12:11 -05:00
Michael Engel
996e6f551c
Created abstract base class for models
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-02-24 10:11:12 +01:00
Michael Engel
149086e043
Added unit tests for new model factory
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-02-24 10:11:12 +01:00
Michael Engel
ca499a9bba
Moved creating model based on cli input to factory
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-02-24 10:11:12 +01:00
Florent Benoit
4058f81590
feat: display emoji of the engine for the run in the prompt
...
fixes https://github.com/containers/ramalama/issues/783
Signed-off-by: Florent Benoit <fbenoit@redhat.com>
2025-02-23 22:01:05 +01:00
Daniel J Walsh
2912913036
Merge pull request #870 from benoitf/RAMALAMA-869
...
chore: do not format size for --json export in list command
2025-02-23 15:36:24 -05:00
Florent Benoit
c70c9e245e
chore: do not format size for --json export in list command
...
fixes https://github.com/containers/ramalama/issues/869
Signed-off-by: Florent Benoit <fbenoit@redhat.com>
2025-02-23 21:20:53 +01:00
Daniel J Walsh
f7a3e635d7
Merge pull request #831 from containers/ci-fixes
...
Make CI build all images
2025-02-23 05:43:35 -05:00
Eric Curtin
bfe91e3c2d
Make CI build all images
...
To ensure they all continue to build and remain of reasonable size.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-02-22 13:48:50 +00:00
Eric Curtin
c5054f1d29
Merge pull request #864 from rhatdan/cuda
...
Revert back to 12.6 version of cuda
2025-02-20 16:36:56 +00:00
Daniel J Walsh
bc5f35a6a0
Revert back to 12.6 version of cuda
...
This is breaking workloads on Fedora 41 at this time.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-02-20 11:32:30 -05:00
Eric Curtin
6c756d91d8
Merge pull request #862 from containers/typo
...
Change rune to run
2025-02-20 15:59:08 +00:00
Eric Curtin
07d0b6d909
Merge pull request #863 from containers/fix-macos-podman-acceleration
...
Fix macOS GPU acceleration via podman
2025-02-20 15:50:06 +00:00
Eric Curtin
a705da6c8b
Fix macOS GPU acceleration via podman
...
We should always have acceleration on when running on macOS. There
is a possible case where one may not want to use acceleration on
macOS. If someone is hell-bent on using podman without krunkit on
macOS. But for now, just turn it on regardless.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-02-20 15:35:34 +00:00
Eric Curtin
eb0b2381d3
Change rune to run
...
Spotted this typo during demo
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-02-20 14:37:31 +00:00
Eric Curtin
13a1ed8058
Merge pull request #861 from rhatdan/docs
...
Define Environment variables to use
2025-02-20 11:29:05 +00:00
Daniel J Walsh
48d9d8765a
Define Environment variables to use
...
Fixes: https://github.com/containers/ramalama/issues/860
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-02-19 16:59:31 -05:00
Daniel J Walsh
367d658246
Merge pull request #859 from benoitf/DESKTOP-836
...
chore: add alias from llama-2 to llama2
2025-02-19 16:31:01 -05:00
Florent Benoit
09d1717270
chore: add alias from llama-2 to llama2
...
fixes https://github.com/containers/ramalama/issues/836
Signed-off-by: Florent Benoit <fbenoit@redhat.com>
2025-02-19 21:58:20 +01:00
Daniel J Walsh
51c85a50a5
Merge pull request #855 from rhatdan/demos
...
Add demos script to show the power of RamaLama
2025-02-19 14:46:12 -05:00
Daniel J Walsh
bdad361ea7
Add demos script to show the power of RamaLama
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-02-19 13:13:14 -05:00
Eric Curtin
28498f82a4
Merge pull request #840 from containers/network-tests
...
Some tests around --network, --net options
2025-02-19 14:06:29 +00:00
Eric Curtin
63999b323c
Some tests around --network, --net options
...
For run and serve commands.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-02-19 13:33:40 +00:00
Eric Curtin
f97761b1cc
Merge pull request #854 from containers/add-renovate-json
...
Introduce basic renovate.json file
2025-02-19 10:51:41 +00:00
Giulia Naponiello
1f9bdafdb5
Introduce basic renovate.json file
...
A newly introduced renovate.json configures renovate to automate
these updates that used to be performed manually:
- https://github.com/containers/ramalama/pull/746
- https://github.com/containers/ramalama/pull/816
Signed-off-by: Giulia Naponiello <gnaponie@redhat.com>
2025-02-19 10:01:00 +01:00