Daniel J Walsh
3f8e31a073
Merge pull request #1714 from slp/install-virglrenderer
...
container-images: add virglrenderer to vulkan
2025-07-19 06:35:54 -04:00
Daniel J Walsh
08722738cf
Merge pull request #1718 from containers/konflux/references/main
...
Update Konflux references
2025-07-19 06:34:54 -04:00
red-hat-konflux-kflux-prd-rh03[bot]
ab7adbb430
Update Konflux references
...
Signed-off-by: red-hat-konflux-kflux-prd-rh03 <206760901+red-hat-konflux-kflux-prd-rh03[bot]@users.noreply.github.com>
2025-07-19 08:03:10 +00:00
Mike Bonnet
72504179fc
Merge pull request #1716 from containers/fix-sentencepiece-build
...
build_rag.sh: install cmake
2025-07-18 10:54:06 -07:00
Mike Bonnet
dcfeee8538
build_rag.sh: install cmake
...
cmake is required to build sentencepiece.
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-07-18 09:17:33 -07:00
Daniel J Walsh
1d903e746c
Merge pull request #1677 from containers/vllm-cpu
...
Add vllm to cpu inferencing Containerfile
2025-07-18 06:05:26 -04:00
Daniel J Walsh
13a22f6671
Merge pull request #1708 from containers/konflux-more-images
...
konflux: add pipelines for asahi, cann, intel-gpu, llama-stack, musa, openvino, and ramalama-cli
2025-07-18 06:04:12 -04:00
Daniel J Walsh
1d6aa51cd7
Merge pull request #1712 from tonyjames/main
...
Add support for Intel Iris Xe Graphics (46AA, 46A6, 46A8)
2025-07-18 06:03:34 -04:00
Tony James
50d01f177b
Add support for Intel Iris Xe Graphics (46AA, 46A6, 46A8)
...
Signed-off-by: Tony James <3128081+tonyjames@users.noreply.github.com>
2025-07-17 18:58:07 -04:00
Eric Curtin
234134b5cc
Add vllm to cpu inferencing Containerfile
...
To be built upon "ramalama" image
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-07-17 21:09:20 +01:00
Daniel J Walsh
64ca9cfb4a
Merge pull request #1709 from containers/fix-cuda-gpu
...
fix GPU selection and pytorch URL when building rag images
2025-07-17 11:31:41 -04:00
Eric Curtin
e3dda75ec6
Merge pull request #1707 from rhatdan/install
...
README: remove duplicate statements
2025-07-17 15:57:12 +01:00
Daniel J Walsh
075df4bb87
Merge pull request #1617 from jwieleRH/check_nvidia
...
Improve NVIDIA GPU detection.
2025-07-17 06:29:40 -04:00
Daniel J Walsh
5b46b23f2e
README: remove duplicate statements
...
Simplify ramalama's top-level description. Remove the duplicate
statements.
Also make sure all references to PyPI are spelled this way.
Signed-off-by: Ken Dreyer <kdreyer@redhat.com>
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-07-17 06:26:55 -04:00
Daniel J Walsh
1fe1b20c8c
Merge pull request #1711 from carlwgeorge/include-config-in-wheel
...
Included ramalama.conf in wheel
2025-07-17 06:21:47 -04:00
Mike Bonnet
f5512c8f65
build_rag.sh: install sentencepiece via pip
...
python3-sentencepiece was pulling in an older version of protobuf.
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-07-16 23:27:54 -07:00
Mike Bonnet
7132d5a7f8
build_rag.sh: disable pip cache
...
pip's caching behavior was causing errors when downloading huge (4.5G) torch wheels during
the rocm-ubi-rag build.
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-07-16 23:27:54 -07:00
Mike Bonnet
2d3f8dfe28
fix GPU selection and pytorch URL when building rag images
...
A previous commit changed the second argument to add_rag() from the image name to the
full repo path. Update the case statement accordingly, so the "GPU" variable is set correctly.
The "cuda" directory is no longer available on download.pytorch.org. When building for cuda,
pull wheels from the "cu128" directory, which contains binaries built for CUDA 12.8.
When building rocm* images, download binaries from the "rocm6.3" directory, which are built
for ROCm 6.3.
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-07-16 23:27:54 -07:00
Carl George
1d8a2e5b6c
Included ramalama.conf in wheel
...
Currently other data files such as shortnames.conf, man pages, and shell
completions are included in the Python wheel. Including ramalama.conf
as well means we can avoid several calls to make in the RPM spec file,
instead relying on the wheel mechanisms to put these files in place. As
long as `make docs` is run before the wheel generation, all the
necessary files are included.
Signed-off-by: Carl George <carlwgeorge@gmail.com>
2025-07-17 01:20:28 -05:00
Eric Curtin
42ac787686
Merge pull request #1710 from containers/konflux/mintmaker/main/registry.access.redhat.com-ubi9-ubi-9.x
...
chore(deps): update registry.access.redhat.com/ubi9/ubi docker tag to v9.6-1752625787
2025-07-17 01:15:43 +01:00
red-hat-konflux-kflux-prd-rh03[bot]
18c560fff6
chore(deps): update registry.access.redhat.com/ubi9/ubi docker tag to v9.6-1752625787
...
Signed-off-by: red-hat-konflux-kflux-prd-rh03 <206760901+red-hat-konflux-kflux-prd-rh03[bot]@users.noreply.github.com>
2025-07-17 00:03:57 +00:00
John Wiele
ce35ccb4c3
Remove pyYAML as a dependency.
...
Extract information directly from the CDI YAML file by making some
simplifying assumptions instead of doing a complete YAML parse.
Default to all devices known to nvidia-smi.
Fix the signature of check_nvidia().
Remove some debug logging.
Signed-off-by: John Wiele <jwiele@redhat.com>
2025-07-16 16:11:39 -04:00
John Wiele
b97177b408
Apply suggestions from code review
...
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>
Signed-off-by: John Wiele <jwiele@redhat.com>
2025-07-16 16:08:32 -04:00
John Wiele
14c4aaca39
Improve NVIDIA GPU detection.
...
Allow GPUs to be specified by UUID as well as index since the index is
not guaranteed to persist across reboots.
Crosscheck requested GPUs with nvidia-smi and CDI configuration. If
any requested GPUs lack corresponding CDI configuration, print a
message with a pointer to documentation.
If the only GPU specified in the CDI configuration is "all", as
appears to be the case on WSL2, use "all" as the default.
Add an optional encoding argument to run_cmd() to facilitate checking
the output of the command.
Add pyYAML as a dependency for parsing the CDI configuration.
Signed-off-by: John Wiele <jwiele@redhat.com>
2025-07-16 16:08:32 -04:00
Mike Bonnet
bf4fd56106
konflux: add pipelines for asahi, cann, intel-gpu, llama-stack, musa, openvino, and ramalama-cli
...
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-07-16 12:23:12 -07:00
Mike Bonnet
1373a8e7ba
konflux: don't trigger pipelines on PR transition to "Ready for Review"
...
By default, Konflux triggers new pipelines when a PR moves from Draft to
"Ready for Review". Because the commit SHA hasn't changed, no new builds
are performed. However, a new integration test is also triggered, and because
no builds were performed it is unable to find the URL and digest of the images,
causing the integration test to fail. Updating the "on-cel-expression" to exclude
the transition to "Ready to Review" avoids the unnecessary pipelines and the
false integration test failures.
Update the whitespace of the "on-cel-expression" in the push pipelines for consistency.
No functional change.
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-07-16 12:11:54 -07:00
Sergio Lopez
74584d0b5e
container-images: add virglrenderer to vulkan
...
When running in a krun-isolated container, we need
"/usr/libexec/virgl_render_server" to be present in the container
image to launch it before entering the microVM.
Install the virglrenderer package in addition to mesa-vulkan-drivers.
Signed-off-by: Sergio Lopez <slp@redhat.com>
2025-07-16 18:53:11 +02:00
Daniel J Walsh
4dea2ee02f
Merge pull request #1687 from containers/konflux-cuda-arm64
...
konflux: build cuda on arm64, and simplify testing
2025-07-16 12:01:45 -04:00
Mike Bonnet
069e98c095
fix unit tests to be independent of environment
...
Setting RAMALAMA_IMAGE would cause some unit tests to fail. Make those
tests independent of the calling environment.
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-07-16 06:44:00 -07:00
Mike Bonnet
f57b8eb284
konflux: copy source into the bats image
...
Including the source in the bats image ensures that we're always testing with the same
version of the code that was used to build the images. It also eliminates the need for
repeated checkouts of the repo and simplifies testing, avoiding additional volumes and
artifact references.
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-07-16 06:44:00 -07:00
Mike Bonnet
299d3b9b75
konflux: build cuda and layered images on arm64
...
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-07-16 06:44:00 -07:00
Stephen Smoogen
683b8fb8a0
Minor fixes to rpm builds by packit and spec file. ( #1704 )
...
* This removes epel9 from packit rules as epel9 does not currently
build without many additional packages added to the distro.
* This fixes a breakage in epel10 by adding mailcap as a buildrequires.
Signed-off-by: Stephen Smoogen <ssmoogen@redhat.com>
Co-authored-by: Stephen Smoogen <ssmoogen@redhat.com>
2025-07-16 09:37:00 -04:00
Mike Bonnet
64e22ee0aa
Merge pull request #1700 from containers/test-optimization-and-fixup
...
reduce unnecessary image pulls during testing, and re-enable a couple tests
2025-07-15 11:34:59 -07:00
Mike Bonnet
651fc503bd
implement "ps --noheading" for docker using --format
...
"docker ps" does not support the "--noheading" option. Use the --format
option to emulate the behavior.
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-07-15 10:32:53 -07:00
Daniel J Walsh
384cad7161
Merge pull request #1696 from containers/renovate/quay.io-konflux-ci-build-trusted-artifacts-latest
...
chore(deps): update quay.io/konflux-ci/build-trusted-artifacts:latest docker digest to f7d0c51
2025-07-15 13:17:33 -04:00
Daniel J Walsh
3dec0d7487
Merge pull request #1699 from containers/renovate/registry.access.redhat.com-ubi9-ubi-9.x
...
chore(deps): update registry.access.redhat.com/ubi9/ubi docker tag to v9.6-1752587049
2025-07-15 13:16:27 -04:00
Daniel J Walsh
d7763ad1c5
Merge pull request #1698 from containers/mistral
...
Mistral should point to lmstudio gguf
2025-07-15 13:15:11 -04:00
Mike Bonnet
b550cc97d2
bats: re-enable a couple tests, and minor cleanup
...
Fix the "serve and stop" test by passing the correct (possibly random) port to "ramalama chat".
Fix the definition of "ramalama_runtime".
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-07-15 09:03:49 -07:00
Mike Bonnet
927d2f992a
bats: allow the container to use the overlay driver when possible
...
Remove the STORAGE_DRIVER env var from the container so it doesn't force use
of the vfs driver in all cases.
Mount /dev/fuse into the container when running locally.
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-07-15 09:03:49 -07:00
Mike Bonnet
f176bb3926
add a dryrun field to Config, and set it early
...
accel_image() is called to set option defaults, before options are even parsed.
This can cause images to be pulled even if they will not actually be used, slowing
down testing and making the cli less responsive. Set the "dryrun" option before
the first call to accel_image() to avoid unnecessary image pulls.
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-07-15 09:03:49 -07:00
renovate[bot]
f38c736d23
chore(deps): update registry.access.redhat.com/ubi9/ubi docker tag to v9.6-1752587049
...
Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2025-07-15 15:47:38 +00:00
Eric Curtin
fa2f485175
Mistral should point to lmstudio gguf
...
I don't know who MaziyarPanahi is, but I know who lmstudio are
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-07-15 15:04:54 +01:00
Mike Bonnet
f8c41b38c1
avoid unnecessary image pulls
...
Don't pull images in _get_rag() and _get_source_model() if pull == "never"
or if running with "--dryrun".
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-07-14 14:42:21 -07:00
renovate[bot]
b7323f7972
chore(deps): update quay.io/konflux-ci/build-trusted-artifacts:latest docker digest to f7d0c51
...
Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2025-07-14 20:13:34 +00:00
Daniel J Walsh
53e38dea8f
Merge pull request #1694 from rhatdan/VERSION
...
Bump to 0.11.0
2025-07-14 10:59:06 -04:00
Daniel J Walsh
bf68cfddd3
Bump to 0.11.0
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-07-14 10:32:32 -04:00
Stephen Smoogen
8ab242f820
Move rpms ( #1693 )
...
* Start adding rpm/ramalama.spec for Fedora
Add a ramalama.spec to sit next to python-ramalama.spec while we get
this reviewed. Change various configs so they are aware of
ramalama.spec
Signed-off-by: Stephen Smoogen <ssmoogen@redhat.com>
* Add needed obsoletes/provides in base rpm to start process.
Signed-off-by: Stephen Smoogen <ssmoogen@redhat.com>
* Try to fix CI problems with initial mr
The initial MR puts two spec files in the same directory which was
causing problems with the CI. This splits them off into different
directories which should allow for the tooling to work.
Signed-off-by: Stephen Smoogen <ssmoogen@redhat.com>
* Finish move of Fedora rpm package to new name.
Put changes into various files needed to allow for new RPM package
`ramalama` to build in Fedora infrastructure versus python3-ramalama.
Signed-off-by: Stephen Smoogen <ssmoogen@redhat.com>
* Fix problem with path names lsm5 caught
Signed-off-by: Stephen Smoogen <ssmoogen@redhat.com>
---------
Signed-off-by: Stephen Smoogen <ssmoogen@redhat.com>
Co-authored-by: Stephen Smoogen <ssmoogen@redhat.com>
2025-07-14 10:13:49 -04:00
Daniel J Walsh
eba46c8df6
Merge pull request #1691 from mbortoli/readme-improvements
...
Readme improvements: Update model's name and improve CUDA_VISIBLE_DEVICES section
2025-07-14 10:03:20 -04:00
Mario Antonio Bortoli Filho
b5826c96e9
README: fix model name and improve CUDA section
...
- Corrected the model name under the Benchmark section; previous name was not available in Ollama's registry.
- Added instructions to switch between CPU-only mode and using all available GPUs via CUDA_VISIBLE_DEVICES.
Signed-off-by: Mario Antonio Bortoli Filho <mario@bortoli.dev>
2025-07-14 09:43:16 -03:00
Daniel J Walsh
066b659f3a
Merge pull request #1689 from containers/pip-install
...
Only install if pyproject.toml exists
2025-07-14 06:07:24 -04:00
Eric Curtin
6d7effadc2
Only install if pyproject.toml exists
...
Otherwise skip
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-07-13 22:13:11 +01:00
Daniel J Walsh
1d2e1a1e01
Merge pull request #1688 from containers/konflux/references/main
...
chore(deps): update konflux references
2025-07-12 06:06:35 -04:00
Daniel J Walsh
a54e2b78c4
Merge pull request #1681 from ramalama-labs/bug/chat-fix
...
Bug/chat fix
2025-07-12 06:05:31 -04:00
red-hat-konflux-kflux-prd-rh03[bot]
f4cec203ac
chore(deps): update konflux references
...
Signed-off-by: red-hat-konflux-kflux-prd-rh03 <206760901+red-hat-konflux-kflux-prd-rh03[bot]@users.noreply.github.com>
2025-07-12 08:02:23 +00:00
Ian Eaves
a616005695
resolve merge conflicts
...
Signed-off-by: Ian Eaves <ian.k.eaves@gmail.com>
2025-07-11 16:47:56 -05:00
Daniel J Walsh
c7c0f7d2e5
Merge pull request #1685 from rhatdan/convert
...
Allow `ramalama rag` to output different formats
2025-07-11 16:18:19 -04:00
Daniel J Walsh
b630fcdea2
Allow ramalama rag to output different formats
...
Add ramalama rag --format option to allow outputing
of markdown, json as well as qdrant databases.
This content can then be used as input to the client tool.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
Co-authored-by: Ian Eaves <ian.k.eaves@gmail.com>
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-07-11 11:04:32 -04:00
Daniel J Walsh
027f88cf31
Merge pull request #1683 from containers/konflux-integration
...
konflux: add integration tests that run in multi-arch VMs
2025-07-10 15:05:03 -04:00
Mike Bonnet
d7ed2216dd
konflux: build entrypoint images on smaller instances
...
The entrypoint image builds are very lightweight, use smaller instances
to reduce resource consumption.
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-07-10 09:19:16 -07:00
Mike Bonnet
6d9a7eea9e
konflux: build rag images on instance types with more disk space
...
-rag builds were failing due to the 40G disk filling up. Run builds on
newly-available "d160" instance types which have 160G of disk space
available.
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-07-10 09:19:16 -07:00
Mike Bonnet
5ebc48f453
konflux: add integration tests that run in multi-arch VMs
...
The integration tests will be triggered after all image builds associated with a single
commit are complete. Tests are currently being run on amd64 and arm64 platforms.
Remove "bats-nocontainer" from the build-time tests, since those are covered by "bats" run
in the integration tests.
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-07-10 09:19:16 -07:00
Mike Bonnet
b6cb2fdbe2
konflux: build all ramalama layered images on arm64
...
Some bats tests need the ramalama-rag image avilable for the current arch. Build
all the ramalama layered images on arm64 as well as amd64.
Switch to building on larger VM instance types to reduce build times and improve
developer feedback and experience.
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-07-10 06:44:58 -07:00
Mike Bonnet
f75599097e
bats: ignore RAMALAMA_IMAGE from the calling environment
...
Some tests parse the output of the ramalama cli and hard-code the location of the expected
default image. However, this output changes based on the value of the RAMALAMA_IMAGE
environment variable, and setting this variable in the calling environment can cause those
tests to fail. Unset the RAMALAMA_IMAGE environment variable in these tests to avoid false failures.
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-07-10 06:44:58 -07:00
Daniel J Walsh
80317bffbc
Merge pull request #1684 from containers/konflux/mintmaker/main/registry.access.redhat.com-ubi9-ubi-9.x
...
chore(deps): update registry.access.redhat.com/ubi9/ubi docker tag to v9.6-1752069608
2025-07-10 07:45:47 -04:00
red-hat-konflux-kflux-prd-rh03[bot]
124afc14bb
chore(deps): update registry.access.redhat.com/ubi9/ubi docker tag to v9.6-1752069608
...
Signed-off-by: red-hat-konflux-kflux-prd-rh03 <206760901+red-hat-konflux-kflux-prd-rh03[bot]@users.noreply.github.com>
2025-07-10 00:02:12 +00:00
Daniel J Walsh
79b23e1237
Merge pull request #1668 from containers/renovate/registry.access.redhat.com-ubi9-ubi-9.x
...
Update registry.access.redhat.com/ubi9/ubi Docker tag to v9.6-1752069608
2025-07-09 16:04:00 -04:00
Daniel J Walsh
5fd301532c
Merge pull request #1679 from containers/bugfix-for-chat
...
Bugfix for chat
2025-07-09 16:03:14 -04:00
Daniel J Walsh
64d53180fd
Merge pull request #1680 from nathan-weinberg/bump-er
...
chore: bump ramalama-stack to 0.2.5
2025-07-09 16:01:09 -04:00
Daniel J Walsh
c0278c1b8c
Merge pull request #1676 from rhatdan/selinux
...
Enable SELinux separation
2025-07-09 16:00:35 -04:00
Nathan Weinberg
e402a456cf
chore: bump ramalama-stack to 0.2.5
...
Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
2025-07-09 15:42:33 -04:00
Eric Curtin
3da38bc7b8
Bugfix for chat
...
This was recently removed:
+ if getattr(self.args, "model", False):
+ data["model"] = self.args.model
it is required
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-07-09 20:13:09 +01:00
Daniel J Walsh
980179d5ca
Enable SELinux separation
...
Remove some unused functions from model.py
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-07-09 13:11:58 -04:00
Daniel J Walsh
657bacb52e
Merge pull request #1675 from rhatdan/image
...
Hide --container option, having --container/--nocontainer is confusing
2025-07-09 09:55:44 -04:00
Daniel J Walsh
09c6ccb2f0
Hide --container option, having --container/--nocontainer is confusing
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-07-09 06:58:04 -04:00
Daniel J Walsh
7f09d4bf5b
Merge pull request #1643 from engelmi/enhance-ref-file
...
Enhance ref file and mount all snapshot files to container
2025-07-08 13:37:45 -04:00
Michael Engel
7a6c9977f7
Disable generate and serve OCI image test
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-07-08 18:36:39 +02:00
Michael Engel
def6116f15
Add deduplication check by file hash to update_snapshot
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-07-08 18:33:42 +02:00
Michael Engel
5e39e11678
Remove limitation of only one model file per snapshot
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-07-08 18:33:42 +02:00
Michael Engel
21e42fc837
Move split model logic to url model class
...
The split model feature was exclusive to URL models. Because of this - and the
improvements in mounting all model snapshot files - the logic has been removed
from the ModelFactory and put to the URL model class.
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-07-08 18:33:42 +02:00
Michael Engel
eefafe24fd
Refactor model classes to mount all snapshot files instead of explicit ones
...
Previously, the model, mmproj and chat template files were mounted explicity if
present using many if-exists checks. Relying on the new ref file all files of that
model snapshot are either mounted or used directly with its blob path. When mounted
into a container, the files are put into MNT_DIR with the respective file names.
The split_model part has been dropped for now, but will be refactored in the next
commit.
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-07-08 18:33:42 +02:00
Michael Engel
6cbaf692aa
Remove obsolete glob check if model exists
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-07-08 18:33:42 +02:00
Michael Engel
129ee175d6
Fixed using model_store instead of store
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-07-08 18:33:42 +02:00
Michael Engel
8e98c77f54
Removed unused functio
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-07-08 18:33:42 +02:00
Michael Engel
e398941913
Remove unused garbage collection function
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-07-08 18:33:42 +02:00
Michael Engel
d95bd13ca0
Replace RefFile by RefJSONFile in model store
...
Replacing the use of RefFile with the new RefJSONFile in model store. It also adds
support for adhoc migration of old to new ref file format.
This will break ramalama as is since no specific functionality for getting the explicit
(gguf) model file path has been implemented. Will be adjusted in the next commit to
fix this.
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-07-08 18:33:42 +02:00
Michael Engel
496439ea02
Added new ref file format
...
Added a new, simpler ref file format serialized as JSON. It also gets additional
fields such as the hash of the file that is used as the name of the blob file.
This essentially makes the snapshot directory and all symlinks obsolete, further
simplifying the storage and improving stability. It also leads to the ref file as
being the single source for all files of a model.
Further refactoring, incl. swapping and migrating from the old to new format, will
follow in subsequent commits.
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-07-08 18:33:42 +02:00
Daniel J Walsh
bf0af8034a
Merge pull request #1673 from containers/mlx-fixes
...
mlx fixes
2025-07-08 11:51:26 -04:00
Daniel J Walsh
99f56a7684
Merge pull request #1669 from rhatdan/image
...
move --image & --keep-groups to run, serve, perplexity, bench commands
2025-07-08 11:49:07 -04:00
Eric Curtin
5b20aa4e2c
mlx fixes
...
mlx_lm.server is the only one in my path at least on my system.
Also, printing output like this which doesn't make sense:
Downloading huggingface://RedHatAI/Llama-3.2-1B-Instruct-FP8-dynamic/model.safetensors:latest ...
Trying to pull huggingface://RedHatAI/Llama-3.2-1B-Instruct-FP8-dynamic/model.safetensors:latest ...
Also remove recommendation to install via `brew install ramalama`, skips installing Apple specific
dependancies.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-07-08 15:57:25 +01:00
Eric Curtin
957dfd52e7
Merge pull request #1672 from containers/revert-1671-set-ramalama-stack-version
...
Revert "feat: allow for dynamic version installing of ramalama-stack"
2025-07-08 14:54:23 +01:00
Eric Curtin
ebb8ea93fd
Revert "feat: allow for dynamic version installing of ramalama-stack"
2025-07-08 14:53:25 +01:00
Daniel J Walsh
7dc3d9da8e
move --image & --keep-groups to run, serve, perplexity, bench commands
...
This eliminates the need for pulling images by accident when not
using containers. Since these commands are only used for container
commands, no need for them in other places.
Fixes: https://github.com/containers/ramalama/issues/1662
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-07-08 09:09:41 -04:00
Daniel J Walsh
72aa795b17
Merge pull request #1666 from engelmi/inspect-add-safetensor-support
...
Inspect add safetensor support
2025-07-08 08:54:15 -04:00
Eric Curtin
2fea5f86f6
Merge pull request #1671 from nathan-weinberg/set-ramalama-stack-version
...
feat: allow for dynamic version installing of ramalama-stack
2025-07-08 13:39:00 +01:00
Michael Engel
412d5616d3
Catch error on creating snapshot and log error
...
Relates to: https://github.com/containers/ramalama/issues/1663
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-07-08 12:57:12 +02:00
Michael Engel
3b880923c0
Added support for safetensors to inspect command
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-07-08 12:57:12 +02:00
Daniel J Walsh
b7c15ce86a
Merge pull request #1664 from containers/konflux/references/main
...
chore(deps): update konflux references
2025-07-08 06:48:46 -04:00
Daniel J Walsh
87287ae574
Merge pull request #1670 from containers/konflux/mintmaker/main/registry.access.redhat.com-ubi9-ubi-9.x
...
Update registry.access.redhat.com/ubi9/ubi Docker tag to v9.6-1751897624
2025-07-08 06:43:49 -04:00
Nathan Weinberg
eeaab7276c
feat: allow for dynamic version installing of ramalama-stack
...
previously we were setting an explicit version of `ramalama-stack`
in the Containerfile restricting what we used at runtime
moved the install to the entrypoint script and allowed the use of
the RAMALAMA_STACK_VERSION env var to install a specific version
(default with no env var installs the latest package and pulls the
YAML files from the main branch)
Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
2025-07-07 21:38:30 -04:00
red-hat-konflux-kflux-prd-rh03[bot]
8104b697dd
Update registry.access.redhat.com/ubi9/ubi Docker tag to v9.6-1751897624
...
Signed-off-by: red-hat-konflux-kflux-prd-rh03 <206760901+red-hat-konflux-kflux-prd-rh03[bot]@users.noreply.github.com>
2025-07-08 00:05:41 +00:00
renovate[bot]
eacaffe03d
Update registry.access.redhat.com/ubi9/ubi Docker tag to v9.6-1751897624
...
Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2025-07-07 19:49:58 +00:00
Daniel J Walsh
21957b22c2
Merge pull request #1661 from ramalama-labs/feat/vision
...
Adds the ability to include vision based context to chat via --rag
2025-07-07 09:36:52 -04:00
Daniel J Walsh
cd7220a3ea
Merge pull request #1667 from rhatdan/VERSION
...
Bump to v0.10.1
2025-07-07 08:19:41 -04:00
Daniel J Walsh
fe3731dffc
Bump to v0.10.1
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-07-07 06:52:23 -04:00
Eric Curtin
0947e11f13
Merge pull request #1665 from rhatdan/pull
...
Make sure errors and progress messages go to STDERR
2025-07-06 14:58:38 +01:00
Daniel J Walsh
ab4d0f2202
Make sure errors and progress messages go to STDERR
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-07-06 07:12:14 -04:00
Daniel J Walsh
c62a2a4e5b
Move download_file to http_client rather then common
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-07-06 06:52:51 -04:00
red-hat-konflux-kflux-prd-rh03[bot]
ee8d7a3a04
chore(deps): update konflux references
...
Signed-off-by: red-hat-konflux-kflux-prd-rh03 <206760901+red-hat-konflux-kflux-prd-rh03[bot]@users.noreply.github.com>
2025-07-05 08:02:20 +00:00
Daniel J Walsh
c9f9f691aa
Merge pull request #1642 from kush-gupt/feat/mlx
...
MLX runtime support
2025-07-04 05:53:41 -04:00
Ian Eaves
fe2d22c848
renamed tests + lint
...
Signed-off-by: Ian Eaves <ian.k.eaves@gmail.com>
2025-07-03 17:47:02 -05:00
Ian Eaves
cba091b265
adds vision to chat
...
Signed-off-by: Ian Eaves <ian.k.eaves@gmail.com>
2025-07-03 17:37:11 -05:00
Kush Gupta
bc92481a66
Fix API request
...
Signed-off-by: Kush Gupta <kushalgupta@gmail.com>
2025-07-03 10:18:26 -04:00
Eric Curtin
149c9f101c
Merge pull request #1660 from telemaco/pre-commit-conf
...
Add .pre-commit-config.yaml
2025-07-03 14:54:26 +01:00
Eric Curtin
06488b45f1
Merge pull request #1637 from rhatdan/store
...
Always use absolute path for --store option
2025-07-03 14:33:44 +01:00
Eric Curtin
4482803eb2
Merge pull request #1657 from containers/konflux-layered-images
...
konflux: add pipelines for the layered images of ramalama, cuda, rocm, and rocm-ubi
2025-07-03 14:32:37 +01:00
Kush Gupta
277cb4f504
make sure host is not in container, dont care about llama.cpp args
...
Signed-off-by: Kush Gupta <kushalgupta@gmail.com>
2025-07-03 09:27:15 -04:00
Kush Gupta
d77b7ce231
mlx runtime with client/server
...
Signed-off-by: Kush Gupta <kushalgupta@gmail.com>
2025-07-03 09:27:00 -04:00
Roberto Majadas
5a51552d1f
Add pre-commit configuration
...
Signed-off-by: Roberto Majadas <rmajadas@redhat.com>
2025-07-03 14:40:49 +02:00
Daniel J Walsh
8501240d43
Merge pull request #1659 from telemaco/lint-and-format-conf-updates
...
Update lint and format tools configuration
2025-07-03 07:12:52 -04:00
Eric Curtin
c791ac1602
Merge pull request #1658 from containers/konflux/mintmaker/main/registry.access.redhat.com-ubi9-ubi-9.x
...
chore(deps): update registry.access.redhat.com/ubi9/ubi docker tag to v9.6-1751445649
2025-07-03 11:45:32 +01:00
Daniel J Walsh
689955480c
Always use absolute path for --store option
...
Fixes: https://github.com/containers/ramalama/issues/1634
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-07-03 06:38:57 -04:00
Roberto Majadas
e5e6195c49
Update lint and format tools configuration
...
Signed-off-by: Roberto Majadas <rmajadas@redhat.com>
2025-07-03 12:32:26 +02:00
Daniel J Walsh
ae38e3f09c
Merge pull request #1632 from ramalama-labs/feat/user-prompt-configs
...
Adds a user configuration setting to disable gpu prompting
2025-07-03 06:30:52 -04:00
Daniel J Walsh
c32d67fd4e
Merge pull request #1635 from containers/list-models
...
Add command to list available models
2025-07-03 06:23:35 -04:00
red-hat-konflux-kflux-prd-rh03[bot]
9c43c0ba71
chore(deps): update registry.access.redhat.com/ubi9/ubi docker tag to v9.6-1751445649
...
Signed-off-by: red-hat-konflux-kflux-prd-rh03 <206760901+red-hat-konflux-kflux-prd-rh03[bot]@users.noreply.github.com>
2025-07-03 00:03:59 +00:00
Ian Eaves
27fa3909a3
adds user prompt controls
...
Signed-off-by: Ian Eaves <ian.k.eaves@gmail.com>
2025-07-02 16:58:37 -05:00
Mike Bonnet
0808cf76b9
konflux: add pipelines for the layered images of ramalama, cuda, rocm, and rocm-ubi
...
Build the -llama-server, -whisper-server, and -rag layered images, which inherit from
the existing ramalama, cuda, rocm, and rocm-ubi images.
Layered images use shared Containerfiles, and customize their builds using --build-arg.
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-07-02 12:19:25 -07:00
Eric Curtin
3a61309e10
Add command to list available models
...
With commands such as:
ramalama chat --url https://generativelanguage.googleapis.com/v1beta/openai --ls
we can now list the various models available.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-07-02 16:52:28 +01:00
Eric Curtin
8dc1144cbd
Merge pull request #1641 from containers/layered-containerfiles
...
build layered images from Containerfiles
2025-07-02 10:19:32 +01:00
Mike Bonnet
46c0154d2a
build layered images from Containerfiles
...
Move the Containerfiles for the entrypoint and rag images out of container_build.sh and into their
own files. This is necessary so they can be built with Konflux.
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-07-01 15:14:26 -07:00
Eric Curtin
e624a41063
Merge pull request #1631 from jbtrystram/fix_quadlet
...
quadlet: add missing privileged options
2025-07-01 22:38:16 +01:00
jbtrystram
412372de9c
quadlet: use shlex to join shell arguments
...
this is to avoid incorrect parsing if arguments contain spaces.
See https://github.com/containers/ramalama/pull/1631#discussion_r2175358681
Signed-off-by: jbtrystram <jbtrystram@redhat.com>
2025-07-01 22:34:06 +02:00
jbtrystram
a3a199664c
quadlet: add missing privileged options
...
The default privileged options were ommited from the generated quadlet
file. Add them using the same argument parsing as in engine.py. [1]
Also add a few base options found in model.py[2] that were missing.
Fixes https://github.com/containers/ramalama/issues/1593
[1] 8341ddcf7b/ramalama/engine.py (L71-L82)
[2] 8341ddcf7b/ramalama/model.py (L205-L223)
Signed-off-by: jbtrystram <jbtrystram@redhat.com>
2025-07-01 22:33:21 +02:00
Daniel J Walsh
58922cd285
Merge pull request #1638 from engelmi/use-config-for-pull-flag-in-accel-image
...
Use config instance for defining pull behavior in accel_image
2025-07-01 14:39:49 -04:00
Daniel J Walsh
5468b1b4c7
Merge pull request #1639 from nathan-weinberg/rlls-0.2.4
...
chore: bump ramalama-stack to 0.2.4
2025-07-01 14:33:15 -04:00
Nathan Weinberg
1dad8284b7
chore: bump ramalama-stack to 0.2.4
...
Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
2025-07-01 14:28:10 -04:00
Daniel J Walsh
fe756ccf70
Merge pull request #1640 from engelmi/split-model-store-into-files
...
Split the model store into multiple files
2025-07-01 14:24:22 -04:00
Michael Engel
d7ecda282b
Added staticmethod annotation
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-07-01 17:45:51 +02:00
Michael Engel
3327df7852
Split the model store into multiple files
...
The source code for the model store is getting bigger, so splitting it
into multiple source files under a directory helps keeping it easier
to read.
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-07-01 17:41:34 +02:00
Michael Engel
4a5724e673
Use config instance for defining pull behavior in accel_image
...
By using the pull field in the config instance for the flag to
indicate pulling of the container image should be attempted in
the accel_image function, the behavior is tied to the cli options.
This also prevents a ramalama ls to seemingly block since the
image is downloaded (with no output).
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-07-01 15:24:54 +02:00
Daniel J Walsh
162e2e5991
Merge pull request #1614 from containers/konflux-tests
...
run tests during build pipelines
2025-07-01 06:55:33 -04:00
Daniel J Walsh
3b11fcf343
Merge pull request #1633 from containers/konflux/mintmaker/main/registry.access.redhat.com-ubi9-ubi-9.x
...
chore(deps): update registry.access.redhat.com/ubi9/ubi docker tag to v9.6-1751287003
2025-07-01 06:37:26 -04:00
Eric Curtin
34eae809b6
Merge pull request #1620 from olliewalsh/store_delete_refcount
...
Fix modelstore deleting logic when multiple reference refer to the same blob/snapshot
2025-07-01 09:42:46 +01:00
red-hat-konflux-kflux-prd-rh03[bot]
1e346cc083
chore(deps): update registry.access.redhat.com/ubi9/ubi docker tag to v9.6-1751287003
...
Signed-off-by: red-hat-konflux-kflux-prd-rh03 <206760901+red-hat-konflux-kflux-prd-rh03[bot]@users.noreply.github.com>
2025-07-01 00:03:13 +00:00
Oliver Walsh
7b211d0aef
Only remove .parial blob file when the snapshot refcount is 0
...
Previously would always remove this partial blob file.
Note: this assumes the blob hash equals the snapshot hash, which
is only true for repos with a single model
Signed-off-by: Oliver Walsh <owalsh@redhat.com>
2025-06-30 23:30:35 +01:00
Oliver Walsh
80fd6d95fe
Handle existing but broken symlink to snapshot file
...
Signed-off-by: Oliver Walsh <owalsh@redhat.com>
2025-06-30 23:11:41 +01:00
Oliver Walsh
69e0929ca0
Add bats tests for pullling llama.cpp multimodal images
...
Signed-off-by: Oliver Walsh <owalsh@redhat.com>
2025-06-30 23:11:41 +01:00
Oliver Walsh
990a7412e8
Fix modelstore deleting logic
...
When deleting a reference, count the remaining references to the
snapshot/blobs to determine if they should be deleted.
Signed-off-by: Oliver Walsh <owalsh@redhat.com>
2025-06-30 23:11:41 +01:00
Mike Bonnet
8b1d2c03cd
konflux: skip checks on PR builds
...
Most of the checks don't (yet) apply to these images, and they add significant time to the builds.
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-06-30 12:11:13 -07:00
Mike Bonnet
36e55002fe
konflux: set PipelineRun timeouts to 6 hours
...
Container builds and tests can take a long time. We'd rather them eventually complete successfully
than fail with a timeout.
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-06-30 12:11:13 -07:00
Mike Bonnet
ee05ed0586
run tests during build pipelines
...
Use the bats container to run a set of Makefile targets to test the code
and images in parallel.
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-06-30 12:11:13 -07:00
Stephen Smoogen
8341ddcf7b
Start process of moving python-ramalama to ramalama ( #1498 )
...
* Start adding rpm/ramalama.spec for Fedora
Add a ramalama.spec to sit next to python-ramalama.spec while we get
this reviewed. Change various configs so they are aware of
ramalama.spec
Signed-off-by: Stephen Smoogen <ssmoogen@redhat.com>
* Add needed obsoletes/provides in base rpm to start process.
Signed-off-by: Stephen Smoogen <ssmoogen@redhat.com>
* Try to fix CI problems with initial mr
The initial MR puts two spec files in the same directory which was
causing problems with the CI. This splits them off into different
directories which should allow for the tooling to work.
Signed-off-by: Stephen Smoogen <ssmoogen@redhat.com>
---------
Signed-off-by: Stephen Smoogen <ssmoogen@redhat.com>
Co-authored-by: Stephen Smoogen <ssmoogen@redhat.com>
2025-06-30 14:51:29 +01:00
Eric Curtin
afbb01760f
Merge pull request #1628 from rhatdan/host
...
Fix handling of --host option when running in a container
2025-06-30 13:58:46 +01:00
Daniel J Walsh
1270b7fba6
Merge pull request #1629 from rhatdan/VERSION
...
Bump to v0.10.0
2025-06-30 08:31:07 -04:00
Daniel J Walsh
8d054ff751
Bump to v0.10.0
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-06-30 08:29:27 -04:00
Daniel J Walsh
67b3d6ebba
Merge pull request #1627 from containers/konflux/references/main
...
chore(deps): update konflux references
2025-06-30 05:14:15 -04:00
Daniel J Walsh
bc561d2597
Merge pull request #1570 from ieaves/feat/file-upload
...
Adds the ability to pass files to `ramalama run`
2025-06-29 05:36:00 -04:00
Ian Eaves
1f03de03f8
Add file upload feature
...
Signed-off-by: Ian Eaves <ian.k.eaves@gmail.com>
2025-06-28 21:18:16 -05:00
Daniel J Walsh
6b13f497fa
Fix handling of --host option when running in a container
...
When you run a Model server within a container and only wanted it bound
to a certain port, the port binding should happen to the container not
inside of the container.
Fixes: https://github.com/containers/ramalama/issues/1572
Also fix handling of -t option, should not be used with anything other
then run command, and now I am not sure of that.
The LLAMA_PROMPT_PREFIX= environment variable should not be set within
containers as an environment variable, since we are doing chat on the
outside.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-06-28 11:48:24 -04:00
red-hat-konflux-kflux-prd-rh03[bot]
f5298105e3
chore(deps): update konflux references
...
Signed-off-by: red-hat-konflux-kflux-prd-rh03 <206760901+red-hat-konflux-kflux-prd-rh03[bot]@users.noreply.github.com>
2025-06-28 08:03:12 +00:00
Daniel J Walsh
7e1d159a3b
Merge pull request #1624 from containers/gemma3n-alias
...
Add gemma aliases
2025-06-27 10:36:26 -04:00
Daniel J Walsh
ca9885ac99
Merge pull request #1623 from containers/bump-llamacpp2
...
Want to pick up support for gemma3n
2025-06-27 10:35:53 -04:00
Eric Curtin
b42eb5762d
Merge pull request #1621 from sarroutbi/202506271328-fix-unit-tests-for-machines-running-gpus
...
Fix unit tests for machines with GPUs
2025-06-27 15:32:46 +01:00
Eric Curtin
089589cdfe
Add gemma aliases
...
The ollama variants are incompatible
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-06-27 15:28:00 +01:00
Eric Curtin
289e682f2a
Want to pick up support for gemma3n
...
And the other latest and greatest llama.cpp features
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-06-27 15:06:32 +01:00
Sergio Arroutbi
8ab3ce3f56
Fix test_common to use expected image
...
Signed-off-by: Sergio Arroutbi <sarroutb@redhat.com>
2025-06-27 16:03:36 +02:00
Sergio Arroutbi
146a5d011a
Fix quadlet tests to pass on a machine with GPU
...
Signed-off-by: Sergio Arroutbi <sarroutb@redhat.com>
2025-06-27 13:44:21 +02:00
Daniel J Walsh
895fb0d1dd
Merge pull request #1588 from rhatdan/llama-stack
...
Fixup to work with llama-stack
2025-06-27 07:22:24 -04:00
Daniel J Walsh
e0108b9d34
Merge pull request #1616 from nathan-weinberg/rlls-0.2.3
...
chore: bump ramalama-stack to 0.2.3
2025-06-27 06:41:28 -04:00
Daniel J Walsh
1c87479aee
Fixes to work with llama-stack
...
Adapt ramalama stack and chat modules for compatibility with llama-stack by updating host binding, argument formatting, and command invocation patterns, and add robust attribute checks in the chat utility.
Bug Fixes:
Add hasattr checks around optional args (pid2kill, name) in chat kills() to prevent attribute errors
Enhancements:
Bind model server to 0.0.0.0 instead of localhost for external accessibility
Convert port, context size, and thread count arguments to strings for consistent CLI usage
Reformat container YAML to use JSON array and multiline args for llama-server and llama-stack commands
Update Containerfile CMD to JSON exec form for llama-stack entrypoint
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-06-27 05:57:45 -04:00
Eric Curtin
b2cd9dc36e
Merge pull request #1610 from rhatdan/url
...
Fix removing of file based URL models
2025-06-27 08:46:40 +01:00
Eric Curtin
faacef5ea5
Merge pull request #1615 from rhatdan/build
...
Free up disk space for building all images
2025-06-27 08:45:55 +01:00
Eric Curtin
a019b91b8a
Merge pull request #1619 from carlwgeorge/zsh-completions
...
Use standard zsh completion directory
2025-06-27 08:43:47 +01:00
Carl George
10cdbfb28d
Use standard zsh completion directory
...
We're currently using /usr/share/zsh/vendor-completions for zsh
completions. However, the RPM macro %{zsh_completions_dir} (which is
required by the Fedora packaging guidelines) is defined as
/usr/share/zsh/site-functions, so let's switch to that.
https://docs.fedoraproject.org/en-US/packaging-guidelines/ShellCompletions/
Signed-off-by: Carl George <carlwgeorge@gmail.com>
2025-06-27 02:07:37 -05:00
Nathan Weinberg
00a5f084b4
chore: bump ramalama-stack to 0.2.3
...
Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
2025-06-26 16:13:21 -04:00
Daniel J Walsh
93d23c93e6
Free up disk space for building all images
...
Were using Podman to build images, so don't futz with Docker.
only build base images, not as necessary to build RAG Images.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-06-26 16:06:54 -04:00
Daniel J Walsh
8c2bc88284
Fix removing of file based URL models
...
Currently we are incorrectly reporting file models as
file://PATH as opposed to the correct file:///PATH.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-06-26 15:44:29 -04:00
Daniel J Walsh
370f1ccc1c
Merge pull request #1611 from ktdreyer/nopull
...
rename "nopull" boolean to "pull"
2025-06-26 14:57:23 -04:00
Daniel J Walsh
c98c3a0cb4
Merge pull request #1612 from containers/konflux-bats
...
konflux: build bats image
2025-06-26 13:27:42 -04:00
Ken Dreyer
f9e6fed54a
rename "nopull" boolean to "pull"
...
Rename "nopull" to "pull" for improved clarity and readability. This
avoids the double-negative, making the logic more straightforward to
reason about. "pull = True" now means "pull the image", "pull = False"
means "don't pull the image."
Signed-off-by: Ken Dreyer <kdreyer@redhat.com>
2025-06-26 13:06:34 -04:00
Mike Bonnet
7f05324a7a
bats: only install ollama on x86_64
...
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-06-26 08:10:45 -07:00
Mike Bonnet
0f4c0fee43
konflux: bats: use shared pipelines
...
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-06-26 07:43:59 -07:00
red-hat-konflux-kflux-prd-rh03
27460c5c97
Red Hat Konflux kflux-prd-rh03 update bats
...
Signed-off-by: red-hat-konflux-kflux-prd-rh03 <konflux@no-reply.konflux-ci.dev>
2025-06-26 14:06:56 +00:00
Daniel J Walsh
aa1e4f1f30
Merge pull request #1603 from slp/pin-copr-mesa
...
container-images: pin mesa version to COPR
2025-06-26 09:38:31 -04:00
Daniel J Walsh
0f90023a52
Merge pull request #1609 from rhatdan/build
...
Separate build image into its own VM
2025-06-26 09:36:59 -04:00
Daniel J Walsh
d4e76d3638
Merge pull request #1598 from containers/bats-container
...
add support for running bats in a container
2025-06-26 09:36:35 -04:00
Eric Curtin
61efb04416
Merge pull request #1605 from rhatdan/chat
...
Switchout hasattr for getattr wherever possible
2025-06-26 14:27:22 +01:00
Eric Curtin
932a1d8c08
Merge pull request #1607 from engelmi/prune-model-store-code
...
Prune model store code
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-06-26 09:21:03 -04:00
Daniel J Walsh
de46cd16c7
Separate build image into its own VM
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-06-26 09:12:27 -04:00
Eric Curtin
9140476c7d
Merge pull request #1607 from engelmi/prune-model-store-code
...
Prune model store code
2025-06-26 13:24:09 +01:00
Sergio Lopez
385a992e2b
container-images: pin mesa version to COPR
...
When building on Fedora systems make sure we install the
mesa version from the COPR, which has the patches to force
alignment to 16K (needed for GPU acceleration on macOS, but
harmless to other systems).
We also need to add "--nobest" to "dnf update" to ensure it
doesn't get frustrated by being unable to install the mesa package
from appstream.
Signed-off-by: Sergio Lopez <slp@redhat.com>
2025-06-26 12:09:05 +02:00
Michael Engel
2f3af6afff
Use property for model store
...
By accessing the model store via property a None-check can be performed
and creating an instance on-the-fly. In addition, this removes the need
for setting the store from the factory and removes its optional trait.
The unit tests for ollama have been rewritten as well since functions
such as repo_pull or exists have been removed. It only tests the pull
function which mocks away http calls to external services.
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-06-26 09:29:03 +02:00
Michael Engel
ef3863904f
Refactoring model base class for new model store
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-06-26 09:29:03 +02:00
Michael Engel
8482cf3957
Remove model flag for safetensors via mscli
...
Relates to: github.com/containers/ramalama/pull/1559
Remove Model flag for safetensor files for now in order to
allow multiple safetensor files be downloaded for the
convert command.
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-06-26 09:29:03 +02:00
Michael Engel
8f578ebf30
Prune old model store code in hf-style models
...
In addition to pruning old model store code, the usage of downloading
files using the hfcli or modelscope cli has been removed.
In the future, the download of multiple files - incl. safetensors - will
be done explicitly based on the metadata only by http requests.
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-06-26 09:29:03 +02:00
Michael Engel
512ccbaba5
Prune old model store code in Ollama model
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-06-26 09:29:03 +02:00
Michael Engel
38f16c42c4
Prune old model store code in URL model
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-06-26 09:29:03 +02:00
Michael Engel
5f688686d8
Remove script for old to new model store migration
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-06-26 09:29:03 +02:00
Michael Engel
51f766d898
Remove --use-model-store feature flag
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-06-26 09:29:03 +02:00
Mike Bonnet
4f479484de
support running all Makefile targets in the bats container
...
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-06-25 21:45:37 -07:00
Mike Bonnet
a651be7832
add support for running bats in a container
...
Add a new "bats" container which is configured to run the bats tests.
The container supports running the standard bats test suite
(container-in-container) as well as the "--nocontainer" tests.
Add two new Makefile targets for running the bats container via podman.
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-06-25 21:45:12 -07:00
Mike Bonnet
77d30733be
make use of /dev/dri optional when serving llama-stack
...
Add the --dri option to disable mounting /dev/dri into the container when running "ramalama serve --api llama-stack".
Update bats test to pass "--dri off".
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-06-25 12:06:30 -07:00
Mike Bonnet
681c488e28
Merge pull request #1608 from containers/konflux-rocm-cuda
...
konflux: use shared pipelines for rocm, rocm-ubi, and cuda
2025-06-25 09:51:21 -07:00
Eric Curtin
f4e929896a
Merge pull request #1606 from containers/fix-text-input
...
Allow std input
2025-06-25 17:07:49 +01:00
Mike Bonnet
7be12487c6
konflux: use shared pipelines for rocm, rocm-ubi, and cuda
...
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-06-25 08:46:11 -07:00
Daniel J Walsh
4b71dafa29
Merge pull request #1599 from containers/konflux-centralize-pipelines
...
konflux: centralize pipeline definitions
2025-06-25 10:37:26 -04:00
Mike Bonnet
ed4879d301
konflux: move Pipeline and PipelineRun definitions into subdirs of .tekton
...
This will simplify management as more components are on-boarded.
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-06-25 07:14:03 -07:00
Eric Curtin
aab36b04d4
Allow std input
...
We used to have this feature, got dropped recently accidentally,
can do things like:
`cat text_file_with_prompt.txt | ramalama run smollm:135m`
or
`cat some_doc | ramalama run smollm:135m Explain this document:`
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-06-25 15:01:48 +01:00
Daniel J Walsh
2526ab6223
Merge pull request #1600 from containers/konflux/mintmaker/main/registry.access.redhat.com-ubi9-ubi-9.x
...
chore(deps): update registry.access.redhat.com/ubi9/ubi docker tag to v9.6-1750786174
2025-06-25 09:04:29 -04:00
Eric Curtin
f70b13c8db
Merge pull request #1602 from rhatdan/timeout
...
Some of our tests are running for hours, need to be timed out
2025-06-25 13:35:50 +01:00
Eric Curtin
82d04a7469
Merge pull request #1601 from rhatdan/chat
...
Missing options of api_key and pid2kill are causing crashes
2025-06-25 13:34:03 +01:00
Daniel J Walsh
951246f228
Missing options of api_key and pid2kill are causing crashes
...
Also add debug information to chat.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-06-25 06:46:31 -04:00
Daniel J Walsh
2ba6f6f167
Some of our tests are running for hours, need to be timed out
...
None of our tests should take more then 1 hour, so time them
out and then need to figure out what is causing the issue.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-06-25 06:34:47 -04:00
red-hat-konflux-kflux-prd-rh03[bot]
18527f87a6
chore(deps): update registry.access.redhat.com/ubi9/ubi docker tag to v9.6-1750786174
...
Signed-off-by: red-hat-konflux-kflux-prd-rh03 <206760901+red-hat-konflux-kflux-prd-rh03[bot]@users.noreply.github.com>
2025-06-25 00:02:06 +00:00
Mike Bonnet
e661d87580
konflux: centralize pipeline definitions
...
Move the pipeline definitions into their own files and references them from the PipelineRuns
that are created on pull request and push. This allows the pipelines to be used for multiple
components and dramatically reduces code duplication and maintenance burden.
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-06-24 17:00:10 -07:00
Daniel J Walsh
dc43419f21
Merge pull request #1595 from rhatdan/fedora
...
Move RamaLama container image to default to fedora:42
2025-06-24 15:56:50 -04:00
Daniel J Walsh
189d722eb7
Move RamaLama container image to default to fedora:42
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-06-24 14:34:59 -04:00
Eric Curtin
788d5564d5
Merge pull request #1578 from containers/gemini
...
API key support
2025-06-24 13:07:35 +01:00
Eric Curtin
fd71bac96a
Merge pull request #1589 from rhatdan/accel
...
Don't pull image when doing ramalama --help call
2025-06-24 12:38:54 +01:00
Daniel J Walsh
1b6b415d0c
Don't pull image when doing ramalama --help call
...
Fixes: https://github.com/containers/ramalama/issues/1587
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-06-23 19:44:14 -04:00
Daniel J Walsh
1ee66c0964
Merge pull request #1576 from rhatdan/chat
...
Remove last libexec program
2025-06-23 13:48:45 -04:00
Daniel J Walsh
6d7bd22ee1
Remove last libexec program
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-06-23 11:47:01 -04:00
Daniel J Walsh
eaa0da253d
Hide --max-model-len from option list
...
This fixes make validate to not complain about --ctx-size option.
No reason to have this available in display, since this is only for
users assuming vllm options.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-06-23 11:44:17 -04:00
Daniel J Walsh
a00188027c
Merge pull request #1586 from rhatdan/VERSION
...
Bump to v0.9.3
2025-06-23 11:23:09 -04:00
Eric Curtin
1465086ded
API key support
...
If we pass --api-key, we can talk to OpenAI providers.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-06-23 14:21:20 +01:00
Daniel J Walsh
a9abe6909d
Bump to v0.9.3
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-06-23 07:01:30 -04:00
Daniel J Walsh
4d49658853
Merge pull request #1579 from nathan-weinberg/rlls-0.2.2
...
chore: bump ramalama-stack to 0.2.2
2025-06-23 06:52:17 -04:00
Daniel J Walsh
693827df74
Merge pull request #1580 from nathan-weinberg/fix-dash
...
fix: broken link in CI dashboard
2025-06-23 06:50:30 -04:00
Nathan Weinberg
50d1a8ccb7
fix: broken link in CI dashboard
...
Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
2025-06-22 21:54:40 -04:00
Nathan Weinberg
bfa4d32af6
chore: bump ramalama-stack to 0.2.2
...
Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
2025-06-22 21:45:15 -04:00
Daniel J Walsh
fe095f1f8d
Merge pull request #1574 from containers/specify-model
...
Make model argument mandatory
2025-06-21 06:25:58 -04:00
Eric Curtin
cb8ab961b5
Make model argument mandatory
...
To be consistent with "ramalama run" experience. Inferencing
servers that have implemented model-swapping require this. In the
case of servers like llama-server that only load one server, any
value is sufficient.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-06-21 09:39:38 +01:00
Eric Curtin
aa29aa6efa
Merge pull request #1571 from kush-gupt/main
...
fix: vLLM serving and model mounting
2025-06-20 15:49:44 +01:00
Kush Gupta
c4ec0a57e0
fix doc validation
...
Signed-off-by: Kush Gupta <kushalgupta@gmail.com>
2025-06-20 10:11:51 -04:00
Kush Gupta
e698424f78
fix doc typo and codespell test
...
Signed-off-by: Kush Gupta <kushalgupta@gmail.com>
2025-06-20 08:59:46 -04:00
Kush Gupta
d0ecd5b65a
alias max model len, improve file mounting logic
...
Signed-off-by: Kush Gupta <kushalgupta@gmail.com>
2025-06-20 08:26:50 -04:00
Daniel J Walsh
3f87444b6f
Merge pull request #1101 from lsm5/tmt-gpu
...
TMT: run tests with GPUs
2025-06-20 06:40:38 -04:00
Daniel J Walsh
cdc1edc13c
Merge pull request #1566 from containers/containers-install-from-checkout
...
install ramalama into containers from the current checkout
2025-06-20 06:34:59 -04:00
Daniel J Walsh
f795b41ed5
Merge pull request #1567 from sarroutbi/202506182026-fix-accel-image-test
...
Fix test_accel unit test to fallback to latest
2025-06-20 06:34:29 -04:00
Sergio Arroutbi
307fd722e6
Fix test_accel unit test to fallback to latest
...
Signed-off-by: Sergio Arroutbi <sarroutb@redhat.com>
2025-06-20 10:29:24 +02:00
Kush Gupta
55b7d568a9
Merge branch 'containers:main' into main
2025-06-19 22:04:53 -04:00
Kush Gupta
847ec6c33f
vllm mount fixes for safetensor directories ( #12 )
...
* vllm mount fixes for safetensor directories
Signed-off-by: Kush Gupta <kushalgupta@gmail.com>
* Update ramalama/model.py for better file detection
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
* make format
Signed-off-by: Kush Gupta <kushalgupta@gmail.com>
* improve mount for files
Signed-off-by: Kush Gupta <kushalgupta@gmail.com>
* fix docs for new vllm param
Signed-off-by: Kush Gupta <kushalgupta@gmail.com>
* add error handling
Signed-off-by: Kush Gupta <kushalgupta@gmail.com>
* fix cli param default implementation
Signed-off-by: Kush Gupta <kushalgupta@gmail.com>
* adjust error message string
Signed-off-by: Kush Gupta <kushalgupta@gmail.com>
* skip broken test
Signed-off-by: Kush Gupta <kushalgupta@gmail.com>
---------
Signed-off-by: Kush Gupta <kushalgupta@gmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-06-19 22:04:36 -04:00
Eric Curtin
5ad0f802ba
Merge pull request #1541 from containers/save-space2
...
Trying to save space
2025-06-20 00:18:05 +01:00
Eric Curtin
6d52980aeb
Merge pull request #1569 from mtrmac/oci-docs
...
Document the image format created/consumed by the oci:// transport
2025-06-19 21:46:18 +01:00
Miloslav Trmač
c63ddbcc64
Document the image format created/consumed by the oci:// transport
...
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2025-06-19 21:53:57 +02:00
Lokesh Mandvekar
a53c42723a
TMT: run tests with GPUs
...
This commit adds TMT test jobs triggered via Packit that fetches an
instance with NVIDIA GPU, specified in `plans/no-rpm.fmf`, and can be
verified in the gpu_info test result.
In addition, system tests (nocontainer), validate, and unit tests are
also triggered via TMT.
Fixes : #1054
TODO:
1. Enable bats-docker tests
2. Resolve f41 validate test failures
Signed-off-by: Lokesh Mandvekar <lsm5@fedoraproject.org>
2025-06-19 15:32:31 -04:00
Eric Curtin
5f75e6f6f4
Trying to save space
...
tiny is is not so tiny, it's 600M
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-06-19 17:37:44 +01:00
Mike Bonnet
ae114e45af
install ramalama into containers from the current checkout
...
Copy the current checkout of the ramalama repo into the containers and use that for installation.
This removes the need for an extra checkout of the ramalama repo, and is consistent with the build
process used by container_build.sh (which used a bind-mount rather than a copy).
This keeps the version of ramalama in sync with the Containerfiles, and makes testing and CI more
useful.
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-06-19 08:37:53 -07:00
Lokesh Mandvekar
66f7c0d110
System tests: account for rootful default store
...
For the rootful case, the default store is at /var/lib/ramalama.
Signed-off-by: Lokesh Mandvekar <lsm5@fedoraproject.org>
2025-06-19 10:17:49 -04:00
Daniel J Walsh
1af46a247b
Merge pull request #1550 from rhatdan/chat
...
Replace ramalama-client-code with ramalama chat
2025-06-19 07:59:42 -04:00
Daniel J Walsh
95a5a14ebf
Replace ramalama-client-code with ramalama chat
...
ramalama chat does not use --context or --temp, these are server
settings not client side.
Also remove ramalama client command, since this is a duplicate of
ramalama chat.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-06-19 07:34:36 -04:00
Daniel J Walsh
6c77edfdee
Merge pull request #1534 from containers/latest-only-makes-sense-ollama
...
:latest tag should not be assumed for non-OCI artefacts
2025-06-18 14:43:40 -04:00
Daniel J Walsh
4e4f5f329c
Merge pull request #1564 from sarroutbi/202506181805-reuse-common-command-execution
...
Reuse code for unit test execution rules
2025-06-18 14:42:18 -04:00
Sergio Arroutbi
628b723dae
Fix test_accel unit test to fallback to latest
...
Signed-off-by: Sergio Arroutbi <sarroutb@redhat.com>
2025-06-18 20:26:46 +02:00
Sergio Arroutbi
ce24886c1d
Reuse code for unit test execution rules
...
Signed-off-by: Sergio Arroutbi <sarroutb@redhat.com>
2025-06-18 18:06:10 +02:00
Eric Curtin
8ff0cd3287
:latest tag should not be assumed for non-OCI artefacts
...
I see people showing things like:
file://srv/llm/modles/unsloth/Qwen3-235B-A22B-GGUF/UD_Q2_K_XL/Qwen3-235B-A22B-UD-Q2_K_XL-00001-of-00002.gguf:latest 1 month ago 46.42 GB
file://srv/llm/modles/unsloth/Qwen3-235B-A22B-GGUF/UD_Q2_K_XL/Qwen3-235B-A22B-UD-Q2_K_XL-00002-of-00002.gguf:latest 1 month ago 35.55 GB
file://srv/llm/modles/unsloth/Qwen3-235B-A22B-GGUF/Q8_0/Qwen3-235B-A22B-Q8_0-00001-of-00006.gguf:latest 1 week ago 46.44 GB
file://srv/llm/modles/unsloth/Qwen3-235B-A22B-GGUF/Q8_0/Qwen3-235B-A22B-Q8_0-00002-of-00006.gguf:latest 1 week ago 46.0 GB
file://srv/llm/modles/unsloth/Qwen3-235B-A22B-GGUF/Q8_0/Qwen3-235B-A22B-Q8_0-00003-of-00006.gguf:latest 1 week ago 45.93 GB
file://srv/llm/modles/unsloth/Qwen3-235B-A22B-GGUF/Q8_0/Qwen3-235B-A22B-Q8_0-00004-of-00006.gguf:latest 1 week ago 46.0 GB
file://srv/llm/modles/unsloth/Qwen3-235B-A22B-GGUF/Q8_0/Qwen3-235B-A22B-Q8_0-00005-of-00006.gguf:latest 1 week ago 46.0 GB
file://srv/llm/modles/unsloth/Qwen3-235B-A22B-GGUF/Q8_0/Qwen3-235B-A22B-Q8_0-00006-of-00006.gguf:latest 1 week ago 2.39 GB
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-06-18 16:53:05 +01:00
Eric Curtin
9df9532ed4
Merge pull request #1562 from sarroutbi/202506181742-add-verbose-rule-for-unit-test-execution
...
Add verbose rule for complete output on unit tests
2025-06-18 16:47:14 +01:00
Sergio Arroutbi
5218906464
Add verbose rule for complete output on unit tests
...
Signed-off-by: Sergio Arroutbi <sarroutb@redhat.com>
2025-06-18 17:43:20 +02:00
Aaron Teo
91dd2df8e5
Merge pull request #1559 from engelmi/do-not-flag-safetensors-as-model
...
Remove Model flag for safetensor files for now
2025-06-18 22:10:42 +08:00
Eric Curtin
f780e41313
Merge pull request #1558 from scraly/patch-1
...
Add install command via homebrew
2025-06-18 14:47:13 +01:00
Michael Engel
ac2ae1e8e9
Remove model flag for safetensor files via hf cli
...
Fixes: https://github.com/containers/ramalama/issues/1557
Remove Model flag for safetensor files for now in order to
allow multiple safetensor files be downloaded for the
convert command.
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-06-18 15:26:33 +02:00
Aurelie Vache
9517fbb90a
feat: add install command via homebrew
...
Signed-off-by: scraly <scraly@gmail.com>
2025-06-18 15:05:35 +02:00
Eric Curtin
67eb9420e1
Merge pull request #1556 from rhatdan/engine
...
Fix default prefix for systems with no engines
2025-06-18 10:19:24 +01:00
Daniel J Walsh
c946769700
Merge pull request #1555 from containers/konflux/mintmaker/main/registry.access.redhat.com-ubi9-ubi-9.x
...
chore(deps): update registry.access.redhat.com/ubi9/ubi docker tag to v9.6-1749542372
2025-06-18 05:09:31 -04:00
Daniel J Walsh
1a5fd28a4d
Fix default prefix for systems with no engines
...
Fixes: https://github.com/containers/ramalama/issues/1552
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-06-18 05:00:21 -04:00
red-hat-konflux-kflux-prd-rh03[bot]
f2ef4d4f6a
chore(deps): update registry.access.redhat.com/ubi9/ubi docker tag to v9.6-1749542372
...
Signed-off-by: red-hat-konflux-kflux-prd-rh03 <206760901+red-hat-konflux-kflux-prd-rh03[bot]@users.noreply.github.com>
2025-06-18 00:03:23 +00:00
Eric Curtin
13b29fab14
Merge pull request #1542 from containers/konflux-ramalama
...
Red Hat Konflux kflux-prd-rh03 update ramalama
2025-06-17 22:23:29 +01:00
Mike Bonnet
df5a093531
konflux: reference the UBI image by digest
...
This will allow MintMaker to submit PRs to update the UBI reference when new versions
are released.
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-06-17 14:09:54 -07:00
Mike Bonnet
6bf454d8ed
konflux: add builds for arm64
...
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-06-17 14:09:50 -07:00
Mike Bonnet
2a9704fb1b
konflux: set path-context to the container-images directory
...
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-06-17 14:09:41 -07:00
red-hat-konflux-kflux-prd-rh03
3dbad48272
Red Hat Konflux kflux-prd-rh03 update ramalama
...
Signed-off-by: red-hat-konflux-kflux-prd-rh03 <konflux@no-reply.konflux-ci.dev>
2025-06-17 14:09:41 -07:00
Daniel J Walsh
eb45f50bda
Merge pull request #1551 from rhatdan/test
...
Create tempdir when run as non-root user
2025-06-17 17:02:46 -04:00
Daniel J Walsh
bbf24ae0e9
Create tempdir when run as non-root user
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-06-17 12:01:08 -04:00
Eric Curtin
13d133490e
Merge pull request #1547 from containers/add_GGML_VK_VISIBLE_DEVICES
...
Add GGML_VK_VISIBLE_DEVICES env var
2025-06-17 12:07:23 +01:00
Daniel J Walsh
aaa6f0f362
Merge pull request #1549 from containers/spaces2tabs
...
Tabs to spaces
2025-06-17 07:04:29 -04:00
Eric Curtin
5fe848eb93
Add GGML_VK_VISIBLE_DEVICES env var
...
Can be used to manually select vulkan device
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-06-17 11:57:16 +01:00
Eric Curtin
10350d61f8
Tabs to spaces
...
github UI showed red, changing just in case, incorrect tabs or
spaces can cause github ui to skip builds.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-06-17 11:55:24 +01:00
Eric Curtin
03110ac2e5
Merge pull request #1548 from rhatdan/test
...
Run bats test with TMPDIR pointing at /mnt/tmp
2025-06-17 11:54:17 +01:00
Daniel J Walsh
f8396fc6bf
Run bats test with TMPDIR pointing at /mnt/tmp
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-06-17 06:47:10 -04:00
Eric Curtin
3f012ba00e
Merge pull request #1502 from alaviss/push-qsrlulqsylxk
...
model: always pass in GPU offloading parameters
2025-06-17 10:20:02 +01:00
Daniel J Walsh
9e2ef6fced
Merge pull request #1544 from containers/add-dnf-update
...
Add dnf update -y to Fedora ROCm build
2025-06-17 05:00:07 -04:00
Eric Curtin
65a08929bb
Add dnf update -y to Fedora ROCm build
...
Trying to fix a compiler issue
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-06-16 21:54:30 +01:00
Daniel J Walsh
f0e799319c
Merge pull request #1539 from containers/dedepu
...
Deduplicate code
2025-06-16 14:30:38 -04:00
Daniel J Walsh
4382641624
Merge pull request #1543 from containers/whisper-downgrade
...
Downgrade whisper
2025-06-16 14:21:38 -04:00
Eric Curtin
e4eca9c059
Downgrade whisper
...
We don't need the latest released version right now
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-06-16 19:20:36 +01:00
Daniel J Walsh
e0e3ee137c
Merge pull request #1537 from rhatdan/VERSION
...
Bump to v0.9.2
2025-06-16 13:40:13 -04:00
Eric Curtin
d62f9d0284
Deduplicate code
...
So there is only one version of this function
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-06-16 18:29:37 +01:00
Eric Curtin
11186fac1d
Merge pull request #1540 from containers/update-podman
...
Upgrade podman
2025-06-16 19:28:19 +02:00
Eric Curtin
3d71a9f7c9
Upgrade podman
...
Use ubuntu plucky repo for podman
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-06-16 18:28:08 +01:00
Eric Curtin
ae54b39c31
Merge pull request #1512 from Hasnep/make-minimum-python-version-consistent
...
Make minimum version of Python consistent
2025-06-16 18:58:37 +02:00
Eric Curtin
7955e292df
Merge pull request #1538 from containers/tabs2spaces
...
Convert tabs to spaces
2025-06-16 17:08:31 +02:00
Eric Curtin
3930d68b8a
Convert tabs to spaces
...
Saw this in github ui
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-06-16 16:06:39 +01:00
Daniel J Walsh
96c28b179a
Bump to v0.9.2
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-06-16 09:56:40 -04:00
Daniel J Walsh
e7ab2cb96b
Merge pull request #1527 from rhatdan/image
...
honor the user specifying the image
2025-06-16 09:55:44 -04:00
Daniel J Walsh
f48293cd85
Merge pull request #1536 from nathan-weinberg/bump-rls
...
chore: bump ramalama-stack to 0.2.1
2025-06-16 09:52:45 -04:00
Nathan Weinberg
257d8597d8
chore: bump ramalama-stack to 0.2.1
...
adds RAG capabilities
Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
2025-06-16 09:37:38 -04:00
Daniel J Walsh
94f3a4e83a
honor the user specifying the image
...
Currently we are ignoreing the user specified image if it does not
contain a ':'
Fixes: https://github.com/containers/ramalama/issues/1525
While I was in the code base, I standardized on container-images for
Fedora to come from quay.io/fedora repo.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-06-16 09:37:00 -04:00
Hannes
fad170d198
Make minimum version of Python consistent
...
Signed-off-by: Hannes <h@nnes.dev>
2025-06-16 21:09:32 +08:00
Daniel J Walsh
de9c7ed89e
Merge pull request #1535 from containers/dont-always-set-up-this-symlink
...
Not sure this is supposed to be here
2025-06-16 08:11:53 -04:00
Eric Curtin
ce11e66dd4
Not sure this is supposed to be here
...
Think it's only meant for the:
container-images/scripts/build-cli.sh
version, it's breaking podman on my bootc system and replacing
/usr/bin/podman with a broken /usr/bin/podman-remote symlink.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-06-16 12:55:57 +01:00
Eric Curtin
acc426bbe1
Merge pull request #1532 from rhatdan/huggingface
...
Suggest using uv pip install to get missing module
2025-06-16 11:21:11 +02:00
Daniel J Walsh
e455d82def
Suggest using uv pip install to get missing module
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-06-16 04:06:23 -04:00
Eric Curtin
2fe2e517be
Merge pull request #1531 from rhatdan/chat
...
Add ramalama chat command
2025-06-15 22:47:47 +02:00
Daniel J Walsh
3cd6a59a76
Apply suggestions from code review
...
Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-06-15 22:16:18 +02:00
Daniel J Walsh
a21fa39b45
Add ramalama chat command
...
For now we will just add the chat command, next PR will remove the
external chat command and just use this internal one.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-06-15 21:56:36 +02:00
Daniel J Walsh
093d5a4cf1
Merge pull request #1488 from ieaves/imp/typed-config
...
Refactor config and arg typing
2025-06-15 03:29:01 -04:00
Daniel J Walsh
c637a404f8
Merge pull request #1523 from containers/change-from
...
Change the FROM for asahi container image
2025-06-15 02:30:10 -04:00
Daniel J Walsh
913c0c2cdf
Merge pull request #1529 from containers/add-colors
...
Add colors to "ramalama serve" if we can
2025-06-15 02:24:32 -04:00
Eric Curtin
ee4ccffb29
Add colors to "ramalama serve" if we can
...
I don't notice any difference but a lot of things are LOG_INFO in
llama.cpp
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-06-15 00:57:41 +01:00
Eric Curtin
7a2c30415a
Merge pull request #1528 from engelmi/add-all-option-to-ls
...
Add --all option to ramalama ls
2025-06-14 18:33:31 +02:00
Michael Engel
68052a156b
Remove unneeded list and type cast
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-06-14 18:11:04 +02:00
Michael Engel
28b4d8a9c0
Add --all option to ramalama ls
...
Relates to: https://github.com/containers/ramalama/issues/1278
By default, ramalama ls should not display partially downloaded
AI Models. In order to enable users to view all models, the new
option --all for the ls command has been introduced.
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-06-14 17:41:11 +02:00
Ian Eaves
cb6226534d
sourcery changes
...
Signed-off-by: Ian Eaves <ian.k.eaves@gmail.com>
2025-06-13 14:56:09 -05:00
Ian Eaves
f6b33ebafd
sourcery sucks
...
Signed-off-by: Ian Eaves <ian.k.eaves@gmail.com>
2025-06-13 14:22:43 -05:00
Ian Eaves
796d7b5782
sourcery changes
...
Signed-off-by: Ian Eaves <ian.k.eaves@gmail.com>
2025-06-13 12:39:27 -05:00
Ian Eaves
91a12887a5
modified ollama-model_pull test
...
Signed-off-by: Ian Eaves <ian.k.eaves@gmail.com>
2025-06-13 11:02:47 -05:00
Ian Eaves
eff6eab2ba
sourcery nits
...
Signed-off-by: Ian Eaves <ian.k.eaves@gmail.com>
2025-06-13 10:44:31 -05:00
Eric Curtin
90f7fe6e79
Change the FROM for asahi container image
...
Explicitly add quay.io
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-06-13 13:57:58 +01:00
Eric Curtin
6556b04df9
Merge pull request #1522 from rhatdan/demo
...
Update to add multi-modal
2025-06-13 14:38:33 +02:00
Daniel J Walsh
9f1faba404
Update to add multi-modal
...
Remove failing on pipe errors, since something the network
can fail and break the demo, it would be better to continue
after failures.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-06-13 14:16:23 +02:00
Daniel J Walsh
2f92ec55c7
Merge pull request #1506 from rhatdan/tty
...
Do not run with --tty when not in interactive mode
2025-06-13 08:13:23 -04:00
Daniel J Walsh
9cc4b7f266
Merge pull request #1517 from kwaa/chore/intel_gpus
...
chore(common/intel_gpus): detect arc a770, a750
2025-06-13 04:26:43 -04:00
藍+85CD
9172e3fb15
chore(common/intel_gpus): detect arc a770, a750
...
Signed-off-by: 藍+85CD <50108258+kwaa@users.noreply.github.com>
2025-06-13 15:31:04 +08:00
Daniel J Walsh
b7555c0e81
Do not run with --tty when not in interactive mode
...
I have found that when running with nvidia the -t (--tty) option
in podman is covering up certain errors. When we are not running
ramalama interactively, we do not need this flag set, and this
would make it easier to diagnose what is going on with users
systems.
Don't add -i unless necessary
Server should not need to be run with --interactive or --tty.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-06-13 09:23:14 +02:00
Daniel J Walsh
7550fd37c8
Merge pull request #1505 from containers/renovate/huggingface-hub-0.x
...
fix(deps): update dependency huggingface-hub to ~=0.33.0
2025-06-13 02:37:15 -04:00
Daniel J Walsh
d583955bdd
Merge pull request #1497 from containers/change-install-script
...
This installs ramalama via uv if python3 version is too old
2025-06-13 02:35:58 -04:00
Daniel J Walsh
87e6d5ece7
Merge pull request #1510 from containers/increase-retry-attempt-v2
...
Wait for upto 16 seconds for model to load
2025-06-13 02:35:22 -04:00
Daniel J Walsh
83363e7814
Merge pull request #1513 from Hasnep/update-black-target-version
...
Update black target version
2025-06-13 02:33:11 -04:00
Daniel J Walsh
7c730e03bf
Merge pull request #1516 from containers/cosmetic
...
For `ramalama ls` shorten huggingface lines
2025-06-13 02:26:37 -04:00
Hannes
1b2867e995
Update black target version to 3.11, 3.12 and 3.13
...
Signed-off-by: Hannes <h@nnes.dev>
2025-06-13 08:04:12 +08:00
Ian Eaves
e87740c06d
sourcery nits
...
Signed-off-by: Ian Eaves <ian.k.eaves@gmail.com>
2025-06-12 18:04:58 -05:00
Ian Eaves
40705263e1
merge
2025-06-12 17:27:21 -05:00
Ian Eaves
0be170bb56
fixing typo
...
Signed-off-by: Ian Eaves <ian.k.eaves@gmail.com>
2025-06-12 17:23:43 -05:00
Ian Eaves
82d24551ca
refactored layered config to preserve previous functionality
...
Signed-off-by: Ian Eaves <ian.k.eaves@gmail.com>
2025-06-12 17:19:35 -05:00
Ian Eaves
5b138bdba5
ollama tests, type fixes, format tests
...
Signed-off-by: Ian Eaves <ian.k.eaves@gmail.com>
2025-06-12 17:19:35 -05:00
Ian Eaves
6f49a310be
unnecessary dep group
...
Signed-off-by: Ian Eaves <ian.k.eaves@gmail.com>
2025-06-12 17:19:35 -05:00
Ian Eaves
0f39374f2b
type and bug fixes
...
Signed-off-by: Ian Eaves <ian.k.eaves@gmail.com>
2025-06-12 17:19:35 -05:00
Ian Eaves
2de9b928d4
sourcery found a few things
...
Signed-off-by: Ian Eaves <ian.k.eaves@gmail.com>
2025-06-12 17:19:35 -05:00
Ian Eaves
47b8b6055c
config rewrite + tests
...
Signed-off-by: Ian Eaves <ian.k.eaves@gmail.com>
2025-06-12 17:19:35 -05:00
Eric Curtin
e9fac56ad5
Merge pull request #1514 from Hasnep/add-python-shebang-files-to-linting
...
Add Python shebang files to linting
2025-06-12 08:49:47 -05:00
Eric Curtin
6196c88713
For `ramalama ls` shorten huggingface lines
...
Substitute huggingface with hf and remove :latest as it doesn't
really apply. huggingface lines are particularly lengthy so it's
welcome characters saved. hf is a common acronym for huggingface
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-06-12 14:26:53 +01:00
Eric Curtin
12b37d60cd
Merge pull request #1511 from engelmi/ignore-rm-of-non-existing-snapshot-dir
...
Ignore errors when removing snapshot directory
2025-06-12 07:42:08 -05:00
Eric Curtin
42b6525187
This installs ramalama via uv if python3 version is too old
...
Lets say in the case of RHEL9.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-06-12 13:33:37 +01:00
Hannes
4712924e78
Fix unformatted Python files
...
Signed-off-by: Hannes <h@nnes.dev>
2025-06-12 20:03:14 +08:00
Hannes
752516fce7
Add Python shebang files to Makefile linting
...
Signed-off-by: Hannes <h@nnes.dev>
2025-06-12 20:03:08 +08:00
Michael Engel
830409e618
Ignore errors when removing snapshot directory on failed creation
...
Relates to: https://github.com/containers/ramalama/issues/1508
remove_snapshot should never fail, therefore adding the ignore_errors=True.
Before removing a snapshot with ramalama rm an existence check is made. If
the model does not exist, an error will be raised to preserve the previous
behavior of that command.
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-06-12 13:31:21 +02:00
Eric Curtin
493d34bd29
Wait for upto 16 seconds for model to load
...
Trying to put this timeout to bed once and for all. There is a
chance a really large model on certain hardware could take more
than 16 seconds to load.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-06-12 09:58:42 +01:00
Daniel J Walsh
e5635d1d14
Merge pull request #1507 from containers/increase-retry-attempt
...
Increase retry attempts to attempt to connect to server
2025-06-12 03:58:14 -04:00
Eric Curtin
22986e0d6a
Increase retry attempts to attempt to connect to server
...
increase i to 512
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-06-12 05:15:41 +01:00
renovate[bot]
75436923b1
fix(deps): update dependency huggingface-hub to ~=0.33.0
...
Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2025-06-11 20:26:34 +00:00
Daniel J Walsh
003612abf7
Merge pull request #1503 from nathan-weinberg/fix-container-dep
...
fix: remove unneeded dependency from Llama Stack container
2025-06-11 00:11:17 -04:00
Daniel J Walsh
d98adcbc9f
Merge pull request #1499 from containers/update-shortnames
...
This is not a multi-model model
2025-06-10 23:43:49 -04:00
Nathan Weinberg
ea9ba184ac
fix: remove unneeded dependency from Llama Stack container
...
`blobfile` dependency is already included in ramalama-stack version 0.2.0
adding it explicitly is unnecessarily
Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
2025-06-10 22:20:38 -04:00
Leorize
6bac6d497a
readme: apply styling suggestions
...
Signed-off-by: Leorize <leorize+oss@disroot.org>
2025-06-10 17:53:49 -05:00
Leorize
5a3e55eb0c
model: always pass in GPU offloading parameters
...
This does nothing on systems with no GPUs, but on Vulkan-capable
systems, this would automatically offload the model to capable
accelerators.
Take this moment to claim Vulkan support in README also.
Signed-off-by: Leorize <leorize+oss@disroot.org>
2025-06-10 17:46:14 -05:00
Eric Curtin
6959d73d30
Merge pull request #1501 from alaviss/push-tumrzqxpzvkn
...
amdkfd: add constants for heap types
2025-06-10 17:41:49 -05:00
Leorize
309766dd8c
amdkfd: add constants for heap types
...
Signed-off-by: Leorize <leorize+oss@disroot.org>
2025-06-10 17:22:30 -05:00
Eric Curtin
4808a49de0
Merge pull request #1500 from alaviss/push-pwxuznmnqptr
...
Only enumerate ROCm-capable AMD GPUs
2025-06-10 17:02:17 -05:00
Leorize
db4a7d24af
Apply formatting fixes
...
Signed-off-by: Leorize <leorize+oss@disroot.org>
2025-06-10 15:20:18 -05:00
Leorize
93e36ac24e
Extract VRAM minimum into a constant
...
Signed-off-by: Leorize <leorize+oss@disroot.org>
2025-06-10 15:17:37 -05:00
Leorize
ecb9fb086f
Extract amdkfd utilities to its own module
...
Signed-off-by: Leorize <leorize+oss@disroot.org>
2025-06-10 15:17:20 -05:00
Leorize
fab87654cb
Only enumerate ROCm-capable AMD GPUs
...
Discover AMD graphics devices using AMDKFD topology instead of
enumerating the PCIe bus. This interface exposes a lot more information
about potential devices, allowing RamaLama to filter out unsupported
devices.
Currently, devices older than GFX9 are filtered, as they are no longer
supported by ROCm.
Signed-off-by: Leorize <leorize+oss@disroot.org>
2025-06-10 14:54:48 -05:00
Eric Curtin
9bc76c2757
This is not a multi-model model
...
Although the other gemma once are. Point the user towards a single
gguf.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-06-10 18:43:06 +01:00
Daniel J Walsh
83a75f16f7
Merge pull request #1492 from containers/renovate/registry.access.redhat.com-ubi9-ubi-9.x
...
chore(deps): update registry.access.redhat.com/ubi9/ubi docker tag to v9.6-1749542372
2025-06-10 08:42:14 -04:00
Daniel J Walsh
8a9f6a0291
Merge pull request #1496 from containers/fix-build
...
Install uv to fix build issue
2025-06-10 08:32:17 -04:00
Eric Curtin
b21556b513
Install uv to fix build issue
...
Run the install-uv.sh script.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-06-10 13:14:56 +01:00
Daniel J Walsh
4be8cbc71e
Merge pull request #1495 from containers/dont-use-llvmpipe
...
There's a change that we want that avoids using software rasterizers
2025-06-10 08:08:50 -04:00
Eric Curtin
b4a3375d94
There's a change that we want that avoids using software rasterizers
...
It avoids using llvmpipe when Vulkan is built in and fallsback to
ggml-cpu.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-06-10 13:05:31 +01:00
Daniel J Walsh
7bdd073b59
Merge pull request #1491 from makllama/xd/fix_hf
...
Fix #1489
2025-06-10 05:25:40 -04:00
renovate[bot]
5b849722cb
chore(deps): update registry.access.redhat.com/ubi9/ubi docker tag to v9.6-1749542372
...
Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2025-06-10 09:22:45 +00:00
Daniel J Walsh
5925bb6908
Merge pull request #1490 from rhatdan/llama-stack
...
Make sure llama-stack URL is shown to user
2025-06-10 05:22:05 -04:00
Xiaodong Ye
ae0775afd1
Address review comments
...
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
2025-06-10 16:45:47 +08:00
Xiaodong Ye
6f020d361c
Fix #1489
...
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
2025-06-10 16:39:26 +08:00
Daniel J Walsh
764fc2d829
Make sure llama-stack URL is shown to user
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-06-10 09:50:04 +02:00
Daniel J Walsh
b64d82276c
Merge pull request #1471 from rhatdan/oci
...
Throw exception when using OCI without engine
2025-06-10 03:36:20 -04:00
Daniel J Walsh
041c05d2b8
Throw exception when using OCI without engine
...
Fixes: https://github.com/containers/ramalama/issues/1463
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-06-10 08:46:01 +02:00
Daniel J Walsh
97a14e9c2d
Merge pull request #1486 from containers/remove-duplicate-line-on-restapi
...
Only print this in the llama-stack case
2025-06-10 00:09:54 -04:00
Eric Curtin
2368da00ac
Only print this in the llama-stack case
...
In the llama.cpp case it doesn't make as much sense, llama-server
prints this string when it's ready to be served like so:
main: server is listening on http://0.0.0.0:8080 - starting the main loop
This can be printed seconds or minutes too early potentially in
the llama.cpp case.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-06-09 15:25:08 +01:00
Daniel J Walsh
c62acfbba6
Merge pull request #1484 from rhatdan/VERSION
...
Bump to v0.9.1
2025-06-09 08:37:35 -04:00
Daniel J Walsh
9c639fc651
Bump to v0.9.1
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-06-09 14:37:05 +02:00
Daniel J Walsh
bbcfb7c0f1
Fix llama-stack
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-06-09 14:37:05 +02:00
Daniel J Walsh
3317372625
Merge pull request #1474 from rhatdan/demos
...
Update demos to show serving models.
2025-06-09 03:35:06 -04:00
Daniel J Walsh
cd2a8c3539
Update demo scripts to show serve
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-06-09 09:34:36 +02:00
Daniel J Walsh
fe6d90461f
Merge pull request #1472 from rhatdan/llama-stack
...
Fix handling of generate with llama-stack
2025-06-09 03:29:53 -04:00
Daniel J Walsh
e4ea40a1b8
Merge pull request #1483 from containers/renovate/huggingface-hub-0.x
...
fix(deps): update dependency huggingface-hub to ~=0.32.4
2025-06-09 00:14:15 -04:00
renovate[bot]
9627b5617b
fix(deps): update dependency huggingface-hub to ~=0.32.4
...
Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2025-06-08 20:35:25 +00:00
Eric Curtin
4a10c02716
Merge pull request #1481 from ieaves/imp/dev-dependency-groups
...
Adds dev dependency groups
2025-06-08 15:34:54 -05:00
Daniel J Walsh
4fe7ae73a1
Fix stopping of llama-stack based containers by name
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-06-08 11:54:24 +02:00
Daniel J Walsh
2ca6b57dc3
Fix handling of generate with llama-stack
...
llama-stack API is not working without --generate command.
Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-06-07 10:36:46 +02:00
Ian Eaves
f65529bda7
adds dev dependency groups
...
Signed-off-by: Ian Eaves <ian.k.eaves@gmail.com>
2025-06-06 18:12:33 -05:00
Nathan Weinberg
268e47ccc0
Merge pull request #1478 from nathan-weinberg/stack-bump
...
chore: bump 'ramalama-stack' version to 0.2.0
2025-06-05 16:15:03 -04:00
Nathan Weinberg
c59a507426
chore: bump 'ramalama-stack' version to 0.2.0
...
Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
2025-06-05 15:00:11 -04:00
Daniel J Walsh
fc9b33e436
Merge pull request #1477 from containers/no-warmup
...
Don't warmup by default
2025-06-05 14:46:30 -04:00
Eric Curtin
8d2041a0bb
Don't warmup by default
...
llama-server by default warms up the model with an empty run for
performance reasons. We can warm up ourselves with a real query.
Warming up was causing issues and delays start time.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-06-05 19:42:41 +01:00
Daniel J Walsh
a67d8c1f6a
Merge pull request #1476 from containers/env-var
...
Call set_gpu_type_env_vars rather than set_accel_env_vars
2025-06-05 14:05:08 -04:00
Eric Curtin
882011029c
Call set_gpu_type_env_vars rather than set_accel_env_vars
...
For GPU detection.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-06-05 16:43:48 +01:00
Daniel J Walsh
f07a062124
Merge pull request #1475 from containers/env-var
...
Do not override a small subset of env vars
2025-06-05 11:00:31 -04:00
Eric Curtin
ff446f96fb
Do not override a small subset of env vars
...
RamaLama does not try to detect GPU if the user has already set
certain env vars. Make this list smaller.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-06-05 14:01:45 +01:00
Daniel J Walsh
ef7bd2a004
Merge pull request #1467 from rhatdan/llama-stack
...
llama-stack container build fails with == 1.5.0
2025-06-05 01:39:04 -04:00
Daniel J Walsh
b990ef0392
Merge pull request #1469 from containers/timeout-change
...
Change timeouts
2025-06-04 20:13:44 -04:00
Eric Curtin
0bcf3b8308
Merge pull request #1468 from waltdisgrace/documentation_improvements
...
Documentation improvements
2025-06-04 11:38:55 -05:00
Eric Curtin
0455e45073
Change timeouts
...
Most we want to sleep between request attempts in 100ms, a request
every 100ms isn't that expensive.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-06-04 17:37:11 +01:00
Grace Chin
c4777a9ccc
Add documentation about running tests
...
Signed-off-by: Grace Chin <gchin@redhat.com>
2025-06-04 11:55:57 -04:00
Daniel J Walsh
56b62ec756
Merge pull request #1466 from makllama/xd/rename
...
Rename: RepoFile=>HFStyleRepoFile, BaseRepository=>HFStyleRepository, BaseRepoModel=>HFStyleRepoModel
2025-06-04 05:28:53 -04:00
Daniel J Walsh
8538e01667
llama-stack container build fails with == 1.5.0
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-06-04 10:58:41 +02:00
Xiaodong Ye
86fbd93e5f
Rename: RepoFile=>HFStyleRepoFile, BaseRepository=>HFStyleRepository, BaseRepoModel=>HFStyleRepoModel
...
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
2025-06-04 09:01:20 +08:00
Daniel J Walsh
31b82b36de
Merge pull request #1465 from nathan-weinberg/stack-lock
...
fix: lock down ramalama-stack version in llama-stack Containerfile
2025-06-03 15:21:00 -04:00
Nathan Weinberg
ae17010390
fix: lock down ramalama-stack version in llama-stack Containerfile
...
Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
2025-06-03 14:16:50 -04:00
Eric Curtin
8056437669
Merge pull request #1464 from taronaeo/chore/rm-else-in-llama-whisper-build
...
chore: remove unclear else from llama and whisper build
2025-06-03 12:19:31 -05:00
Aaron Teo
bbd6afc8e9
chore: remove unclear else from llama and whisper build
...
Ref: https://github.com/containers/ramalama/pull/1459#discussion_r2124350835
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
2025-06-04 00:43:05 +08:00
Eric Curtin
cc73b6bd1e
Merge pull request #1461 from taronaeo/doc/container-build-help
...
docs: update container_build.sh help information
2025-06-03 11:15:10 -05:00
Eric Curtin
cc2970f027
Merge pull request #1459 from taronaeo/feat/s390x-build
...
feat: s390x build commands
2025-06-03 11:13:17 -05:00
Aaron Teo
bf0bfe0761
docs: update container_build.sh help information
...
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
fix: remove -v from print information
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
2025-06-03 23:49:37 +08:00
Aaron Teo
3996f1b4a4
feat: s390x build commands
...
currently it builds correctly on s390x but we want to enforce the
-DGGML_VXE=ON flag. we also want to disable whisper.cpp for now until we
can bring up support for it, otherwise it will be a product that none of us
have experience in.
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
fix: missing s390x for ramalama
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
feat: disable whisper.cpp for s390x
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
chore: remove s390x containerfile
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
2025-06-03 23:38:33 +08:00
Daniel J Walsh
e053285a7c
Merge pull request #1462 from rhatdan/VERSION
...
Bump to v0.9.0
2025-06-03 07:28:33 -04:00
Daniel J Walsh
50df70c48c
Bump to v0.9.0
...
Switching pyproject.toml to python 3.10 since
CANN and MUSE containerfiles only have access to those
versions of python.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-06-03 06:53:08 -04:00
Daniel J Walsh
6d7cfa88a4
Merge pull request #1457 from rhatdan/llama-stack
...
Add support for generating kube.yaml and quadlet/kube files for llama…
2025-06-03 06:51:51 -04:00
Eric Curtin
75b36dc3ba
Merge pull request #1458 from engelmi/snapshot-verification
...
Snapshot verification
2025-06-02 07:47:30 -05:00
Michael Engel
b84527bdd5
Replace exception with explicit is_gguf check
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-06-02 13:11:01 +02:00
Michael Engel
4f53c65386
Improved error handling when creating new snapshot
...
An error when creating new snapshots has only been partially handled
inside the model store and the caller side had to clean up properly.
In order to simplify this, more error handling has been added when
creating new snapshots - removing the (faulty) snapshot, logging and
passing the exception upwards so that the caller can do additional
actions. This ensures that the state remains consistent.
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-06-02 13:11:01 +02:00
Michael Engel
228c985d2c
Moved endianness verification to dedicated step
...
Previously, the endianness check was done for each SnapshotFile and
these files might not be models, but could also be miscellaneous such
as chat templates or other meta data. By removing only the affected file
on a mismatch error the store might get into an inconsistent state since
the cleanup depends on the error handling of the caller.
Therefore, the check for endianness has been moved one layer up and only
checks the flagged model file. In case of a mismatch an implicit removal
of the whole snapshot is triggered.
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-06-02 13:11:01 +02:00
Michael Engel
381d052d55
Extract model endianness into dedicated function
...
By moving the recently improved code to detect the endianness into
a dedicated function, its reusability is increased. Also, a specific
exception class if the model is not in the gguf format has been added.
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-06-02 13:11:01 +02:00
Michael Engel
b6d1eb77a1
Remove unused GGUFEndian members
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-06-02 13:11:01 +02:00
Daniel J Walsh
b218c099e4
Add support for generating kube.yaml and quadlet/kube files for llama-stack
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-06-02 06:03:26 -04:00
Daniel J Walsh
c15a1e31f4
Merge pull request #1451 from rhatdan/selinux
...
Eliminate selinux-policy packages from containers
2025-06-01 06:20:27 -04:00
Eric Curtin
a1cbd017e9
Merge pull request #1456 from makllama/xd/refactoring
...
Refactoring huggingface.py and modelscope.py and extract repo_model_base.py
2025-05-31 11:39:20 -05:00
Eric Curtin
ee7cb50849
Merge pull request #1413 from rhatdan/llama-stack
...
Add support for llama-stack
2025-05-31 11:37:48 -05:00
Xiaodong Ye
816593caf6
Refactoring huggingface.py and modelscope.py and extract repo_model_base.py
...
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
2025-05-31 21:04:16 +08:00
Daniel J Walsh
360f075fed
Eliminate selinux-policy packages from containers
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-05-31 05:28:57 -04:00
Eric Curtin
408ee66000
Merge pull request #1454 from taronaeo/feat/hf-byteswap-on-save
...
feat(model_store): prevent model endianness mismatch on download
2025-05-30 15:08:22 -05:00
Aaron Teo
a8dec56641
feat(model_store): prevent model endianness mismatch on download
...
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
fix(gguf_parser): missed some calls
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
fix(gguf_parser): typo `return` vs `raise`
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
fix(gguf_parser): missing staticmethod declarations
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
feat(model_store): prevent model endianness mismatch on download
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
feat(model_store): prevent downloading of non-native endian models
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
feat(model_store): check file for gguf and verify endianness
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
feat(model_store): add more information on why we deny endian mismatch
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
fix(model_store): linters complaining
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
fix(model_store): linter complaining
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
fix(model_store): linter complaining
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
2025-05-31 03:04:27 +08:00
Daniel J Walsh
a437651934
Add support for llama-stack
...
Add new option --api which allows users to specify the API Server
either llama-stack or none. With None, we just generate a service with
serve command. With `--api llama-stack`, RamaLama will generate an API
Server listening on port 8321 and a openai server listening on port
8080.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-05-30 09:16:52 -04:00
Daniel J Walsh
52204997b2
Merge pull request #1455 from almusil/logging
...
Small logging improvements
2025-05-30 08:01:24 -04:00
Daniel J Walsh
27ec0d05ef
Merge pull request #1452 from taronaeo/fix/gguf-parser-string-endian
...
fix(gguf_parser): fix memoryerror exception when loading non-native models
2025-05-30 05:50:56 -04:00
Ales Musil
7d62050941
Add more logging around HTTP requests.
...
Add more logging to indacate requests to http/https addresses in
debug. This should make it easier to find out what exactly is going
on under the hood mainly for pull command.
Signed-off-by: Ales Musil <amusil@redhat.com>
2025-05-30 08:57:50 +02:00
Ales Musil
4c905a4207
Add global logger and use it in the existing code.
...
Add global logger that can be used to print message to stderr.
Replace all perror calls in dabug cases with logger.debug calls
which reduces the extra argument required to pass as the module
will print error message based on the level.
Signed-off-by: Ales Musil <amusil@redhat.com>
2025-05-30 08:57:50 +02:00
Eric Curtin
fbca7ec238
Merge pull request #1450 from rhatdan/libexec
...
make ramalama-client-core send default model to server
2025-05-30 00:14:46 -05:00
Aaron Teo
1b32a09190
fix(gguf_parser): fix memoryerror exception when loading non-native
...
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
fix(gguf_parser): missed some calls
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
fix(gguf_parser): typo `return` vs `raise`
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
fix(gguf_parser): missing staticmethod declarations
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
2025-05-30 10:05:05 +08:00
Eric Curtin
d1bfe4a18a
Merge pull request #1449 from rhatdan/vulkan1
...
Switch default ramalama image build to use VULKAN
2025-05-29 19:41:23 -05:00
Daniel J Walsh
b26a82c132
make ramalama-client-core send default model to server
...
Also move most of the helper functions into ramalamashell class
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-05-29 16:19:38 -04:00
Daniel J Walsh
b6d5e95e2c
Switch default ramalama image build to use VULKAN
...
podman 5.5 and Podman Desktop have been updated, this
should give us better performance then previous versions
on MAC.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-05-29 09:30:23 -04:00
Daniel J Walsh
8a604e3b13
Merge pull request #1430 from melodyliu1986/melodyliu1986-feature-branch
...
fix(run): Ensure 'run' subcommand works with host proxy settings.
2025-05-29 08:52:48 -04:00
Daniel J Walsh
7c2b21bb25
Merge pull request #1447 from rhatdan/choice
...
Choice could be not set and should not be used
2025-05-29 08:42:21 -04:00
Daniel J Walsh
398309a354
Choice could be not set and should not be used
...
Fixes: https://github.com/containers/ramalama/issues/1445
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-05-29 06:54:54 -04:00
Song Liu
206d669ce7
fix(run): Ensure 'run' subcommand works with host proxy settings.
...
When 'ramalama run' is used with '--network none', it was inheriting host proxy
environment variables. This caused the interanl client to fail when connecting to
the internal llama-server on 127.0.0.1, as it tried to route loopback traffic through
the unreachable proxy.
This change modifies engine.py to:
- Correctly set NO_PROXY/no_proxy for localhost and 127.0.0.1.
- Explicitly unset http_proxy, https_proxy, HTTP_PROXY, and HTTPS_PROXY variables
for the container when the 'run' subcommand is invoked.
This allows the internal client to connect directly to the internal server, resolving
the connection error.
Fixes : #1414
Signed-off-by: Song Liu <soliu@redhat.com>
2025-05-29 15:50:15 +08:00
Daniel J Walsh
859609e59e
Merge pull request #1444 from taronaeo/feat/s390x-build
...
fix(gguf_parser): fix big endian model parsing
2025-05-28 11:22:57 -04:00
Aaron Teo
0bf4f5daf7
refactor(gguf_parser): fix big endian model parsing
...
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
wip(gguf_parser): debug gguf_version
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
wip(gguf_parser): swap endianness for model version
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
wip(gguf_parser): manually set to big endian mode first
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
wip(gguf_parser): refactor endianness read
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
fix(gguf_parser): unable to load big endian models on big-endian
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
wip(gguf_parser): add print statements for debug
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
wip(gguf_parser): pin endianness for now
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
fix(gguf_parser): support big-endian model
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
wip(build_llama_and_whisper): add s390x build flags
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
wip(build_llama_and_whisper): add openblas-openmp dep
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
Revert "wip(build_llama_and_whisper): add openblas-openmp dep"
This reverts commit 375a358d192789cd4651886308cc723e56baf50f.
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
Revert "wip(build_llama_and_whisper): add s390x build flags"
This reverts commit 00fc3ea21b64a9a39226878a0bf194c1b4bc3c41.
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
chore(build_rag): add notification of rag and docling build skip
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
fix(linter): code formatting
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
fix(linter): separately declare variables
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
fix(model_inspect): fix model endianness detection
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
fix(linter): fix code styling
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
fix(model_inspect): circular import for ggufendian
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
fix(endian): missing import
Signed-off-by: Aaron Teo <aaron.teo1@ibm.com>
2025-05-28 22:32:09 +08:00
Daniel J Walsh
b9171dcf4f
Merge pull request #1442 from olliewalsh/quadlet_duplicate_options
...
Fix quadlet handling of duplicate options
2025-05-28 08:27:43 -04:00
Oliver Walsh
82c45f2171
Fix quadlet handling of duplicate options
...
Re-implement without relying on ConfigParser which does not support duplicate
options.
Extend unit test coverage for this and correct the existing test data.
Signed-off-by: Oliver Walsh <owalsh@redhat.com>
2025-05-28 02:16:31 +01:00
Daniel J Walsh
e26a141f33
Merge pull request #1441 from nathan-weinberg/py-version
...
fix: update references to Python 3.8 to Python 3.11
2025-05-27 13:14:09 -04:00
Nathan Weinberg
31b23a2ff7
fix: update references to Python 3.8 to Python 3.11
...
prev commit made Python 3.11 the min version for
ramalama, but not all references in the project
were updated to reflect this
Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
2025-05-27 11:41:08 -04:00
Daniel J Walsh
69a371c626
Merge pull request #1439 from rhatdan/VERSION
...
Bump to v0.8.5
2025-05-27 08:53:06 -04:00
Daniel J Walsh
b7d45f48aa
Merge pull request #1438 from p5/bump-llama.cpp
...
chore: bump llama.cpp to support tool streaming
2025-05-27 07:34:03 -04:00
Robert Sturla
b3adc7445b
fix(ci): remove aditional unused software during build workflow
...
Signed-off-by: Robert Sturla <robertsturla@outlook.com>
2025-05-27 10:00:05 +01:00
Robert Sturla
d23ed7d7ec
chore: bump llama.cpp to support tool streaming
...
Signed-off-by: Robert Sturla <robertsturla@outlook.com>
2025-05-26 22:02:20 +01:00
Daniel J Walsh
691c235b80
Bump to v0.8.5
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-05-26 06:26:40 -04:00
Daniel J Walsh
90974a49af
Merge pull request #1436 from makllama/xd/mudnn
...
Support Moore Threads GPU #3
2025-05-26 05:58:09 -04:00
Xiaodong Ye
57243bcfb0
musa: switch to mudnn images
...
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
2025-05-26 11:23:53 +08:00
Daniel J Walsh
a1e2fad76f
Merge pull request #1435 from containers/multimodal
...
Don't use jinja in the multimodal case
2025-05-24 05:41:30 -04:00
Eric Curtin
98e15e40db
Don't use jinja in the multimodal case
...
At least with smolvlm the output becomes junk with this option on.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-05-23 18:19:53 +01:00
Eric Curtin
e041cc8e44
Merge pull request #1426 from afazekas/split_file
...
split/big model support for llama.cpp
2025-05-20 08:58:17 -04:00
Daniel J Walsh
64ce8a1018
Merge pull request #1425 from olliewalsh/hftokenauth
...
Add support for Hugging Face token authentication
2025-05-20 08:47:49 -04:00
Daniel J Walsh
e3dc18558a
Merge pull request #1428 from sarroutbi/202505201103-remove-unused-parameters
...
Remove unused parameters from ollama_repo_utils.py
2025-05-20 07:19:57 -04:00
Eric Curtin
157c598568
Merge pull request #1427 from afazekas/bump-bug-rocm
...
Bump llama.cpp to fix rocm bug
2025-05-20 06:13:04 -04:00
Sergio Arroutbi
6e0b3f4361
Remove unused parameters from ollama_repo_utils.py
...
Signed-off-by: Sergio Arroutbi <sarroutb@redhat.com>
2025-05-20 11:03:41 +02:00
Attila Fazekas
14e443dd47
Bump llama.cpp to fix rocm bug
...
Last bump unfortunately bring a bug to rocm/hip support
bumping the version to include the fix.
[0] https://github.com/ggml-org/llama.cpp/issues/13437
Signed-off-by: Attila Fazekas <afazekas@redhat.com>
2025-05-20 10:29:27 +02:00
Attila Fazekas
efa62203eb
split/big model support for llama.cpp
...
Bigger than 70B models typically stored in multiple gguf files
with a special naming what the llama.cpp expects.
Signed-off-by: Attila Fazekas <afazekas@redhat.com>
2025-05-20 10:02:36 +02:00
Oliver Walsh
14a632b081
Only remove a snapshot if we tried to create one
...
Otherwise can remove an existing snapshot due to an unrelated error
e.g HTTP 401 if token auth fails
Signed-off-by: Oliver Walsh <owalsh@redhat.com>
2025-05-19 11:53:51 +01:00
Oliver Walsh
b7b6172626
Use cached huggingface auth token if it exists
...
Signed-off-by: Oliver Walsh <owalsh@redhat.com>
2025-05-19 11:42:04 +01:00
Daniel J Walsh
b04d88e9c4
Merge pull request #1423 from containers/renovate/registry.access.redhat.com-ubi9-ubi-9.x
...
Update registry.access.redhat.com/ubi9/ubi Docker tag to v9.6-1747219013
2025-05-18 08:18:10 -04:00
Daniel J Walsh
2f6b6d1f49
Merge pull request #1424 from containers/add-shortnames
...
Add smolvlm vision models
2025-05-18 08:17:14 -04:00
Eric Curtin
e494a6d924
Add smolvlm vision models
...
For multimodal usage
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-05-18 12:02:01 +01:00
renovate[bot]
7c452618f8
Update registry.access.redhat.com/ubi9/ubi Docker tag to v9.6-1747219013
...
Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2025-05-17 18:09:58 +00:00
Daniel J Walsh
520a8379a2
Merge pull request #1407 from makllama/xd/mthreads
...
Support Moore Threads GPU #1
2025-05-16 13:20:02 -04:00
Daniel J Walsh
9d65a2e546
Merge pull request #1422 from olliewalsh/hf_repo_norm
...
Normalize hf repo quant/tag
2025-05-16 11:29:29 -04:00
Daniel J Walsh
798db33f49
Merge pull request #1420 from rhatdan/except
...
Don't throw Exceptions, be more specific
2025-05-16 11:11:40 -04:00
Oliver Walsh
ab96b97751
Normalize hf repo quant/tag
...
The huggingface repo tag refers to the quantization and is case insensitive.
Normalize this to uppercase.
Fixes : #1421
Signed-off-by: Oliver Walsh <owalsh@redhat.com>
2025-05-16 15:28:59 +01:00
Daniel J Walsh
a27e56cb16
Don't throw Exceptions, be more specific
...
Fixes: https://github.com/containers/ramalama/issues/1419
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-05-16 10:28:04 -04:00
Daniel J Walsh
924f38358d
Merge pull request #1409 from engelmi/add-port-mapping-to-gen
...
Added host:container port mapping to quadlet generation
2025-05-16 09:35:55 -04:00
Daniel J Walsh
ffc9d46dda
Merge pull request #1416 from olliewalsh/multimodal
...
Multimodal/vision support
2025-05-16 09:34:57 -04:00
Eric Curtin
0dc0de1cf1
Merge pull request #1418 from containers/typo2
...
Small typo
2025-05-16 12:52:39 +01:00
Eric Curtin
98f17220f3
Small typo
...
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-05-16 12:52:09 +01:00
Eric Curtin
270608f1c2
Merge pull request #1415 from containers/add-more-debug
...
Add more debug for non starting servers with "ramalama run"
2025-05-16 12:47:14 +01:00
Oliver Walsh
1e34882beb
Omit unused tag when creating ModelScopeRepository instance
...
Co-authored-by: Michael Engel <mengel@redhat.com>
Signed-off-by: Oliver Walsh <owalsh@redhat.com>
2025-05-16 12:24:51 +01:00
Eric Curtin
1bd1b2ae98
Add more debug for non starting servers with "ramalama run"
...
Sometimes the server doesn't start
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-05-16 11:56:41 +01:00
Michael Engel
fff099f130
Validate --port option input
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-05-16 12:18:48 +02:00
Michael Engel
0707857dae
Added host:container port mapping to quadlet generation
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-05-16 12:18:48 +02:00
Oliver Walsh
e9ae198547
Multimodal/vision support
...
Add support for pulling hf repos vs individual models, replicating
the `llama.cpp -hf <model>` logic.
Add support for mmproj file in model store snapshot.
If an mmproj file is available pass it on the llama.cpp command line.
Structure classes to continue support for modelscope as ModelScopeRepository
inherits from HuggingfaceRepository.
Example usage:
$ ramalama serve huggingface://ggml-org/gemma-3-4b-it-GGUF
...
Open webui, upload a picture, ask for a description.
Fixes : #1405
Signed-off-by: Oliver Walsh <owalsh@redhat.com>
2025-05-16 10:40:18 +01:00
Xiaodong Ye
267979fa44
Address review comments
...
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
2025-05-15 19:20:31 +08:00
Xiaodong Ye
b68c6b4c45
Address review comments
...
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
2025-05-15 19:20:31 +08:00
Xiaodong Ye
7e4a0102af
Address review comments
...
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
2025-05-15 19:20:31 +08:00
Xiaodong Ye
d33efcc5ec
Address review comments
...
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
2025-05-15 19:20:31 +08:00
Xiaodong Ye
80f2393283
Support Moore Threads GPU
...
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
2025-05-15 19:20:31 +08:00
Eric Curtin
04032f28c1
Merge pull request #1410 from makllama/xd/mthreads_doc
...
Support Moore Threads GPU #2
2025-05-15 09:44:51 +01:00
Xiaodong Ye
e2faafa68e
Address review comments
...
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
2025-05-15 10:41:09 +08:00
Xiaodong Ye
ae79ab16b2
Add doc for Moore Threads GPU
...
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
2025-05-15 10:26:21 +08:00
Eric Curtin
9e01abf5ef
Merge pull request #1408 from bmahabirbu/ocr-cleanup
...
fix: removed ocr print statement and updated ocr description
2025-05-14 15:41:33 +01:00
Brian
dd6e03f991
fix: removed ocr print statement and updated ocr description
...
Signed-off-by: Brian <bmahabir@bu.edu>
2025-05-14 10:04:57 -04:00
Daniel J Walsh
e8f14a3e5b
Merge pull request #1400 from bmahabirbu/ocr
...
added a docling ocr flag ( text image recognition) flag to address RAM issue
2025-05-14 05:52:35 -04:00
Brian
2b5ee4e7c0
added a docling ocr flag ( text image recognition) flag to address RAM issue
...
Signed-off-by: Brian <bmahabir@bu.edu>
2025-05-13 22:51:21 -04:00
Daniel J Walsh
daa2948ef6
Merge pull request #1399 from rhatdan/VERSION
...
Fix cuda builds installation of python3.11
2025-05-13 11:21:51 -04:00
Daniel J Walsh
5fa60ffb68
Merge pull request #1406 from sarroutbi/202505131640-include-building-containers-in-contributing-md
...
Include additional information in CONTRIBUTING.md
2025-05-13 11:20:57 -04:00
Sergio Arroutbi
2f0813c378
Include additional information in CONTRIBUTING.md
...
Include additional information such as:
- Possibility to generate containers through Makefile
- Possibility to generate coverage reports through Makefile
Signed-off-by: Sergio Arroutbi <sarroutb@redhat.com>
2025-05-13 16:57:08 +02:00
Daniel J Walsh
091147d844
Merge pull request #1404 from sarroutbi/202505131435-include-minor-contributing-md-improvements
...
Add minor CONTRIBUTING.md enhancements
2025-05-13 09:50:22 -04:00
Daniel J Walsh
3490306484
Merge pull request #1403 from sarroutbi/202505131335-increase-cli-coverage
...
Increase cli.py coverage
2025-05-13 09:49:55 -04:00
Sergio Arroutbi
a2a040f830
Increase cli.py coverage
...
Signed-off-by: Sergio Arroutbi <sarroutb@redhat.com>
2025-05-13 15:02:31 +02:00
Sergio Arroutbi
f3cd12dce8
Add minor CONTRIBUTING.md enhancements
...
Signed-off-by: Sergio Arroutbi <sarroutb@redhat.com>
2025-05-13 14:36:23 +02:00
Daniel J Walsh
e67cae5b66
Fix cuda builds installation of python3.11
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-05-13 08:28:20 -04:00
Eric Curtin
c802a33b6b
Merge pull request #1402 from sarroutbi/202505131227-fix-pylint-issues
...
Fix issues reported by pylint for cli.py
2025-05-13 13:06:51 +01:00
Sergio Arroutbi
0030b8ae4c
Fix issues reported by pylint for cli.py
...
Signed-off-by: Sergio Arroutbi <sarroutb@redhat.com>
2025-05-13 13:25:48 +02:00
Daniel J Walsh
1c945ea06d
Merge pull request #1398 from containers/less-paths-added
...
Remove all path additions to this file
2025-05-13 05:43:50 -04:00
Eric Curtin
35dc8aac2f
Remove all path additions to this file
...
This was added when we didn't have good installation techniques
for mac. We have pipx which was not intuitive and a hacked
together shell script as an alternative. Now that we have brew and
uv integrated we don't need this code.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-05-13 10:16:56 +01:00
Eric Curtin
935dc717b1
Merge pull request #1396 from containers/build-fix-2
...
Fix builds
2025-05-13 09:58:46 +01:00
Eric Curtin
36137ac613
Fix builds
...
Use array for list of packages. Move start of script execution
after all function definitions.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-05-13 09:58:19 +01:00
Eric Curtin
a4bcd52d14
Merge pull request #1395 from ieaves/imp/main-error-reporting
...
Using perror in cli.main
2025-05-12 19:54:12 +01:00
Ian Eaves
0238164464
switched cli.main away from print to perror
...
Signed-off-by: Ian Eaves <ian.k.eaves@gmail.com>
2025-05-12 13:22:44 -05:00
Daniel J Walsh
4ad7812185
Merge pull request #1393 from containers/reword-description
...
This script is not macOS only
2025-05-12 12:27:53 -04:00
Daniel J Walsh
e7e8182fec
Merge pull request #1391 from rhatdan/VERSION
...
Bump to 0.8.3
2025-05-12 12:25:16 -04:00
Eric Curtin
f7296d0e23
This script is not macOS only
...
This script works for many platforms and generally does the right
thing.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-05-12 17:07:13 +01:00
Daniel J Walsh
730a09dba3
Merge pull request #1392 from containers/change-install-url
...
Shorten url in README.md
2025-05-12 12:04:39 -04:00
Daniel J Walsh
71872f8ced
Bump to v0.8.3
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-05-12 12:02:34 -04:00
Eric Curtin
c5ea7fc9d1
Shorten url in README.md
...
This is now installable via https://ramalama.ai/install.sh
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-05-12 16:57:43 +01:00
Daniel J Walsh
4f240699da
Merge pull request #1389 from containers/punctuation-consistency-2
...
More de-duplication and consistency
2025-05-12 07:48:12 -04:00
Eric Curtin
24698d1c4b
More de-duplication and consistency
...
After the modelscope changes
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-05-12 12:02:18 +01:00
Daniel J Walsh
beba6bb066
Merge pull request #1371 from engelmi/add-output-path-to-generate
...
Add output path to generate quadlet/kube
2025-05-12 06:21:57 -04:00
Daniel J Walsh
e06824a572
Merge pull request #1381 from makllama/xd/modelscope
...
Add support for modelscope and update doc
2025-05-12 06:17:32 -04:00
Michael Engel
1323b25a7a
Added unit and system test for generate change
...
Added unit tests for new parsing feature of --generate option as
well as for the refactored quadlet file generation. In addition,
a system test has been added to verify the output directory of
the --generate option works as expected.
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-05-12 11:59:38 +02:00
Michael Engel
385a82ab69
Added support for expanding user directory in IniFile and PlainFile
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-05-12 11:59:37 +02:00
Michael Engel
0a90abe25a
Extended --generate option by output directory
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-05-12 11:59:37 +02:00
Michael Engel
56bea00ea7
Refactor kube generation by wrapping in file class
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-05-12 11:59:37 +02:00
Michael Engel
fe19a7384a
Refactor quadlet generation for use of configparser
...
Instead of writing the quadlet string manually, lets use the
configparser from the standard library. A slim wrapper class
has been added as well to simplify the usage of configparser.
In addition, the generated quadlets are not directly written to
file, but instead the inifile instances are returned. This
implies that the caller needs to do the write_to_file call and
enables writing simple unit tests for the generation.
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-05-12 11:59:37 +02:00
Eric Curtin
107cd50e63
Merge pull request #1363 from melodyliu1986/melodyliu1986-feature-branch
...
update the shortnames path according to the shortnames.py
2025-05-12 10:53:20 +01:00
Eric Curtin
2b8cdbbe83
Merge pull request #1387 from makllama/xd/docker_build
...
Fix #1382
2025-05-12 10:44:30 +01:00
Eric Curtin
505061c0af
Merge pull request #1388 from nathan-weinberg/mac-no-bats
...
ci(fix): macOS runner didn't have bats
2025-05-12 10:43:31 +01:00
Song Liu
4beb41aca6
update the shortnames path according to the shortnames.py
...
Signed-off-by: Song Liu <soliu@redhat.com>
2025-05-12 16:01:11 +08:00
Nathan Weinberg
a80a556332
Merge pull request #1386 from containers/punctuation-consistency
...
Punctuation consistency when pulling models
2025-05-11 23:05:43 -04:00
Nathan Weinberg
2a1317936a
ci(fix): macOS runner didn't have bats
...
couldn't run e2e tests
also consolidated all package installs into one step
Signed-off-by: Nathan Weinberg <nathan2@stwmd.net>
2025-05-11 22:55:06 -04:00
Xiaodong Ye
1801378950
Fix #1382
...
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
2025-05-12 10:32:23 +08:00
Xiaodong Ye
34af059f3d
Address issues found by CI
...
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
2025-05-12 09:52:16 +08:00
Eric Curtin
8f112e7c0d
Punctuation consistency when pulling models
...
Around spacing
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-05-11 19:38:18 +01:00
Daniel J Walsh
b183d0e4f5
Merge pull request #1383 from makllama/xd/docker
...
Support older version of Docker
2025-05-11 11:46:09 -04:00
Eric Curtin
fad29e5bac
Merge pull request #1380 from antbbn/patch-1
...
Check nvidia-container-runtime executable also in engine.py
2025-05-11 15:39:57 +01:00
Eric Curtin
4505758ca2
Merge pull request #1384 from nathan-weinberg/more-ci-fixes
...
ci: additional fixes and cleanup for image build jobs
2025-05-11 13:56:49 +01:00
Xiaodong Ye
590199c9dd
Address review comments
...
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
2025-05-11 14:04:03 +08:00
Xiaodong Ye
9e410944cb
Add unit tests
...
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
2025-05-11 13:54:18 +08:00
Xiaodong Ye
1aac29e783
Address review comments
...
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
2025-05-11 11:56:58 +08:00
Nathan Weinberg
f5313251ad
ci: additional fixes and cleanup for image build jobs
...
Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
2025-05-10 16:24:32 -04:00
Xiaodong Ye
d32d6ed6dd
Support older version of Docker
...
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
2025-05-10 20:51:28 +08:00
Xiaodong Ye
9985b9ef75
Format changes for passing CI
...
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
2025-05-10 20:48:37 +08:00
Antonio Bibiano
bff58da5d4
Update test in case nvidia-container-runtime binary is not present
...
Signed-off-by: Antonio Bibiano <antbbn@gmail.com>
2025-05-10 14:37:50 +02:00
Antonio Bibiano
5adaa9b8b8
Check nvidia-container-runtime executable also in engine.py
...
Signed-off-by: Antonio Bibiano <antbbn@gmail.com>
2025-05-10 14:14:24 +02:00
Xiaodong Ye
15da984fe0
Add support for modelscope
...
Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>
2025-05-10 19:51:08 +08:00
Eric Curtin
177397c346
Merge pull request #1375 from nathan-weinberg/fix-image-jobs
...
ci: modify 'latest' job to only run on release
2025-05-10 12:31:06 +01:00
Daniel J Walsh
6392d3c7e9
Merge pull request #1378 from TristanCacqueray/vision-support
...
Update llama_cpp_sha to the latest version
2025-05-10 06:32:40 -04:00
Tristan Cacqueray
6e4c290ca2
Update llama_cpp_sha to the latest version
...
This change brings vision support to the rpc-server.
Signed-off-by: Tristan Cacqueray <tdecacqu@redhat.com>
2025-05-10 11:07:34 +02:00
Nathan Weinberg
1281879f9f
ci: modify 'latest' job to only run on release
...
Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
2025-05-09 13:10:20 -04:00
Daniel J Walsh
a9f5238082
Merge pull request #1340 from ieaves/feat/standardized-build
...
Remove hardcoded /usr/local site-packages injection to fix sys.path pollution
2025-05-09 10:55:44 -04:00
Daniel J Walsh
d381120860
Merge pull request #1373 from rhatdan/build
...
Make version optional in build
2025-05-09 10:33:29 -04:00
Daniel J Walsh
17d91fbe24
Merge pull request #1372 from nathan-weinberg/ci-tweaks
...
Various CI fixes
2025-05-09 10:15:14 -04:00
Daniel J Walsh
5260ad701a
Make version optional in build
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-05-09 10:14:04 -04:00
Nathan Weinberg
c8e992846f
ci(docs): add CI report matrix
...
Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
2025-05-09 09:28:16 -04:00
Nathan Weinberg
2db912c383
ci(chore): remove incorrect 'Fedora' message from install job
...
also remove some trailing whitespace
Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
2025-05-09 09:27:26 -04:00
Nathan Weinberg
01691fe899
ci(fix): fix regex for CI image job
...
Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
2025-05-09 09:27:26 -04:00
Nathan Weinberg
99bdf6097f
ci(fix): add 'make install-requirements' to 'latest' and 'nightly' jobs
...
Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
2025-05-09 09:27:26 -04:00
Eric Curtin
340c417820
Merge pull request #1370 from jelly/404-urls
...
Update not found urls
2025-05-09 13:17:21 +01:00
Eric Curtin
d740fc6135
Merge pull request #1367 from rhatdan/rag
...
Use python3.11 on systems with older python
2025-05-09 13:16:58 +01:00
Jelle van der Waa
3c253faed6
Update podman markdown links
...
Signed-off-by: Jelle van der Waa <jvanderwaa@redhat.com>
2025-05-09 14:16:14 +02:00
Jelle van der Waa
a08af92c55
Update llama.cpp documentation url
...
Signed-off-by: Jelle van der Waa <jvanderwaa@redhat.com>
2025-05-09 14:09:11 +02:00
Daniel J Walsh
a3beed7d14
Merge pull request #1369 from mcornea/cuda_all_devices
...
Use all GPUs in CUDA_VISIBLE_DEVICES as default
2025-05-09 06:10:56 -04:00
Marius Cornea
2a218f8bce
Use all GPUs in CUDA_VISIBLE_DEVICES as default
...
Currently the CUDA_VISIBLE_DEVICES environment variable defaults to '0'
when it's not overidden by the user. This commit updates it to include all
available GPUs detected by nvidia-smi, allowing the application to
utilize multiple GPUs by default.
Signed-off-by: Marius Cornea <mcornea@redhat.com>
2025-05-09 09:51:36 +03:00
Daniel J Walsh
317103e542
Use python3.11 on systems with older python
...
Fixes: https://github.com/containers/ramalama/issues/1362
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-05-08 13:27:45 -04:00
Eric Curtin
ae57590e66
Merge pull request #1366 from mikebonnet/fix-client-cmd
...
fix "ramalama client"
2025-05-08 16:55:05 +01:00
Eric Curtin
3ab554f155
Merge pull request #1359 from rhatdan/docling
...
Allow docling to handle URLs rather then handling locally
2025-05-08 16:04:46 +01:00
Mike Bonnet
2d7407cc90
fix "ramalama client"
...
get_cmd_with_wrapper() was changed in 849813f8
to accept a single string argument instead
of a list. Update cli.py to pass only the first element of the list.
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-05-08 07:46:19 -07:00
Daniel J Walsh
0e616778eb
Merge pull request #1361 from mikebonnet/rag-build-tweaks
...
small improvements to the build of the ramalama-rag image
2025-05-08 08:46:27 -04:00
Daniel J Walsh
0530c1e6bf
Merge pull request #1364 from sarroutbi/202505081057-extend-tomlparser-coverity
...
Extend TOMLParser coverage to 100%
2025-05-08 08:39:29 -04:00
Daniel J Walsh
b5e6269e81
Merge pull request #1365 from sarroutbi/202505081128-groom-coverage-rules
...
Groom coverage rules, genreate xml/lcov reports
2025-05-08 08:37:01 -04:00
Sergio Arroutbi
1de0c27534
Groom coverage rules, genreate xml/lcov reports
...
Signed-off-by: Sergio Arroutbi <sarroutb@redhat.com>
2025-05-08 11:36:42 +02:00
Sergio Arroutbi
675f302f1c
Extend TOMLParser coverage to 100%
...
Signed-off-by: Sergio Arroutbi <sarroutb@redhat.com>
2025-05-08 11:12:49 +02:00
Daniel J Walsh
b4dc9ad977
Merge pull request #1355 from mcornea/fix_cuda_devices
...
Allow user-defined CUDA_VISIBLE_DEVICES environment variable
2025-05-07 14:18:59 -04:00
Marius Cornea
0db9aac978
Allow user-defined CUDA_VISIBLE_DEVICES environment variable
...
The check_nvidia function was previously overriding any user-defined
CUDA_VISIBLE_DEVICES environment variable with a default value of "0".
This change adds a check to only set CUDA_VISIBLE_DEVICES=0 when it's not
already present in the environment.
Signed-off-by: Marius Cornea <mcornea@redhat.com>
2025-05-07 21:08:55 +03:00
Mike Bonnet
e8415fc4da
build_rag.sh: set the pip installation prefix to /usr
...
This is consistent with how pip installs packages in the base ramalama image.
Remove some redundant package names from docling(), they're already installed in rag().
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-05-07 10:49:38 -07:00
Mike Bonnet
c4d9940e56
install git-core instead of git
...
Avoid pulling in a bunch of unnecessary perl packages.
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-05-07 10:49:38 -07:00
Mike Bonnet
ed5e8a3dce
build_rag.sh: fix logic error when building from a UBI9-based image
...
"$VERSION_ID" is set to "9.5" when building from a UBI9-based image (the default). This fails
the "-ge" test. Check if "$ID" is "fedora" before assuming "$VERSION_ID" is an integer.
If python3.11 is getting installed, also install python3.11-devel explicitly.
Signed-off-by: Mike Bonnet <mikeb@redhat.com>
2025-05-07 10:49:04 -07:00
Daniel J Walsh
8a0f0f2038
Allow docling to handle URLs rather then handling locally
...
Docling has support for pulling html pages, and we were not pulling them
correctly.
Also support --dryrun
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-05-07 13:04:04 -04:00
Daniel J Walsh
30c13c04d1
Merge pull request #1360 from nathan-weinberg/update-lls-container
...
chore: update curl commands in llama-stack Containerfile
2025-05-07 13:00:29 -04:00
Nathan Weinberg
071426e2e7
chore: update curl commands in llama-stack Containerfile
...
Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
2025-05-07 12:05:13 -04:00
Ian Eaves
785c66184b
updated build to remove setup.py dependency to fix cli entrypoint
...
Signed-off-by: Ian Eaves <ian.k.eaves@gmail.com>
removed uv.lock
Signed-off-by: Ian Eaves <ian.k.eaves@gmail.com>
reverts uv-install.sh, bin/ramalama, and flat cli hierarchy
Signed-off-by: Ian Eaves <ian.k.eaves@gmail.com>
packit version extraction from pyproject.toml
Signed-off-by: Ian Eaves <ian.k.eaves@gmail.com>
pyproject.toml references license file
Signed-off-by: Ian Eaves <ian.k.eaves@gmail.com>
fixed completion directory location
Signed-off-by: Ian Eaves <ian.k.eaves@gmail.com>
fixed format and check-format. There is no longer a root .py file to check
Signed-off-by: Ian Eaves <ian.k.eaves@gmail.com>
newline at end of install-uv.sh
Signed-off-by: Ian Eaves <ian.k.eaves@gmail.com>
remove *.py from make lint flake8 command
Signed-off-by: Ian Eaves <ian.k.eaves@gmail.com>
added import for ModelStoreImport to main
Signed-off-by: Ian Eaves <ian.k.eaves@gmail.com>
attempt to consolidate main functions
Signed-off-by: Ian Eaves <ian.k.eaves@gmail.com>
lint
Signed-off-by: Ian Eaves <ian.k.eaves@gmail.com>
Make bin/ramalama executable
Signed-off-by: Ian Eaves <ian.k.eaves@gmail.com>
typo
Signed-off-by: Ian Eaves <ian.k.eaves@gmail.com>
2025-05-07 10:57:31 -05:00
Daniel J Walsh
7cc1052e0c
Merge pull request #1356 from sarroutbi/202505062329-add-test-tomlparser-unit-test
...
Add TOMLParser unit tests
2025-05-07 08:25:41 -04:00
Sergio Arroutbi
feb46d7c5c
Add TOMLParser unit tests
...
Signed-off-by: Sergio Arroutbi <sarroutb@redhat.com>
2025-05-07 12:10:52 +02:00
Eric Curtin
a4cf5cca48
Merge pull request #1358 from sarroutbi/202505062219-install-coverity-tools-and-execute-them
...
Add coverage tools, run them via specific rules
2025-05-07 11:08:29 +01:00
Sergio Arroutbi
70806fa8ab
Add coverage tools, run them via specific rules
...
Added new rules to install/run specific coverity tools:
* install-detailed-cov-requirements: Install basic coverage tools
* install-cov-requirements: Install extended coverage tools
* cov-tests: Execute basic coverage tools
* detailed-cov-tests: Execute extended coverage tools
Signed-off-by: Sergio Arroutbi <sarroutb@redhat.com>
2025-05-07 11:44:59 +02:00
Eric Curtin
279a5ff32c
Merge pull request #1353 from engelmi/use-model-type-instead-of-class-name
...
Use model type instead of class name
2025-05-06 21:44:56 +01:00
Michael Engel
5e5e35b4b5
Use model type instead of class name
...
Relates to: https://github.com/containers/ramalama/issues/1325
Follow-up of: https://github.com/containers/ramalama/pull/1350
Previously, the model_type member of the model store has been set to
the class name of the model, which mapped URL types like http or file
to url. This is now changed to use the model_type property of the
model class. It is, by default, still the inferred class name, except
in the URL class where it gets set to the URL scheme.
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-05-06 15:39:50 +02:00
Daniel J Walsh
ae9c30d50c
Merge pull request #1345 from containers/change-to-cli-serve
...
Use CLI ramalama serve here
2025-05-06 08:51:20 -04:00
Eric Curtin
dfa7ec81db
Merge pull request #1349 from rhatdan/options
...
Consolidate and alphabetize runtime options
2025-05-06 13:50:41 +01:00
Eric Curtin
1496e13108
Use CLI ramalama serve here
...
It's easier to debug, we can do ps -ef, etc. Easier to code also.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-05-06 13:00:50 +01:00
Eric Curtin
da4f3b5489
Merge pull request #1350 from engelmi/fix-partial-model-listing
...
Fix partial model listing
2025-05-06 12:42:49 +01:00
Daniel J Walsh
3f95d053cc
Consolidate and alphabetize runtime options
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-05-06 07:31:06 -04:00
Daniel J Walsh
f57404ff2e
Merge pull request #1352 from sarroutbi/202505061305-minor-typo
...
Fix typo (RAMALAMA_TRANSPORTS->RAMALAMA_TRANSPORT)
2025-05-06 07:28:10 -04:00
Sergio Arroutbi
9beeb29c59
Fix typo (RAMALAMA_TRANSPORTS->RAMALAMA_TRANSPORT)
...
Signed-off-by: Sergio Arroutbi <sarroutb@redhat.com>
2025-05-06 13:06:28 +02:00
Michael Engel
fa4f40f547
Map url:// prefix of model name to URL class
...
Relates to: https://github.com/containers/ramalama/issues/1325
In the list models function only the url:// prefix is present.
Passing a listed model to the factory can not map this model
input correctly to the URL model class. Therefore, this gets
extended and the unit tests updated by appropriate cases.
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-05-06 12:31:44 +02:00
Michael Engel
53e51c72c5
Remove partial postfix from model name
...
Relates to: https://github.com/containers/ramalama/issues/1325
Instead of appending the (partial) identifier directly, the returned
ModelFile class is extended to indicate if the file is partially
downloaded or not.
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-05-06 12:29:41 +02:00
Daniel J Walsh
3813a01e04
Merge pull request #1346 from rhatdan/VERSION
...
Bump to v0.8.2
2025-05-05 11:34:15 -04:00
Daniel J Walsh
982e70d51b
Bump to v0.8.2
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-05-05 10:35:02 -04:00
Daniel J Walsh
969e0f6b36
Merge pull request #1347 from containers/if-run-ramalama-exists
...
Only execute this if /run/ramalama exists
2025-05-05 10:34:52 -04:00
Eric Curtin
800224e6ee
Only execute this if /run/ramalama exists
...
Using this script to install llama.cpp and whisper.cpp bare metal
on a bootc system, the build stops executing here:
+ ln -sf /usr/bin/podman-remote /usr/bin/podman
+ python3 -m pip install /run/ramalama --prefix=/usr
ERROR: Invalid requirement: '/run/ramalama': Expected package name at the start of dependency specifier
/run/ramalama
^
Hint: It looks like a path. File '/run/ramalama' does not exist.
Error: building at STEP "RUN chmod a+rx /usr/bin/build_llama_and_whisper.sh && build_llama_and_whisper.sh "rocm"": while running runtime: exit status 1
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-05-05 14:40:55 +01:00
Eric Curtin
f1668aea45
Merge pull request #1344 from schuellerf/patch-1
...
Update ramalama-cuda.7.md
2025-05-05 13:32:56 +01:00
Florian Schüller
d6a21a9582
Update ramalama-cuda.7.md
...
Looks like a typo?
Signed-off-by: Florian Schüller <florian.schueller@redhat.com>
2025-05-05 11:24:03 +02:00
Eric Curtin
5670fc0d66
Merge pull request #1339 from benoitf/ignore-none
...
fix: ignore <none>:<none> images
2025-05-05 07:13:36 +01:00
Eric Curtin
c4f7aaa953
Merge pull request #1343 from xxiong2021/main
...
according to Commit 1d36b36, the files path was changed
2025-05-05 06:38:03 +01:00
Xiaoqiang Xiong
0b815654cf
according to Commit 1d36b36, the files path was changed
...
Signed-off-by: Xiaoqiang Xiong <xxiong@redhat.com>
2025-05-05 11:31:26 +08:00
Florent Benoit
d43b715d78
fix: ignore <none>:<none> images
...
related to https://github.com/containers/ramalama/issues/904
Signed-off-by: Florent Benoit <fbenoit@redhat.com>
2025-05-04 21:39:08 +02:00
Daniel J Walsh
34f8cf2d50
Merge pull request #1336 from rhatdan/llama-stack
...
INFERENCE_MODEL should be set by the container engine
2025-05-03 06:51:00 -04:00
Daniel J Walsh
d6b3b2da14
INFERENCE_MODEL should be set by the container engine
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-05-02 11:12:56 -04:00
Eric Curtin
449b7b7fbd
Merge pull request #1335 from rhatdan/llama-stack
...
llama stack run should be the CMD not run during build
2025-05-02 12:04:37 +01:00
Daniel J Walsh
e81cab92c4
llama stack run should be the CMD not run during build
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-05-02 06:34:54 -04:00
Daniel J Walsh
1846fb8611
Merge pull request #1328 from containers/add-comment
...
Describe what this test does
2025-05-02 06:11:10 -04:00
Daniel J Walsh
b18030fca3
Merge pull request #1334 from containers/ramalama-shell-fixes
...
RamaLamaShell fixes
2025-05-02 06:10:36 -04:00
Eric Curtin
ba485eec11
Merge pull request #1332 from containers/make-install-more-resiliant
...
Make installer more resilliant
2025-05-02 10:40:36 +01:00
Eric Curtin
995d0b1cd8
RamaLamaShell fixes
...
Was testing this, found some bugs, mainly caused by the recursive
call of cmdloop. Fixed this by using no recursion. Some
refactorings.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-05-02 10:39:16 +01:00
Eric Curtin
8ec74cb553
Merge pull request #1333 from bmahabirbu/mac-fix
...
Fixed mac gpu not being enabled from stale global var check
2025-05-02 09:33:46 +01:00
Eric Curtin
abb2cf47fa
Merge pull request #1329 from containers/mistral-small
...
Add shortnames for mistral-small3.1 model
2025-05-02 09:33:07 +01:00
Brian
8a02b1d0d3
Fixed mac gpu not being enabled from stale global var check
...
Signed-off-by: Brian <bmahabir@bu.edu>
2025-05-02 00:00:20 -04:00
Eric Curtin
37a56dbdbf
Make installer more resilliant
...
This checked in file is an exact copy of:
curl -LsSfO https://astral.sh/uv/0.7.2/install.sh
Checking in the 0.7.2 version, because now a user can install with
access to github.com alone. Even if astral.sh is down for whatever
reason.
We may want to update uv installer from time to time.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-05-01 20:08:28 +01:00
Daniel J Walsh
88a56ab35f
Merge pull request #1331 from rhatdan/llama-stack
...
Fixup use of /.venv
2025-05-01 14:47:18 -04:00
Daniel J Walsh
907cb41315
Fixup use of /.venv
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-05-01 14:38:14 -04:00
Daniel J Walsh
9a7444c6dc
Merge pull request #1330 from nathan-weinberg/container-fix
...
fix: additional fixes for llama-stack Containerfile
2025-05-01 14:35:47 -04:00
Nathan Weinberg
0ed1029e31
fix: additional fixes for llama-stack Containerfile
...
update locations of YAML files and fix typo with 'uv run'
Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
2025-05-01 14:05:48 -04:00
Eric Curtin
2c48af0175
Add shortnames for mistral-small3.1 model
...
Another Ollama model that's only compatible with Ollama's forking
of llama.cpp
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-05-01 17:48:46 +01:00
Eric Curtin
c7b92e1564
Describe what this test does
...
It failed for me once
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-05-01 16:57:32 +01:00
Daniel J Walsh
023f875427
Merge pull request #1327 from rhatdan/docs
...
Expose http line in man pages
2025-05-01 11:46:13 -04:00
Eric Curtin
39c29b2857
Merge pull request #1158 from containers/use-wrapper-everywhere
...
Turn on client/server implementation of run
2025-05-01 15:15:49 +01:00
Daniel J Walsh
e2f382ab62
Expose http line in man pages
...
The llama.cpp documentation links is lost when we convert markdown to
nroff format. This change will expose the link in man pages.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-05-01 09:45:12 -04:00
Eric Curtin
849813f8b7
Turn on client/server implementation of run
...
Now that we've had one release with the wrapper scripts included
in the container images it should be safe to turn this on
everywhere.
Only add libexec for commands that have wrappers
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-05-01 14:13:10 +01:00
Daniel J Walsh
38a57687dd
Merge pull request #1323 from rhatdan/docs
...
Switch all Ramalama to RamaLama
2025-05-01 06:15:01 -04:00
Daniel J Walsh
ee588cecee
Switch all Ramalama to RamaLama
...
Fix Ramalama names that have snuck into the repo.
Cleanup whitespace in README.md doc.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-05-01 06:13:42 -04:00
Daniel J Walsh
9b639c6172
Merge pull request #1320 from arburka/main
...
Update Docs page
2025-05-01 05:31:12 -04:00
Daniel J Walsh
2c4693a3bd
Merge pull request #1319 from arburka/patch-1
...
Updates to ReadMe doc
2025-05-01 05:27:12 -04:00
arburka
1de54a92e3
Content Update to ramalama docs/readme file
...
Signed-off-by: arburka <88330245+arburka@users.noreply.github.com>
2025-04-30 21:46:51 -04:00
arburka
20868bf17b
Updated Ramalama README.md for readability, clarity, and scanability improvments
...
Signed-off-by: arburka <88330245+arburka@users.noreply.github.com>
Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>
2025-04-30 21:24:37 -04:00
Eric Curtin
8f6e135bb2
Merge pull request #1317 from rhatdan/llama-stack
...
Fix up several issue in llama-stack Containerfile
2025-04-30 20:16:18 +01:00
Daniel J Walsh
18333b431d
Fix up several issue in llama-stack Containerfile
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-30 15:12:42 -04:00
Daniel J Walsh
922ac6bc6b
Merge pull request #1314 from nathan-weinberg/lls-container
...
feat: update llama-stack Containerfile to use ramalama-stack
2025-04-30 15:00:58 -04:00
Daniel J Walsh
ad27acb095
Merge pull request #1312 from containers/simplify-install-script
...
Simplify installer
2025-04-30 13:57:48 -04:00
Eric Curtin
f790ae4361
Simplify installer
...
Use uv installer from uv itself
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-04-30 16:38:58 +01:00
Nathan Weinberg
e5dde374b7
feat: update llama-stack Containerfile to use ramalama-stack
...
Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
2025-04-30 11:30:08 -04:00
Daniel J Walsh
21257c3d79
Merge pull request #1311 from dougsland/smi
...
common: adjust nvidia-smi for check cuda version
2025-04-30 09:47:12 -04:00
Douglas Landgraf
393a67b9b0
common: adjust nvidia-smi for check cuda version
...
Be compatible with both versions of nvidia-smi
with version flag or not.
Signed-off-by: Douglas Landgraf <dlandgra@redhat.com>
2025-04-30 09:05:24 -04:00
Daniel J Walsh
db8f30ec18
Merge pull request #1292 from containers/pass-args-to-ramalama-run-core
...
Pass args to ramalama run core
2025-04-30 08:14:46 -04:00
Eric Curtin
bb259ad7af
Pass args to *core scripts
...
Ensure arguments are passed to *core scripts
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-04-30 11:55:44 +01:00
Daniel J Walsh
a13764c363
Merge pull request #1309 from sarroutbi/202504292004-avoid-unused-parameter
...
Avoid unused parameter
2025-04-29 15:49:55 -04:00
Daniel J Walsh
3fd4fdb4d1
Merge pull request #1310 from rhatdan/main
...
Bump to 0.8.1
2025-04-29 15:29:06 -04:00
Daniel J Walsh
e1f84cb1b9
Bump to 0.8.1
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-29 14:15:52 -04:00
Sergio Arroutbi
d1c0eda2aa
Avoid unused parameter
...
Signed-off-by: Sergio Arroutbi <sarroutb@redhat.com>
2025-04-29 20:05:44 +02:00
Daniel J Walsh
b9ffbf8955
Merge pull request #1307 from engelmi/catch-possible-go-to-jinja-error
...
Catch possible error when parsing Go to Jinja template
2025-04-29 12:29:47 -04:00
Eric Curtin
c160628ef8
Merge pull request #1305 from sarroutbi/202504291518-avoid-usage-of-reserved-words
...
Avoid reserved words usage and fix format
2025-04-29 16:53:04 +01:00
Michael Engel
98c2af92f5
Catch possible error when parsing Go to Jinja template
...
Parsing a chat template in Go-syntax to a Jinja template might raise an
exception. Since this is only a nice-to-have feature and we fallback to
the chat template specified in the backend, lets silently skip it.
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-04-29 17:52:40 +02:00
Sergio Arroutbi
4c20cdc392
Avoid reserved words usage and fix format
...
* Avoid reserved words usage such as `hash` or `all`
* Fix format
Signed-off-by: Sergio Arroutbi <sarroutb@redhat.com>
2025-04-29 15:23:00 +02:00
Daniel J Walsh
4a8477c8e2
Merge pull request #1298 from melodyliu1986/melodyliu1986-feature-branch
...
Fix the error: ramalama login can NOT get the value of RAMALAMA_TRANSPORT
2025-04-29 06:44:33 -04:00
Daniel J Walsh
36e2055426
Merge pull request #1299 from rhatdan/pull
...
Report on the use of cached models
2025-04-29 06:43:09 -04:00
Song Liu
d226d1fb17
Fix the error: ramalama login can NOT get the value of env var RAMALAMA_TRANSPORT
...
Signed-off-by: Song Liu <soliu@redhat.com>
2025-04-29 15:49:05 +08:00
Daniel J Walsh
2fba91c28e
Report on the use of cached models
...
Don't attempt to pull a model when inspecting it
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-28 14:48:54 -04:00
Daniel J Walsh
c999b7fbe0
Merge pull request #1301 from rhatdan/VERSION
...
Fix rpm scripts to correct version
2025-04-28 14:26:23 -04:00
Daniel J Walsh
8a060e4611
Fix rpm scripts to correct version
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-28 14:25:04 -04:00
Daniel J Walsh
a963045414
Merge pull request #1284 from rhatdan/VERSION
...
Bump to v0.8.0
2025-04-28 14:20:08 -04:00
Daniel J Walsh
3050d9393f
Merge pull request #1300 from containers/renovate/registry.access.redhat.com-ubi9-ubi-9.x
...
Update registry.access.redhat.com/ubi9/ubi Docker tag to v9.5-1745848351
2025-04-28 14:18:22 -04:00
renovate[bot]
0459d6b79e
Update registry.access.redhat.com/ubi9/ubi Docker tag to v9.5-1745848351
...
Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2025-04-28 16:12:17 +00:00
Daniel J Walsh
381dd738aa
Bump to v0.8.0
...
Stop using installed library version.
Fixes: https://github.com/containers/ramalama/issues/1297
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-28 07:36:34 -04:00
Daniel J Walsh
b4aaba551b
Merge pull request #1294 from rhatdan/cache
...
Make --no-cache optional for make build
2025-04-28 07:24:17 -04:00
Daniel J Walsh
3b2cae4691
Merge pull request #1295 from rhatdan/info
...
Add shortname information to ramalama info
2025-04-28 07:23:55 -04:00
Daniel J Walsh
715dffbb53
Make --no-cache optional for make build
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-28 06:59:13 -04:00
Daniel J Walsh
75600a6c36
Add shortname information to ramalama info
...
Fixes: https://github.com/containers/ramalama/issues/1263
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-27 08:00:25 -05:00
Daniel J Walsh
aefb785494
Merge pull request #1282 from rhatdan/build
...
Use current ramalama directory rather them main from repo
2025-04-26 08:23:05 -04:00
Daniel J Walsh
6c59fd7fd5
Use currenct ramalama directory rather them main from repo
...
This allows users to experiment with content and get it into
container image.
Fixes: https://github.com/containers/ramalama/issues/1274
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-26 07:04:59 -05:00
Daniel J Walsh
193ec04793
Merge pull request #1290 from containers/fix-for-finding-correct-config-file
...
Fix to check correct directory for shortnames file
2025-04-26 07:51:37 -04:00
Daniel J Walsh
cf2cb3b570
Merge pull request #1291 from containers/add-exit
...
Exit ramalama if user types exit
2025-04-26 07:49:39 -04:00
Daniel J Walsh
63eccb105a
Merge pull request #1293 from containers/run-server-ignore-ctrl-c
...
Change CLI behaviour
2025-04-26 07:46:19 -04:00
Eric Curtin
fc78d0bf18
Change CLI behaviour
...
Don't exit on Ctrl-C, cut response short or print an info message
to the user telling them how they may exit.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-04-26 12:23:20 +01:00
Eric Curtin
db6cdb5296
Exit ramalama if user types exit
...
To behave like bash or python3
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-04-25 23:19:01 +01:00
Eric Curtin
7d07701b1a
Fix to check correct directory for shortnames file
...
If installed with uv the correct directory wasn't being looked for
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-04-25 22:27:43 +01:00
Daniel J Walsh
c58f56e112
Merge pull request #1287 from olivergs/workaroundcuda
...
Workaround for CUDA image not pointing to libcuda.so.1 in ld.so.conf
2025-04-25 17:07:25 -04:00
Daniel J Walsh
ae1adcc12f
Merge pull request #1288 from containers/shortnames-gemma3
...
Add gemma3 shortnames
2025-04-25 17:05:03 -04:00
Oliver Gutierrez
a22e209551
Workaround for CUDA image not pointing to libcuda.so.1 in ld.so.conf
...
libcuda.so.1 is located at /usr/local/cuda-12.8/compat and that path
is not in any /etc/ld.so.conf.d/* files.
The workaround is to simply add the path and run ldconfig to make it
available.
Signed-off-by: Oliver Gutierrez <ogutsua@gmail.com>
2025-04-25 19:39:36 +01:00
Eric Curtin
35cbabdb74
Add gemma3 shortnames
...
Otherwise we will pull the incompatible gguf's from Ollama
registry.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-04-25 18:06:06 +01:00
Eric Curtin
7fb4710b2e
Merge pull request #1285 from sarroutbi/202504251610-fix-issues-reported-by-checkmake
...
Fix minor issues
2025-04-25 16:53:13 +01:00
Eric Curtin
0bd07f9aa6
Merge pull request #1286 from sarroutbi/202504251655-use-camel-case-for-consistency
...
Use RamaLama instead of Ramalama for consistency
2025-04-25 16:00:36 +01:00
Sergio Arroutbi
e6e8b1e881
Use RamaLama instead of Ramalama for consistency
...
Signed-off-by: Sergio Arroutbi <sarroutb@redhat.com>
2025-04-25 16:56:01 +02:00
Sergio Arroutbi
837b0c99c2
Fix minor issues
...
- Include test rule in global Makefile
- Use GO variable in doc/Makefile
- Ramalama->RamaLama
Signed-off-by: Sergio Arroutbi <sarroutb@redhat.com>
2025-04-25 16:53:03 +02:00
Eric Curtin
cde9e1ae48
Merge pull request #1283 from engelmi/sanitize-filename-on-migrate
...
Remove model tag from file name on migration
2025-04-25 14:58:55 +01:00
Michael Engel
a09e711ceb
Remove model tag from file name on migration
...
Relates to: https://github.com/containers/ramalama/issues/1278
Remove the model tag including the : symbol from the file name on
migration from the old to new store. Also, rename the sanitize_hash
to sanitize_filename function.
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-04-25 15:24:54 +02:00
Daniel J Walsh
efa333848d
Merge pull request #1281 from containers/revert-1271-spdx
...
Revert "Replace project.license with an SPDX license expression."
2025-04-25 08:53:53 -04:00
Daniel J Walsh
10d314dada
Merge pull request #1280 from containers/cli-advancements
...
Remove hardcodeing to ecurtin $HOME
2025-04-25 08:40:45 -04:00
Eric Curtin
0423c62a29
Merge pull request #1273 from rhatdan/docs
...
Add information on configuring the libkrun machine provider
2025-04-25 13:38:27 +01:00
Eric Curtin
3b60c77e88
Merge pull request #1279 from rhatdan/draftmodel
...
Fix up description of draft model
2025-04-25 13:37:40 +01:00
Eric Curtin
4381b3a799
Revert "Replace project.license with an SPDX license expression."
...
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-04-25 13:35:19 +01:00
Eric Curtin
0af2c37984
Remove hardcodeing to ecurtin $HOME
...
This was left in by mistake
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-04-25 13:19:13 +01:00
Daniel J Walsh
40d1e17e7c
Merge pull request #1270 from jwieleRH/validate
...
Fix formatting as suggested by isort so that "make validate" passes.
2025-04-25 08:03:53 -04:00
Daniel J Walsh
702b44853f
Fix up description of draft model
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-25 07:02:16 -05:00
Eric Curtin
c5908c3cc2
Merge pull request #1255 from afazekas/draft_model
...
Initial draft model support
2025-04-25 12:54:08 +01:00
John Wiele
883534363e
Fix formatting as suggested by isort so that "make validate" passes.
...
Signed-off-by: John Wiele <jwiele@redhat.com>
2025-04-25 07:48:43 -04:00
Daniel J Walsh
c27ab276b9
Add information on configuring the libkrun machine provider
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-25 06:43:03 -05:00
Daniel J Walsh
99639c8e19
Merge pull request #1277 from sarroutbi/202504251255-enhance-pylint-mark
...
Enhance pylint mark in ramalama/cli.py
2025-04-25 07:35:22 -04:00
Sergio Arroutbi
762e20af46
Enhance pylint mark in ramalama/cli.py
...
- Use preferred f-string
- Minimum code refactoring
- Remove unnecessary comments
Signed-off-by: Sergio Arroutbi <sarroutb@redhat.com>
2025-04-25 13:16:52 +02:00
Eric Curtin
b047195d29
Merge pull request #1271 from jwieleRH/spdx
...
Replace project.license with an SPDX license expression.
2025-04-25 11:07:28 +01:00
Eric Curtin
2f1db29e8b
Merge pull request #1272 from rhatdan/VERSION
...
Use --quiet for ramalama version
2025-04-25 11:06:48 +01:00
Eric Curtin
ee621c0e42
Merge pull request #1276 from sarroutbi/202504251018-minor-changes-to-enhance-pylint-results-for-toml-parser
...
Add minor changes to enhance pylint mark
2025-04-25 10:53:50 +01:00
Sergio Arroutbi
e45c731e80
Add minor changes to enhance pylint mark
...
Signed-off-by: Sergio Arroutbi <sarroutb@redhat.com>
2025-04-25 10:18:47 +02:00
Daniel J Walsh
c327936811
Merge pull request #1266 from edmcman/cuda-12.4
...
Automatically pick cuda docker container
2025-04-24 22:58:21 -04:00
Daniel J Walsh
615ea2cd73
Use --quiet for ramalama version
...
Makes figuring out what the version of ramalama is easier.
Fixes: https://github.com/containers/ramalama/issues/1258
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-24 21:56:19 -05:00
John Wiele
def274c681
Replace project.license with an SPDX license expression.
...
`project.license` as a TOML table is deprecated. The new format for
license is a valid SPDX license expression consisting of one or more
license identifiers.
See https://packaging.python.org/en/latest/guides/writing-pyproject-toml/#license .
Signed-off-by: John Wiele <jwiele@redhat.com>
2025-04-24 18:33:34 -04:00
Edward J. Schwartz
83712b7cd8
Fix cuda version logic
...
Signed-off-by: Edward J. Schwartz <edmcman@cmu.edu>
2025-04-24 11:36:30 -04:00
Edward J. Schwartz
3d8804877a
Automatically pick cuda docker container
...
Signed-off-by: Edward J. Schwartz <edmcman@cmu.edu>
2025-04-24 10:53:41 -04:00
Attila Fazekas
d0b983383e
Initial draft model support
...
Allows to pass draft model to serve and fetching it when needed.
'run' does not supports passing draft_model.
You should also pass draft related args tuned to your combination
and do not forget to set the sampling parameters like top_k
on the UI.
Signed-off-by: Attila Fazekas <afazekas@redhat.com>
2025-04-24 14:28:38 +02:00
Eric Curtin
c43013d62d
Merge pull request #1265 from containers/bump-llamacpp
...
We need fixes around CPU support etc.
2025-04-24 13:23:11 +01:00
Eric Curtin
53ad1b58d4
We need fixes around CPU support etc.
...
And other enhancements.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-04-24 12:49:53 +01:00
Daniel J Walsh
e1f8fb8b6b
Merge pull request #1254 from rhatdan/pull
...
Change default testing to use --pull=missing
2025-04-23 16:20:08 -04:00
Daniel J Walsh
01b5014f7c
Change default testing to use --pull=missing
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-23 14:29:12 -04:00
Daniel J Walsh
173ebead83
Merge pull request #1189 from engelmi/set-model-store-as-default
...
Set model store as default
2025-04-23 14:04:08 -04:00
Daniel J Walsh
de8eeaee79
Merge pull request #1256 from containers/remove-more-values
...
Deleting default values for bug reports
2025-04-23 14:03:41 -04:00
Eric Curtin
441ede951e
Deleting default values for bug reports
...
You have to delete this text in every box or it gets left around,
polluting the issue.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-04-23 18:56:43 +01:00
Eric Curtin
fb9bc738e6
Merge pull request #1236 from rhatdan/engine
...
Move model and rag to use shared Engine implementation
2025-04-23 18:23:46 +01:00
Daniel J Walsh
9fb76b6f73
Move model and rag to use shared Engine implementation
...
Shrink the size of cly.py and model.py by moving all engine
related functions into new engine.py python module.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-23 11:08:55 -04:00
Michael Engel
d8183a85a8
Refactor convert and push cli for model store usage
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-04-23 17:07:24 +02:00
Michael Engel
528da77904
Fixed system tests
...
Fixed system tests which broken by switching from old to new
model store.
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-04-23 17:07:24 +02:00
Michael Engel
754d64b594
Added ref file is None check to run and serve
...
The ref file could be not available, e.g. when running with --dryrun,
so the retrieved ref file instance is None. By checking this we
prevent ramalama from crashing.
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-04-23 17:07:24 +02:00
Michael Engel
9a18d6f312
Pass on KeyError for first remove in OCI model
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-04-23 17:07:24 +02:00
Michael Engel
0a61ff7ef2
Align remove behavior of new to old store
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-04-23 17:07:24 +02:00
Michael Engel
0e2940becd
Use new model store by default
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-04-23 17:07:24 +02:00
Michael Engel
1cda889ca7
Added migrate script to import models to new store
...
The migration is run on each command to import all models
from the old store to the new one. It also removes the old
directories and creating the old structure is prevented.
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-04-23 17:07:24 +02:00
Michael Engel
413354a778
Do not pass chat template to llama.cpp
...
Relates to: https://github.com/containers/ramalama/issues/1202
Passing the chat template file to the model run or serve leads to bad
results recently. As a temporary fix the template is not passed to the
model run.
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-04-23 17:07:23 +02:00
Michael Engel
f90e2087d3
Check for model to exist when ensuring chat template
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-04-23 17:07:23 +02:00
Michael Engel
fbc37b06bd
Update ref file list when files to download are not found
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-04-23 17:07:23 +02:00
Michael Engel
c8b151fadd
Added model name to base directory path in model store
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-04-23 17:07:23 +02:00
Eric Curtin
f7d39c4dcf
Merge pull request #1253 from rhatdan/oci
...
Verify OCI image format in rag command
2025-04-23 13:58:07 +01:00
Daniel J Walsh
50287a6341
Merge pull request #1250 from containers/jinja
...
Allow jinja argument
2025-04-23 08:05:17 -04:00
Daniel J Walsh
00eae2130b
Merge pull request #1252 from containers/add-awk
...
Add gawk
2025-04-23 08:03:52 -04:00
Eric Curtin
8e42df4615
Add gawk
...
For awk binary
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-04-23 12:33:41 +01:00
Daniel J Walsh
23e039385e
Verify OCI image format in rag command
...
Fixes: https://github.com/containers/ramalama/issues/1244
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-23 07:30:04 -04:00
Daniel J Walsh
425e1a8cc2
Merge pull request #1246 from bmahabirbu/preload-docling
...
preload docling models
2025-04-23 06:49:40 -04:00
Daniel J Walsh
caa2f082cb
Merge pull request #1249 from containers/simplify-github-issues
...
Deleting default values for issues
2025-04-23 06:48:38 -04:00
Eric Curtin
f4a2dd2d86
Allow jinja argument
...
It's on everywhere anyway, this just ensures these wrapper scripts
don't crash if jinja gets passed.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-04-23 11:48:01 +01:00
Eric Curtin
7417a56b80
Deleting default values for issues
...
You have to delete this text in every box or it gets left around,
polluting the issue.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-04-23 11:30:35 +01:00
Eric Curtin
b3008dd95c
Merge pull request #1245 from nathan-weinberg/bot-docs
...
docs: update CONTRIBUTING.md section on bots
2025-04-23 10:29:18 +01:00
Eric Curtin
604697f71d
Merge pull request #1243 from rhatdan/docling
...
support AI Models environment variables doc2rag/rag_framework
2025-04-23 10:28:11 +01:00
Nathan Weinberg
329f26acb8
docs: update CONTRIBUTING.md section on bots
...
Signed-off-by: Nathan Weinberg <nathan2@stwmd.net>
2025-04-22 21:26:28 -04:00
Brian
3522a2d709
preload docling models
...
Signed-off-by: Brian <bmahabir@bu.edu>
2025-04-22 20:53:02 -04:00
Daniel J Walsh
72678b983d
support AI Models environment variables doc2rag/rag_framework
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-22 14:12:37 -04:00
Daniel J Walsh
e9a1ed392e
Merge pull request #1242 from containers/client-server-fix-containers
...
Fix for using client/server version of "ramalama run"
2025-04-22 13:38:01 -04:00
Eric Curtin
e387d3ddc2
Fix for using client/server version of "ramalama run"
...
When using containers we need this check so the code doesn't start
going into some pulling logic.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-04-22 17:49:53 +01:00
Daniel J Walsh
6768890a82
Merge pull request #1241 from afazekas/jinja-run
...
Always use --jinja with run too
2025-04-22 11:18:53 -04:00
Attila Fazekas
47eaa922b8
--temp 0 in test
...
temp 0 significatly reduces the sampling making unexpected output,
in many cases it makes the inference to always produce the same
output. Small modles are likely to get into a loop unless
the sampling is tuned.
Signed-off-by: Attila Fazekas <afazekas@redhat.com>
2025-04-22 17:00:29 +02:00
Daniel J Walsh
b261cb57a8
Merge pull request #1235 from rhatdan/cuda
...
Allow building older versions of cuda
2025-04-22 10:49:21 -04:00
Attila Fazekas
486c27da81
Always use --jinja with run too
...
`ramalama serve` already using --jinja by default,
`ramalama run` should do it too.
fixes : #1212
Signed-off-by: Attila Fazekas <afazekas@redhat.com>
2025-04-22 15:48:46 +02:00
Eric Curtin
4e77444a46
Merge pull request #1240 from afazekas/sai-1239
...
Fix Typo in Clustering Placeholder Comment
2025-04-22 12:47:10 +01:00
Attila Fazekas
cd9b51415b
Fix Typo in Clustering Placeholder Comment
...
fixes : #1239
Signed-off-by: Attila Fazekas <afazekas@redhat.com>
2025-04-22 13:32:51 +02:00
Eric Curtin
da3dfea987
Merge pull request #1237 from melodyliu1986/melodyliu1986-feature-branch
...
Fix: huggingface logout doesn't use token
2025-04-22 12:14:50 +01:00
Eric Curtin
51040c5ca5
Merge pull request #1238 from afazekas/rpc_tmp_cleint
...
ad-hoc llama clustering option
2025-04-22 12:14:26 +01:00
Attila Fazekas
5bd53340a9
ad-hoc llama clustering option
...
RAMALAMA_LLAMACPP_RPC_NODES allow you to use
rpc nodes from other places.
Example:
Worker node:
$ podman run --replace --name llama_rpc_cuda_0 -it --gpus=all \
--runtime /usr/bin/nvidia-container-runtime --network host \
quay.io/ramalama/cuda /usr/bin/rpc-server -p 50052 -H 0.0.0.0
or
$ podman run --replace --name llama_rpc_cuda_0 -it --gpus=all \
--runtime /usr/bin/nvidia-container-runtime -p 50052:50052 \
quay.io/ramalama/cuda /usr/bin/rpc-server -p 50052 -H 0.0.0.0
Client node (rocm):
$ RAMALAMA_LLAMACPP_RPC_NODES=192.168.142.5:50052 ramalama serve qwq:32b-q8_0 --ctx 8192
output:
load_tensors: loading model tensors, this can take a while... (mmap = true)
load_tensors: offloading 64 repeating layers to GPU
load_tensors: offloading output layer to GPU
load_tensors: offloaded 65/65 layers to GPU
load_tensors: RPC[192.168.142.5:50052] model buffer size = 19271.03 MiB
load_tensors: CPU_Mapped model buffer size = 788.91 MiB
load_tensors: ROCm0 model buffer size = 13142.15 MiB
Signed-off-by: Attila Fazekas <afazekas@redhat.com>
2025-04-22 12:53:38 +02:00
Song Liu
7fe4f00589
Fix: huggingface logout doesn't use token
...
Signed-off-by: Song Liu <soliu@redhat.com>
2025-04-22 15:00:36 +08:00
Daniel J Walsh
c66e931b7a
Allow building older versions of cuda
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-21 13:51:10 -04:00
Daniel J Walsh
255d43438a
Merge pull request #1233 from rhatdan/release
...
Bump to v0.7.5
2025-04-21 10:27:23 -04:00
Daniel J Walsh
2147ca83ac
Bump to v0.7.5
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-21 06:12:47 -04:00
Daniel J Walsh
0412732674
Merge pull request #1231 from afazekas/f42
...
Switch all f41 container to f42
2025-04-21 06:12:19 -04:00
Daniel J Walsh
47cfeffcf9
Merge pull request #1230 from edmcman/tag-refactor
...
Refactor exception handling of huggingface pull operation
2025-04-21 06:10:27 -04:00
Daniel J Walsh
c9115ff6cb
Merge pull request #1232 from melodyliu1986/melodyliu1986-feature-branch
...
Fix bug in login_cli and update huggingface or hf registry behavior
2025-04-21 06:05:28 -04:00
Song Liu
1c3027d4be
Fix bug in login_cli and update huggingface or hf registry behavior
...
Signed-off-by: Song Liu <soliu@redhat.com>
2025-04-21 15:30:07 +08:00
Edward J. Schwartz
353c6360f2
Move directory check to be executed once
...
Signed-off-by: Edward J. Schwartz <edmcman@cmu.edu>
2025-04-20 11:30:29 -04:00
Edward J. Schwartz
36aa6068b0
Formatting
...
Signed-off-by: Edward J. Schwartz <edmcman@cmu.edu>
2025-04-20 10:24:04 -04:00
Edward J. Schwartz
258831631e
Refactor ollama repo downloading utilities
...
Signed-off-by: Edward J. Schwartz <edmcman@cmu.edu>
2025-04-20 10:18:18 -04:00
Edward J. Schwartz
97bb3618ad
Refactor exception handling of huggingface pull operation
...
Signed-off-by: Edward J. Schwartz <edmcman@cmu.edu>
2025-04-20 09:46:17 -04:00
Attila Fazekas
d8ff9d0496
Switch all f41 container to f42
...
Actually only 1 container left on f41,
making all to use f42.
Signed-off-by: Attila Fazekas <afazekas@redhat.com>
2025-04-20 15:04:42 +02:00
Daniel J Walsh
243142c1e6
Merge pull request #1226 from afazekas/intel-gpu-rag
...
fix: intel-gpu-rag build
2025-04-20 06:26:29 -04:00
Attila Fazekas
145d0fa0f8
fix rocm-ubi container build
...
close : #1222
Signed-off-by: Attila Fazekas <afazekas@redhat.com>
2025-04-20 07:56:13 +02:00
Attila Fazekas
95455b94ec
fix: intel-gpu-rag build
...
* intle-gpu the only rag user container with f41, moving to f42
* dependencies referenced by git url, adding git package
* numpy compile requires gcc-c++, python3-devel
* f42 has python3-sentencepiece same version (no compile)
Fixes issues with several other rag containers too
Signed-off-by: Attila Fazekas <afazekas@redhat.com>
2025-04-19 21:50:58 +02:00
Eric Curtin
2d84b42ace
Merge pull request #1227 from afazekas/llama-rpc
...
Enable llama.cpp rpc feature in containers
2025-04-19 17:58:24 +01:00
Attila Fazekas
58fc144c87
Enable llama.cpp rpc feature in containers
...
Enable both server and client support for rpc.
The feature currently PoC in llama.cpp, but can work in practice.
Required for distributed inference.
Signed-off-by: Attila Fazekas <afazekas@redhat.com>
2025-04-19 18:47:25 +02:00
Daniel J Walsh
b7f89a4d7b
Merge pull request #1225 from bmahabirbu/rag-opt
...
Optimized doc2rag for reduced ram and fixed batch size
2025-04-19 07:11:25 -04:00
Brian
75e8391d43
Optimized doc2rag for reduced ram and fixed batch size
...
Signed-off-by: Brian <bmahabir@bu.edu>
2025-04-18 21:07:54 -04:00
Daniel J Walsh
b683ed2e19
Merge pull request #1224 from afazekas/intel-gpu
...
fix intel-gpu container build
2025-04-18 16:37:38 -04:00
Attila Fazekas
b6388f6c84
fix intel-gpu container build
...
49656ee3e9
removed one line
but did not removed the line join, leading to build failures.
Signed-off-by: Attila Fazekas <afazekas@redhat.com>
2025-04-18 22:33:25 +02:00
Eric Curtin
487b255226
Merge pull request #1223 from lsm5/packit-copr-rpm-version
...
Packit: use latest version for rpm
2025-04-18 19:07:04 +01:00
Daniel J Walsh
a5e602d535
Merge pull request #1219 from rhatdan/release
...
Tag images on push with digests, so they are permanent
2025-04-18 11:43:12 -04:00
Lokesh Mandvekar
c030c3f0bb
Packit: use latest version for rpm
...
Packit by default uses `git describe` for rpm version in copr builds.
This can often lag behind the latest release thus making it impossible
to update default distro builds with copr builds.
Ref: https://copr.fedorainfracloud.org/coprs/rhcontainerbot/podman-next/package/python-ramalama/
The latest build in there still shows version: `0.7.3` when v0.7.4 is
the latest upstream release.
This commit adds a packit action to modify the spec file which fetches
version info from setup.py.
The rpm release info is also modified such that it will update over the
latest distro package in almost all circumstances, assuming no distro
package will have a release 1001+.
Signed-off-by: Lokesh Mandvekar <lsm5@fedoraproject.org>
2025-04-18 21:05:22 +05:30
Daniel J Walsh
807bd54fc5
Tag images on push with digests, so they are permanent
...
We have accidently overwridden the images release version
if we also tag by digest, then we will not destroy the
image or manifest list. Since Podman Desktop AI Lab Recipes
relies on the image digest this makes it safer for them.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-18 10:30:37 -04:00
Eric Curtin
0dfb51b2a7
Merge pull request #1218 from rhatdan/completion
...
Improve shell completions for all arguments
2025-04-18 11:00:39 +01:00
Daniel J Walsh
8dada5a934
Improve shell completions for all arguments
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-17 19:52:03 -04:00
Daniel J Walsh
513f6eb119
Merge pull request #1217 from rhatdan/llama-stack
...
Fixes for llama-stack image to build and install
2025-04-17 12:45:51 -04:00
Daniel J Walsh
4497be31a5
Fixes for llama-stack image to build and install
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-17 12:40:25 -04:00
Daniel J Walsh
5ff8ebdcde
Merge pull request #1216 from rhatdan/openvino
...
Fix release scripts for openvino
2025-04-17 12:28:14 -04:00
Daniel J Walsh
a67538f9d9
Fix release scripts for openvino
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-17 12:26:11 -04:00
Daniel J Walsh
25ebe31ab9
Merge pull request #1213 from afazekas/llama-stack-entry
...
llama-stack relative COPY
2025-04-17 12:07:43 -04:00
Eric Curtin
54060dabaf
Merge pull request #1214 from afazekas/rocm-ubi
...
rocm-ubi repo path fix
2025-04-17 16:52:49 +01:00
Attila Fazekas
0ae62b90e4
llama-stack relative COPY
...
cdb6df6877
recently added
the entrypoint.sh however the Containerfile does not have
relative path to container-images/ as for example
intel-gpu has.
container_build.sh uses container-images as working directory.
Signed-off-by: Attila Fazekas <afazekas@redhat.com>
2025-04-17 17:14:59 +02:00
Attila Fazekas
d40d221049
rocm-ubi repo path fix
...
rocm-ubi pointed to a wrong path for the repo files,
this change fixing it.
Signed-off-by: Attila Fazekas <afazekas@redhat.com>
2025-04-17 17:14:10 +02:00
Daniel J Walsh
7fa760c5c6
Merge pull request #1215 from Ferenc-/fix-cuda-link
...
Fix link to ramalama-cuda
2025-04-17 11:06:58 -04:00
Eric Curtin
2e0a314746
Merge pull request #1208 from rhatdan/man
...
Ship nvidia and cann man pages
2025-04-17 14:45:06 +01:00
Daniel J Walsh
2afc6f1433
Merge pull request #1209 from kush-gupt/to-gguf
...
Add --gguf option to convert Safetensors using llama.cpp scripts and functionality
2025-04-17 09:23:38 -04:00
Daniel J Walsh
969420096f
Merge pull request #1211 from rhatdan/nvidia
...
Only use nvidia-container-runtime if it is installed
2025-04-17 08:16:46 -04:00
Ferenc Géczi
611474739f
Fix link to ramalama-cuda
...
Signed-off-by: Ferenc Géczi <ferenc.geczi@ibm.com>
2025-04-17 12:00:00 +00:00
Eric Curtin
ee6246954b
Merge pull request #1210 from containers/renovate/registry.access.redhat.com-ubi9-ubi-9.x
...
Update registry.access.redhat.com/ubi9/ubi Docker tag to v9.5-1744101466
2025-04-17 10:22:24 +01:00
Daniel J Walsh
49656ee3e9
Only use nvidia-container-runtime if it is installed
...
Also we no longer need to ship openvino with intel containerfile
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-17 05:11:39 -04:00
renovate[bot]
f82358b5ab
Update registry.access.redhat.com/ubi9/ubi Docker tag to v9.5-1744101466
...
Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2025-04-17 09:09:45 +00:00
Daniel J Walsh
28cfc5b1c1
Merge pull request #1183 from bmahabirbu/openvino
...
Create openvino model server image and add it quay.io/ramalama
2025-04-17 05:09:21 -04:00
Daniel J Walsh
47e39a75e9
Merge pull request #1207 from rhatdan/debug
...
Quote strings with spaces in debug mode
2025-04-17 05:08:20 -04:00
Kush Gupta
7cd56bd8f0
formatting
...
Signed-off-by: Kush Gupta <kushalgupta@gmail.com>
2025-04-17 03:09:00 -04:00
Kush Gupta
09f3b3826e
add gguf option to convert
...
Signed-off-by: Kush Gupta <kushalgupta@gmail.com>
2025-04-17 02:47:09 -04:00
Brian
cb8630e9aa
add openvino model server image to quay.io/ramalama
...
Signed-off-by: Brian <bmahabir@bu.edu>
2025-04-16 18:25:31 -04:00
Daniel J Walsh
11349e36c3
Ship nvidia and cann man pages
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-16 17:33:36 -04:00
Daniel J Walsh
ed838da4d1
Quote strings with spaces in debug mode
...
Currently if you run in Debug mode and attempt to cut and paste the
Podman or Docker line, the PROMPT field has a space with a > in it.
When pasted this causes issues since it is not properly quoted.
With this change the the command can be successfully cut and pasted.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-16 17:12:59 -04:00
Daniel J Walsh
79aa781fc4
Merge pull request #1201 from rhatdan/docling
...
Fix doc2rag warning
2025-04-16 15:15:44 -04:00
Eric Curtin
e010fb12ab
Merge pull request #1203 from rhatdan/VERSION
...
Add newver.sh script
2025-04-16 16:10:13 +01:00
Daniel J Walsh
7a6cc8968e
Add newver.sh script
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-16 10:04:51 -04:00
Daniel J Walsh
aafcf6af6e
Fix doc2rag warning
...
/usr/bin/doc2rag:46: DeprecationWarning: Use contextualize() instead.
doc_text = chunker.serialize(chunk=chunk)
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-16 09:48:02 -04:00
Eric Curtin
748d399242
Merge pull request #1196 from rhatdan/pull
...
Default to --pull=newer for ramalama rag command.
2025-04-16 14:43:41 +01:00
Daniel J Walsh
64d28fbc57
Handle --pull=newer on Docker
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-16 09:20:18 -04:00
Daniel J Walsh
181064d59a
Default to --pull=newer for ramalama rag command.
...
Fixes: https://github.com/containers/ramalama/issues/1192
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-16 05:53:14 -04:00
Daniel J Walsh
7541c654e3
Merge pull request #1199 from rhatdan/llama-stack
...
Add missing entrypoint.sh
2025-04-16 05:41:06 -04:00
Daniel J Walsh
cdb6df6877
Add missing entrypoint.sh
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-16 05:39:50 -04:00
Daniel J Walsh
0111eaffc9
Merge pull request #1197 from rhatdan/llama-stack
...
Setup /venv for running llama-stack
2025-04-16 05:36:17 -04:00
Daniel J Walsh
d39d878d19
Setup /venv for running llama-stack
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-15 20:44:27 -04:00
Eric Curtin
763c10f79d
Merge pull request #1193 from marcnuri-forks/feat/container_image_ctx_size
...
feat: add CTX_SIZE env config to container-images llama-server.sh
2025-04-15 18:02:02 +01:00
Daniel J Walsh
dcc810c81a
Merge pull request #1160 from containers/install-script-fix
...
Also hardcode version into version.py as fallback
2025-04-15 09:25:19 -04:00
Eric Curtin
5988cdcfc2
Also hardcode version into version.py as fallback
...
This way we should always return the current version
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-04-15 13:07:48 +01:00
Daniel J Walsh
2356e2f4a2
Merge pull request #1195 from containers/readme-update
...
macOS tip to install homebrew
2025-04-15 07:42:27 -04:00
Eric Curtin
ed4046fb82
macOS tip to install homebrew
...
It's a requirement for macOS installs.
Co-Authored-By: Rashid Khan <rkhan@redhat.com>
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-04-15 12:10:29 +01:00
Eric Curtin
32feccf75e
Merge pull request #1194 from leo-pony/main
...
fix llama.cpp CANN backend x86 build failing issue
2025-04-15 11:20:41 +01:00
leo-pony
a444fff302
fix llama.cpp cann backend x86 build failing issue: update llama.cpp to the new commit that has fixed this buiild issue.
...
Signed-off-by: leo-pony <nengjunma@outlook.com>
2025-04-15 16:48:32 +08:00
Marc Nuri
f0e2edd938
feat: CTX_SIZE env config in container-images llama-server.sh is optional
...
Signed-off-by: Marc Nuri <marc@marcnuri.com>
2025-04-15 09:08:47 +02:00
Marc Nuri
02bacb8daf
feat: add CTX_SIZE env config to container-images llama-server.sh
...
Relates to https://github.com/containers/podman-desktop-extension-ai-lab/issues/2630
Allow overriding the context size when running ramalama from a container.
2048 tokens (the default if not specified) is a small context window when running the inference server with
MCP tools or even for longer chat completion conversations.
Being able to provide a context window larger than 2048 is critical for those use cases.
Signed-off-by: Marc Nuri <marc@marcnuri.com>
2025-04-15 06:23:25 +02:00
Daniel J Walsh
6e97173af6
Merge pull request #1191 from rhatdan/VERSION
...
More fixes to get release out
2025-04-14 15:33:39 -04:00
Daniel J Walsh
0f830f2afb
More fixes to get release out
...
intel-gpu will not currently build on Fedora 42, there are issues
in the glibc library. Should try again when Fedora 42 is released
in May.
Verification of the ramalama-cli command, was broken, since ramalama
is the entrypoint.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-14 13:25:09 -04:00
Daniel J Walsh
c0aed4a616
Merge pull request #1186 from rhatdan/VERSION
...
Bump version to v0.7.4
2025-04-14 11:05:23 -04:00
Daniel J Walsh
74a8b67b0b
Bump to 0.7.4
...
Fix handling of minor_release
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-14 10:07:11 -04:00
Daniel J Walsh
07d8ba417a
Fixup build and release scripts
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-14 10:07:11 -04:00
Daniel J Walsh
da5d5a2e71
Merge pull request #1185 from containers/fix-cann-build
...
Fix cann build
2025-04-14 09:54:19 -04:00
Eric Curtin
8367579f77
Fix cann build
...
set_env.sh uses unbound variables deliberately
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-04-14 14:40:20 +01:00
Daniel J Walsh
82dc19438d
Merge pull request #1174 from containers/toolbox-support
...
Add check for toolbox
2025-04-14 09:26:49 -04:00
Daniel J Walsh
211a92dcad
Merge pull request #1184 from containers/disable-arm-neon
...
Disable ARM neon for now in cuda builds
2025-04-14 08:59:20 -04:00
Eric Curtin
ad2cf9f2df
Disable ARM neon for now in cuda builds
...
Otherwize we get build errors
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-04-14 13:26:23 +01:00
Daniel J Walsh
49233a3bc8
Merge pull request #1182 from reidliu41/update-install
...
[Misc] update install script
2025-04-14 05:38:48 -04:00
Eric Curtin
b41977c623
Add check for toolbox
...
If we are in toolbox, don't attempt to run nested containers. We
then have to rely on the user to install llama.cpp in the container
themselves. It's tempting to do an even more generic attempt to see
if we are already inside a container, so we never attempt to do
nested containers, whether toolbox, podman, docker, etc.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-04-13 20:29:11 +01:00
Daniel J Walsh
8955602f9a
Merge pull request #1173 from rhatdan/cuda
...
Improve performance for certain workloads
2025-04-13 14:45:06 -04:00
Eric Curtin
afb8225d80
Merge pull request #1171 from rhatdan/oci
...
Fix failover to OCI image on push
2025-04-13 16:39:47 +01:00
Daniel J Walsh
5455c5ed47
Merge pull request #1123 from edmcman/hf-tag
...
Add ability to pull via hf://user/repo:tag syntax
2025-04-13 05:57:48 -04:00
Daniel J Walsh
c53e238286
Merge pull request #1180 from engelmi/improve-model-store
...
Improve model store
2025-04-13 05:54:55 -04:00
reidliu41
c5e9fb8511
update from suggestion
...
Signed-off-by: reidliu41 <reid201711@gmail.com>
2025-04-13 10:57:15 +08:00
reidliu41
21456ed680
[Misc] update install script
...
Signed-off-by: reidliu41 <reid201711@gmail.com>
2025-04-13 10:49:33 +08:00
Michael Engel
6ee9a75d18
Only list OCI container with --container true
...
Related to: https://github.com/containers/ramalama/pull/1164
Copies the improvement to only list OCI containers when the
--container flag is true.
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-04-13 00:08:07 +02:00
Michael Engel
1f5777db2d
Split model source and path in list models for --use-model-store
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-04-13 00:08:07 +02:00
Daniel J Walsh
d07aa4f9f9
Merge pull request #1176 from nathan-weinberg/make-fix
...
fix: add 'pipx' install to 'make install-requirements'
2025-04-11 15:10:14 -04:00
Daniel J Walsh
15e2a5a8cf
Merge pull request #1177 from nathan-weinberg/issue-template
...
github: add issue templates
2025-04-11 15:08:05 -04:00
Daniel J Walsh
14960256b2
Merge pull request #1179 from nathan-weinberg/fix-contrib-2
...
docs: fix python version guidance in CONTRIBUTING.md
2025-04-11 15:06:33 -04:00
Daniel J Walsh
045831508f
Apply suggestions from code review
...
Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>
2025-04-11 15:06:14 -04:00
Nathan Weinberg
6d4a613e6d
docs: fix python version guidance in CONTRIBUTING.md
...
ramalama can run as low as Python 3.8
Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
2025-04-11 11:27:51 -04:00
Eric Curtin
e4836744b8
Merge pull request #1175 from nathan-weinberg/contrib-fix
...
docs: fix broken link in CONTRIBUTING.md
2025-04-11 15:44:44 +01:00
Nathan Weinberg
b916aa6057
docs: fix broken link in CONTRIBUTING.md
...
Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
2025-04-11 10:33:15 -04:00
Nathan Weinberg
e968ffb764
github: add issue templates
...
the CONTRIBUTING.md doc refers to several issue templates
being present in the projec but currently none exist
this commit adds templates in based on the podman project
Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
2025-04-11 10:28:59 -04:00
Nathan Weinberg
082ed827f1
fix: add 'pipx' install to 'make install-requirements'
...
'make install-requirements' currently assumes 'pipx'
is installed in your env, but this may not be the case
add an explict install/upgrade command via pip
Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
2025-04-11 10:04:33 -04:00
Daniel J Walsh
cc47ae8015
Improve performance for certain workloads
...
Fixes: https://github.com/containers/ramalama/issues/1156
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-11 07:17:22 -04:00
Daniel J Walsh
0338d26589
Merge pull request #1169 from rhatdan/llama-stack
...
Build images for llama-stack
2025-04-10 18:29:19 -04:00
Daniel J Walsh
53091b192d
Fix failover to OCI image on push
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-10 18:26:45 -04:00
Daniel J Walsh
4c43042343
Merge pull request #1166 from containers/update-llama.cpp
...
Update llama.cpp add llama 4
2025-04-10 17:55:46 -04:00
Daniel J Walsh
b996ae315d
Build images for llama-stack
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-10 17:54:02 -04:00
Eric Curtin
20be18708e
Merge pull request #1167 from rhatdan/fedora
...
Bump all images to f42
2025-04-10 18:38:40 +01:00
Daniel J Walsh
12d4a46b23
Bump all images to f42
...
Some of the images were using f41 and others f42, moving
them all to the same version. f42 is in beta now so good time
to move.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-10 13:04:37 -04:00
Eric Curtin
559efcbd76
Merge pull request #1164 from rhatdan/nocontainer
...
Do not list OCI containers when running with nocontainer
2025-04-10 17:50:01 +01:00
Eric Curtin
3e41c8749d
Update llama.cpp add llama 4
...
To pick up llama 4 support among other things.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-04-10 17:48:07 +01:00
Daniel J Walsh
1f93767e34
Do not list OCI containers when running with nocontainer
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-10 11:41:48 -04:00
Eric Curtin
8b9721c16d
Merge pull request #1161 from rhatdan/docling
...
Docling on certain platforms needs accellerate package
2025-04-10 15:13:10 +01:00
Daniel J Walsh
602d4b9e37
Docling on certain platforms needs accellerate package
...
Fixes: https://github.com/containers/ramalama/issues/1157
Also make sure build scripts blow up if any command fails.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-10 09:44:18 -04:00
Daniel J Walsh
cbb4ec0a4a
Merge pull request #1149 from containers/renovate/registry.access.redhat.com-ubi9-ubi-9.x
...
chore(deps): update registry.access.redhat.com/ubi9/ubi docker tag to v9.5-1744101466
2025-04-09 13:44:45 -04:00
Edward J. Schwartz
211beaba00
attempt to download file/repo if tag format fails
...
Signed-off-by: Edward J. Schwartz <edmcman@cmu.edu>
2025-04-09 11:45:53 -04:00
Daniel J Walsh
bcfc715f77
Merge pull request #1154 from rhatdan/rag
...
Use -rag images when using --rag commands
2025-04-09 09:51:07 -04:00
Daniel J Walsh
458b44b3c7
Merge pull request #1153 from rhatdan/rocm
...
Scripts currently used for releasing images
2025-04-09 07:54:18 -04:00
Daniel J Walsh
96dbf92bd5
Use -rag images when using --rag commands
...
Fixes: https://github.com/containers/ramalama/issues/1143
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-09 07:53:49 -04:00
Daniel J Walsh
81faa3e7ba
Merge pull request #1151 from containers/client
...
feat: Add ramalama client command with basic implementation
2025-04-08 16:23:57 -04:00
Daniel J Walsh
1282987809
Scripts currently used for releasing images
...
These are the scripts I am using to push images and build multi-arch
images to the quay.io repositories.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-08 13:32:33 -04:00
Eric Curtin
a84a11d7b7
feat: Add ramalama client command with basic implementation
...
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-04-08 17:46:10 +01:00
Daniel J Walsh
8c8f2cfcb8
Merge pull request #1150 from ueno/wip/dueno/image-arg3
...
Give "image" config option precedence over hardware-based defaults
2025-04-08 10:35:33 -04:00
Daiki Ueno
ab7e7594fb
Give "image" config option precedence over hardware-based defaults
...
Currently, the config options are stored in a single dict, regardless
of where they are originated, e.g., environment variables, files, or
the preset default. This prevents overriding certain options, such as
"image", with a config file.
This groups config options by origins in collections.ChainMap.
Signed-off-by: Daiki Ueno <dueno@redhat.com>
2025-04-08 23:07:45 +09:00
Daniel J Walsh
4f44702271
Merge pull request #1142 from ueno/wip/dueno/image-arg2
...
Exercise image detection in tests
2025-04-08 09:53:45 -04:00
Daiki Ueno
42e7da7365
Exercise image detection in tests
...
This adds a unit test to check whether the image can be properly
overridden, with the --image command-line option or RAMALAMA_IMAGE
envvar.
Signed-off-by: Daiki Ueno <dueno@redhat.com>
2025-04-08 22:51:35 +09:00
renovate[bot]
c25e59c881
chore(deps): update registry.access.redhat.com/ubi9/ubi docker tag to v9.5-1744101466
...
Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2025-04-08 13:48:30 +00:00
Eric Curtin
1196d4df82
Merge pull request #1148 from rhatdan/rocm
...
Removing git breaks rocm images
2025-04-08 14:47:55 +01:00
Daniel J Walsh
bd967d1515
Removing git breaks rocm images
...
Rocm requres a couple of -devel packages which require git.
Removing git removed these packages.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-08 09:05:23 -04:00
Daniel J Walsh
cf43e310f0
Merge pull request #1137 from rhatdan/VERSION
...
Bump to 0.7.3
2025-04-07 15:17:18 -04:00
Daniel J Walsh
928541dff9
Bump to 0.7.3
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-07 14:55:36 -04:00
Daniel J Walsh
14bef6c830
Merge pull request #1138 from containers/fix-ramalama-rag
...
Build fix container image parsing
2025-04-07 14:03:51 -04:00
Eric Curtin
02c75fb974
Build fix container image parsing
...
There's two cases we enter this function at the end of a rag command
or when manually specifing a container as inferencing engine.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-04-07 17:19:31 +01:00
Eric Curtin
c3a9205153
Merge pull request #1136 from rhatdan/VERSION
...
don't use version 0 for pulling images
2025-04-07 16:18:18 +01:00
Daniel J Walsh
5c01db5db5
don't use version 0 for pulling images
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-07 10:55:50 -04:00
Eric Curtin
30ccfcddca
Merge pull request #1134 from rhatdan/vulkan
...
Revert VULKAN change until podman 5.5 is released
2025-04-07 15:30:51 +01:00
Daniel J Walsh
3c407247f7
Revert VULKAN change until podman 5.5 is released
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-07 09:57:10 -04:00
Eric Curtin
a39fef55fd
Merge pull request #1124 from containers/quick-fix
...
Quick fix to installer
2025-04-07 11:09:23 +01:00
Eric Curtin
c11a84ae5f
Quick fix to installer
...
Directory was not right
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-04-06 15:22:06 +01:00
Edward J. Schwartz
be45d785c9
reformat
...
Signed-off-by: Edward J. Schwartz <edmcman@cmu.edu>
2025-04-06 09:55:53 -04:00
Edward J. Schwartz
55b81e37b9
rename ollama's repo pull function
...
Signed-off-by: Edward J. Schwartz <edmcman@cmu.edu>
2025-04-06 09:55:30 -04:00
Edward J. Schwartz
9906e7faca
gguf handled by tag now
...
Signed-off-by: Edward J. Schwartz <edmcman@cmu.edu>
2025-04-06 09:44:17 -04:00
Edward J. Schwartz
19ea4095d5
Initial work on huggingface gguf tags
...
Signed-off-by: Edward J. Schwartz <edmcman@cmu.edu>
2025-04-05 16:09:40 -04:00
Daniel J Walsh
62a8f9b9f2
Merge pull request #1121 from containers/nocontainerfix-and-rm-duplicate
...
Remove duplicate code
2025-04-05 09:56:26 -04:00
Eric Curtin
93b6d0a5ed
Remove duplicate code
...
Ensure container images aren't attempeted to be downloaded when
using:
--nocontainer
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-04-05 14:20:58 +01:00
Eric Curtin
e0fb7f0bd5
Merge pull request #1120 from nathan-weinberg/toolkit-docs
...
docs: add note about COPR repo for Fedora users
2025-04-05 01:30:19 +01:00
Nathan Weinberg
16ad55cdb4
docs: add note about COPR repo for Fedora users
...
Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
2025-04-04 16:11:48 -04:00
Eric Curtin
f0598d7b18
Merge pull request #1117 from engelmi/add-partial-files-to-be-listed
...
Add partial files to be listed for ramalama list command
2025-04-04 16:11:41 +01:00
Eric Curtin
5ebdb53d3f
Merge pull request #1118 from jguiditta/doc_fix
...
Fix get/set selbool references.
2025-04-04 16:11:23 +01:00
Jason Guiditta
ca63ba27e5
Fix get/set selbool references.
...
Documentation:
* As installed in current versions of Fedora, these commands are not
'boolean', but 'bool'.
* The set command will give an error message when the value of the
boolean is not set.
Signed-off-by: Jason Guiditta <jguiditt@redhat.com>
2025-04-04 10:50:58 -04:00
Michael Engel
6d4919b62c
Removed unused imports
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-04-04 15:27:23 +02:00
Michael Engel
5fbc524ec4
Fix error removing blobs
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-04-04 15:25:41 +02:00
Michael Engel
ce8c586f33
Added .partial files to be listed for list command
...
Fixes: https://github.com/containers/ramalama/issues/1104
The model store iterates through all files in the ref files for
the list command. If a pull has been cancelled, then these point
to non-existent files even though there are might be already
partial files. To avoid this error, the model store list will
check for the files to exist and check for partial files.
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-04-04 15:25:41 +02:00
Daniel J Walsh
5a30bcc5f3
Merge pull request #1068 from containers/ramalama-serve-code
...
Introduce wrapper for serve and run
2025-04-03 15:10:40 -04:00
Daniel J Walsh
052b52c489
Merge pull request #1116 from containers/certificates
...
macOS python certificate issue
2025-04-03 15:09:13 -04:00
Eric Curtin
83215cfd33
Introduce wrapper for serve and run
...
We are coming to the limits of what we can do in a "podman run"
line. Create wrapper functions so we can do things like forking
processes and other similar things that you need to do inside a
container in python3. There are some features coming up where
rather than upstreaming separate solutions to all our engines
like vLLM and llama.cpp we want to solve the problem in the
python3 layer.
The "if True:"'s will remain for a while, we may need to wait for
containers to be distributed around the place before we turn things
on.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-04-03 18:13:41 +01:00
Eric Curtin
fbaf980ba0
macOS python certificate issue
...
I never encountered this but some have
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-04-03 18:09:56 +01:00
Eric Curtin
87c44edacc
Merge pull request #1113 from engelmi/remove-unused-cli-args-param
...
Removed unused cli_args param from GGUF parse function
2025-04-03 16:24:18 +01:00
Michael Engel
ba56b2fc12
Removed unused cli_args param from GGUF parse function
...
Fixes: https://github.com/containers/ramalama/issues/1103
Removed unused cli_args param from GGUFInfoParser.parse function which
caused also the call in the model store to fail since it wasn't passed in.
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-04-03 16:44:29 +02:00
Eric Curtin
85f4a76e28
Merge pull request #1107 from rhatdan/images
...
No longer install podman remote and openvino in general images
2025-04-03 14:04:57 +01:00
Eric Curtin
b4112d417e
Merge pull request #1108 from nathan-weinberg/docs
...
docs: fix documentation README
2025-04-03 13:32:22 +01:00
Nathan Weinberg
4916009a8c
docs: fix documentation README
...
docs README had lots of broken links and phantom
make targets
this commit removes a lot of the content and fixes
some other workflow guidance
Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
2025-04-03 08:04:19 -04:00
Daniel J Walsh
bd8a0260fb
No longer install podman remote or openvino in general images
...
Remove other packages that are not necessary when running the
containers.
Remove leftover build_rag load command and move openvino to only the
intel container.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-03 06:43:23 -04:00
Eric Curtin
8b6964587d
Merge pull request #1099 from machadovilaca/fix-694
...
Add --webui flag to optionally disable web UI for ramalama serve
2025-04-03 11:12:51 +01:00
Eric Curtin
bf63bf24a3
Merge pull request #1110 from rhatdan/convert
...
Don't leak intermediate OCI image when converting model to OCI
2025-04-03 11:09:49 +01:00
Eric Curtin
fbf30a6919
Merge pull request #1111 from rhatdan/docs
...
Fix all links to ramalama-cuda
2025-04-03 11:09:25 +01:00
Daniel J Walsh
0369d044b0
Fix all links to ramalama-cuda
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-02 19:26:21 -04:00
Daniel J Walsh
f54df7c337
Don't leak intermediate OCI image when converting model to OCI
...
Fixes: https://github.com/containers/ramalama/issues/904
We are currently leaking a <none><none> image every time we convert an
image with ramalama convert.
There will still be a <none><none> image but this is assocated with the
Manifest list created for the OCI image.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-02 19:09:25 -04:00
Daniel J Walsh
41328d282a
Merge pull request #1109 from containers/pull-vllm
...
Point vllm ramalama at rhel registry
2025-04-02 19:08:44 -04:00
Eric Curtin
dd39c5ee3a
Point vllm ramalama at rhel registry
...
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-04-02 23:29:35 +01:00
Daniel J Walsh
cedf7d3b14
Merge pull request #1102 from edmcman/pull2
...
Be verbose about pulling image
2025-04-02 16:05:24 -04:00
Edward J. Schwartz
5d0be4b88d
Fix test output
...
Signed-off-by: Edward J. Schwartz <edmcman@cmu.edu>
2025-04-02 15:21:33 -04:00
Daniel J Walsh
441471dfe0
Merge pull request #1106 from containers/jinja
...
Enable --jinja for all llama-server instances
2025-04-02 14:15:07 -04:00
Eric Curtin
53245c1bf6
Enable --jinja for all llama-server instances
...
--jinja has been around a few months now, enable it everywhere.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-04-02 17:34:31 +01:00
Eric Curtin
f86e852c60
Merge pull request #1105 from maxamillion/docs/fix_readme_link_to_cuda_manpage
...
fix broken link to cuda docs in readme
2025-04-02 17:16:52 +01:00
Adam Miller
a7093ffdf3
fix broken link to cuda docs in readme
...
Signed-off-by: Adam Miller <admiller@redhat.com>
2025-04-02 10:51:18 -05:00
Edward J. Schwartz
50bbb50f27
Be verbose about pulling image
...
Signed-off-by: Edward J. Schwartz <edmcman@cmu.edu>
2025-04-02 11:29:23 -04:00
João Vilaça
c0d7f0b271
Add --webui flag to optionally disable web UI for ramalama serve
...
Fixes #694
Signed-off-by: João Vilaça <machadovilaca@gmail.com>
2025-04-02 16:23:36 +01:00
Eric Curtin
62424d3345
Merge pull request #1100 from rhatdan/docs
...
Make NVIDIA configuration more present in documentation
2025-04-02 15:05:02 +01:00
Daniel J Walsh
47633fbe78
Make NVIDIA configuration more present in documentation
...
Fixes: https://github.com/containers/ramalama/issues/899
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-02 07:12:42 -04:00
Eric Curtin
5edacf1588
Merge pull request #1097 from rhatdan/ci
...
Mv ramalama-ci to ramalama-cli
2025-04-01 23:30:43 +01:00
Daniel J Walsh
275fe11a23
Mv ramalama-ci to ramalama-cli
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-01 18:24:32 -04:00
Daniel J Walsh
ffe9900298
Merge pull request #1094 from rhatdan/rocm
...
Complete move from fedora-rocm to rocm
2025-04-01 18:23:21 -04:00
Daniel J Walsh
cf0b842959
Merge pull request #1090 from rhatdan/rag
...
Move all RAG support to the -rag images
2025-04-01 17:37:39 -04:00
Daniel J Walsh
8732cd554e
Merge pull request #1096 from rhatdan/ci
...
Add ramalama-ci image
2025-04-01 17:37:16 -04:00
Daniel J Walsh
ef9e3fb377
Add ramalama-ci image
...
This image will just run ramalama inside of a container and
requires the user to leak the podman-socket into the container.
It will use Podman-remote for all of its actions.
Requested by the Podman Desktop team.
Fixes: https://github.com/containers/ramalama/issues/837
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-01 15:59:45 -04:00
Daniel J Walsh
ca23630470
Move all RAG support to the -rag images
...
Images have grown considerably with RAG support.
Do not force users who do not use rag to pay the
penalty.
Helps revert some growth complained about here:
https://github.com/containers/ramalama/issues/838
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-01 14:50:49 -04:00
Eric Curtin
e4b15baf0e
Merge pull request #1085 from rhatdan/url
...
Add url support to rag to pull content to the host
2025-04-01 19:41:42 +01:00
Daniel J Walsh
00cf1b91ad
Complete move from fedora-rocm to rocm
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-01 14:27:30 -04:00
Eric Curtin
4ed1e15378
Merge pull request #1093 from rhatdan/mac
...
Describe where default ramalama.conf file is on mac
2025-04-01 19:14:44 +01:00
Daniel J Walsh
c2575f5be0
Describe where default ramalama.conf file is on mac
...
Fixes: https://github.com/containers/ramalama/issues/858
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-01 14:12:24 -04:00
Eric Curtin
21e1cc93db
Merge pull request #1084 from rhatdan/fedora
...
mv rocm to rocm-ubi and rocm-fedora to rocm
2025-04-01 19:02:31 +01:00
Eric Curtin
0d9d7e19e9
Merge pull request #1091 from rhatdan/docling
...
Do not stack fault when using unsupported docling format
2025-04-01 18:46:21 +01:00
Daniel J Walsh
623cda587c
Add url support to rag to pull content to the host
...
Users should be able to list URLs and pull them to the host to
be processed by doc2rag command.
Also should force building of AI Data images to --network=none.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-01 10:21:07 -04:00
Daniel J Walsh
8b2bec8c15
Do not stack fault when using unsupported docling format
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-04-01 10:16:49 -04:00
Eric Curtin
ba801c2e1b
Merge pull request #1089 from engelmi/fix-format-and-lint
...
Fix formatting and lint errors
2025-04-01 13:07:46 +01:00
Michael Engel
ddca0fcde0
Fix formatting and lint errors
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-04-01 14:02:37 +02:00
Eric Curtin
75bc45806b
Merge pull request #1088 from engelmi/fix-file-names-for-windows
...
Remove files with colon in their name
2025-04-01 12:52:23 +01:00
Michael Engel
92c80d0134
Remove files with colon in their name
...
The verify_checksum unit tests use files with a colon in their name. This
causes issues for Windows machines since file names/paths can not contain
this symbol. Therefore, these files have been removed and the tests create
these on the fly and only when not run on Windows machines.
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-04-01 13:48:26 +02:00
Daniel J Walsh
fcc08c7704
mv rocm to rocm-ubi and rocm-fedora to rocm
...
Since we are going to concentrate mainly on upstream,
we want to default the name quay.io/ramalama/rocm to the
rocm-fedora Containerfiles.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-31 14:23:12 -04:00
Daniel J Walsh
e09f3f1f8e
Merge pull request #1083 from rhatdan/VERSION
...
Bump to v0.7.2
2025-03-31 14:14:18 -04:00
Daniel J Walsh
8ba3ffe95f
Bump to v0.7.2
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-31 13:43:36 -04:00
Daniel J Walsh
6ec874091a
Merge pull request #1081 from rhatdan/intel
...
Fix handling of entrypoint for Intel
2025-03-31 13:12:03 -04:00
Daniel J Walsh
8d1f3f3ea0
Fix handling of entrypoint for Intel
...
Additional Fix from https://github.com/lirc572
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-31 12:14:58 -04:00
Daniel J Walsh
2e81be1301
Merge pull request #1082 from rhatdan/quadlet
...
Fix gen of name in quadlet to be on its own line.
2025-03-31 11:42:07 -04:00
Daniel J Walsh
8bf16a0525
Fix gen of name in quadlet to be on its own line.
...
Fixes: https://github.com/containers/ramalama/issues/1078
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-31 11:14:04 -04:00
Daniel J Walsh
859e1f4644
Merge pull request #1080 from containers/container-fix
...
Only install epel on rhel-based OSes
2025-03-31 10:46:37 -04:00
Eric Curtin
64ed0551fc
Merge pull request #1072 from rhatdan/pull
...
We should be pulling minor versions not latest
2025-03-31 15:15:45 +01:00
Eric Curtin
a6a3768cdf
Only install epel on rhel-based OSes
...
Was attempting to install on others
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-03-31 14:18:40 +01:00
Daniel J Walsh
19224552bf
Merge pull request #1070 from miabbott/cuda_privileged
...
docs: fixes to ramalama-cuda
2025-03-31 09:17:30 -04:00
Micah Abbott
35e338db8e
docs: use uppercase NVIDIA
...
Signed-off-by: Micah Abbott <miabbott@redhat.com>
2025-03-31 08:13:01 -04:00
Micah Abbott
1eca34548c
docs: add note about container_use_devices usage
...
On SELinux systems, it may be necessary to turn on the
`container_use_devices` boolean in order to run the `nvidia-smi`
command from within a container.
Signed-off-by: Micah Abbott <miabbott@redhat.com>
2025-03-31 08:11:23 -04:00
Daniel J Walsh
61c37425ae
We should be pulling minor versions not latest
...
This way users can stick with an older version of RamaLama and
not get breakage from a major upgrade. Then when their RamaLama version
gets updated, it will pull an updated image.
Also update the README.md and ramalama.1.md man page to show
the accelerated images.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-31 06:37:34 -04:00
Eric Curtin
0f2de2a57f
Merge pull request #1075 from rhatdan/intel
...
Make sure build_rag.sh is in intel-gpu container image
2025-03-31 10:59:10 +01:00
Daniel J Walsh
23f9e06233
Make sure build_rag.sh is in intel-gpu container image
...
Fixes: https://github.com/containers/ramalama/issues/1074
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-30 21:03:15 -04:00
Eric Curtin
0d47d6588e
Merge pull request #1060 from rhatdan/nocontainer
...
Catch errors early about no support for --nocontainer
2025-03-30 14:27:28 +01:00
Daniel J Walsh
3a322cc7bb
Catch errors early about no support for --nocontainer
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-29 05:53:08 -04:00
Micah Abbott
e1865100dd
docs: linting on ramalama-cuda
...
Applied some fixes based on the Markdown linter in VSCode
See: https://github.com/DavidAnson/vscode-markdownlint
Signed-off-by: Micah Abbott <miabbott@redhat.com>
2025-03-28 17:37:12 -04:00
Adam Miller
d80f49c294
Merge pull request #1069 from containers/build-fix
...
args.engine can be None in this code path
2025-03-28 15:06:54 -04:00
Eric Curtin
b52339bc3f
args.engine can be None in this code path
...
The code will then crash, ensure args.engine has a True value of
some type.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-03-28 18:51:56 +00:00
Eric Curtin
398584d5ca
Merge pull request #1066 from rhatdan/docker
...
Docker running of containers is blowing up
2025-03-28 13:29:09 +00:00
Eric Curtin
35a4f815e7
Merge pull request #1065 from rhatdan/apple
...
Fix handling of $RAMALAMA_CONTAINER_ENGINE
2025-03-28 13:00:39 +00:00
Daniel J Walsh
9dda839dcd
Fix handling of $RAMALAMA_CONTAINER_ENGINE
...
apple_vm has a side effect of setting the podman_machine_accel global
variable which is used when running and serving models. Currently if
the user sets RAMALAMA_CONTAINER_ENGINE to podman in an alternative path
the apple_vm code is not called, so the global variable is not set.
Fixes: https://github.com/containers/ramalama/issues/1040
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-28 08:25:51 -04:00
Daniel J Walsh
ccc05a322d
Docker running of containers is blowing up
...
Seems Docker does not support
--entrypoint=[] like podman. --entrypoint "" seems to work on both
platforms.
When attempting to mimic --pull=newer on Docker we were pulling the
wrong image, we should be attempting to pull the accellerated image
not the default.
For some reason llama-run --threads X is blowing up in a docker
container with the option not being supported. This could be something
being masked inside of Docker containers that is not masked inside of podman
containers. Someone who understands what llama-run is doing with the
--threads option would need to look further into this.
This should fix the issue in CI that is blowing up Docker tests.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-28 08:25:02 -04:00
Eric Curtin
e2a1d86871
Merge pull request #1064 from rhatdan/nvidia
...
Link ramalama-nvidia.1 to ramalama-cuda.1
2025-03-28 12:20:59 +00:00
Daniel J Walsh
61a2199314
Link ramalama-nvidia.1 to ramalama-cuda.1
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-28 06:27:59 -04:00
Eric Curtin
a476ded7c4
Merge pull request #1063 from rhatdan/VERSION
...
Bump to v0.7.1
2025-03-27 21:41:03 +00:00
Daniel J Walsh
96a954b783
Bump to v0.7.1
...
RAG support is broken in current release.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-27 17:07:33 -04:00
Daniel J Walsh
1b13c57462
Merge pull request #1055 from rhatdan/device
...
Add support for /dev/accel being leaked into containers
2025-03-27 17:00:14 -04:00
Daniel J Walsh
ee286e37e7
Merge pull request #1061 from rhatdan/rag
...
Don't display server port when using run --rag
2025-03-27 16:59:11 -04:00
Daniel J Walsh
7597f9c8c4
Don't display server port when using run --rag
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-27 15:59:14 -04:00
Daniel J Walsh
04cb59012b
Merge pull request #1059 from containers/rag-chunk-fix
...
fixed chunk error
2025-03-27 15:21:23 -04:00
Daniel J Walsh
240c9f653c
Add /dev/accel if it exists to containers
...
Certain AI Accellerators are stored in /dev/accel rather then /dev/dri.
Ramalama should support these as well.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-27 14:37:37 -04:00
Daniel J Walsh
f1c2a2fb37
Default devices should be added even if user specified devices
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-27 14:37:10 -04:00
Brian Mahabir
ca7b70104c
fixed chunk error
...
Signed-off-by: Brian Mahabir <bmahabir@bu.edu>
2025-03-27 14:20:27 -04:00
Daniel J Walsh
125bc3918e
Merge pull request #1056 from containers/default-threads
...
Hardcode threads to 2 in this test
2025-03-27 14:20:17 -04:00
Eric Curtin
8cde572b01
Merge pull request #1022 from containers/combine-vulkan-kompute-cpu
...
Combine Vulkan, Kompute and CPU inferencing into one image
2025-03-27 16:04:23 +00:00
Eric Curtin
73c54bf34c
Combine Vulkan, Kompute and CPU inferencing into one image
...
Less images to maintain, Vulkan is more mature and more widely
used than Kompute.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-03-27 15:17:44 +00:00
Eric Curtin
3463411463
Hardcode threads to 2 in this test
...
To help stabilize the build
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-03-27 13:09:34 +00:00
Eric Curtin
c09713f61a
Merge pull request #1049 from rhatdan/build
...
fix ramalama rag build code
2025-03-27 12:43:54 +00:00
Eric Curtin
61c5e648a7
Merge pull request #1046 from rhatdan/nvidia
...
Never use entrypoint
2025-03-27 11:30:27 +00:00
Eric Curtin
82eb9580a3
Merge pull request #1053 from benoitf/RAMALAMA-988
...
feat: add --jinja to the list of arguments if MODEL_JINJA env var is true
2025-03-27 11:28:52 +00:00
Florent Benoit
49054b7778
feat: add --jinja to the list of arguments if MODEL_JINJA is true
...
fixes https://github.com/containers/ramalama/issues/988
Signed-off-by: Florent Benoit <fbenoit@redhat.com>
2025-03-27 12:07:30 +01:00
Daniel J Walsh
74a19e757f
fix ramalama rag build code
...
Also on --dryrun, do not pull images when running on a docker platform
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-27 04:52:24 -04:00
Daniel J Walsh
51a2bd4320
Never use entrypoints
...
Turn off all use of entrypoints when running and serving containers.
Entrypoints have the chance of screwing up the way containers run, and
if a user provides their own image with an entrypoint this could become
tough to diagnose errors.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-27 04:46:06 -04:00
Eric Curtin
6dc9453c42
Merge pull request #1050 from rhatdan/intel
...
Attempt to install openvino using pip
2025-03-27 07:38:45 +00:00
Eric Curtin
3e06caddfa
Merge pull request #982 from containers/default-threads
...
Default the number of threads to (nproc)/(2)
2025-03-27 00:04:54 +00:00
Eric Curtin
e4e0e10dea
Default the number of threads to (nproc)/(2)
...
The llama.cpp default for threads is hardcoded to 4. This changes
that harcoding so instead we use the (number of cpu cores)/(2).
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-03-26 22:01:32 +00:00
Daniel J Walsh
7d058686d4
Merge pull request #1044 from containers/minor-fix
...
Remove unused variable
2025-03-26 17:25:40 -04:00
Daniel J Walsh
3bca1f5d89
Attempt to install openvino using pip
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-26 16:23:55 -04:00
Eric Curtin
d1acb41c86
Merge pull request #1047 from edmcman/pull
...
Print status message when emulating --pull=newer for docker
2025-03-26 19:43:49 +00:00
Edward J. Schwartz
c2cb25267f
Print status message when emulating --pull=newer for docker
...
Close #1043
Signed-off-by: Edward J. Schwartz <edmcman@cmu.edu>
2025-03-26 15:22:20 -04:00
Eric Curtin
7e87e19991
Remove unused variable
...
Closes issue opened by sourcery.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-03-26 19:07:30 +00:00
Eric Curtin
fc50bcaf10
Merge pull request #1045 from rhatdan/intel
...
Add openvino to all images
2025-03-26 19:05:13 +00:00
Daniel J Walsh
2622a914e9
Add openvino to all images
...
Podman desktop has asked us to add openvino support to our containers,
this is first step, next we need to pull non-gguf images and start
actually allowing users to specify openvino as a service.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-26 13:51:50 -04:00
Daniel J Walsh
c776b759b8
Merge pull request #1041 from containers/explain-option-better
...
Explain dryrun option better in container_build.sh
2025-03-26 10:13:31 -04:00
Adam Miller
18547bbca3
Merge pull request #1042 from rhatdan/VERSION
...
Bump to v0.7.0
2025-03-26 09:09:41 -04:00
Eric Curtin
83d9ac3966
Explain dryrun option better in container_build.sh
...
It was given some generic explanation
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-03-26 12:47:59 +00:00
Daniel J Walsh
5ef94aa479
Bump to v0.7.0
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-26 08:43:11 -04:00
Daniel J Walsh
065b6eda72
Merge pull request #1036 from rhatdan/images
...
More updates for builds
2025-03-25 20:37:48 -04:00
Eric Curtin
6f71835edc
Merge pull request #1038 from marceloleitner/py3.9
...
Fix errors on python3.9
2025-03-26 00:36:01 +00:00
Daniel J Walsh
4333e1311e
Merge pull request #1039 from containers/update-llama.c
...
Typo in the webui
2025-03-25 20:35:33 -04:00
Eric Curtin
1e98c381a6
Typo in the webui
...
Somebody noticed a typo in the built-in webui in llama.cpp, it was
fixed in upstream llama.cpp. This just ensures we get the fix
downstream!
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-03-26 00:06:28 +00:00
Marcelo Ricardo Leitner
3e28287fa6
Fix errors on python3.9
...
Fixes: https://github.com/containers/ramalama/issues/1037
Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com>
2025-03-25 20:58:27 -03:00
Daniel J Walsh
ff0e5223d0
More updates for builds
...
Fix doc2rag to handle load properly
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-25 16:30:19 -04:00
Daniel J Walsh
bcf5c9576b
Merge pull request #1035 from containers/renovate/registry.access.redhat.com-ubi9-ubi-9.x
...
Update registry.access.redhat.com/ubi9/ubi Docker tag to v9.5-1742918310
2025-03-25 15:14:03 -04:00
renovate[bot]
4328dabb7c
Update registry.access.redhat.com/ubi9/ubi Docker tag to v9.5-1742918310
...
Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2025-03-25 18:11:07 +00:00
Eric Curtin
61e164c6a7
Merge pull request #1033 from containers/rag-eof-fix
...
Added terminal name fixed eof bug and added another model to rag_framework load
2025-03-25 15:12:45 +00:00
Brian Mahabir
8a70dfbb18
Added terminal name fixed eof bug added model to load
...
Signed-off-by: Brian Mahabir <bmahabir@bu.edu>
2025-03-25 11:07:23 -04:00
Daniel J Walsh
c0af65f421
Merge pull request #1032 from containers/minor-fix
...
Minor bugfix remove self. from self.prompt
2025-03-25 10:32:05 -04:00
Eric Curtin
c99f5e3188
Minor bugfix remove self. from self.prompt
...
Not needed in non-constructor scope
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-03-25 14:16:35 +00:00
Eric Curtin
992d07eb95
Use stdlib from cmd in stdlib
...
Handles a lot of cases by default, helps handle Ctrl-C, Ctrl-D,
adds ability to cycle through prompts via up and down keyboard
arrow.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-03-25 13:55:24 +00:00
Eric Curtin
7d6baa87f0
Merge pull request #1031 from rhatdan/images
...
More fixes to build scripts
2025-03-25 12:52:10 +00:00
Daniel J Walsh
420c39f7e8
More fixes to build scripts
...
Adding back rag_framework load
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-25 08:46:02 -04:00
Eric Curtin
7a19e002dc
Merge pull request #1029 from containers/dr-rag
...
Updated rag to have much better querys at the cost of slight delay
2025-03-25 11:31:11 +00:00
Brian
baa7c16489
Updated rag to have much better querys at the cost of slightly more delay
...
Signed-off-by: Brian <bmahabir@bu.edu>
2025-03-24 23:42:11 -04:00
Eric Curtin
254f11ad00
Merge pull request #1028 from rhatdan/images
...
More fixes to build scripts
2025-03-24 21:32:37 +00:00
Daniel J Walsh
809a914fb4
More fixes to build scripts
...
Adding back rag_framework load
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-24 16:53:30 -04:00
Eric Curtin
de6a149d30
Merge pull request #1026 from containers/rag-run-fix
...
added hacky method to use 'run' instead of 'serve' for rag
2025-03-24 17:53:21 +00:00
Daniel J Walsh
e9b5ad9265
Merge pull request #1027 from rhatdan/images
...
Run build_rag.sh as root
2025-03-24 13:49:01 -04:00
Daniel J Walsh
2e1aebfa87
Merge pull request #1024 from rhatdan/rocm
...
Change default ROCM image to rocm-fedora
2025-03-24 13:48:23 -04:00
Daniel J Walsh
d216212207
Run build_rag.sh as root
...
This fixes the build on intel-gpu container.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-24 13:36:55 -04:00
Daniel J Walsh
87cbd9f302
Merge pull request #1023 from rhatdan/images
...
Fix up building of images
2025-03-24 13:36:42 -04:00
Brian Mahabir
8dd0f1d804
added hacky method to use 'run' instead of 'serve' for rag
...
Signed-off-by: Brian Mahabir <bmahabir@bu.edu>
2025-03-24 13:35:28 -04:00
Daniel J Walsh
b276491ff4
Fix up building of images
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-24 12:07:56 -04:00
Daniel J Walsh
7da14df5db
Change default ROCM image to rocm-fedora
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-24 12:04:39 -04:00
Daniel J Walsh
76041eb154
Merge pull request #1021 from containers/optionally-turn-off-color
...
Add feature to turn off colored text
2025-03-24 06:46:59 -04:00
Eric Curtin
0d21651784
Add feature to turn off colored text
...
Requested by user. Also add check to see if terminal is color
capable.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-03-23 14:19:00 +00:00
Daniel J Walsh
02b7c109cd
Merge pull request #1017 from containers/reset-colors
...
Color each word individually
2025-03-22 13:17:26 -04:00
Daniel J Walsh
ab9898dccb
Merge pull request #1019 from containers/install
...
Make install script more aesthetically pleasing
2025-03-22 13:16:41 -04:00
Eric Curtin
dbe7775513
Make install script more aesthetically pleasing
...
Print RamaLama and Llama based loading bar.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-03-21 23:05:43 +00:00
Eric Curtin
106073b265
Merge pull request #1009 from bachp/modelname-in-api
...
Show model name in API instead of model file path
2025-03-21 21:33:12 +00:00
Pascal Bach
a9ccfb64d0
chore: extend flake to allow nix develop
...
This adds all dependencies needed to run
make bats inside the flake
Signed-off-by: Pascal Bach <pascal.bach@nextrem.ch>
2025-03-21 21:53:51 +01:00
Pascal Bach
422cd02173
feat: show model name in API instead of model file path
...
The model file path is always /mnt/models/model.file which makes it hard
to distingish model in the API.
By using the llama-cpp alias flag the server will serve the model name
Signed-off-by: Pascal Bach <pascal.bach@nextrem.ch>
2025-03-21 21:53:51 +01:00
Pascal Bach
85fc3000be
refactor: use long argument names for llama-server
...
Signed-off-by: Pascal Bach <pascal.bach@nextrem.ch>
2025-03-21 21:53:51 +01:00
Eric Curtin
44a16a7d65
Color each word individually
...
We don't want the color yellow to leak into the terminal if the
process dies suddenly.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-03-21 19:47:23 +00:00
Daniel J Walsh
c97cb2dde0
Merge pull request #1016 from containers/bugfix-1
...
Rag condition should be and instead of or
2025-03-21 14:01:15 -04:00
Eric Curtin
a18bcab73e
Rag condition should be and instead of or
...
We want both of these things to be true to execute rag
functionality.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-03-21 17:39:38 +00:00
Eric Curtin
4523c6e9e6
Merge pull request #1010 from containers/rag
...
Adds Rag chatbot to ramalama serve and preloads models for doc2rag and rag_framework
2025-03-21 16:45:48 +00:00
Brian Mahabir
8a5d94a072
Updated rag
...
Signed-off-by: Brian Mahabir <bmahabir@bu.edu>
2025-03-21 12:23:32 -04:00
Eric Curtin
a3ef16ec83
Merge pull request #1015 from rhatdan/rag1
...
Fix ramalama serve --rag ABC --generate kube
2025-03-21 16:23:13 +00:00
Daniel J Walsh
8fb794bee7
Fix ramalama serve --rag ABC --generate kube
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-21 11:59:44 -04:00
Daniel J Walsh
68f41c22f4
Merge pull request #1014 from containers/conversation-history
...
Keep conversation history
2025-03-21 11:40:25 -04:00
Eric Curtin
439a7413ec
Merge pull request #1012 from rhatdan/rag1
...
Generate quadlets with rag databases
2025-03-21 15:30:58 +00:00
Eric Curtin
5878094c31
Keep conversation history
...
Don't treat every prompt like a separate prompt, keep the
conversation history.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-03-21 15:17:56 +00:00
Daniel J Walsh
223a42ac54
Generate quadlets with rag databases
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-21 08:23:39 -04:00
Daniel J Walsh
1421497a84
Merge pull request #1011 from cgruver/cleanup
...
update docs for Intel GPU support. Clean up code comments
2025-03-21 08:12:01 -04:00
Eric Curtin
e61d80f9bc
Merge pull request #1013 from containers/enhance-client
...
Improve UX for ramalama-client
2025-03-21 11:58:23 +00:00
Eric Curtin
69e3ed22b3
Improve UX for ramalama-client
...
I couldn't figure out why things weren't printing word by word
yesterday as I was going for an evening walk it dawned on me, we
were not flushing the buffers. Also adds color to the response
like llama-run.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-03-21 11:38:49 +00:00
Charro Gruver
be7c8c6e13
update docs for Intel GPU support. Clean up code comments
...
Signed-off-by: Charro Gruver <cgruver@redhat.com>
2025-03-21 11:34:38 +00:00
Eric Curtin
ec7cfa4620
Merge pull request #1008 from containers/vllm-rocm
...
Use this container if we detect ROCm accelerator
2025-03-21 11:17:53 +00:00
Eric Curtin
15b8cce09c
Merge pull request #1007 from rhatdan/rag1
...
Fix errors on python3.9
2025-03-20 20:27:57 +00:00
Eric Curtin
82a62951db
Use this container if we detect ROCm accelerator
...
Should help get ROCm + vLLM working
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-03-20 20:27:09 +00:00
Daniel J Walsh
d0e6e6781e
Fix errors on python3.9
...
Fixes: https://github.com/containers/ramalama/issues/1004
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-20 16:05:34 -04:00
Daniel J Walsh
f42c734928
Merge pull request #1005 from containers/appstudio-ramalama
...
Red Hat Konflux update ramalama
2025-03-20 15:45:33 -04:00
red-hat-konflux
026176d12a
Red Hat Konflux update ramalama
...
Signed-off-by: red-hat-konflux <konflux@no-reply.konflux-ci.dev>
2025-03-20 19:08:19 +00:00
Brian M
9d2f70f34f
Merge pull request #1003 from rhatdan/rag1
...
Don't use relative paths for destination
2025-03-20 15:06:24 -04:00
Daniel J Walsh
5213e858e3
Don't use relative paths for destination
...
This can cause the file to not be installed in a subdir of the
destination (/docs) within the container.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-20 14:49:49 -04:00
Daniel J Walsh
e9a253f933
Merge pull request #1001 from containers/verbose
...
Turn on verbose logging in llama-server if --debug is on
2025-03-20 14:21:51 -04:00
Daniel J Walsh
bd5cbecffd
Merge pull request #998 from rhatdan/rag
...
Fix errors found in RamaLama RAG
2025-03-20 14:20:59 -04:00
Daniel J Walsh
f7dc4d7ca4
Merge pull request #997 from containers/ramalama-client
...
Add ramalama client
2025-03-20 14:05:52 -04:00
Daniel J Walsh
94246b6977
Fix errors found in RamaLama RAG
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-20 14:03:51 -04:00
Eric Curtin
9e02876bef
Turn on verbose logging in llama-server if --debug is on
...
Can see more verbose request/response info, etc.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-03-20 17:32:50 +00:00
Eric Curtin
7784b4ef30
Merge pull request #996 from cgruver/intel-gpus
...
Add the ability to identify a wider set of Intel GPUs that have enough Execution Units to produce decent results
2025-03-20 17:31:31 +00:00
Eric Curtin
6416c9e3e9
Add ramalama client
...
Once we achieve feature parity with llama-run, we will more
tightly integrate this into RamaLama.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-03-20 17:30:38 +00:00
Eric Curtin
3237ef4af5
Merge pull request #1002 from kush-gupt/main
...
FIX: Ollama install with brew for CI
2025-03-20 17:16:39 +00:00
Kush Gupta
7ca228bb7c
update brew before starting ollama
...
Signed-off-by: Kush Gupta <kushalgupta@gmail.com>
2025-03-20 13:04:01 -04:00
Charro Gruver
309a14f6e6
Make Linter Happy :-)
...
Signed-off-by: Charro Gruver <cgruver@redhat.com>
2025-03-20 14:50:36 +00:00
Charro Gruver
4a3cd65180
Add the ability to identify a wider set of Intel GPUs that have enough Execution Units to produce decent results
...
Signed-off-by: Charro Gruver <cgruver@redhat.com>
2025-03-20 14:42:18 +00:00
Eric Curtin
1e90133123
Merge pull request #995 from benoitf/fix-condition
...
chore: use the reverse condition for models
2025-03-20 13:32:47 +00:00
Florent Benoit
546ad5d0b8
chore: use the reverse condition for models
...
Signed-off-by: Florent Benoit <fbenoit@redhat.com>
2025-03-20 14:16:26 +01:00
Eric Curtin
08bfd60eb8
Merge pull request #979 from rhatdan/rag
...
Add docling support version 2
2025-03-20 11:43:06 +00:00
Eric Curtin
6f502a87d9
Merge pull request #994 from leo-pony/main
...
[CANN]Fix the bug that openEuler repo does not have ffmpeg-free package, instand of using ffmpeg for openEuler
2025-03-20 03:10:57 +00:00
leo-pony
912ac01af1
[CANN]Fix the bug that openEuler repo does not have ffmpeg-free package. Instand of using ffmpeg for openEuler, which also has LGPL license.
...
Signed-off-by: leo-pony <nengjunma@outlook.com>
2025-03-20 11:01:13 +08:00
Daniel J Walsh
1a17a4497f
Add ramalama serve --rag and ramalama run -rag
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-19 14:47:46 -04:00
Daniel J Walsh
27ca51d87a
Add docling support version 2
...
Remove pragmatic, and move to using local implementation
until llama-stack version is ready.
python3 container-images/scripts/doc2rag.py --help
usage: docling [-h] target source [source ...]
process source files into RAG vector database
positional arguments:
target
source
options:
-h, --help show this help message and exit
ramalama rag should be using accelerated images.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-19 14:47:46 -04:00
Eric Curtin
f73f37a3b8
Merge pull request #992 from benoitf/RAMALAMA-991
...
fix: use expected condition
2025-03-19 14:06:29 +00:00
Florent Benoit
b46ac7e24a
fix: use expected condition
...
fixes https://github.com/containers/ramalama/issues/991
Signed-off-by: Florent Benoit <fbenoit@redhat.com>
2025-03-19 14:50:13 +01:00
Eric Curtin
83285cbad5
Merge pull request #989 from rhatdan/build
...
Fix container_build.sh to build all images
2025-03-19 10:07:28 +00:00
Daniel J Walsh
18d90bb1ed
Fix container_build.sh to build all images
...
Fixes: https://github.com/containers/ramalama/issues/987
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-19 05:04:38 -04:00
Daniel J Walsh
45dfb82926
Merge pull request #985 from containers/ffmpeg-free
...
whisper.cpp requires ffmpeg
2025-03-18 20:32:37 -04:00
Eric Curtin
f8cefcdd87
whisper.cpp requires ffmpeg
...
Installing ffmpeg-free from Fedora/EPEL, ffmpeg-free includes only
the FOSS/patent free bits of ffmpeg.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-03-18 19:18:57 +00:00
Daniel J Walsh
11a23b9279
Merge pull request #986 from rhatdan/intel
...
Improve intel-gpu to work with whisper-server and llama-server
2025-03-18 14:48:09 -04:00
Daniel J Walsh
b0c8c84a04
Improve intel-gpu to work with whisper-server and llama-server
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-18 11:17:05 -04:00
Daniel J Walsh
787f3558b0
Merge pull request #984 from rhatdan/whisper
...
Default whisper-server.sh, llama-server.sh to /mnt/models/model.file
2025-03-18 10:32:26 -04:00
Daniel J Walsh
e0ba69e89c
Default whisper-server.sh, llama-server.sh to /mnt/models/model.file
...
Fixes: https://github.com/containers/ramalama/issues/980
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-18 09:15:06 -04:00
Daniel J Walsh
e18d78045d
Merge pull request #978 from rhatdan/VERSION
...
Bump to v0.6.4
2025-03-17 14:20:54 -04:00
Daniel J Walsh
1bcbfd5ab4
Bump to v0.6.4
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-17 11:55:02 -04:00
Daniel J Walsh
40fe75230b
Merge pull request #975 from rhatdan/main
...
Fix handling of whisper-server and llama-server entrypoints
2025-03-17 11:54:48 -04:00
Daniel J Walsh
064b28d10f
FIx handling of whisper-server and llama-server entrypoints
...
Entrypoint tests are blowing up so remove for now.
Fixes: https://github.com/containers/ramalama/issues/977
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-17 11:07:09 -04:00
Daniel J Walsh
e5249621fb
Merge pull request #949 from edmcman/main
...
Add --runtime-arg option for run and serve
2025-03-17 10:08:46 -04:00
Eric Curtin
9a91ce3594
Merge pull request #976 from cgruver/intel-gpg-fail
...
GPG Check is failing on the Intel Repo
2025-03-17 14:03:11 +00:00
Edward J. Schwartz
65bd965359
Add --runtime-args option
...
Signed-off-by: Edward J. Schwartz <edmcman@cmu.edu>
2025-03-17 09:52:31 -04:00
Charro Gruver
ff270aeee9
GPG Check is failing on the Intel Repo
...
Signed-off-by: Charro Gruver <cgruver@redhat.com>
2025-03-17 13:43:32 +00:00
Eric Curtin
ae2c250ed0
Merge pull request #974 from rhatdan/main
...
Asashi build is failing because of no python3-devel package
2025-03-17 13:03:05 +00:00
Daniel J Walsh
1a0492dbb9
Asashi build is failing because of no python3-devel package
...
Also remove devel packages when completing install.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-17 09:01:15 -04:00
Daniel J Walsh
af6b551f6b
Merge pull request #966 from antheas/threads
...
feat(cpu): add --threads option to specify number of cpu threads
2025-03-17 06:53:22 -04:00
Daniel J Walsh
ded1c436f5
Merge pull request #971 from containers/nvidia-fix
...
Only set this environment variable if we can resolve CDI
2025-03-17 06:49:41 -04:00
Daniel J Walsh
b67d6d43e3
Merge pull request #973 from containers/update-llama
...
Update llama.cpp for some Gemma features
2025-03-17 06:48:47 -04:00
Eric Curtin
e7083607d9
Update llama.cpp for some Gemma features
...
We want to get some new Gemma features added to llama.cpp .
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-03-16 22:50:30 +00:00
Eric Curtin
6019bda457
Only set this environment variable if we can resolve CDI
...
We don't want to use Nvidia/CUDA just because nvidia-smi is present
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-03-16 18:59:19 +00:00
Eric Curtin
3c41b2ff98
Merge pull request #968 from rhatdan/rag1
...
Add software to support using rag in RamaLama
2025-03-15 15:34:43 +00:00
Brian
57f4a6097b
Add software to support using rag in RamaLama
...
This PR just installs the python requirements needed to play with the
rag_framework.py file.
I have not added the docling support yet, since that would swell the
size of the images. Will add that in a separate PR.
Also remove pragmatic and begin conversion to new rag tooling.
Signed-off-by: Brian <bmahabir@bu.edu>
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-15 07:28:47 -04:00
Antheas Kapenekakis
7b0789187a
feat(cpu): add --threads option to specify number of cpu threads
...
Signed-off-by: Antheas Kapenekakis <git@antheas.dev>
2025-03-15 12:17:04 +01:00
Eric Curtin
c8ea9fe3ba
Merge pull request #965 from rhatdan/whisper
...
Fix ENTRYPOINTS of whisper-server and llama-server
2025-03-14 21:47:59 +00:00
Eric Curtin
349ad48008
Merge pull request #967 from containers/llama-cpp-threads
...
Update llama.cpp to contain threads features
2025-03-14 18:22:32 +00:00
Eric Curtin
745b960e77
Update llama.cpp to contain threads features
...
So we can specify CPU threads for llama-run.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-03-14 16:42:24 +00:00
Daniel J Walsh
bff0b2de0b
Fix ENTRYPOINTS of whisper-server and llama-server
...
Fixes:https://github.com/containers/ramalama/issues/964
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-14 10:08:03 -04:00
Daniel J Walsh
3513d47150
Merge pull request #960 from containers/renovate/docker.io-nvidia-cuda-12.x
...
Update docker.io/nvidia/cuda Docker tag to v12.8.1
2025-03-14 09:35:38 -04:00
Eric Curtin
e91b21bfba
Merge pull request #963 from andreadecorte/fix_readme
...
Fix port rendering in README
2025-03-14 13:17:06 +00:00
Andrea Decorte
901a7b2bf2
Fix port rendering in README
...
Port was not rendering in README.md, add a space around it as a workaround.
Signed-off-by: Andrea Decorte <adecorte@redhat.com>
2025-03-14 11:30:01 +01:00
Eric Curtin
cd2150abb1
Merge pull request #962 from leo-pony/main
...
[NPU][Fix] only specify device num, but without ascend-docker-runtime installed, running ramalama/cann container image will failing
2025-03-14 10:27:55 +00:00
leo-pony
d0f02648fa
1. Keep the environment variable of visible Ascend device in ramalama consistent with ascend-docker-runtime.
...
2. Temporarily remove the default value of using device 0 when no ascend device is specified. The reason is that currently, if you only specify device 0 without using ascend-docker-runtime, it cannot be offloaded to NPU normally.
Signed-off-by: leo-pony <nengjunma@outlook.com>
2025-03-14 17:33:46 +08:00
renovate[bot]
ddec113669
Update docker.io/nvidia/cuda Docker tag to v12.8.1
...
Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2025-03-13 17:00:21 +00:00
Daniel J Walsh
8f8f96fa65
Merge pull request #954 from containers/enhance-cuda-check
...
There must be at least one CDI device present to use CUDA
2025-03-13 13:00:11 -04:00
Daniel J Walsh
c4d6772b31
Merge pull request #959 from containers/validate-python3
...
python3 validator
2025-03-13 12:59:51 -04:00
Eric Curtin
68764aa088
There must be at least one CDI device present to use CUDA
...
Otherwise we get failures like:
Error: setting up CDI devices: unresolvable CDI devices nvidia.com/gpu=all
Co-authored-by: Brian <bmahabir@bu.edu>
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-03-13 15:16:00 +00:00
Eric Curtin
f6eaeb6b49
python3 validator
...
We are encountering issues where newer python3 features are
breaking systems with older versions of python3, such as macOS,
this should ensure we validate this in CI.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-03-13 15:13:51 +00:00
Eric Curtin
143f4b0f41
Merge pull request #953 from rhatdan/nvidia
...
Add specified nvidia-oci runtime
2025-03-13 12:58:15 +00:00
Eric Curtin
cd8f5f90ae
Merge pull request #956 from containers/renovate/registry.access.redhat.com-ubi9-ubi-9.x
...
Update registry.access.redhat.com/ubi9/ubi Docker tag to v9.5-1741850090
2025-03-13 12:54:01 +00:00
Daniel J Walsh
066dc1cbbf
Merge pull request #952 from engelmi/add-chat-template-support-to-serve
...
Added --chat-template-file support to ramalama serve
2025-03-13 06:42:01 -04:00
renovate[bot]
c28635f2f0
Update registry.access.redhat.com/ubi9/ubi Docker tag to v9.5-1741850090
...
Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2025-03-13 09:43:30 +00:00
Daniel J Walsh
940953aaaf
Add specified nvidia-oci runtime
...
Nvidia recommends using their nvidia-container-runtime when running
containers with GPU support, so ramalama should use this feature as
well.
Allow users to override the oci-runtime when appropriate.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-12 14:52:28 -04:00
Michael Engel
cb88e40e24
Added --chat-template-file support to ramalama serve
...
Relates to: https://github.com/containers/ramalama/issues/890
Relates to: https://github.com/containers/ramalama/issues/947
If a chat template file can be extracted from the gguf model or if specified by
the model repo, it will now be used in the ramalama serve command and mounted
into the container. It has been included in the generation of the quadlet and
kube files as well.
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-03-12 15:40:25 +01:00
Daniel J Walsh
fd06246fbf
Merge pull request #946 from rhatdan/docker
...
Lets run container in all tests, to make sure it does not explode.
2025-03-12 10:23:17 -04:00
Daniel J Walsh
23de968693
Lets run container in all tests, to make sure it does not explode.
...
Also switch to using smollm
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-12 09:56:34 -04:00
Daniel J Walsh
6a76efc4ca
Merge pull request #951 from containers/fix-install
...
Fix install.sh for OSTree system
2025-03-12 09:47:04 -04:00
Eric Curtin
741ecf2718
Fix install.sh for OSTree system
...
Don't run dnf install on OSTree system.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-03-12 12:14:04 +00:00
Daniel J Walsh
b9505b8699
Merge pull request #939 from s3rj1k/CNAI
...
Handle CNAI annotation deprecation
2025-03-12 07:00:24 -04:00
Daniel J Walsh
ee062e5ee7
Merge pull request #950 from leo-pony/main
...
Add Linux x86-64 support for Ascend NPU accelerator in llama.cpp backend
2025-03-12 06:57:22 -04:00
Daniel J Walsh
0985819215
Merge pull request #915 from containers/more-scaffoling
...
Implement RamaLama shell
2025-03-12 06:52:03 -04:00
leo-pony
ff187ab029
Add Linux x86-64 support for Ascend NPU accelerator in llama.cpp backend
...
Signed-off-by: leo-pony <nengjunma@outlook.com>
2025-03-12 15:38:37 +08:00
Eric Curtin
ea7ec50eef
Merge pull request #943 from containers/consolidate-gpu-detection
...
Consolidate gpu detection
2025-03-11 21:05:24 +00:00
Eric Curtin
8d3a44adac
Consolidate gpu detection
...
This makes sure the gpu detection techniques are the same
throughout the project. We do not display detailed accelerator info,
leave that to tools like "fastfetch" it is hard to maintain, there
are no standards.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-03-11 18:27:33 +00:00
Daniel J Walsh
a28d902a9a
Merge pull request #917 from engelmi/add-chat-template-support
...
Add chat template support
2025-03-11 14:16:26 -04:00
s3rj1k
f716fcc2e9
Update `opencontainers` spec link
...
Signed-off-by: s3rj1k <evasive.gyron@gmail.com>
2025-03-11 18:44:58 +01:00
Eric Curtin
742b6d85ba
Merge pull request #942 from containers/macos-detect
...
macOS detection fix
2025-03-11 17:37:52 +00:00
Eric Curtin
2e1bb04b6d
macOS detection fix
...
Handling of global variables not correct.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-03-11 17:10:38 +00:00
Eric Curtin
58998f20b9
Merge pull request #941 from rhatdan/docker
...
Fix docker handling of GPUs.
2025-03-11 15:23:44 +00:00
Michael Engel
b751eef975
Added snapshot file validation
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-03-11 15:41:55 +01:00
Michael Engel
f55475e36d
Added converting go templates to jinja templates
...
Usually, the chat templates for gguf models are written as jinja templates.
Ollama, however, uses Go Templates specific to ollama. In order to use the
proper templates for models pulled from ollama, the chat templates are
converted to jinja ones and passed to llama-run.
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-03-11 15:41:55 +01:00
Michael Engel
a4c401f303
Use chat template file on model run
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-03-11 15:41:55 +01:00
Michael Engel
5c911fda79
Encode model and chat template information in RefFile
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-03-11 15:41:55 +01:00
Michael Engel
a756441f33
Extract chat template from GGUF file
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-03-11 15:41:55 +01:00
Daniel J Walsh
b4ff470268
Fix docker handling of GPUs.
...
Fixes: https://github.com/containers/ramalama/issues/940
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-11 10:37:17 -04:00
Eric Curtin
742cccce34
Merge pull request #938 from rhatdan/oci
...
Add note about updating nvidia.yaml file
2025-03-11 14:20:09 +00:00
s3rj1k
b66b48de0c
Drop unsupported CNAI annotations
...
Signed-off-by: s3rj1k <evasive.gyron@gmail.com>
2025-03-11 14:35:19 +01:00
s3rj1k
439a95743f
Use `opencontainers` annotations where it makes sense
...
Signed-off-by: s3rj1k <evasive.gyron@gmail.com>
2025-03-11 14:34:52 +01:00
Daniel J Walsh
0698dd8882
Add note about updating nvidia.yaml file
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-11 09:07:24 -04:00
Daniel J Walsh
c9f9266dd7
Merge pull request #935 from containers/raspberripi
...
Bugfixes noticed while installing on Raspberry Pi
2025-03-11 08:54:35 -04:00
Eric Curtin
87cfc4bd18
Bugfixes noticed while installing on Raspberry Pi
...
Just some nits
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-03-11 12:26:24 +00:00
Daniel J Walsh
94d206c845
Merge pull request #933 from containers/macos-fix
...
Make compatible with the macOS system python3
2025-03-10 15:58:04 -04:00
Eric Curtin
e644b63945
Merge pull request #932 from rhatdan/oci
...
Print error when converting from an OCI Image
2025-03-10 19:46:56 +00:00
Eric Curtin
5543d71cac
Make compatible with the macOS system python3
...
populate variables first. Reproducible on systems with brew macOS
also by switching shebang to #!/usr/bin/python3
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-03-10 19:39:24 +00:00
Daniel J Walsh
1a547a0258
Print error when converting from an OCI Image
...
Fixes: https://github.com/containers/ramalama/issues/929
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-10 14:22:02 -04:00
Daniel J Walsh
72c5fafb12
Merge pull request #931 from rhatdan/VERSION
...
Bump to v0.6.3
2025-03-10 13:48:17 -04:00
Daniel J Walsh
583f9a9cac
Bump to v0.6.3
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-10 13:18:06 -04:00
Daniel J Walsh
0ff95703d1
Remove print statement on ports
...
Fixes: https://github.com/containers/ramalama/issues/930
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-10 13:17:44 -04:00
Eric Curtin
7cd661b0ec
Merge pull request #928 from containers/renovate/registry.access.redhat.com-ubi9-ubi-9.x
...
Update registry.access.redhat.com/ubi9/ubi Docker tag to v9.5-1741600006
2025-03-10 15:33:25 +00:00
Daniel J Walsh
199221373d
Merge pull request #926 from benoitf/RAMALAMA-925
...
fix: CHAT_FORMAT variable should be expanded
2025-03-10 11:23:21 -04:00
Florent Benoit
14a80ab1ea
fix: propagate correct command line arguments
...
fixes https://github.com/containers/ramalama/issues/925
Signed-off-by: Florent Benoit <fbenoit@redhat.com>
2025-03-10 16:21:41 +01:00
renovate[bot]
5b8ba9651a
Update registry.access.redhat.com/ubi9/ubi Docker tag to v9.5-1741600006
...
Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2025-03-10 15:04:26 +00:00
Eric Curtin
eefeaff7d5
Merge pull request #921 from rhatdan/config
...
Allow user to specify the images to use per hardware
2025-03-10 15:03:57 +00:00
Daniel J Walsh
0847a76c2d
Allow user to specify the images to use per hardware
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-10 10:03:30 -04:00
Eric Curtin
5394518bbb
Merge pull request #922 from rhatdan/env
...
Add passing of environment variables to ramalama commands
2025-03-10 10:48:01 +00:00
Daniel J Walsh
e18280767c
Add passing of environment variables to ramalama commands
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-03-09 14:32:55 -10:00
Eric Curtin
247b995bb1
Merge pull request #898 from andreadecorte/797
...
Try to choose a free port on serve if default one is not available
2025-03-09 12:43:52 +00:00
Andrea Decorte
78239f2f7f
Try to choose a free port on serve if default one is not available
...
This change tries first to find if the default port 8080 is available.
If not, it tries to find an available free port in the range 8081-8090 in random order.
An error if no free port is found.
In case of success, the chosen port is printed out for the user.
This does not apply if the user chooses a port different from 8080.
Note that this check could be still not be enough if the chosen port is taken
by another process after our check, in that case we will still fail at a later phase as today.
Includes unit testing.
Closes #797
Signed-off-by: Andrea Decorte <adecorte@redhat.com>
2025-03-08 23:39:11 +01:00
Eric Curtin
ebf056cc4a
Merge pull request #920 from cgruver/readme
...
Add Intel ARC 155H to list of supported hardware
2025-03-08 18:25:51 +00:00
Charro Gruver
36c7662956
Add Intel ARC 155H to list of supported hardware
...
Signed-off-by: Charro Gruver <cgruver@redhat.com>
2025-03-08 17:17:15 +00:00
Daniel J Walsh
a668c76e9a
Merge pull request #919 from cgruver/env-vars
...
Modify GPU detection to match against env var value instead of prefix
2025-03-08 11:25:09 -05:00
Charro Gruver
1c57208df0
replace meaningless env var with HIP_VISIBLE_DEVICES in test script
...
Signed-off-by: Charro Gruver <cgruver@redhat.com>
2025-03-08 12:15:22 +00:00
Charro Gruver
7542de5ca2
add HSA_OVERRIDE_GFX_VERSION to env check
...
Signed-off-by: Charro Gruver <cgruver@redhat.com>
2025-03-07 20:21:48 +00:00
Charro Gruver
5859f83f8f
fix tests
...
Signed-off-by: Charro Gruver <cgruver@redhat.com>
2025-03-07 20:02:02 +00:00
Charro Gruver
406ad34a33
fix formatting to satisy lint again... :-)
...
Signed-off-by: Charro Gruver <cgruver@redhat.com>
2025-03-07 19:41:41 +00:00
Charro Gruver
ad39af9aed
fix formatting to satisy lint
...
Signed-off-by: Charro Gruver <cgruver@redhat.com>
2025-03-07 19:39:57 +00:00
Charro Gruver
37ec32594a
Modify GPU detection to match against env var value instead of prefix
...
Signed-off-by: Charro Gruver <cgruver@redhat.com>
2025-03-07 19:29:51 +00:00
Eric Curtin
bf53e71106
Merge pull request #916 from containers/validate
...
Extend make validate check to do more
2025-03-07 12:47:35 +00:00
Eric Curtin
2769347597
Extend make validate check to do more
...
It also does check-format now.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-03-07 12:24:38 +00:00
Daniel J Walsh
adc53fea4e
Merge pull request #911 from leo-pony/main
...
Add support for llama.cpp engine to use ascend NPU device
2025-03-06 12:11:30 -05:00
leo-pony
93c023d4a7
Add code for ascend npu supporting for llama.cpp engine
...
Signed-off-by: leo-pony <nengjunma@outlook.com>
2025-03-06 17:25:46 +08:00
Eric Curtin
31930cd08b
Merge pull request #905 from engelmi/add-new-model-store
...
Add new model store
2025-03-05 13:08:28 +00:00
Eric Curtin
9ce1984102
Implement RamaLama shell
...
This will eventually replace things like linenoise.cpp, llama-run,
etc.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-03-05 13:00:52 +00:00
Michael Engel
62a7765ad3
Added OCI support in list models
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-03-05 11:26:11 +01:00
Michael Engel
6a842d419a
Added URL and local file integration with model store
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-03-05 11:26:11 +01:00
Michael Engel
7c07c1d792
Added huggingface integration with model store
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-03-05 11:26:11 +01:00
Michael Engel
62b320660d
Raise exception instead of sys.exit on download failure
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-03-05 11:26:11 +01:00
Michael Engel
885e8deda2
Enabled the use of dash instead of colon in filenames and directories
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-03-05 11:26:11 +01:00
Michael Engel
4dd6b66723
Added model store and ollama integration
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-03-05 11:26:08 +01:00
Michael Engel
927f10b0e5
Added --use-model-store CLI option
...
Added new --use-model-store CLI option with False as default. Also, updated
the ModelFactory to use that flag and set the store member of models. This
will be used in subsequent commits.
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-03-05 10:01:39 +01:00
Eric Curtin
2af9f0059f
Merge pull request #596 from maxamillion/fedora-rocm
...
Add ramalama image built on Fedora using Fedora's rocm packages
2025-03-04 23:53:16 +00:00
Adam Miller
a79f912ea7
ignore rocm-fedora images from github image build ci workflow
...
Signed-off-by: Adam Miller <admiller@redhat.com>
2025-03-04 17:05:32 -06:00
Daniel J Walsh
2a10dedc87
Merge pull request #901 from kush-gupt/main
...
Detect & get info on hugging face repos, fix sizing of symlinked directories
2025-03-04 11:32:59 -05:00
Eric Curtin
a15928bb74
Merge pull request #909 from containers/ramalama-serve-core
...
Add new ramalama-*-core executables
2025-03-04 16:30:31 +00:00
Daniel J Walsh
2dedd92c5e
Merge pull request #913 from containers/re-introduce-emoji-prompts
...
Reintroduce emoji prompts
2025-03-04 11:04:29 -05:00
Adam Miller
5a253fcb20
remove no longer used ARG
...
Signed-off-by: Adam Miller <admiller@redhat.com>
2025-03-04 08:07:47 -06:00
Eric Curtin
4b1b4f4bc4
Add new ramalama-*-core executables
...
ramalama-serve-core is intended to act as a proxy and implement
multiple-models. ramalama-client-core in intended to act as a OpenAI
client. ramalama-run-core is intended to act as ramalama-serve-core +
ramalama-client-core, both processes will die on completion of
ramalama-run-core.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-03-04 14:00:50 +00:00
Eric Curtin
e35920d4d5
Reintroduce emoji prompts
...
We fixed most of the bugs around UTF-8, I hope! UTF-8 is not
straightforward in C/C++.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-03-04 11:45:24 +00:00
Adam Miller
b28b3b84f2
consolidate back to a single image
...
The Fedora 42 ROCm stack is a little over 3.5G smaller then the
Fedora 41 ROCm stack in packaging.
Signed-off-by: Adam Miller <admiller@redhat.com>
2025-03-03 16:06:39 -06:00
Eric Curtin
f8f020abe5
Merge pull request #910 from containers/kompute2vulkan
...
Build a non-kompute Vulkan container image
2025-03-03 20:52:46 +00:00
Adam Miller
f9b43a176a
rebase on Fedora 42
...
Signed-off-by: Adam Miller <admiller@redhat.com>
2025-03-03 14:27:35 -06:00
Eric Curtin
43b9ab5d5f
Build a non-kompute Vulkan container image
...
There's no image to pull to play around with this backend right
now.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-03-03 19:15:28 +00:00
Daniel J Walsh
a7d9d82600
Merge pull request #908 from containers/utf-8
...
Update llama.cpp
2025-03-03 12:26:55 -05:00
Brian M
c6e36e9c41
Merge pull request #907 from containers/rm-env-var
...
Use python variable instead of environment variable
2025-03-03 10:00:09 -05:00
Eric Curtin
126cf8744d
Update llama.cpp
...
This version of llama.cpp and linenoise.cpp has UTF-8 support
properly implemented.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-03-03 14:57:14 +00:00
Eric Curtin
d07e8d096d
Use python variable instead of environment variable
...
environment variables are more global than we need.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-03-03 12:43:42 +00:00
Eric Curtin
3b0f28569b
Merge pull request #902 from containers/macos-lib-fix
...
Added support for mac cpu and clear warning message
2025-03-03 11:24:31 +00:00
Eric Curtin
2ead1b121e
Merge pull request #903 from alaviss/patch-1
...
readme: fix artifactory link
2025-03-03 11:09:01 +00:00
alaviss
d9567fa71d
readme: fix artifactory link
...
The previous link was to someone with the handle `artifactory`, not JFrog Artifactory.
Signed-off-by: Leorize <leorize+oss@disroot.org>
2025-03-03 03:11:15 -06:00
Brian Mahabir
fb775252f4
Added support for macos cpu for apple sillicon and clear warning message
...
Signed-off-by: Brian Mahabir <56164556+bmahabirbu@users.noreply.github.com>
2025-03-03 00:12:04 -05:00
Kush Gupta
9ceb962263
Merge pull request #9 from kush-gupt/hf-repo-detection
...
detect hf repos, fix get_size for directories
2025-03-02 20:58:17 -05:00
Kush Gupta
992a3cd865
safeguard access to siblings in repo_info
...
Signed-off-by: Kush Gupta <kushalgupta@gmail.com>
2025-03-02 20:30:02 -05:00
Kush Gupta
14093209ce
detect hf repos, fix get_size for directories
...
Signed-off-by: Kush Gupta <kushalgupta@gmail.com>
2025-03-02 20:19:24 -05:00
Eric Curtin
1e61963d48
Merge pull request #897 from benoitf/fix-iso8601
...
fix: handling of date with python 3.8/3.9/3.10
2025-02-28 16:26:12 +00:00
Florent Benoit
bdb7ae58a3
fix: handling of date with python 3.9/3.10
...
use a function working on 3.9+
Signed-off-by: Florent Benoit <fbenoit@redhat.com>
2025-02-28 17:06:08 +01:00
Eric Curtin
dc25be9c2a
Merge pull request #894 from rhatdan/readme
...
Update the README.md to point people at ramalama.ai web site
2025-02-27 15:11:47 +00:00
Daniel J Walsh
5588c2e562
Update the README.md to point people at ramalama.ai web site
...
We need to start updating the web site with blog pointers and release
announcements.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-02-27 04:49:06 -10:00
Daniel J Walsh
70cfebfb56
Merge pull request #888 from containers/fix-bench
...
benchmark failing because of lack of flag
2025-02-26 22:38:05 -05:00
Daniel J Walsh
ff258ae51f
Merge pull request #891 from containers/demo
...
Switch from tiny to smollm:135m
2025-02-26 22:37:26 -05:00
Eric Curtin
c2f81c54cd
benchmark failing because of lack of flag
...
Specifically priviledged because it's not present in the args
object.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-02-26 21:05:06 +00:00
Eric Curtin
b027740e42
Switch from tiny to smollm:135m
...
This is probably a consequence of my slow network, but I switched
to smollm:135m, it's easier for demos. tiny was taking too long
to download.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-02-26 19:43:34 +00:00
Daniel J Walsh
4acfc6b662
Merge pull request #889 from engelmi/inject-config-to-cli-functions
...
Inject config to cli functions
2025-02-26 12:25:43 -05:00
Michael Engel
3622340635
Apply formatting and linting to unit tests
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-02-26 17:42:30 +01:00
Michael Engel
5957153637
Added unit tests for config functions
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-02-26 17:42:30 +01:00
Michael Engel
181d4959dd
Provide module for configuration
...
In cli.py we already load and merge configuration from various sources
and set defaults in the load_and_merge_config(). However, we still define
defaults when getting config values in various places.
In order to streamline this, the merged config is being provided by a
dedicated config.py module. Also, access to values is changed from .get
to access by index since a missing key is a bug and should throw an error.
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-02-26 17:42:30 +01:00
Daniel J Walsh
56e3a71a58
Merge pull request #884 from containers/temp-rm-emoji-usage
...
Remove emoji usage until linenoise.cpp and llama-run are compatible
2025-02-25 20:57:10 -05:00
Adam Miller
a21e1e53b1
suppress shellcheck issue with source=/etc/os-release
...
Signed-off-by: Adam Miller <admiller@redhat.com>
2025-02-25 19:03:08 -06:00
Adam Miller
7e0b5d5095
collapse to a single containerfile, refactor build scripts to accomodate
...
Signed-off-by: Adam Miller <admiller@redhat.com>
2025-02-25 18:48:37 -06:00
Adam Miller
0b02681cf2
move source to main() and clean up conditional
...
Signed-off-by: Adam Miller <admiller@redhat.com>
2025-02-25 18:41:14 -06:00
Adam Miller
61a3f0ac2b
update to handle vulkan/blas packages for fedora
...
Signed-off-by: Adam Miller <admiller@redhat.com>
2025-02-25 18:41:14 -06:00
Adam Miller
549d2eaa4f
Fedora rocm images
...
Signed-off-by: Adam Miller <admiller@redhat.com>
2025-02-25 18:41:14 -06:00
Eric Curtin
8eb9cf2930
Remove emoji usage until linenoise.cpp and llama-run are compatible
...
Less eyecandy, but at least this works, backspaces for example
were broken. Also split function into multiple functions, it was
getting meaty.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-02-25 15:37:21 +00:00
Eric Curtin
16d95effbf
Merge pull request #882 from engelmi/move-model-input-prune-to-factory
...
Moved pruning protocol from model to factory
2025-02-25 15:10:55 +00:00
Eric Curtin
65984d1ddb
Merge pull request #881 from kush-gupt/main
...
Add Ollama to CI and system tests for its caching
2025-02-25 15:04:54 +00:00
Michael Engel
fc75d9f593
Moved pruning protocol from model to factory
...
By moving the pruning of the protocol from the model input to
the model_factory and encapsulating it in a dedicated function,
unit tests can be written more easily.
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-02-25 15:48:31 +01:00
Eric Curtin
3721dd9d0c
Merge pull request #879 from containers/dnf
...
The package available via dnf is in a good place
2025-02-25 14:43:44 +00:00
Kush Gupta
7bdc739d32
replace logname (which needs tty) with whoami
...
Signed-off-by: Kush Gupta <kushalgupta@gmail.com>
2025-02-25 09:42:38 -05:00
Kush Gupta
ddca454c3f
replace hardcoded runner user with whatever user is running the script
...
Signed-off-by: Kush Gupta <kushalgupta@gmail.com>
2025-02-25 09:39:52 -05:00
Kush Gupta
d863cc998e
script os agnostic ollama install
...
Signed-off-by: Kush Gupta <kushalgupta@gmail.com>
2025-02-25 09:22:14 -05:00
Kush Gupta
33cf7f6965
Merge branch 'containers:main' into main
2025-02-25 09:06:54 -05:00
Eric Curtin
14a876d544
The package available via dnf is in a good place
...
Defaulting to that on platforms that have dnf, if it fails for
whatever reason, fall back to this script.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-02-25 13:33:58 +00:00
Kush Gupta
163212b0a1
Simplify check for manifest path
...
Co-authored-by: Michael Engel <mengel@redhat.com>
Signed-off-by: Kush Gupta <kushalgupta@gmail.com>
2025-02-25 07:43:54 -05:00
Eric Curtin
4b14026a26
Merge pull request #880 from containers/vllm
...
Use vllm-openai upstream image
2025-02-25 10:22:57 +00:00
Eric Curtin
3c3b26295c
Merge pull request #878 from containers/check-for-utf8
...
Check if terminal is compatible with emojis before using them
2025-02-25 09:58:20 +00:00
Kush Gupta
f489022d81
Merge pull request #7 from kush-gupt/ollama-cache-tests
...
This adds improvements to the logic for detecting existing Ollama caches and adds a system test to verify the cache functionality. The CI environment is updated to include Ollama.
2025-02-24 20:22:34 -05:00
Kush Gupta
01fb831f27
fix linting
...
Signed-off-by: Kush Gupta <kushalgupta@gmail.com>
2025-02-24 20:04:55 -05:00
Kush Gupta
033522e70a
remove os.getlogin for TTY issues, install ollama in CI and add ollama cache tests
...
Signed-off-by: Kush Gupta <kushalgupta@gmail.com>
2025-02-24 19:59:47 -05:00
Eric Curtin
f237011618
Use vllm-openai upstream image
...
The one we are currently using is old and doesn't have .gguf
compatibility.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-02-25 00:37:54 +00:00
Eric Curtin
94c5e8034f
Check if terminal is compatible with emojis before using them
...
Just in case it doesn't.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-02-24 22:54:49 +00:00
Eric Curtin
00839ee10f
Merge pull request #875 from rhatdan/version
...
Bump to 0.6.2
2025-02-24 14:50:41 +00:00
Daniel J Walsh
b24e933f8e
Bump to 0.6.2
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-02-24 09:30:02 -05:00
Daniel J Walsh
967c521595
Merge pull request #873 from benoitf/RAMALAMA-871
...
fix: use iso8601 for JSON modified field
2025-02-24 07:26:08 -05:00
Eric Curtin
cb4ea96b17
Merge pull request #856 from rhatdan/kube
...
Fix up handling of image selection on generate
2025-02-24 12:20:43 +00:00
Florent Benoit
3082ad9cdf
fix: use iso8601 for JSON modified field
...
ensure all dates are using iso8601 format (and in JSON output)
and then use the humanize field for the CLI output
fixes https://github.com/containers/ramalama/issues/871
Signed-off-by: Florent Benoit <fbenoit@redhat.com>
2025-02-24 12:49:13 +01:00
Daniel J Walsh
ffc8eba1da
Fix up handling of image selection on generate
...
Also fall back to trying OCI images on ramalama run and serve.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-02-24 06:41:54 -05:00
Daniel J Walsh
7d304a4d51
Merge pull request #872 from benoitf/RAMALAMA-783
...
feat: display emoji of the engine for the run in the prompt
2025-02-24 06:13:49 -05:00
Daniel J Walsh
e9c47dccad
Merge pull request #874 from engelmi/added-model-factory
...
Added model factory
2025-02-24 06:12:11 -05:00
Michael Engel
996e6f551c
Created abstract base class for models
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-02-24 10:11:12 +01:00
Michael Engel
149086e043
Added unit tests for new model factory
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-02-24 10:11:12 +01:00
Michael Engel
ca499a9bba
Moved creating model based on cli input to factory
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-02-24 10:11:12 +01:00
Florent Benoit
4058f81590
feat: display emoji of the engine for the run in the prompt
...
fixes https://github.com/containers/ramalama/issues/783
Signed-off-by: Florent Benoit <fbenoit@redhat.com>
2025-02-23 22:01:05 +01:00
Daniel J Walsh
2912913036
Merge pull request #870 from benoitf/RAMALAMA-869
...
chore: do not format size for --json export in list command
2025-02-23 15:36:24 -05:00
Florent Benoit
c70c9e245e
chore: do not format size for --json export in list command
...
fixes https://github.com/containers/ramalama/issues/869
Signed-off-by: Florent Benoit <fbenoit@redhat.com>
2025-02-23 21:20:53 +01:00
Daniel J Walsh
f7a3e635d7
Merge pull request #831 from containers/ci-fixes
...
Make CI build all images
2025-02-23 05:43:35 -05:00
Eric Curtin
bfe91e3c2d
Make CI build all images
...
To ensure they all continue to build and remain of reasonable size.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-02-22 13:48:50 +00:00
Eric Curtin
c5054f1d29
Merge pull request #864 from rhatdan/cuda
...
Revert back to 12.6 version of cuda
2025-02-20 16:36:56 +00:00
Daniel J Walsh
bc5f35a6a0
Revert back to 12.6 version of cuda
...
This is breaking workloads on Fedora 41 at this time.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-02-20 11:32:30 -05:00
Eric Curtin
6c756d91d8
Merge pull request #862 from containers/typo
...
Change rune to run
2025-02-20 15:59:08 +00:00
Eric Curtin
07d0b6d909
Merge pull request #863 from containers/fix-macos-podman-acceleration
...
Fix macOS GPU acceleration via podman
2025-02-20 15:50:06 +00:00
Eric Curtin
a705da6c8b
Fix macOS GPU acceleration via podman
...
We should always have acceleration on when running on macOS. There
is a possible case where one may not want to use acceleration on
macOS. If someone is hell-bent on using podman without krunkit on
macOS. But for now, just turn it on regardless.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-02-20 15:35:34 +00:00
Eric Curtin
eb0b2381d3
Change rune to run
...
Spotted this typo during demo
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-02-20 14:37:31 +00:00
Eric Curtin
13a1ed8058
Merge pull request #861 from rhatdan/docs
...
Define Environment variables to use
2025-02-20 11:29:05 +00:00
Daniel J Walsh
48d9d8765a
Define Environment variables to use
...
Fixes: https://github.com/containers/ramalama/issues/860
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-02-19 16:59:31 -05:00
Daniel J Walsh
367d658246
Merge pull request #859 from benoitf/DESKTOP-836
...
chore: add alias from llama-2 to llama2
2025-02-19 16:31:01 -05:00
Florent Benoit
09d1717270
chore: add alias from llama-2 to llama2
...
fixes https://github.com/containers/ramalama/issues/836
Signed-off-by: Florent Benoit <fbenoit@redhat.com>
2025-02-19 21:58:20 +01:00
Daniel J Walsh
51c85a50a5
Merge pull request #855 from rhatdan/demos
...
Add demos script to show the power of RamaLama
2025-02-19 14:46:12 -05:00
Daniel J Walsh
bdad361ea7
Add demos script to show the power of RamaLama
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-02-19 13:13:14 -05:00
Eric Curtin
28498f82a4
Merge pull request #840 from containers/network-tests
...
Some tests around --network, --net options
2025-02-19 14:06:29 +00:00
Eric Curtin
63999b323c
Some tests around --network, --net options
...
For run and serve commands.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-02-19 13:33:40 +00:00
Eric Curtin
f97761b1cc
Merge pull request #854 from containers/add-renovate-json
...
Introduce basic renovate.json file
2025-02-19 10:51:41 +00:00
Giulia Naponiello
1f9bdafdb5
Introduce basic renovate.json file
...
A newly introduced renovate.json configures renovate to automate
these updates that used to be performed manually:
- https://github.com/containers/ramalama/pull/746
- https://github.com/containers/ramalama/pull/816
Signed-off-by: Giulia Naponiello <gnaponie@redhat.com>
2025-02-19 10:01:00 +01:00
Daniel J Walsh
042f035e46
Merge pull request #851 from rhatdan/version
...
Bump to 0.6.1
2025-02-18 09:48:11 -05:00
Daniel J Walsh
ce24014c29
Merge pull request #849 from stefwalter/install-fedora
...
Include instructions for installing on Fedora 42+
2025-02-18 09:42:17 -05:00
Stef Walter
88e084ca44
Include instructions for installing on Fedora 42+
...
This is the easiest and most secure way to install Ramalama.
Signed-off-by: Stef Walter <swalter@redhat.com>
2025-02-18 09:38:53 -05:00
Daniel J Walsh
99ecb47b26
Bump to 0.6.1
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-02-18 09:33:33 -05:00
Daniel J Walsh
478cab5155
Merge pull request #843 from rhatdan/pull
...
Allowing modification of pull policy
2025-02-18 09:32:34 -05:00
Daniel J Walsh
e754f1d8fa
Merge pull request #850 from containers/rm-license-header
...
Remove LICENSE header from gpu_detector.py
2025-02-18 09:25:04 -05:00
Daniel J Walsh
678c15690a
Allowing modification of pull policy
...
Sometimes users want to test with locally modified images, rather then
pulling from the registry.
If you make your own quay.io/ramalama/ramalama:latest image, then
Podman/Docker currently pull the the image from the registry.
If you change to --pull=never or --pull=missing, then this will not
happen.
You might also want to set this in ramalama.conf.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-02-18 09:11:29 -05:00
Eric Curtin
9b652e9cff
Remove LICENSE header from gpu_detector.py
...
It doesn't make sense to have a license header just for this file.
Removing, it's easier just to maintain one license file.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-02-18 13:55:26 +00:00
Daniel J Walsh
99cfd65870
Merge pull request #848 from rhatdan/nvidia
...
Fix ramalama info to display NVIDIA and amd GPU information
2025-02-18 08:44:31 -05:00
Daniel J Walsh
78c6824e86
Fix ramalama info to display NVIDIA and amd GPU information
...
Fixes: https://github.com/containers/ramalama/issues/822
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-02-18 08:18:15 -05:00
Daniel J Walsh
5452a65b0f
Merge pull request #847 from containers/dryrun
...
Just one add_argument call for --dryrun/--dry-run
2025-02-18 06:53:28 -05:00
Eric Curtin
dbf745dac9
Merge pull request #846 from kush-gupt/main
...
Add system tests to pull from the Hugging Face cache
2025-02-18 10:48:07 +00:00
Eric Curtin
4cab7b9719
Just one add_argument call for --dryrun/--dry-run
...
Just alias a single function call, rather than calling the function
twice
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-02-18 10:46:10 +00:00
Eric Curtin
f7c230277d
Merge pull request #685 from mkesper/main
...
Unify CLI options (verbosity, version)
2025-02-18 10:21:23 +00:00
Michael Kesper
2eab097982
Clarify second dryrun parser argument
...
See #684
Signed-off-by: Michael Kesper <mkesper@web.de>
2025-02-18 00:33:23 +01:00
Michael Kesper
ffbf6766ea
Add --quiet option for pulls
...
Related: #684
Signed-off-by: Michael Kesper <mkesper@web.de>
2025-02-18 00:33:23 +01:00
Michael Kesper
536efce9ea
Remove --version flag
...
`version` is already a subcommand.
Signed-off-by: Michael Kesper <mkesper@web.de>
2025-02-18 00:33:23 +01:00
Michael Kesper
565f9cad05
Move --quiet flag to common options
...
Add --quiet as mutually exclusive with --debug.
Maybe a --verbose switch could also be added.
Related: #684
Signed-off-by: Michael Kesper <mkesper@web.de>
2025-02-18 00:33:23 +01:00
Kush Gupta
d3dee296cf
Merge pull request #5 from kush-gupt/hf-cache-tests
...
Add test function to check hf-cli and tests for pulling from hf cache
2025-02-17 15:35:12 -05:00
Kush Gupta
3c97b007ab
cleanup model from hf cache
...
Signed-off-by: Kush Gupta <kushalgupta@gmail.com>
2025-02-17 15:08:20 -05:00
Kush Gupta
80246ba52f
add function to check hf-cli and tests for pulling from hf cache
...
Signed-off-by: Kush Gupta <kushalgupta@gmail.com>
2025-02-17 14:47:11 -05:00
Daniel J Walsh
6f7384d466
Merge pull request #841 from benoitf/fix-links
...
chore: fix links of llama.cpp repository
2025-02-17 08:46:01 -05:00
Florent Benoit
c9e2b20663
chore: fix links of llama.cpp repository
...
Signed-off-by: Florent Benoit <fbenoit@redhat.com>
2025-02-17 14:13:00 +01:00
Daniel J Walsh
d8ee205eb0
Merge pull request #821 from rhatdan/privs
...
Fix handling of --privileged flag
2025-02-17 08:11:33 -05:00
Eric Curtin
2dce610a36
Merge pull request #835 from rhatdan/cleanup
...
Fix up man page help verifacation
2025-02-17 12:56:50 +00:00
Daniel J Walsh
c0f3df3839
Fix handling of --privileged flag
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-02-17 07:27:34 -05:00
Daniel J Walsh
68974a60d8
Fix up man page help verifacation
...
Also don't codespell *.7 and *.patch files
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-02-17 07:19:24 -05:00
Daniel J Walsh
d371e0fb30
Merge pull request #830 from containers/network-fix
...
Make serve by default expose network
2025-02-17 06:35:02 -05:00
Daniel J Walsh
4f4f0e668e
Merge pull request #833 from kush-gupt/main
...
HuggingFace Cache Implementation
2025-02-17 06:05:42 -05:00
Daniel J Walsh
c6a91af8b0
Merge pull request #819 from rhatdan/entrypoint
...
Add entrypoint container images
2025-02-17 06:03:58 -05:00
Daniel J Walsh
53a599496c
Merge pull request #834 from containers/renovate/registry.access.redhat.com-ubi9-ubi-9.x
...
Update registry.access.redhat.com/ubi9/ubi Docker tag to v9.5-1739751568
2025-02-17 05:41:49 -05:00
renovate[bot]
6f72e47a2b
Update registry.access.redhat.com/ubi9/ubi Docker tag to v9.5-1739751568
...
Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2025-02-17 03:09:27 +00:00
Kush Gupta
1670d7c1fa
fixing formatting
...
Signed-off-by: Kush Gupta <kushalgupta@gmail.com>
2025-02-16 20:11:24 -05:00
Kush Gupta
9986e48d60
fixes and refactor suggested by sourcery-ai
...
Signed-off-by: Kush Gupta <kushalgupta@gmail.com>
2025-02-16 20:09:26 -05:00
Kush Gupta
d4f10a9b55
Merge pull request #3 from kush-gupt/hf-cache
...
HF cache implementation
2025-02-16 19:47:14 -05:00
Kush Gupta
9dd9d74844
revert hf cli pull commands
...
Signed-off-by: Kush Gupta <kushalgupta@gmail.com>
2025-02-16 19:32:12 -05:00
Kush Gupta
9f04a36547
format fix
...
Signed-off-by: Kush Gupta <kushalgupta@gmail.com>
2025-02-16 19:27:52 -05:00
Kush Gupta
1e34d34a85
Merge branch 'containers:main' into hf-cache
2025-02-16 19:14:44 -05:00
Kush Gupta
5f5bb39d9d
hf cache implementation
...
Signed-off-by: Kush Gupta <kushalgupta@gmail.com>
2025-02-16 19:14:01 -05:00
Eric Curtin
b62a62b914
Merge pull request #832 from containers/apt-get2apt
...
Switch apt-get to apt
2025-02-16 22:00:12 +00:00
Eric Curtin
678ae4b79a
Switch apt-get to apt
...
There was a request to switch to the newer apt.
Co-authored-by: Chris Paquin <cpaquin@redhat.com>
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-02-16 17:25:07 +00:00
Eric Curtin
fe00f19a72
Make serve by default expose network
...
But the default for the rest should be to encapsulate the network
in the container. Also de-deduplicate code.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-02-16 16:21:06 +00:00
Eric Curtin
001d7abe8b
Merge pull request #827 from bupd/patch-1
...
README: Fix typo
2025-02-15 23:04:29 +00:00
Prasanth Baskar
862def1ed8
Fix typo in Readme
...
Signed-off-by: bupd <bupdprasanth@gmail.com>
2025-02-16 04:05:43 +05:30
Daniel J Walsh
b72b002f0d
Merge pull request #824 from engelmi/pin-dev-dependencies-to-major-version
...
Pin dev dependencies to major version and improve formatting + linting
2025-02-15 14:43:29 -05:00
Daniel J Walsh
4aa9007e61
Merge pull request #826 from kush-gupt/main
...
README: fix inspect command description
2025-02-15 14:39:00 -05:00
Kush Gupta
cc22556d98
README: fix inspect command description
...
Signed-off-by: Kush Gupta <kushalgupta@gmail.com>
2025-02-15 14:11:26 -05:00
Michael Engel
4c5d0370a5
Applied formatting changes from black and isort
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-02-15 15:01:14 +01:00
Michael Engel
c21764e818
Introduced isort in the format make targets
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-02-15 15:00:56 +01:00
Michael Engel
73a74bea19
Refined formatting and linting
...
Split make target lint into lint, format and check-format
and updated the CI steps accordingly. Also moved configuration
of black to pyproject.toml and flake8 to .flake8 file.
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-02-15 15:00:56 +01:00
Michael Engel
b31482cc8d
Pin versions of the development tools to major version
...
By pinning the version of the development tools, the risk of
accidental upgrades and breaking changes leading are mitigated.
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-02-15 14:56:13 +01:00
Daniel J Walsh
b8e779c5aa
Merge pull request #820 from cgruver/arg-work
...
Align runtime arguments with run, serve, bench, and perplexity
2025-02-14 15:48:40 -05:00
Charro Gruver
25d07ae557
correct missing args in perplexity doc.
...
Signed-off-by: Charro Gruver <cgruver@redhat.com>
2025-02-14 14:52:20 -05:00
Charro Gruver
eb124986d4
correct missing args in perplexity doc. Align add_argument style for consistency.
...
Signed-off-by: Charro Gruver <cgruver@redhat.com>
2025-02-14 14:52:12 -05:00
Charro Gruver
a44616ae89
Rebase - Align runtime arguments with run, serve, bench, and perplexity
...
Signed-off-by: Charro Gruver <cgruver@redhat.com>
2025-02-14 14:49:03 -05:00
Daniel J Walsh
4d209b92f4
Add entrypoint container images
...
Install podman-remote and ramalama so we can use ramalama
from within a container.
$ podman run --env XDG_RUNTIME_DIR -v $HOME:$HOME -v /run/user:/run/user --userns=keep-id -ti --privileged quay.io/ramalama/ramalama ramalama run granite
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-02-14 14:33:32 -05:00
Daniel J Walsh
dfa16ea26e
Merge pull request #817 from containers/bugfix-container_manager
...
Encountered a bug where this function was returning -1
2025-02-14 14:28:08 -05:00
Michael Engel
b1dd195b50
Merge pull request #818 from engelmi/remove-error-wrapping-in-urlopen
...
Removed error wrapping in urlopen
2025-02-14 16:16:19 +01:00
Michael Engel
d6f2703920
Removed error wrapping in urlopen
...
By wrapping the raised error from urlopen() in an IOError, the
caller can not accurately handle that error anymore as its missing
the http response code for http errors, for example. Instead, we do
not handle any error at all in the HTTPClient for urlopen and let
the error bubble up. This way the caller such as the download_file()
can properly handle errors such as skipping retries for 404.
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-02-14 15:42:03 +01:00
Eric Curtin
587f2ff52f
Encountered a bug where this function was returning -1
...
Now the function will never return -1
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-02-14 14:16:37 +00:00
Daniel J Walsh
2f67b2e655
Merge pull request #816 from containers/open-gpu-targets
...
Upgrade from 6.3.1 to 6.3.2
2025-02-14 08:39:35 -05:00
Eric Curtin
27991047f5
Upgrade from 6.3.1 to 6.3.2
...
For ROCm and AMD drivers
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-02-14 10:59:28 +00:00
Daniel J Walsh
75435cb559
Merge pull request #814 from benoitf/RMLM-799
...
chore: replace RAMALAMA label by ai.ramalama
2025-02-13 16:21:33 -05:00
Daniel J Walsh
2b2eb4ebe4
Merge pull request #815 from benoitf/RMLM-813
...
chore: rewrite readarray function to make it portable
2025-02-13 16:11:59 -05:00
Florent Benoit
d56813751b
chore: replace RAMALAMA label by ai.ramalama
...
fixes https://github.com/containers/ramalama/issues/799
Signed-off-by: Florent Benoit <fbenoit@redhat.com>
2025-02-13 21:55:01 +01:00
Florent Benoit
4492eae79b
chore: rewrite readarray function to make it portable
...
fixes https://github.com/containers/ramalama/issues/813
Signed-off-by: Florent Benoit <fbenoit@redhat.com>
2025-02-13 21:54:02 +01:00
Daniel J Walsh
33fef31238
Merge pull request #809 from cgruver/arg-parse
...
Add run and serve arguments for --device and --privileged
2025-02-13 15:38:50 -05:00
Daniel J Walsh
dcabb945da
Merge pull request #810 from benoitf/RMLM-798
...
feat: add ramalama labels about the execution on top of container
2025-02-13 15:38:03 -05:00
Florent Benoit
94166f382d
feat: add ramalama labels about the execution on top of container
...
fixes https://github.com/containers/ramalama/issues/798
Signed-off-by: Florent Benoit <fbenoit@redhat.com>
2025-02-13 21:04:03 +01:00
Charro Gruver
07bc34132b
Changes to satisfy code review
...
Signed-off-by: Charro Gruver <cgruver@redhat.com>
2025-02-13 19:54:15 +00:00
Daniel J Walsh
08c5e25d8d
Merge pull request #802 from containers/ngl-999
...
If ngl is not specified
2025-02-13 14:51:07 -05:00
Daniel J Walsh
0c5015ae62
Merge pull request #803 from rhatdan/recipes
...
Prepare containers to run with ai-lab-recipes
2025-02-13 14:49:25 -05:00
Daniel J Walsh
bf0d770f52
Prepare containers to run with ai-lab-recipes
...
Add two new scripts llama-server.sh and whisper-server.sh which
can handle environment variables from the ai-lab-recipes.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-02-13 14:29:44 -05:00
Charro Gruver
483f229677
Add --privileged and --device args to the appropriate docs
...
Signed-off-by: Charro Gruver <cgruver@redhat.com>
2025-02-13 19:03:58 +00:00
Charro Gruver
b396ef84a1
Add run and serve arguments for --device and --privileged
...
Signed-off-by: Charro Gruver <cgruver@redhat.com>
2025-02-13 18:45:03 +00:00
Eric Curtin
173cae3adf
Merge pull request #808 from containers/renovate/registry.access.redhat.com-ubi9-ubi-9.x
...
Update registry.access.redhat.com/ubi9/ubi Docker tag to v9.5-1739449058
2025-02-13 17:07:37 +00:00
renovate[bot]
f72771f08c
Update registry.access.redhat.com/ubi9/ubi Docker tag to v9.5-1739449058
...
Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2025-02-13 16:33:10 +00:00
Adam Miller
b8ad2c950f
Merge pull request #806 from containers/open-gpu-targets
...
Add some more gfx values to the default list
2025-02-13 11:32:42 -05:00
Eric Curtin
25fe18469e
Add some more gfx values to the default list
...
To attempt to expand GPU support.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-02-13 14:55:49 +00:00
Daniel J Walsh
21436433ba
Merge pull request #800 from containers/network-mode2network
...
Change --network-mode to --network
2025-02-13 08:26:35 -05:00
Eric Curtin
7a0218a552
If ngl is not specified
...
ngl should default to 999 in all these cases.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-02-13 11:57:47 +00:00
Eric Curtin
da896823e3
Change --network-mode to --network
...
So users can easily understand what podman/docker CLI this maps
to.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-02-13 11:13:54 +00:00
Eric Curtin
8dd7ec404f
Merge pull request #795 from rhatdan/keepalive
...
Attempt to use build_llama_and_whisper.sh
2025-02-12 18:18:08 +00:00
Daniel J Walsh
7f1e680b09
Attempt to use build_llama_and_whisper.sh
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-02-12 13:03:23 -05:00
Eric Curtin
7b80bbc02f
Merge pull request #501 from rhatdan/rag
...
Add ramalama rag command
2025-02-12 17:27:48 +00:00
Daniel J Walsh
4e4e6708bf
Add ramalama rag command
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-02-12 11:43:09 -05:00
Daniel J Walsh
ea8d3ddbc1
Merge pull request #794 from containers/skip-dnf-if-not-available
...
Only run dnf commands on platforms that have dnf
2025-02-12 09:03:58 -05:00
Eric Curtin
d1c5bb0316
Only run dnf commands on platforms that have dnf
...
Other refactorings, fixes
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-02-12 12:55:55 +00:00
Daniel J Walsh
d22af111cf
Merge pull request #793 from containers/optimize-rename-variables
...
_engine is set to None or has a value
2025-02-12 07:48:09 -05:00
Daniel J Walsh
5bf0e81abb
Merge pull request #792 from rhatdan/keepalive
...
Install llama.cpp for mac and nocontainer tests
2025-02-12 07:32:02 -05:00
Eric Curtin
8186aed238
_engine is set to None or has a value
...
Simplification, less states of this variable
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-02-12 12:19:50 +00:00
Daniel J Walsh
dabe4aebd4
Install llama.cpp for mac and nocontainer tests
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-02-12 07:08:18 -05:00
Eric Curtin
c02cc6f5a7
Merge pull request #790 from rhatdan/optimize
...
Stash output from container_manager
2025-02-12 11:08:04 +00:00
Eric Curtin
96cd5a3325
Merge pull request #789 from rhatdan/keepalive
...
Add ramalama run --keepalive option
2025-02-12 10:02:23 +00:00
Daniel J Walsh
33da7f9c6d
Add ramalama run --keepalive option
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-02-11 21:16:51 -05:00
Adam Miller
d9e66301c1
Merge pull request #784 from kush-gupt/main
...
Check if file exists before sorting them into a list
2025-02-11 18:27:22 -05:00
Daniel J Walsh
75e5ce95e4
Stash output from container_manager
...
Fixes: https://github.com/containers/ramalama/issues/788
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-02-11 15:38:50 -05:00
Kush Gupta
db5ec02d06
attempt to remove broken symlinks when found during _list_models
...
Signed-off-by: Kush Gupta <kushalgupta@gmail.com>
2025-02-11 14:49:51 -05:00
Kush Gupta
b4a7ae20cd
Merge branch 'containers:main' into main
2025-02-11 14:43:44 -05:00
Daniel J Walsh
d13d02bb30
Merge pull request #785 from rhatdan/init
...
Fix exiting on llama-serve when user hits ^c
2025-02-11 11:55:20 -05:00
Eric Curtin
7a1cf12920
Merge pull request #787 from rhatdan/readme
...
Add Security information to README.md
2025-02-11 16:48:16 +00:00
Daniel J Walsh
fa973e62fc
Add Security information to README.md
...
Also add information about ramalama.conf to the ramalama.1 man page.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-02-11 11:28:07 -05:00
Daniel J Walsh
9b723355ca
Merge pull request #786 from cgruver/device-env-var
...
Change RAMALAMA_GPU_DEVICE to RAMALAMA_DEVICE for AI accelerator device override
2025-02-11 10:55:30 -05:00
cgruver
51a0eb7eda
Change RAMALAMA_GPU_DEVICE to RAMALAMA_DEVICE for AI accelerator device override
...
Signed-off-by: cgruver <cgruver@clg.lab>
2025-02-11 10:36:36 -05:00
Daniel J Walsh
c71a1487f6
Fix exiting on llama-serve when user hits ^c
...
Fixes: https://github.com/containers/ramalama/issues/753
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-02-11 10:13:25 -05:00
Kush Gupta
4764f55839
Merge branch 'containers:main' into main
2025-02-11 09:24:17 -05:00
Kush Gupta
036ade3dba
Add conditional to check if path exists before sorting it when doing ramalama ls
...
Signed-off-by: Kush Gupta <kushalgupta@gmail.com>
2025-02-11 09:23:38 -05:00
Eric Curtin
4a97011ca4
Merge pull request #773 from cgruver/gpu-device
...
Add env var RAMALAMA_GPU_DEVICE to allow for explicit declaration of the GPU device to use
2025-02-11 13:24:25 +00:00
Eric Curtin
6f0783c235
Merge pull request #782 from kush-gupt/main
...
Reuse Ollama cached image when available
2025-02-11 12:02:52 +00:00
Kush Gupta
1428cd8093
add conditional to check if the local cache layer exists
...
Signed-off-by: Kush Gupta <kushalgupta@gmail.com>
2025-02-10 21:06:41 -05:00
Kush Gupta
0971ea8c86
Fix missing comma on default ollama cache list
...
Signed-off-by: Kush Gupta <kushalgupta@gmail.com>
2025-02-10 17:27:38 -05:00
Kush Gupta
4589c3518c
Merge pull request #2 from kush-gupt/ollama-cache
...
Reuse of a local ollama cache
2025-02-10 17:17:35 -05:00
Kush Gupta
cebe5cb8a6
fix merge issues
...
Signed-off-by: Kush Gupta <kushalgupta@gmail.com>
2025-02-10 17:05:21 -05:00
Kush Gupta
09a76f0dde
working ollama cache implementation
...
Signed-off-by: Kush Gupta <kushalgupta@gmail.com>
2025-02-10 16:55:06 -05:00
Daniel J Walsh
56ecf60441
Merge pull request #781 from benoitf/readme-logo
...
chore: use absolute link for the RamaLama logo
2025-02-10 13:39:02 -05:00
Florent BENOIT
5168aba23d
chore: use absolute link for the RamaLama logo
...
allow proper rendering when README is rendered outside of GitHub
example: https://pypi.org/project/ramalama/
Signed-off-by: Florent Benoit <fbenoit@redhat.com>
2025-02-10 19:34:37 +01:00
Daniel J Walsh
14c9d3519c
Merge pull request #779 from rhatdan/VERSION
...
Bump to v0.6.0
2025-02-10 13:22:29 -05:00
Daniel J Walsh
ba2d283195
Merge pull request #780 from rhatdan/cleanup
...
Cleanup READMEs and man pages.
2025-02-10 13:22:13 -05:00
Daniel J Walsh
8fff12a2c0
Cleanup READMEs and man pages.
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-02-10 12:20:02 -05:00
Daniel J Walsh
8bf723590c
Bump to v0.6.0
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-02-10 12:09:29 -05:00
Daniel J Walsh
0d841ec2cd
Merge pull request #776 from engelmi/add-model-info-cli
...
Add model inspect cli
2025-02-10 12:07:01 -05:00
Daniel J Walsh
c5a15b0344
Merge pull request #778 from rhatdan/contrib
...
Use containers CODE-OF-CONDUCT.md
2025-02-10 12:06:39 -05:00
Daniel J Walsh
a28d9f4908
Merge pull request #648 from containers/ollama-com-proto
...
Parse https://ollama.com/library/ syntax
2025-02-10 12:06:21 -05:00
Daniel J Walsh
1ef19dbf73
Use containers CODE-OF-CONDUCT.md
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-02-10 11:41:15 -05:00
Michael Engel
7f8a046cd4
Added system tests for new inspect command
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-02-10 17:37:44 +01:00
Michael Engel
36bcc94ec3
Added CI step to check installed python files
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-02-10 17:37:44 +01:00
Michael Engel
183597cdcc
Fix: Set directory and filename in Model base class
...
The directory and filename of a model is determined by the
respective model implementation, e.g. Ollama or Huggingface.
If, however, these two fields are not defined in the model
base class, then accessing them for a specific model instance
might fail since these do not exist.
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-02-10 17:37:44 +01:00
Michael Engel
496692cd18
Added inspect CLI command to display model info
...
AI Models are shipped with a lot of (meta) information such as
the used architecture, the chat template it requires and so on.
In order to make these available to the user, the new CLI command
inspect with the option support for --all and --json has been
implemented.
At the moment the GGUF file format - which includes the model as
well as the (meta) information in one file - is fully supported.
Other formats where the model and information is stored in different
files are not (yet) supported and only display basic information
such as the model name, path and registry.
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-02-10 17:37:44 +01:00
Daniel J Walsh
ffa5ae9b73
Merge pull request #777 from rhatdan/contrib
...
Add community documents
2025-02-10 11:26:28 -05:00
Eric Curtin
60c8d1b175
Parse https://ollama.com/library/ syntax
...
People search for ollama models using the web ui, this change
allows one to copy the url from the browser and for it to be
compatible with ramalama run.
Also pull smaller models "smollm" to accelerate the builds.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-02-10 15:53:48 +00:00
Cara Delia
9edd4c005d
Add communicy documents
...
Update CONTRIBUTING.md
Create CODE-OF-CONDUCT.md
Create SECURITY.md
Replaces: https://github.com/containers/ramalama/pull/737
Originally written by caradelia (Cara Delia)
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-02-10 10:29:05 -05:00
Daniel J Walsh
159e0bf524
Merge pull request #775 from eltociear/patch-1
...
docs: update ramalama.1.md
2025-02-10 10:11:56 -05:00
Eric Curtin
acdee5bfc8
docs: update ramalama.1.md
...
Merge pull request #771 from containers/renovate/registry.access.redhat.com-ubi9-ubi-9.x
chore(deps): update registry.access.redhat.com/ubi9/ubi docker tag to v9.5-1738814488
Signed-off-by: Ikko Ashimine <ashimine_ikko_bp@tenso.com>
2025-02-11 00:10:03 +09:00
Daniel J Walsh
c86dbaae85
Merge pull request #768 from containers/progress-bar-2
...
There would be one case where this wouldn't work
2025-02-10 10:01:27 -05:00
Eric Curtin
ad4300f546
There would be one case where this wouldn't work
...
When accumulated_size has just been refreshed to zero. Also check
size is greater than zeor to avoid division by zero.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-02-10 14:04:29 +00:00
Charro Gruver
4d991a96a5
simplify env detection logic based on feedback from sourcery.ai
...
Signed-off-by: Charro Gruver <cgruver@redhat.com>
2025-02-09 13:55:02 +00:00
Charro Gruver
7dcafac866
Add env var RAMALAMA_GPU_DEVICE to allow for explicit declaration of the GPU dvice to use
...
Signed-off-by: Charro Gruver <cgruver@redhat.com>
2025-02-09 13:47:13 +00:00
Eric Curtin
8df50705a5
Merge pull request #771 from containers/renovate/registry.access.redhat.com-ubi9-ubi-9.x
...
chore(deps): update registry.access.redhat.com/ubi9/ubi docker tag to v9.5-1738814488
2025-02-09 13:30:48 +00:00
renovate[bot]
12faac1bbf
chore(deps): update registry.access.redhat.com/ubi9/ubi docker tag to v9.5-1738814488
...
Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2025-02-09 12:28:05 +00:00
Eric Curtin
b31c162c38
Merge pull request #766 from hanthor/main
...
typo: Add quotes to intel-gpu argument in build llama and whisper script
2025-02-09 12:27:40 +00:00
James Reilly
162ad80476
typo: Add quotes to intel-gpu argument
...
Signed-off-by: James Reilly <jreilly1821@gmail.com>
2025-02-08 20:42:10 +05:30
Daniel J Walsh
738eda277d
Merge pull request #767 from containers/progress-bar
...
Progress bar fixes
2025-02-08 05:47:27 -05:00
Eric Curtin
d52ac5f407
Progress bar fixes
...
Clear line on completion of download don't add accumulated_size
twice.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-02-08 10:12:34 +00:00
Daniel J Walsh
b539762a64
Merge pull request #765 from rhatdan/security
...
Drop all capablities and run with no-new-privileges
2025-02-07 21:42:34 -05:00
Eric Curtin
d0f912f65f
Merge pull request #749 from cgruver/detect-gpu
...
Detect Intel ARC GPU in Meteor Lake chipset
2025-02-07 21:35:02 +00:00
Daniel J Walsh
7e1b334297
Merge pull request #764 from containers/bug-fix-progress
...
update_progress only takes one parameter
2025-02-07 16:13:13 -05:00
Daniel J Walsh
aa09f6dd09
Merge pull request #763 from containers/krunkit-detection
...
Check if krunkit process is running with --all-providers
2025-02-07 16:12:58 -05:00
Daniel J Walsh
a0170183bd
Drop all capablities and run with no-new-privileges
...
llama-run attempts to write to $HOME. Since we are running as
root inside of the container, the process tries to write to /root
which is has permissions 500, which means even root procesess can
not write their without CAP_DAC_OVERRIDE, since we don't want to
give that CAP, we modify HOME to be /tmp/
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-02-07 16:10:43 -05:00
Eric Curtin
b3a1c52621
Check if krunkit process is running with --all-providers
...
Required for newer podman machine.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-02-07 21:00:48 +00:00
Eric Curtin
da453cfb50
update_progress only takes one parameter
...
Passed it two by mistake
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-02-07 20:55:06 +00:00
Charro Gruver
df6e7677fe
update check for Intel iGPU to use glob
...
Signed-off-by: Charro Gruver <cgruver@redhat.com>
2025-02-07 20:38:31 +00:00
Daniel J Walsh
d6595388aa
Merge pull request #761 from cgruver/doc-fix
...
Remove reference to non-existent docs in CONTRIBUTING.md
2025-02-07 15:03:21 -05:00
Charro Gruver
053e437886
Remove reference to non-existent docs in CONTRIBUTING.md
...
Signed-off-by: Charro Gruver <cgruver@redhat.com>
2025-02-07 19:38:56 +00:00
Daniel J Walsh
9d02f7d7b3
Merge pull request #717 from containers/update-progress-once-per-100
...
Update progress bar only once every 100ms
2025-02-07 09:50:51 -05:00
Eric Curtin
79c6d19091
Update progress bar only once every 100ms
...
To help with CPU usage
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-02-07 12:27:56 +00:00
Daniel J Walsh
b5a11dc310
Merge pull request #748 from bmbouter/patch-1
...
Update README.md
2025-02-07 05:22:09 -05:00
bmbouter
20fb74ccc1
Update README.md
...
Pulp also has support for this (being a full container registry).
Signed-off-by: Brian Bouterse <bmbouter@gmail.com>
2025-02-06 14:21:06 -05:00
Charro Gruver
c9cc91a956
Detect Intel ARC GPU in Meteor Lake chipset
...
Signed-off-by: Charro Gruver <cgruver@redhat.com>
2025-02-06 17:31:43 +00:00
Eric Curtin
1ab1908919
Merge pull request #746 from containers/rhoai
...
Update vLLM containers
2025-02-06 13:29:06 +00:00
Eric Curtin
839902410b
Update vLLM containers
...
From rhoai-2.17 -> rhoai-2.18
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-02-06 12:51:34 +00:00
Eric Curtin
3477a13ed1
Merge pull request #744 from rhatdan/pragmatic
...
Allow users to build RAG versus Docling images
2025-02-06 08:24:42 +00:00
Daniel J Walsh
5dfc878c20
Allow users to build RAG versus Docling images
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-02-05 16:17:31 -05:00
Daniel J Walsh
032efaeed9
Merge pull request #597 from rhatdan/pragmatic
...
Begin process of packaging PRAGmatic
2025-02-05 16:07:26 -05:00
Daniel J Walsh
c166c0a007
Begin process of packaging PRAGmatic
...
Building Pragmatic into a container image is fairly easy.
podman build --build-arg IMAGE=quay.io/ramalama/rocm -t quay.io/ramalama/rocm-rag container-images/pragmatic
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-02-05 14:07:56 -05:00
Daniel J Walsh
965bdf2540
Merge pull request #741 from containers/dryrun-fix
...
On macOS this was returning an incorrect path
2025-02-05 11:53:04 -05:00
Eric Curtin
abc065d070
On macOS this was returning an incorrect path
...
$ ramalama --dryrun run granite-code
llama-run -c 2048 --temp 0.8 /path/to/model
At least try and show the correct path.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-02-05 14:59:46 +00:00
Daniel J Walsh
a08b166909
Merge pull request #740 from containers/renovate/actions-checkout-4.x
...
[skip-ci] Update actions/checkout action to v4
2025-02-05 08:06:15 -05:00
renovate[bot]
2bde3845c9
[skip-ci] Update actions/checkout action to v4
...
Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2025-02-05 12:47:23 +00:00
Daniel J Walsh
5d1a71e009
Merge pull request #738 from dougsland/freshinstalltest
...
github actions: ramalama install
2025-02-05 07:46:58 -05:00
Daniel J Walsh
efa0f9b13f
Merge pull request #707 from containers/ngl-default
...
Make the default of ngl be -1
2025-02-05 07:46:34 -05:00
Eric Curtin
3332c623b2
Make the default of ngl be -1
...
This means automatically assign a value, which may be 999 or 0
depending on hardware.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-02-05 12:24:40 +00:00
Douglas Schilling Landgraf
9f346b680a
github actions: ramalama install
...
Signed-off-by: Douglas Schilling Landgraf <dougsland@redhat.com>
2025-02-05 06:41:19 -05:00
Daniel J Walsh
03cbf9b5b7
Merge pull request #739 from containers/install-fixes
...
There's a comma in the list of files in install.sh
2025-02-05 05:14:11 -05:00
Eric Curtin
d792b458e2
There's a comma in the list of files in install.sh
...
It shouldn't be there. Also remove the -qq, it makes it feel
like the script isn't making progress as podman takes a while to
install.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-02-04 22:24:58 +00:00
Daniel J Walsh
10321b0ee0
Merge pull request #736 from cgruver/podman-farm-script
...
modify container_build.sh to add capability to use podman farm for multi-arch images
2025-02-04 16:48:52 -05:00
Charro Gruver
8e7dc3150c
update logic to exclude certain images from multi-arch builds
...
Signed-off-by: Charro Gruver <cgruver@redhat.com>
2025-02-04 21:18:43 +00:00
Charro Gruver
71f26f4fc2
remove multi-arch-targets.list. Use the presence of .multi-arch in the image folder to trigger podman farm build
...
Signed-off-by: Charro Gruver <cgruver@redhat.com>
2025-02-04 20:30:24 +00:00
Charro Gruver
cf8030faaa
add .multi-arch files to trigger podman farm build
...
Signed-off-by: Charro Gruver <cgruver@redhat.com>
2025-02-04 20:27:41 +00:00
Charro Gruver
8a187c65d2
fix syntax error
...
Signed-off-by: Charro Gruver <cgruver@redhat.com>
2025-02-04 20:25:09 +00:00
Charro Gruver
cdb7130f7f
simplify script for multi-arch support
...
Signed-off-by: Charro Gruver <cgruver@redhat.com>
2025-02-04 20:19:32 +00:00
Daniel J Walsh
d5ada4d97d
Merge pull request #729 from containers/renovate/registry.access.redhat.com-ubi9-ubi-9.x
...
Update registry.access.redhat.com/ubi9/ubi Docker tag to v9.5-1738643550
2025-02-04 14:48:27 -05:00
Charro Gruver
7ba4c953d1
Fix syntax error in Makefile
...
Signed-off-by: Charro Gruver <cgruver@redhat.com>
2025-02-04 19:47:47 +00:00
Charro Gruver
e7495266ec
Modify Makefile to support multi-arch builds
...
Signed-off-by: Charro Gruver <cgruver@redhat.com>
2025-02-04 19:44:27 +00:00
Charro Gruver
2193333029
modify container_build.sh to add capability to use podman farm for multi-arch images
...
Signed-off-by: Charro Gruver <cgruver@redhat.com>
2025-02-04 19:36:30 +00:00
Daniel J Walsh
72131263b3
Merge pull request #735 from cgruver/podman-farm
...
Add docs for using podman farm to build multi-arch images
2025-02-04 14:28:45 -05:00
Charro Gruver
aea3e17684
incorporate Sourcery-AI suggestions
...
Signed-off-by: Charro Gruver <cgruver@redhat.com>
2025-02-04 19:01:35 +00:00
Charro Gruver
1a5795a945
Add docs for using podman farm to build multi-arch images
...
Signed-off-by: Charro Gruver <cgruver@redhat.com>
2025-02-04 18:38:36 +00:00
Eric Curtin
3fdcae16cf
Merge pull request #722 from jcajka/main
...
ramalama container: Make it possible to build basic container on all RHEL architectures
2025-02-04 17:48:22 +00:00
Daniel J Walsh
5e32fe9cd0
Merge pull request #733 from rhatdan/image
...
Honor RAMALAMA_IMAGE if set
2025-02-04 09:09:10 -05:00
Jakub Čajka
42f6cd9cf6
ramalama container: Make it possible to build basic container on all
...
RHEL architectures
Signed-off-by: Jakub Čajka <jcajka@redhat.com>
2025-02-04 14:49:52 +01:00
Eric Curtin
5b48748a10
Merge pull request #734 from rhatdan/network
...
Add --network-mode option
2025-02-04 13:48:29 +00:00
renovate[bot]
cbfc7251b2
Update registry.access.redhat.com/ubi9/ubi Docker tag to v9.5-1738643550
...
Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2025-02-04 13:37:40 +00:00
Daniel J Walsh
11fe7875fc
Honor RAMALAMA_IMAGE if set
...
Currently on my cuda laptop, if I set RAMALAMA_IMAGE to something,
ramalama ignores it and forces cuda image.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-02-04 08:21:43 -05:00
Joshua Stone
760b6fd341
Add --network-mode option
...
Signed-off-by: Joshua Stone <jostone@redhat.com>
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-02-04 08:16:11 -05:00
Daniel J Walsh
341f962f8c
Merge pull request #724 from cgruver/add-intel-gpu
...
Add logic to build intel-gpu image to build_llama_and_whisper.sh
2025-02-04 08:06:50 -05:00
Daniel J Walsh
07aba4b509
Merge pull request #728 from lsm5/packit-epel
...
Packit: downstream jobs for EPEL 9,10
2025-02-04 08:02:23 -05:00
Daniel J Walsh
319f3860e2
Merge pull request #730 from containers/asahi-check
...
Check for apple,arm-platform in /proc
2025-02-04 08:01:00 -05:00
Eric Curtin
c9c4a605e6
Check for apple,arm-platform in /proc
...
In /proc/device-tree/compatible specifically, thanks Asahi Lina!
It's more portable accross Fedora Asahi Remix and Ubuntu Asahi
Remix.
Also added env var to container image.
Co-authored-by: Asahi Lina <lina@asahilina.net>
Co-authored-by: Daniel J Walsh <dwalsh@redhat.com>
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-02-04 12:36:28 +00:00
Daniel J Walsh
0388554d50
Merge pull request #731 from containers/revert-674-network-enable-option
...
Revert "Add --network-mode option"
2025-02-04 07:16:00 -05:00
Eric Curtin
77d5ecb0a7
Revert "Add --network-mode option"
...
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-02-04 11:29:38 +00:00
Lokesh Mandvekar
3c187849f4
Packit: downstream jobs for EPEL 9,10
...
This commit enables downstream jobs for EPEL 9 and 10.
Since we're only updating epel branches downstream and not running tests on EPEL
targets, we can include the epel targets, we can keep them as part of
`&fedora_targets` yaml anchor and not include them separately.
Signed-off-by: Lokesh Mandvekar <lsm5@fedoraproject.org>
2025-02-04 15:24:48 +05:30
Charro Gruver
12d779dc57
Add logic to build intel-gpu image to build_llama_and_whisper.sh
...
Signed-off-by: Charro Gruver <cgruver@redhat.com>
2025-02-03 23:24:01 +00:00
Daniel J Walsh
3a08829a18
Merge pull request #723 from kush-gupt/readme
...
README: add convert to commands list
2025-02-03 13:20:02 -05:00
Kush Gupta
06c05b3cb2
add convert to available commands
...
Signed-off-by: Kush Gupta <kushalgupta@gmail.com>
2025-02-03 13:14:33 -05:00
Douglas Landgraf
dd93fd0eb4
Merge pull request #674 from rhjostone/network-enable-option
...
Add --network-mode option
2025-02-03 10:45:22 -05:00
Joshua Stone
dc408f8f75
Add --network-mode option
...
Signed-off-by: Joshua Stone <jostone@redhat.com>
2025-02-03 10:16:27 -05:00
Eric Curtin
9ab4d915b5
Merge pull request #719 from rhatdan/huggingface
...
Report error when huggingface-cli is not available
2025-02-03 12:43:30 +00:00
Daniel J Walsh
c9880bb4d8
Report error when huggingface-cli is not available
...
Fixes: https://github.com/containers/ramalama/issues/688
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-02-03 07:34:45 -05:00
Eric Curtin
82d34c0f23
Merge pull request #711 from containers/more-maintainers
...
Adding slp, engelmi, also
2025-02-03 09:47:28 +00:00
Eric Curtin
d14839ebb5
Merge pull request #715 from dougsland/makelint
...
Makelint
2025-02-03 02:59:30 +01:00
Eric Curtin
2e064ceb7e
Merge pull request #716 from containers/macos-fix
...
Fix macOS emoji compatibility with Alacritty
2025-02-03 02:58:41 +01:00
Eric Curtin
bd568fc791
Fix macOS emoji compatibility with Alacritty
...
The main fix here is to check LC_ALL also. It's required for this
platform combination. The rest is code simplifications and
refactorings. Fix to have consitent spacing (there was a mix of
double and single spacings in output based on log-level).
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-02-03 01:24:59 +00:00
Eric Curtin
6706b2682a
Merge pull request #713 from dougsland/commonimprovements
...
common: general improvements
2025-02-03 01:42:59 +01:00
Douglas Schilling Landgraf
0454b7b675
CI/CD: create a job for make lint
...
Make lint is a good one to have.
Resolves: https://github.com/containers/ramalama/issues/714
Signed-off-by: Douglas Schilling Landgraf <dougsland@redhat.com>
2025-02-02 19:26:44 -05:00
Douglas Schilling Landgraf
15c857d15f
CI/CD: create a job for make lint
...
Make lint is a good one to have.
Signed-off-by: Douglas Schilling Landgraf <dougsland@redhat.com>
2025-02-02 19:23:41 -05:00
Douglas Schilling Landgraf
b0678aa656
common: general improvements
...
- if timeout happens we try 5 times before
sending Timeout error to users.
- improve user experience when timeout occurs
- Added console source tree for handling messages
Resolves: https://github.com/containers/ramalama/issues/693
Signed-off-by: Douglas Schilling Landgraf <dougsland@redhat.com>
2025-02-02 19:16:15 -05:00
Eric Curtin
8b717f0a85
Adding slp, engelmi, also
...
Adding Sergio and Michael
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-02-02 22:51:31 +00:00
Daniel J Walsh
900eca6ecb
Merge pull request #706 from containers/codeowners
...
Use CODEOWNERS file for autoassign
2025-02-02 16:41:10 -05:00
Daniel J Walsh
c5c4418c0a
Merge pull request #710 from containers/rm-display-driver-info
...
We are displaying display driver info, scope creep
2025-02-02 16:36:46 -05:00
Eric Curtin
d590f7b09c
We are displaying display driver info, scope creep
...
We have to be mindful of maintenance of the codebase. Display
drivers have nothing to do with AI acceleration. Don't show
display info such as "Color LCD" :
{
"Engine": {
"Name": null
},
"GPUs": {
"Detected GPUs": [
{
"Cores": "18",
"GPU": "Apple M3 Pro",
"Metal": "Metal 3",
"Vendor": "Apple (0x106b)"
},
{
"GPU": "Color LCD"
}
],
"INFO": "No errors"
},
"Image": "quay.io/ramalama/ramalama",
"Runtime": "llama.cpp",
"Store": "/Users/ecurtin/.local/share/ramalama",
"UseContainer": false,
"Version": "0.0.19"
}
Not sure about the "macOS detection covers AMD GPUs" code. We could
have external GPUs potentially on macOS but even in that case the
code seems illogical.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-02-02 17:51:40 +00:00
Douglas Schilling Landgraf
9a655f231a
Use CODEOWNERS file for autoassign
...
Let's make users and developers be aware of PRs/issues etc.
Resolves: https://github.com/containers/ramalama/issues/668
Signed-off-by: Douglas Schilling Landgraf <dougsland@redhat.com>
2025-02-02 07:38:19 -05:00
Daniel J Walsh
3032571edf
Merge pull request #659 from khumarahn/1
...
add --ngl to specify the number of gpu layers, and --keep-groups so podman has access to gpu
2025-02-01 23:28:46 -05:00
Daniel J Walsh
83579bbf1b
Merge pull request #704 from graystevens/patch-1
...
Update install.sh to include "gpu_detector.py"
2025-02-01 17:09:53 -05:00
Daniel J Walsh
ff819b7a2c
Merge pull request #702 from containers/renovate/registry.access.redhat.com-ubi9-ubi-9.x
...
Update registry.access.redhat.com/ubi9/ubi Docker tag to v9.5-1736404036
2025-02-01 16:48:46 -05:00
Graham Stevens
1c4112123b
Update install.sh
...
Signed-off-by: Graham Stevens <331262+graystevens@users.noreply.github.com>
2025-02-01 21:48:40 +00:00
Eric Curtin
672c61d6dc
Merge pull request #703 from containers/tmp2TMP
...
This should be a global variable
2025-02-01 17:44:50 +01:00
Eric Curtin
bf652941c2
This should be a global variable
...
Not a local one
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-02-01 16:44:02 +00:00
renovate[bot]
bb5691f831
Update registry.access.redhat.com/ubi9/ubi Docker tag to v9.5-1736404036
...
Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2025-02-01 15:42:11 +00:00
Eric Curtin
8611799ea2
Merge pull request #687 from volker48/fix-macos
...
fix error on macOS for M1 pro
2025-02-01 16:41:56 +01:00
Marcus McCurdy
7bdfb1f7ca
fix error with docker on macOS for M1 pro
...
Signed-off-by: Marcus McCurdy <mmccurdy@obsidiansecurity.com>
2025-02-01 10:00:08 -05:00
Alexey Korepanov
c9c777d959
add --keep-groups and --ngl options
...
Signed-off-by: Alexey Korepanov <kaikaikai@yandex.ru>
2025-02-01 12:47:09 +00:00
Daniel J Walsh
c2c955bb69
Merge pull request #701 from rhatdan/version
...
Bump to v0.5.5
2025-02-01 07:40:46 -05:00
Daniel J Walsh
97ceee9c82
Bump to v0.5.5
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-02-01 07:15:03 -05:00
Daniel J Walsh
4b3429002b
Merge pull request #670 from dougsland/gpudetector
...
Add ramalama gpu_detector
2025-02-01 07:03:37 -05:00
Daniel J Walsh
38974ffb4a
Merge pull request #690 from containers/install-local
...
Introduce a mode so one call install from git
2025-02-01 06:53:38 -05:00
Eric Curtin
f2ee46282d
Merge pull request #692 from maxamillion/rev_llama_cpp_aa6fb13
...
bump llama.cpp to latest release hash aa6fb13
2025-02-01 11:10:06 +01:00
Adam Miller
b855b0681d
bump llama.cpp to latest release hash aa6fb13
...
Signed-off-by: Adam Miller <admiller@redhat.com>
2025-01-31 16:02:43 -06:00
Eric Curtin
96dc553f8a
Introduce a mode so one call install from git
...
If you have git clone'd the project, you now have the option of
doing:
./install.sh -l
And it will install the version from the git repo.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-01-31 16:17:37 +00:00
Douglas Schilling Landgraf
8d27050da1
Add ramalama gpu_detector into info options
...
GPUDetector can be used in any part of the code.
However, useful for users to run in the cli.
ramalama info
Signed-off-by: Douglas Schilling Landgraf <dougsland@redhat.com>
Signed-off-by: Brian <bmahabir@bu.edu>
2025-01-31 10:28:22 -05:00
Eric Curtin
5b1f1a1022
Merge pull request #680 from kush-gupt/main
...
Pull the source model if it isn't already in local storage for the convert and push functions
2025-01-31 12:00:54 +01:00
Eric Curtin
961b6918a1
Merge pull request #683 from containers/revert-625-use-jinja-running-llama-run
...
Revert "Added --jinja to llama-run command"
2025-01-31 11:56:13 +01:00
Eric Curtin
45ba7e7e97
Revert "Added --jinja to llama-run command"
...
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-01-31 10:29:16 +00:00
Kush Gupta
574ac2a4e3
accidentally overwrote install.sh
...
Signed-off-by: Kush Gupta <kushalgupta@gmail.com>
2025-01-30 20:13:14 -05:00
Kush Gupta
2619a16e41
Change serve error output to better reflect the error
...
Signed-off-by: Kush Gupta <kushalgupta@gmail.com>
2025-01-30 20:11:00 -05:00
Kush Gupta
7dc07fa805
Update pull test expected output
...
Gives the user a clearer message on HTTP 404 Not Found Errors when trying and failing to pull from Ollama
Signed-off-by: Kush Gupta <kushalgupta@gmail.com>
2025-01-30 20:11:00 -05:00
Kush Gupta
4f56f45c9c
Update 055-convert.bats
...
Update new error message for not finding an Ollama model in the convert test
Signed-off-by: Kush Gupta <kushalgupta@gmail.com>
2025-01-30 20:11:00 -05:00
Kush Gupta
79701438d4
Update error handling in Ollama's pull function
...
Give the user a more descriptive error if getting a HTTP Error 404: Not Found
Signed-off-by: Kush Gupta <kushalgupta@gmail.com>
2025-01-30 20:11:00 -05:00
Kush Gupta
480fd7a81c
Update _get_source to pull the source if it doesn't exist already
...
Added a condition to check if the source model already exists, otherwise pull it down for the convert or push (convert and push were the only cli functions that used _get_source as far as I could tell)
Signed-off-by: Kush Gupta <kushalgupta@gmail.com>
2025-01-30 20:11:00 -05:00
Daniel J Walsh
d06e798a65
Merge pull request #673 from jistr/chore/gitignore-man-7
...
Add generated man pages for section 7 into gitignore
2025-01-30 14:16:31 -05:00
Daniel J Walsh
adfbcb86b1
Merge pull request #676 from kush-gupt/main
...
remove ro as an option when mounting images
2025-01-30 14:10:57 -05:00
Kush Gupta
0e44fbf47d
remove ro as an option when mounting images
...
Signed-off-by: Kush Gupta <kushalgupta@gmail.com>
2025-01-30 12:13:47 -05:00
Eric Curtin
4b252b228f
Merge pull request #672 from jistr/bug/local-install-config-not-found
...
Look for configs also in /usr/local/share/ramalama
2025-01-30 17:42:35 +01:00
Jiri Stransky
768786126f
Add generated man pages for section 7 into gitignore
...
Sections 1 and 5 are already ignored, but there is also
`ramalama-cuda.7.md` file now which gets rendered into
`ramalama-cuda.7`, so the latter should be in gitignore.
Signed-off-by: Jiri Stransky <jistr@redhat.com>
2025-01-30 16:50:18 +01:00
Jiri Stransky
949b52a12d
Look for configs also in /usr/local/share/ramalama
...
The /usr/local/... path should be considered a valid location [1].
It is important for `pip install ramalama` to work correctly, because
that command, when run as root (e.g. in a toolbox or toolbox-like
container), installs the default config files into the
`/usr/local/share/ramalama` directory.
[1] https://refspecs.linuxfoundation.org/FHS_3.0/fhs/ch04s09.html
Resolves: https://github.com/containers/ramalama/issues/671
Signed-off-by: Jiri Stransky <jistr@redhat.com>
2025-01-30 16:41:35 +01:00
Eric Curtin
adc3bbefec
Merge pull request #657 from cgruver/compact-build
...
Update intel-gpu Containerfile to reduce the size of the builder image
2025-01-30 08:02:51 +01:00
Charro Gruver
d654122a2a
Update intel-gpu Containerfile to reduce the size of the builder image
...
Signed-off-by: Charro Gruver <cgruver@redhat.com>
2025-01-29 20:11:09 +00:00
Eric Curtin
4851d600d7
Merge pull request #645 from pbabinca/huggingface-cli
...
Guide users to install huggingface-cli to login to huggingface
2025-01-29 16:01:02 +01:00
Pavol Babincak
67c7ca4eee
Guide users to install huggingface-cli to login to huggingface
...
Relates-to: #643
Signed-off-by: Pavol Babincak <pbabinca@redhat.com>
Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>
Co-authored-by: Daniel J Walsh <dwalsh@redhat.com>
2025-01-29 14:14:52 +01:00
Daniel J Walsh
7260e699b5
Merge pull request #644 from cgruver/clg-test
...
Add container image to support Intel ARC GPU
2025-01-29 07:55:45 -05:00
Daniel J Walsh
ecc686b490
Merge pull request #647 from jhjaggars/throw-exception-http-init
...
throwing an exception with there is a failure in http_client.init
2025-01-29 07:27:46 -05:00
Jesse Jaggars
0cbf72bae0
raising exceptions rather than returning numbers
...
Signed-off-by: Jesse Jaggars <jhjaggars@gmail.com>
2025-01-28 18:15:51 -05:00
Daniel J Walsh
59d2fcd1af
Merge pull request #637 from containers/perplexity
...
Add perplexity subcommand to RamaLama CLI
2025-01-28 16:46:41 -05:00
Daniel J Walsh
7bc916465d
Merge pull request #641 from rhatdan/version
...
Bump to v0.5.4
2025-01-28 16:33:50 -05:00
Jesse Jaggars
391b176c26
throwing an exception with there is a failure in http_client.init
...
Signed-off-by: Jesse Jaggars <jjaggars@redhat.com>
2025-01-28 13:32:08 -05:00
Charro Gruver
1e762c8207
Add container image to support Intel ARC GPU
...
Signed-off-by: Charro Gruver <cgruver@redhat.com>
2025-01-28 14:56:10 +00:00
Eric Curtin
2494118163
Add perplexity subcommand to RamaLama CLI
...
- Added a new subcommand `perplexity` to the RamaLama CLI in `cli.py`.
- Implemented the `perplexity` method in the `Model` class in `model.py`.
- Updated the documentation in `ramalama.1.md` to include the new `perplexity` command.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-01-28 13:20:26 +00:00
Daniel J Walsh
ae2d5e7488
Bump to v0.5.4
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-01-28 07:26:34 -05:00
Daniel J Walsh
3e90238d9c
Merge pull request #634 from jobcespedes/feat/amd_gpu_targets_arg
...
feat: add argument to define amd gpu targets
2025-01-28 06:48:46 -05:00
Eric Curtin
8aaab4fe5c
Merge pull request #633 from containers/renovate/docker.io-nvidia-cuda-12.x
...
Update docker.io/nvidia/cuda Docker tag to v12.8.0
2025-01-28 10:44:00 +01:00
Job Céspedes Ortiz
a8f1fb1941
feat: add argument to define amd gpu targets
...
This commit add a container buid arg to set custom the amd gpu targets.
It also keeps the default amd gpu target values.
Signed-off-by: Job Céspedes Ortiz <jobcespedes@gmail.com>
2025-01-28 03:13:06 -06:00
Eric Curtin
c4f4707836
Merge pull request #635 from kubealex/macos-setup-quickstart
...
Point macOS users to script install
2025-01-28 09:44:12 +01:00
Eric Curtin
a3911d030c
Update README.md
...
Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>
2025-01-28 08:43:03 +00:00
Alessandro Rossi
2ee756fce1
Update README.md
...
Signed-off-by: Alessandro Rossi <al.rossi87@gmail.com>
2025-01-28 08:56:38 +01:00
renovate[bot]
227c906e64
Update docker.io/nvidia/cuda Docker tag to v12.8.0
...
Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2025-01-27 22:03:42 +00:00
Eric Curtin
0c447b1444
Merge pull request #632 from containers/rocm-gfx-fix
...
fixed rocm detection by adding gfx targets in containerfile
2025-01-27 23:03:14 +01:00
Daniel J Walsh
8789e2b214
Merge pull request #631 from rhatdan/shortnames
...
Add shortname for deepseek
2025-01-27 10:59:58 -05:00
Brian
2103ab64d0
fixed rocm detection by adding gfx targets
...
Signed-off-by: Brian <bmahabir@bu.edu>
2025-01-27 10:56:25 -05:00
Daniel J Walsh
974a37fe00
Add shortname for deepseek
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-01-27 10:02:26 -05:00
Eric Curtin
42b88e3cfd
Merge pull request #630 from containers/deepseek-support
...
Update llama.cpp version
2025-01-27 15:27:59 +01:00
Daniel J Walsh
b08b6ffca5
Merge pull request #625 from engelmi/use-jinja-running-llama-run
...
Added --jinja to llama-run command
2025-01-27 06:50:48 -05:00
Eric Curtin
6d2d751948
Update llama.cpp version
...
To add recent support for deepseek, etc.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-01-27 11:30:00 +00:00
Daniel J Walsh
eb29f61a56
Merge pull request #628 from containers/mac-cpu-fix
...
added mac cpu only support
2025-01-26 19:24:38 -05:00
Brian
ae8de1c4f2
added mac cpu only support
...
Signed-off-by: Brian <bmahabir@bu.edu>
2025-01-26 16:33:07 -05:00
Eric Curtin
3cbaa5deea
Merge pull request #627 from kubealex/patch-1
...
Fix list in README - Credits section
2025-01-26 16:54:44 +01:00
Alessandro Rossi
274040245a
Update README.md
...
Fix listing of project in "Credits" section
Signed-off-by: Alessandro Rossi <al.rossi87@gmail.com>
2025-01-26 16:40:52 +01:00
Michael Engel
beb541619d
Added --jinja to llama-run command
...
Signed-off-by: Michael Engel <mengel@redhat.com>
2025-01-25 00:59:56 +01:00
Daniel J Walsh
484457b2a5
Merge pull request #622 from containers/install-ostree
...
Avoid dnf install on OSTree system
2025-01-24 08:27:09 -05:00
Daniel J Walsh
b5137ce043
Merge pull request #624 from containers/less-verbose-output
...
Less verbose output
2025-01-24 08:26:15 -05:00
Brian M
bc09fc2dd0
Merge pull request #623 from rhatdan/man
...
Add man page for cuda support
2025-01-23 23:28:24 -05:00
Eric Curtin
451f90fc3c
Make output less verbose
...
Now it looks like this:
$ ./install.sh
Downloaded ramalama
Downloaded cli.py
Downloaded huggingface.py
Downloaded model.py
Downloaded ollama.py
Downloaded common.py
Downloaded __init__.py
Downloaded quadlet.py
Downloaded kube.py
Downloaded oci.py
Downloaded version.py
Downloaded shortnames.py
Downloaded toml_parser.py
Downloaded file.py
Downloaded http_client.py
Downloaded url.py
Downloaded annotations.py
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-01-23 22:40:09 +00:00
Eric Curtin
7d8cc5a620
Avoid dnf install on OSTree system
...
It's not mutable
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-01-23 21:53:08 +00:00
Daniel J Walsh
df3cb91112
Add man page for cuda support
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-01-23 14:13:28 -05:00
Daniel J Walsh
0befa5cb66
Merge pull request #620 from containers/ramalama-bench
...
Introduce ramalama bench
2025-01-23 12:52:56 -05:00
Eric Curtin
d31b8bf302
Introduce ramalama bench
...
Allows benchmarking of models, GPU stacks, etc.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-01-23 15:19:27 +00:00
Eric Curtin
af4fce6d13
Merge pull request #621 from containers/attempt-podman-install
...
Attempt to install podman
2025-01-23 15:12:53 +00:00
Eric Curtin
bced453968
Attempt to install podman
...
Try and install podman via dnf or apt-get
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-01-23 13:58:19 +00:00
Daniel J Walsh
1cbb5b0dfe
Merge pull request #614 from rhatdan/version
...
Bump to v0.5.3
2025-01-23 07:38:19 -05:00
Eric Curtin
7e1687d08c
Merge pull request #615 from rhatdan/cleanup
...
A couple of cleanups in build_llama_and_whisper.sh
2025-01-22 21:58:14 +00:00
Daniel J Walsh
1bbbe1a498
A couple of cleanups in build_llama_and_whisper.sh
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-01-22 14:52:17 -05:00
Eric Curtin
f9dae2c3e0
Merge pull request #613 from containers/fix-rocm
...
We need the rocm libraries in here
2025-01-22 18:15:43 +00:00
Daniel J Walsh
cf3402596b
Bump to v0.5.3
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-01-22 12:08:21 -05:00
Daniel J Walsh
2a4d4d0905
Merge pull request #612 from containers/hf-co
...
Treat hf.co/ prefix the same as hf://
2025-01-22 12:03:16 -05:00
Eric Curtin
20394d4e65
Treat hf.co/ prefix the same as hf://
...
ollama uses hf.co/ to specify huggingface prefix, like RamaLama
uses hf://
Treat them similarly.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-01-22 16:31:58 +00:00
Eric Curtin
1fc87045ac
We need the rocm libraries in here
...
We didn't before, now we do, or else we crash.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-01-22 16:29:31 +00:00
Eric Curtin
ccdc0584d8
Merge pull request #610 from rhatdan/vllm
...
Start making vllm work with RamaLama
2025-01-21 19:40:05 +00:00
Eric Curtin
1b01a50c38
Merge pull request #609 from rhatdan/nvidia
...
Had to make this change for my laptop to suppor nvidia
2025-01-21 19:38:54 +00:00
Daniel J Walsh
45a1163a16
Start making vllm work with RamaLama
...
vLLM container we are using has vllm as an entrypoint for the container.
Currently we are running llama-run in these containers.
I can not test because vllm still blows up with lack of support for GGUF.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-01-21 14:25:51 -05:00
Daniel J Walsh
9f58703716
Had to make this change for my laptop to suppor nvidia
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-01-21 14:22:47 -05:00
Daniel J Walsh
33ecf1ad84
Merge pull request #606 from containers/remove-unnecessary
...
Remove these lines they are unused
2025-01-21 12:59:56 -05:00
Eric Curtin
d7c55611b0
Remove these lines they are unused
...
We now store the sha in the shell script
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-01-21 15:57:47 +00:00
Daniel J Walsh
73d1020831
Merge pull request #603 from containers/prompt-history
...
Update to version that has command history
2025-01-21 09:15:25 -05:00
Daniel J Walsh
c2e37eeeb2
Merge pull request #604 from containers/cleaner-error
...
Cleaner output if a machine executes this command
2025-01-21 09:14:56 -05:00
Daniel J Walsh
fc0428b06a
Merge pull request #605 from containers/fix-rocm-build
...
ROCm build broken
2025-01-21 09:13:45 -05:00
Eric Curtin
9780e6e5a2
Update to version that has command history
...
We added a feature to llama-run to traverse prompt history via
up/down arrows. Kind of like a shell.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-01-21 13:22:23 +00:00
Eric Curtin
15a3648de2
ROCm build broken
...
The build flags changed in upstream llama.cpp so we were no longer
building with ROCm acceleration.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-01-21 12:31:21 +00:00
Eric Curtin
7da69e0b70
Cleaner output if a machine executes this command
...
Without podman-machine running, ramalama ps will return:
Error: no container manager (Podman, Docker) found
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-01-21 09:48:55 +00:00
Eric Curtin
e08af09c68
Merge pull request #602 from containers/rocm-hotfix
...
code crashes for rocm added proper type cast for env var
2025-01-21 09:26:18 +00:00
Brian
964a8ea439
code crashes for rocm added proper type cast for env var
...
Signed-off-by: Brian <bmahabir@bu.edu>
2025-01-20 17:58:06 -05:00
Daniel J Walsh
25168aa086
Merge pull request #600 from containers/README-md-updates
...
Various README.md updates
2025-01-20 07:53:10 -05:00
Eric Curtin
13d13408ae
Various README.md updates
...
Want to show our new progress bar. Update the diagram. Use newer
model granite3-moe as an example.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-01-19 18:23:45 +00:00
Eric Curtin
b2bd725be4
Merge pull request #593 from rhatdan/makefile
...
Add model transport info to ramalama run/serve manpage
2025-01-16 21:37:03 +00:00
Eric Curtin
62e2693edb
Merge pull request #595 from pepijndevos/patch-1
...
Build with curl support
2025-01-16 19:29:43 +00:00
Daniel J Walsh
4f513b6a58
Add model transport infor to ramalama run/serve manpage
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-01-16 13:27:15 -05:00
Pepijn de Vos
92994abf68
Build with curl support
...
This allows the standalone container to download models
Signed-off-by: Pepijn de Vos <pepijndevos@gmail.com>
2025-01-16 18:38:00 +01:00
Eric Curtin
f215f33474
Merge pull request #591 from rhatdan/oci
...
Remove omlmd from OCI calls
2025-01-15 21:02:15 +00:00
Daniel J Walsh
6f1c21c3c4
Remove omlmd from OCI calls
...
Also Simplify Spec File
Fedora 39 is no longer supported, so remove checks in spec file.
More Podman to Recommends, You can run RamaLama with no container engine or with Docker.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-01-15 14:46:51 -05:00
Eric Curtin
50b1fa2f27
Merge pull request #582 from bmahabirbu/vllm
...
Added vllm cuda support
2025-01-15 14:42:18 +00:00
Brian
1801da9afc
Added vllm cuda support
...
Signed-off-by: Brian <bmahabir@bu.edu>
2025-01-15 08:53:45 -05:00
Eric Curtin
87fc5d74b0
Merge pull request #590 from rhatdan/makefile
...
container_build.sh works on MAC
2025-01-15 00:31:35 +00:00
Daniel J Walsh
9d9d07451d
container_build.sh works on MAC
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-01-14 17:12:31 -05:00
Daniel J Walsh
69a94155c9
Merge pull request #588 from containers/rm-pipx-req
...
We no longer have python dependancies
2025-01-14 14:55:56 -05:00
Eric Curtin
b9f4cbd8c1
We no longer have python dependancies
...
No python3 dependancies outside of the standard python libs.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-01-14 19:39:20 +00:00
Daniel J Walsh
006ceec1c4
Merge pull request #585 from rhatdan/version
...
Bump to v0.5.2
2025-01-14 10:14:40 -05:00
Daniel J Walsh
435bb8a787
Bump to v0.5.2
...
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-01-14 09:04:26 -05:00
Daniel J Walsh
6e9a2e4e5d
Merge pull request #584 from containers/shortnames
...
granite-code models in Ollama are malformed
2025-01-13 14:48:25 -05:00
Eric Curtin
da7eb54046
granite-code models in Ollama are malformed
...
To the extent where they work in Ollama but not vanilla llama.cpp
This is sortof a workaround, pulling the hf versions.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-01-13 17:27:50 +00:00
Eric Curtin
cc1db3c5ed
Merge pull request #583 from rhatdan/pull
...
Fix ramalama run on docker to work correctly
2025-01-13 17:08:05 +00:00
Daniel J Walsh
bdf24daa2d
Fix ramalama run on docker to work correctly
...
Also always pull newer images when doing run and serve.
Fixes: https://github.com/containers/ramalama/issues/542
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2025-01-13 11:42:06 -05:00
Daniel J Walsh
7c6e643b9a
Merge pull request #576 from containers/simplfy-check
...
Simplify this comparison
2025-01-13 08:42:22 -05:00
Daniel J Walsh
a62b6e442b
Merge pull request #580 from containers/update-llama.cpp
...
Update llama.cpp to include minor llama-run
2025-01-13 08:41:23 -05:00
Eric Curtin
7b5e3ce4b8
Merge pull request #581 from jim3692/add-flake
...
Add flake
2025-01-12 22:46:53 +00:00
jim3692
f715a32c90
Enable Flake for MacOS
...
Signed-off-by: jim3692 <dim@knp.one>
2025-01-13 00:28:39 +02:00
jim3692
dd593a344a
Implement Nix Flake
...
Signed-off-by: jim3692 <dim@knp.one>
2025-01-13 00:15:34 +02:00
Eric Curtin
0a6d936f70
Update llama.cpp to include minor llama-run
...
Around resetting color output
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-01-12 19:54:26 +00:00
Eric Curtin
cdd1369c40
Merge pull request #579 from swarajpande5/fix-capitalize
...
Capitalize constants in python files (CONSTANT_CASE)
2025-01-12 12:48:37 +00:00
swarajpande5
4b6fb7bf8c
Capitalize constants in python files
...
Signed-off-by: swarajpande5 <swarajpande5@gmail.com>
2025-01-12 13:00:57 +05:30
Eric Curtin
55da685e65
Simplify this comparison
...
It's just a list of ors now
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-01-11 19:41:03 +00:00
Eric Curtin
21b2023c09
Merge pull request #573 from containers/arm
...
On ARM by default turn on GPU acceleration
2025-01-11 14:38:18 +00:00
Eric Curtin
bc518e9aae
On ARM by default turn on GPU acceleration
...
On x86 we don't want to do this without a GPU because x86
integrated graphics have very limited access to VRAM and it's
normally not worth it. But ARM SoC's share memory between CPU and
GPU, meaning it's worth it generally. And we care mostly about
Apple Silicon where we want this on in podman machine.
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-01-11 14:23:14 +00:00
Eric Curtin
ac9aa5abed
Merge pull request #574 from containers/dead_code
...
This is all dead code which isn't called
2025-01-10 19:39:51 +00:00
Eric Curtin
019e256591
This is all dead code which isn't called
...
We should find a static analyzer for this
Signed-off-by: Eric Curtin <ecurtin@redhat.com>
2025-01-10 18:28:40 +00:00