* This recipe uses langchain.js and langgraph.js to create an AI application that does function calling
Signed-off-by: Lucas Holmquist <lholmqui@redhat.com>
https://github.com/containers/ai-lab-recipes/pull/806 updated the
version of chromadb used with the rag recipe when run with podman
ai lab.
Update the versions of Langchain and Chromadb clients to be compatible
Signed-off-by: Michael Dawson <mdawson@devrus.com>
pin the chromadb version when using quadlet and bootc to the
same one used when run with podman ai lab. Chromadb seems to
break comapibility regularly and the client must be compatible
with the chromadb version used.
Signed-off-by: Michael Dawson <mdawson@devrus.com>
We need to share container image storage between rootless users, so that
we don't need `sudo` and we don't duplicate the `instructlab` image.
This change follows the Red Hat solution to
[create additional image store for rootless
users](https://access.redhat.com/solutions/6206192).
The `/usr/lib/containers/storage` folder can be read by anyone and new
users will inherit a default configuration via `/etc/skel` that
configures the additional storage.
The `ilab` wrapper is also modified to remove the impersonation code and
not use `sudo` anymore.
Follow-up on #766
Signed-off-by: Fabien Dupont <fdupont@redhat.com>
We need to share container image storage between rootless users, so that
we don't need `sudo` and we don't duplicate the `instructlab` image.
This change follows the Red Hat solution to
[create additional image store for rootless users](https://access.redhat.com/solutions/6206192).
The `/usr/lib/containers/storage` folder can be read by anyone and new
users will inherit a default configuration via `/etc/skel` that
configures the additional storage.
The `ilab` wrapper is also modified to remove the impersonation code and
not use `sudo` anymore.
Signed-off-by: Fabien Dupont <fdupont@redhat.com>
Adds different steps for building required libraries, packages and dependencies for Intel Habanalabs
Signed-off-by: Enrique Belarte Luque <ebelarte@redhat.com>
Add SSL_CERT_FILE and SSL_CERT_DIR to the preserved environment variables and ensure they are passed to Podman. This change ensures that SSL certificates are correctly handled within the container environment.
Signed-off-by: Tyler Lisowski <lisowski@us.ibm.com>
When working with AI/ML recipes, it is frequent to pull versioned
software and data from Git repositories. This change adds the `git`
and `git-lfs` packages.
Signed-off-by: Fabien Dupont <fdupont@redhat.com>
Added workaround for libdnf,hl-smi binary and ilab wrapper.
Also added duplicated directory for common files working with Konflux CI.
Signed-off-by: Enrique Belarte Luque <ebelarte@redhat.com>
This change updates the version of AMD ROCm to 6.2 in the amd-bootc
image for training. With this new version, the `rocm-smi` package is
replaced by the `amd-smi` package.
Signed-off-by: Fabien Dupont <fdupont@redhat.com>
The multi-stage build has too many stages. During the installation of
the `amggpu-dkms` package, the modules are built and installed in
`/lib/modules/${KERNEL_VERSION}`. If the installation of the package is
done in the `driver-toolkit` image, the extra dependencies are very
limited. This change removes the `source` stage and installs the
`amdgpu-dkms` package on top of `driver-toolkit`.
The `amdgpu-dkms` packages installs the modules in
`/lib/modules/${KERNEL_VERSION}/extra` and these are the only modules in
that folder. The `amdgpu-dkms-firmware` package is installed as a
dependency of `admgpu-dkms` and it installs the firwmare files in
`/lib/firmware/updates/amdgpu·`. So, this change removes the in-tree
`amdgpu` modules and firmware, then copies the ones generated by DKMS in
the `builder` stage.
The change also moves the repository definitions to the `repos.d` folder
and adds the AMD public key to verify the signatures of the AMD RPMs.
The users call a wrapper script called `ilab` to hide the `instructlab`
container image and the command line options. This change copies the
file from `nvidia-bootc` and adjusts the logic. The main change is that
`/dev/kfd` and `/dev/dri` devices are passed to the container, instead
of `nvidia.com/gpu=all`. The `ilab` wrapper is copied in the `amd-bootc`
image.
The Makefile is also modified to reflect these changes.
Signed-off-by: Fabien Dupont <fdupont@redhat.com>
The comment is no longer relevant since we changed the way we pass
environment variables to the container.
Signed-off-by: Omer Tuchfeld <omer@tuchfeld.dev>
The use of a uid map leads to a new layer with all files chowned.
This takes several seconds due to the size of the instructlab
container (26GB). Normally this would be a one time cost where
the idmap layer is cached and reusued accross container creations;
however, since the container is stored on a read-only additional
image store, no caching is performed.
Address the problem by creating a derived empty contianer in
mutable container storage. This allows the 1k idmap layer to be
created in the smae area, yet reuses the layers in additional
image store.
Signed-off-by: Jason T. Greene <jason.greene@redhat.com>
The `/dev/nvswitchctl` device is created by the NVIDIA Fabric Manager
service, so it cannot be a condition for the `nvidia-fabricmanager`
service.
Looking at the NVIDIA driver startup script for Kubernetes, the actual
check is the presence of `/proc/driver/nvidia-nvswitch/devices` and the
fact that it's not empty [1].
This change modifies the condition to
`ConditionDirectoryNotEmpty=/proc/driver/nvidia-nvswitch/devices`, which
verifies that a certain path exists and is a non-empty directory.
[1] https://gitlab.com/nvidia/container-images/driver/-/blob/main/rhel9/nvidia-driver?ref_type=heads#L262-269
Signed-off-by: Fabien Dupont <fdupont@redhat.com>
The `nvidia-driver` package provides the firmware files for the given
driver version. This change removes the copy of the firmware from the
builder step and install the `nvidia-driver` package instead. This also
allows a better tracability of the files in the final image.
Signed-off-by: Fabien Dupont <fdupont@redhat.com>
Intel has released the version `1.17.0-495` of their Gaudi drivers. They
are available explicitly for RHEL 9.4 with a new `9.4` folder in the RPM
repository. This change updates the arguments to use the new version
from the new repository folder.
Signed-off-by: Fabien Dupont <fdupont@redhat.com>
When building the `driver-toolkit` image, It is cumbersome to find kernel
version that matches the future `nvidia-bootc` and `intel-bootc` images.
However, the kernel version is stored as a label on the `rhel-bootc`
images, which are exposed as the `FROM` variable in the Makefile.
This change collects the kernel version using `skopeo inspect` and `jq`.
The `DRIVER_TOOLKIT_BASE_IMAGE` variable is introduced in the Makefile
to dissociate it from the `FROM` variable that is used as the `nvidia-bootc`
and `intel-bootc` base image.
The user can now specify something like:
```shell
make nvidia-bootc \
FROM=quay.io/centos-bootc/centos-bootc:stream9 \
DRIVER_TOOLKIT_BASE_IMAGE=quay.io/centos/centos:stream9
```
Also, the `VERSION` variable in `/etc/os-release` is the full version, so
this change modifies the command to retrieve the `OS_VERSION_MAJOR`
value.
Signed-off-by: Fabien Dupont <fdupont@redhat.com>
During the build of the out-of-tree drivers, the base image will always
have the `kernel-core` package installed. And the `Makefile` doesn't
pass the `KERNEL_VERSION` argument to the build command. So, it's
simpler to rely on the `kernel-core` package info.
The commands to get the `KREL` and `KDIST` were not working with RHEL
9.4 kernel. The new set of commands has been tested with `ubi9/ubi:9.4`
and `centos/centos:stream9` based driver toolkit image and they return
the correct value. For example, the values returned for the following
kernels are:
* `5.14.0-427.28.1.el9_4` (`ubi9:ubi:9.4`):
* `KVER`: `5.14.0`
* `KREL`: `427.28.1`
* `KDIST`: `.el9_4`
* `5.14.0-427.el9` (`centos/centos:stream9`):
* `KVER`: `5.14.0`
* `KREL`: `427`
* `KDIST`: `.el9`
The `OS_VERSION_MAJOR` argument is also not passed by the `Makefile`,
but we can get it from the `/etc/os-release` file. I'm switching to
grep+sed, because I don't want to load all the other variables.
Signed-off-by: Fabien Dupont <fdupont@redhat.com>
In the `nvidia-bootc` Containerfile, the condition on the existence of
`/dev/nvswitchctl` in the `nvidia-fabricmanager` unit file is not
persisted, because we don't use the `-i` option of `sed`, so the final
image still always tries to load the service. This change adds the `-i`
option to fix this.
Signed-off-by: Fabien Dupont <fdupont@redhat.com>
# Background
df8885777d
# Issue
The current error handling for multiple subuid ranges is broken due to
surprising behavior of `wc -l` which always returns `1` even when the
input is empty.
# Solution
More carefully count the number of lines in the
`CURRENT_USER_SUBUID_RANGE` variable
# Additional changes
50fb00f26f had a small merge error, this
commit fixes that.
Signed-off-by: Omer Tuchfeld <omer@tuchfeld.dev>
We have a file that's always a duplicate of another file, until we can
get rid of this requirement a pre-commit hook to take care of it would
be nice
Signed-off-by: Omer Tuchfeld <omer@tuchfeld.dev>
vLLM fails with empty set values. Adjust the model of env passing to
only set a value if it is defined.
Signed-off-by: Jason T. Greene <jason.greene@redhat.com>
# Background
See df8885777d
# Issue
Introduced a regression [1] where it's no longer possible to run the script
as root, as the subuid map ends up being empty and this causes an error:
```
Error: invalid empty host id at UID map: [1 1]
```
# Solution
Avoid UID mapping if we're already running as root.
# Motivation
We want to also be able to run the script as root, for example as part
of a systemd service.
[1] RHELAI-798
Signed-off-by: Omer Tuchfeld <omer@tuchfeld.dev>
The default base image for the Driver Toolkit image is `centos:stream9`.
The original work for Driver Toolkit is in OpenShift and the base image
is `ubi9/ubi`. In bother cases, the images don't have the `kernel`
package installed.
This change adds a test on the `KERNEL_VERSION` argument and exits if
it's not provided at build time. This also ensure that only the
relevant kernel is present when using `centos:stream9` or `ubi9/ubi`
as the base image. And this realigns a bit with the original Driver
Toolkit.
Signed-off-by: Fabien Dupont <fdupont@redhat.com>
- Set all indenting to 4 spaces (no tabs)
- Use POSIX style function definition in oneliner functions
- Remove unneeded exports on env variables
Signed-off-by: Javi Polo <jpolo@redhat.com>
Include ILAB_GLOBAL_CONFIG, VLLM_LOGGING_LEVEL, and NCCL_DEBUG as environment variables when starting the ilab container. Also add shared memory size of 10G to enable vllm execution. Resolves: https://github.com/containers/ai-lab-recipes/issues/721
Signed-off-by: Tyler Lisowski <lisowski@us.ibm.com>
# Background
The ilab command is wrapped by an `ilab` script which launches ilab
inside a podman container.
# Issue
Since the ilab container image is pulled during the bootc image build
process using the root user, the image is not accessible to non-root
users.
# Solution
We run the container as sudo in order to be able to access the root
container storage. But for security reasons we map root UID 0 inside the
container to the current user's UID (and all the other subuids to the
user's /etc/subuid range) so that we're effectively running the
container as the current user.
# Additional changes
Changed `"--env" "HOME"` to `"--env" "HOME=$HOME"` to pass the HOME
environment variable from the current shell and not from the sudo
environment.
# Future work
In the future, we will run podman as the current user, once we figure a
reasonable way for the current user to access the root's user container
storage
Signed-off-by: Omer Tuchfeld <omer@tuchfeld.dev>
# Background
We have an ilab wrapper script that users will use to launch the ilab
container.
Users may want to mount additional volumes into the container, as they
could possibly have e.g. large models stored in some external storage.
# Problem
Users cannot simply edit the script to add the mounts to the podman
command as it is read-only.
# Solution
Add support for an environment variable that users can set to specify
additional mounts to be added to the podman command. This will allow
users to specify additional mounts without having to modify the script.
# Implementation
The script will now check for the `ILAB_ADDITIONAL_MOUNTS` environment
variable. If it is set, the script will parse the variable as evaluated
bash code to get the mounts. The mounts will then be added to the podman
command.
Example `ILAB_ADDITIONAL_MOUNTS` usage:
```bash
ILAB_ADDITIONAL_MOUNTS="/host/path:/container/path /host/path2:/container/path2"`
```
If your path contains spaces, you can use quotes:
```bash
ILAB_ADDITIONAL_MOUNTS="/host/path:/container/path '/host/path with spaces':/container/path"
```
The latter works because the script uses `eval` to parse the mounts.
Signed-off-by: Omer Tuchfeld <omer@tuchfeld.dev>
The wrapper had a mixed used of tabs/spaces, making it annoying to edit
Formatted with shfmt to switch to spaces
Signed-off-by: Omer Tuchfeld <omer@tuchfeld.dev>
If the wrapper script is killed, the container will be left running.
Instead of just running the command, use `exec` to replace the
wrapper script with the command, so that the command will receive
the same signals as the wrapper script and the container will be
terminated as expected.
Signed-off-by: Omer Tuchfeld <omer@tuchfeld.dev>
Upgrade informer will run every couple of our and will be triggered by
systemd timer.
In order to start it on boot and run once i enabled it and timer.
Disabling auto upgrade service in order to remove unexpected reboots.
Service will run "bootc upgrade --check" and in case new version exists
it will create motd file with upgrade info.
Signed-off-by: Igal Tsoiref <itsoiref@redhat.com>
Signed-off-by: Javi Polo <jpolo@redhat.com>
While skopeo maybe part of the base image, there is no
guarantee, and as long as ilab requires it, we should
make sure it is installed.
Signed-off-by: Dan Walsh <dwalsh@redhat.com>
Background
RHEL AI ships with a script in `/usr/bin` called `ilab` which
makes running `ilab` commands feel native even though they're actually
running in a podman container
Issues
The abstraction becomes leaky once you start dealing with paths.
The user thinks it's local paths, but it's actually paths inside the pod,
and if the user is doing any action with a path that's not mounted inside the pod,
files persisted to that path will not persist across ilab wrapper invocations
Examples:
1. ilab config init outputs:
Generating `/root/.config/instructlab/config.yaml`...
Initialization completed successfully, you're ready to start using `ilab`. Enjoy!
But:
ls /root/.config/instructlab/config.yaml
ls: cannot access '/root/.config/instructlab/config.yaml': Permission denied
2. User provided paths e.g.:
ilab config init --model-path...
ilab model download --model-dir=...
The path may not be mounted to the host and the data is written to overlay fs and gone when the conatiner dies
Solution
Mount the user HOME direcotry and set the HOME inside the conainer
This seems to resolve the abouve issues as long the user provided paths
are nested under the user HOME direcotry
Signed-off-by: Eran Cohen <eranco@redhat.com>
Ticket [RHELAI-442](https://issues.redhat.com/browse/RHELAI-442)
# Background
RHEL AI ships with a script in `/usr/local/bin` called `ilab` which
makes running `ilab` commands feel native even though they're actually
running in a podman container
# Issues
* The script is outdated / used several different container images for
different purposes, while it should be just using the single instructlab
image
* The volume mounts were incorrect, as instructlab now uses XDG paths
* Unnecessary directory creation for `HF_CACHE`
* Unnecessary GPU count logic
* Script has unnecessary fiddling of `ilab` parameters, essentially creating a
UX that deviates from the natural `ilab` CLI
# Solutions
* Changed script to use the single container image `IMAGE_NAME` (this
was already the case mostly, except for old references to `LVLM_NAME`
and `TRAIN_NAME` which no longer get replaced leading to a broken `PODMAN_COMMAND_SERVE`.
Also adjusted entrypoint to use the `ilab` executable in the pyenv
* Will now mount the host's `~/.config` and `~/.local` into the
container's corresponding directories, for `instructlab` to use
and for its config / data to persist across invocations
* Will now mount `~/.cache` into the container's corresponding `.cache`
directory, so that the information stored in the default `HF_CACHE` is
also persisted across invocations
* Removed unnecessary GPU count logic
* Removed all parameter parsing / fiddling
# Other changes
Added secret/fake "shell" `ilab` subcommand which opens a shell in the
wrapper's container, useful for troubleshooting issues with the wrapper
itself
Signed-off-by: Omer Tuchfeld <omer@tuchfeld.dev>
it matches much better.
Changing the way we set image_version_id label, in order for it to work in
konflux we should use LABEL in container file
Signed-off-by: Igal Tsoiref <itsoiref@redhat.com>
Of note, there was already a use of a "VENDOR" word to describe the
accelerator or provider (amd, intel, nvidia, etc..). I renamed that
in order to make room for this new use of VENDOR.
Signed-off-by: Ralph Bean <rbean@redhat.com>
Set github hash by defautl as image version.
Add RHEL_AI_VERSION into /etc/os-release in order to use it in
insights
Signed-off-by: Igal Tsoiref <itsoiref@redhat.com>
The `nvidia-persistenced` and `nvidia-fabricmanager` services should be
started on machines with NVIDIA devices. Fabric Manager is only needed
on machines with an NVLink switch, so we patch it to start only if
/dev/nvswitchctl is present.
Signed-off-by: Fabien Dupont <fdupont@redhat.com>
Upstream, this image can be pulled unauthenticated, but in other
environments a user might want to include an image that exists in some
repository that requires authentication to pull.
The person building the image needs to provide
`--secret=id=instructlab-nvidia-pull/.dockerconfigjson,src=instructlab-nvidia-pull/.dockerconfigjson`
when building the image in order to make the secret available.
Signed-off-by: Ralph Bean <rbean@redhat.com>
For the InstructLab image, we use NVIDIA driver version `550.90.07` with
CUDA `12.4.1`, so this change updates the versions in the bootc image to
align the stack.
Signed-off-by: Fabien Dupont <fdupont@redhat.com>
- Fix model download container and targets
- Add prometheus model for eval
- Improve caching in instructlab container
- Add additional "models" targets for all permutations
- Introduce build chaining so that you can build everthing in one step
- Small update to conform to $(MAKE) convention for submakes
Signed-off-by: Jason T. Greene <jason.greene@redhat.com>
The top level vendor targets (amd, intel, nvidia) fail with
"podman" build \
\
--file /root/ai-lab-recipes/training/model/../build/Containerfile.models \
--security-opt label=disable \
--tag "quay.io/ai-lab/-bootc-models:latest" \
-v /root/ai-lab-recipes/training/model/../build:/run/.input:ro
Error: tag quay.io/ai-lab/-bootc-models:latest: invalid reference format
make[1]: *** [Makefile:41: bootc-models] Error 125
make[1]: Leaving directory '/root/ai-lab-recipes/training/model'
make: *** [Makefile:70: bootc-models] Error 2
because VENDOR is not defined when the bootc-models target is called.
Modify the makefile to set VENDOR for each target.
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
- Properly separate and order podman and bootc-image-builder arguments
- Move all the `selinux.tmp` workaround to the same layer, so bootc
install wont complain about missing files
Signed-off-by: Javi Polo <jpolo@redhat.com>
Any Gaudi update must be synchronized with all stakeholders. For now,
all packages from Kernel OOT drivers over firmware and SynapseAI to
PyTorch stack must have the same version. `habana-torch-plugin` version
`1.16.0.526` does not work with Kernel drivers `1.16.1-7`.
Signed-off-by: Christian Heimes <cheimes@redhat.com>
The NVIDIA bootc container is using multi-stage to avoid shipping build
dependencies in the final image, making it also smaller. This change
implements the same build strategy for the Intel bootc image.
The builder image is the same as for NVIDIA bootc. It is currently named
after NVIDIA, but should be renamed in a follow-up change. The benefit
is that a single builder image is maintained for all bootc images that
require out-of-tree drivers.
The number of build arguments is also reduced, since most of the
information is already present in the builder image. There is only one
kernel package per builder image and one image per architecture, so we
can retrieve the `KERNEL_VERSION` and `TARGET_ARCH` variables by
querying the RPM database. The OS information is retrieved by sourcing
the `/etc/os-release` file.
The extraction of the RPMs doesn't require storing the files, as
`rpm2cpio` supports streaming the file over HTTP(S). This number of
commands is smaller and the downloads happened already for each build,
since the download was not in a separate `RUN` statement.
It is not necessary to copy the source of the drivers in `/usr/src\, since
we don't need to keep it in the final image. The Makefiles accept a
`KVERSION` variable to specify the version of the kernel and resolve its
path. The other benefit is to build as non-root.
The `.ko` files can then be copied to the final image with `COPY
--from=builder`. The change also ensures that the firmware files are
copied to the final image.
This change also adds support for `EXTRA_RPM_PACKAGES`.
Signed-off-by: Fabien Dupont <fdupont@redhat.com>
- Fix wrong script install (container lab used over wrapper [wont run on its own])
+ Restores elements that were unintentionally removed
- Fix quay tags
- Introduce "$ARCH-bootc-models" images in addition to bootc that include models
Signed-off-by: Jason T. Greene <jason.greene@redhat.com>
growfs is created by Makefile and CI does not use it. Also if I'm not misktaken growfs is only used for disk images creation.
By changing this, growfs file will only be created when Makefile is runningso CI pipelines can build the Containerfile and growfs can be also used when needed.
Signed-off-by: Enrique Belarte Luque <ebelarte@redhat.com>
Konflux CI fails when building using bootc images as base throwing this error:
`Error: Cannot create repo temporary directory "/var/cache/dnf/baseos-044cae74d71fe9ea/libdnf.1jsyRp": Permission denied`
This temporary workaround is needed for build pipeline to work on Konflux CI until libdnf fix is merged to RHEL.
References:
https://issues.redhat.com/browse/RHEL-39796https://github.com/rpm-software-management/libdnf/pull/1665
This should be removed once the permanent fix is merged.
Signed-off-by: Enrique Belarte Luque <ebelarte@redhat.com>
Many commands that are run for SDG and training can take a lot of time,
so there is a risk to have a network disconnection during the task. With
`tmux`, users have the ability to detach the jobs from their SSH session
and let the tasks run for a very long time.
Signed-off-by: Fabien Dupont <fdupont@redhat.com>
The `lspci` command is frequently used to inspect the hardware on a
server. Adding it to the OS image would help users to troubleshoot
deployment and configuration issues.
Signed-off-by: Fabien Dupont <fdupont@redhat.com>
Some users want to use buildah instead of podman to build
their container engines.
Buildah does not support --squash-all but after examining the podman
code --squash-all ends up just being the equivalent of "--squash --layers=false"
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
Fixed two instructions in the README.
1) the instruction to make model pointed to etr-resnet-50 rather than the etr-resnet-101 that the instructions use.
2) The client container start had a /detecion in the model address where it should not have.
added signoff
Signed-off-by: Graeme Colman <gcolman@redhat.com>
And hence mixtral download fails
Downloading model failed with the following Hugging Face Hub error: 401 Client Error. (Request ID: Root=1-6637576e-28a8c5cb049f1dbb35d46d83;86121860-3ce0-419b-aed0-4fc79c440da7)
Cannot access gated repo for url https://huggingface.co/api/models/mistralai/Mixtral-8x7B-Instruct-v0.1/tree/main?recursive=True&expand=False.
Access to model mistralai/Mixtral-8x7B-Instruct-v0.1 is restricted. You must be authenticated to access it.
Signed-off-by: Rom Freiman <rfreiman@gmail.com>
@ -17,7 +17,7 @@ For a full list of the images we build check out or [quay organization](https://
## Testing frameworks
Our testing frameworks are a bit different from our standard workflows. In terms of compute, some of these jobs run either AWS machines provisioned via terraform using secrets in the github repository, or customized github hosted action runners, as well as the standard ubuntu-22.04 github runners for jobs not requiring additional resources.
Our testing frameworks are a bit different from our standard workflows. In terms of compute, some of these jobs run either AWS machines provisioned via terraform using secrets in the github repository, or customized github hosted action runners, as well as the standard ubuntu-24.04 github runners for jobs not requiring additional resources.
These workflows start by checking out the [terraform-test-environment-module](https://github.com/containers/terraform-test-environment-module) repo, as well as the code in `containers/ai-lab-recipes` at the `main` branch. Then it will provision the terraform instance, install the correct ansible playbook requirements, and runs a coressponding playbook. Aditional actions may also be taken depending on the testing framework in question.
You can run the conversion image directly with podman in the terminal. You just need to provide it with the huggingface model name you want to download, the quantization level you want to use and whether or not you want to keep the raw files after conversion.
You can run the conversion image directly with podman in the terminal. You just need to provide it with the huggingface model name you want to download, the quantization level you want to use and whether or not you want to keep the raw files after conversion. "HF_TOKEN" is optional, it is required for private models.
st.session_state["Question"]="What is the Higgs Boson?"
if"Answers"notinst.session_state:
st.session_state["Answers"]={}
st.session_state["Answers"]["Right_Answer_1"]="The Higgs boson, sometimes called the Higgs particle, is an elementary particle in the Standard Model of particle physics produced by the quantum excitation of the Higgs field, one of the fields in particle physics theory"
st.session_state["Answers"]["Wrong_Answer_1"]="Alan Turing was the first person to conduct substantial research in the field that he called machine intelligence."
The llamacpp_python model server images are based on the [llama-cpp-python](https://github.com/abetlen/llama-cpp-python) project that provides python bindings for [llama.cpp](https://github.com/ggerganov/llama.cpp). This provides us with a python based and OpenAI API compatible model server that can run LLM's of various sizes locally across Linux, Windows or Mac.
This model server requires models to be converted from their original format, typically a set of `*.bin` or `*.safetensor` files into a single GGUF formatted file. Many models are available in GGUF format already on [huggingface.co](https://huggingface.co). You can also use the [model converter utility](../../convert_models/) available in this repo to convert models yourself.
This model server requires models to be converted from their original format, typically a set of `*.bin` or `*.safetensor` files into a single GGUF formatted file. Many models are available in GGUF format already on [huggingface.co](https://huggingface.co). You can also use the [model converter utility](../../convert_models/) available in this repo to convert models yourself.
## Image Options
We currently provide 3 options for the llamacpp_python model server:
* [Base](#base)
We currently provide 3 options for the llamacpp_python model server:
* [Base](#base)
* [Cuda](#cuda)
* [Vulkan (experimental)](#vulkan-experimental)
* [Vulkan (experimental)](#vulkan-experimental)
### Base
The [base image](../llamacpp_python/base/Containerfile) is the standard image that works for both arm64 and amd64 environments. However, it does not includes any hardware acceleration and will run with CPU only. If you use the base image, make sure that your container runtime has sufficient resources to run the desired model(s).
The [base image](../llamacpp_python/base/Containerfile) is the standard image that works for both arm64 and amd64 environments. However, it does not includes any hardware acceleration and will run with CPU only. If you use the base image, make sure that your container runtime has sufficient resources to run the desired model(s).
The [Cuda image](../llamacpp_python/cuda/Containerfile) include all the extra drivers necessary to run our model server with Nvidia GPUs. This will significant speed up the models response time over CPU only deployments.
The [Cuda image](../llamacpp_python/cuda/Containerfile) include all the extra drivers necessary to run our model server with Nvidia GPUs. This will significant speed up the models response time over CPU only deployments.
To run the Cuda image with GPU acceleration, you need to install the correct [Cuda drivers](https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#driver-installation) for your system along with the [Nvidia Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html#). Please use the links provided to find installation instructions for your system.
To run the Cuda image with GPU acceleration, you need to install the correct [Cuda drivers](https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#driver-installation) for your system along with the [Nvidia Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html#). Please use the links provided to find installation instructions for your system.
Once those are installed you can use the container toolkit CLI to discover your Nvidia device(s).
Once those are installed you can use the container toolkit CLI to discover your Nvidia device(s).
@ -57,13 +60,15 @@ Finally, you will also need to add `--device nvidia.com/gpu=all` to your `podman
### Vulkan (experimental)
The [Vulkan image](../llamacpp_python/vulkan/Containerfile) is experimental, but can be used for gaining partial GPU access on an M-series Mac, significantly speeding up model response time over a CPU only deployment. This image requires that your podman machine provider is "applehv" and that you use krunkit instead of vfkit. Since these tools are not currently supported by podman desktop this image will remain "experimental".
The [Vulkan](https://docs.vulkan.org/guide/latest/what_is_vulkan.html) image ([amd64](../llamacpp_python/vulkan/amd64/Containerfile)/[arm64](../llamacpp_python/vulkan/arm64/Containerfile)) is experimental, but can be used for gaining partial GPU access on an M-series Mac, significantly speeding up model response time over a CPU only deployment. This image requires that your podman machine provider is "applehv" and that you use krunkit instead of vfkit. Since these tools are not currently supported by podman desktop this image will remain "experimental".
There are many models to choose from these days, most of which can be found on [huggingface.co](https://huggingface.co). In order to use a model with the llamacpp_python model server, it must be in GGUF format. You can either download pre-converted GGUF models directly or convert them yourself with the [model converter utility](../../convert_models/) available in this repo.
A well performant Apache-2.0 licensed models that we recommend using if you are just getting started is
`granite-7b-lab`. You can use the link below to quickly download a quantized (smaller) GGUF version of this model for use with the llamacpp_python model server.
`granite-7b-lab`. You can use the link below to quickly download a quantized (smaller) GGUF version of this model for use with the llamacpp_python model server.
@ -86,7 +90,7 @@ Place all models in the [models](../../models/) directory.
You can use this snippet below to download the default model:
```bash
make -f Makefile download-model-granite
make download-model-granite
```
Or you can use the generic `download-models` target from the `/models` directory to download any model file from huggingface:
@ -105,29 +109,30 @@ make MODEL_NAME=<model_name> MODEL_URL=<model_url> -f Makefile download-model
To deploy the LLM server you must specify a volume mount `-v` where your models are stored on the host machine and the `MODEL_PATH` for your model of choice. The model_server is most easily deploy from calling the make command: `make -f Makefile run`. Of course as with all our make calls you can pass any number of the following variables: `REGISTRY`, `IMAGE_NAME`, `MODEL_NAME`, `MODEL_PATH`, and `PORT`.
To enable dynamic loading and unloading of different models present on your machine, you can start the model service with a `CONFIG_PATH` instead of a `MODEL_PATH`.
@ -154,10 +159,10 @@ Here is an example `models_config.json` with two model options.
}
```
Now run the container with the specified config file.
Now run the container with the specified config file.
@ -56,11 +56,11 @@ The local Model Service relies on a volume mount to the localhost to access the
make run
```
As stated above, by default the model service will use [`facebook/detr-resnet-101`](https://huggingface.co/facebook/detr-resnet-101). However you can use other compatabale models. Simply pass the new `MODEL_NAME` and `MODEL_PATH` to the make command. Make sure the model is downloaded and exists in the [models directory](../../../models/):
As stated above, by default the model service will use [`facebook/detr-resnet-101`](https://huggingface.co/facebook/detr-resnet-101). However you can use other compatible models. Simply pass the new `MODEL_NAME` and `MODEL_PATH` to the make command. Make sure the model is downloaded and exists in the [models directory](../../../models/):
```bash
# from path model_servers/object_detection_python from repo containers/ai-lab-recipes
make MODEL_NAME=facebook/detr-resnet-50 MODEL_PATH=/models/facebook/detr-resnet-50 run
make MODEL_NAME=facebook/detr-resnet-50 MODEL_PATH=/models/facebook/detr-resnet-101 run
```
## Build the AI Application
@ -81,7 +81,7 @@ This could be any appropriately hosted Model Service (running locally or in the
The following Podman command can be used to run your AI Application:
```bash
podman run -p 8501:8501 -e MODEL_ENDPOINT=http://10.88.0.1:8000/detection object_detection_client
podman run -p 8501:8501 -e MODEL_ENDPOINT=http://10.88.0.1:8000 object_detection_client
This recipe demonstrates the ReAct (Reasoning and Acting) framework in action through a music exploration application. ReAct enables AI to think step-by-step about tasks, take appropriate actions, and provide reasoned responses. The application shows how ReAct can be used to create an intelligent music discovery assistant that combines reasoning with Spotify API interactions.
The application utilizes [`llama-cpp-python`](https://github.com/abetlen/llama-cpp-python) for the Model Service and integrates with Spotify's API for music data. The recipe uses [Langchain](https://python.langchain.com/docs/get_started/introduction) for the ReAct implementation and [Streamlit](https://streamlit.io/) for the UI layer.
## Spotify API Access
To use this application, you'll need Spotify API credentials:
- Create a Spotify Developer account
- Create an application in the Spotify Developer Dashboard
- Get your Client ID and Client Secret
These can be provided through environment variables or the application's UI.
## Try the ReAct Agent Application
The [Podman Desktop](https://podman-desktop.io) [AI Lab Extension](https://github.com/containers/podman-desktop-extension-ai-lab) includes this recipe among others. To try it out, open `Recipes Catalog` -> `ReAct Agent` and follow the instructions to start the application.
# Build the Application
The rest of this document will explain how to build and run the application from the terminal, and will
go into greater detail on how each container in the Pod above is built, run, and
what purpose it serves in the overall application. All the recipes use a central [Makefile](../../common/Makefile.common) that includes variables populated with default values to simplify getting started. Please review the [Makefile docs](../../common/README.md), to learn about further customizing your application.
This application requires a model, a model service and an AI inferencing application.
* [Quickstart](#quickstart)
* [Download a model](#download-a-model)
* [Build the Model Service](#build-the-model-service)
* [Deploy the Model Service](#deploy-the-model-service)
* [Build the AI Application](#build-the-ai-application)
* [Deploy the AI Application](#deploy-the-ai-application)
* [Interact with the AI Application](#interact-with-the-ai-application)
* [Embed the AI Application in a Bootable Container Image](#embed-the-ai-application-in-a-bootable-container-image)
## Quickstart
To run the application with pre-built images from `quay.io/ai-lab`, use `make quadlet`. This command
builds the application's metadata and generates Kubernetes YAML at `./build/chatbot.yaml` to spin up a Pod that can then be launched locally.
Try it with:
```
make quadlet
podman kube play build/chatbot.yaml
```
This will take a few minutes if the model and model-server container images need to be downloaded.
The Pod is named `chatbot`, so you may use [Podman](https://podman.io) to manage the Pod and its containers:
```
podman pod list
podman ps
```
Once the Pod and its containers are running, the application can be accessed at `http://localhost:8501`. However, if you started the app via the podman desktop UI, a random port will be assigned instead of `8501`. Please use the AI App Details `Open AI App` button to access it instead.
Please refer to the section below for more details about [interacting with the chatbot application](#interact-with-the-ai-application).
To stop and remove the Pod, run:
```
podman pod stop chatbot
podman pod rm chatbot
```
## Download a model
If you are just getting started, we recommend using [granite-7b-lab](https://huggingface.co/instructlab/granite-7b-lab). This is a well
performant mid-sized model with an apache-2.0 license. In order to use it with our Model Service we need it converted
and quantized into the [GGUF format](https://github.com/ggerganov/ggml/blob/master/docs/gguf.md). There are a number of
ways to get a GGUF version of granite-7b-lab, but the simplest is to download a pre-converted one from
The Model Service can be built from make commands from the [llamacpp_python directory](../../../model_servers/llamacpp_python/).
```bash
# from path model_servers/llamacpp_python from repo containers/ai-lab-recipes
make build
```
Checkout the [Makefile](../../../model_servers/llamacpp_python/Makefile) to get more details on different options for how to build.
## Deploy the Model Service
The local Model Service relies on a volume mount to the localhost to access the model files. It also employs environment variables to dictate the model used and where its served. You can start your local Model Service using the following `make` command from `model_servers/llamacpp_python` set with reasonable defaults:
```bash
# from path model_servers/llamacpp_python from repo containers/ai-lab-recipes
make run
```
## Build the AI Application
The AI Application can be built from the make command:
```bash
# Run this from the current directory (path recipes/natural_language_processing/chatbot from repo containers/ai-lab-recipes)
make build
```
## Deploy the AI Application
Make sure the Model Service is up and running before starting this container image. When starting the AI Application container image we need to direct it to the correct `MODEL_ENDPOINT`. This could be any appropriately hosted Model Service (running locally or in the cloud) using an OpenAI compatible API. In our case the Model Service is running inside the Podman machine so we need to provide it with the appropriate address `10.88.0.1`. To deploy the AI application use the following:
```bash
# Run this from the current directory (path recipes/natural_language_processing/chatbot from repo containers/ai-lab-recipes)
make run
```
## Interact with the AI Application
Everything should now be up an running with the chat application available at [`http://localhost:8501`](http://localhost:8501). By using this recipe and getting this starting point established, users should now have an easier time customizing and building their own LLM enabled chatbot applications.
## Embed the AI Application in a Bootable Container Image
To build a bootable container image that includes this sample chatbot workload as a service that starts when a system is booted, run: `make -f Makefile bootc`. You can optionally override the default image / tag you want to give the make command by specifying it as follows: `make -f Makefile BOOTC_IMAGE=<your_bootc_image> bootc`.
Substituting the bootc/Containerfile FROM command is simple using the Makefile FROM option.
```bash
make FROM=registry.redhat.io/rhel9/rhel-bootc:9.4 bootc
```
Selecting the ARCH for the bootc/Containerfile is simple using the Makefile ARCH= variable.
```
make ARCH=x86_64 bootc
```
The magic happens when you have a bootc enabled system running. If you do, and you'd like to update the operating system to the OS you just built
with the chatbot application, it's as simple as ssh-ing into the bootc system and running:
```bash
bootc switch quay.io/ai-lab/chatbot-bootc:latest
```
Upon a reboot, you'll see that the chatbot service is running on the system. Check on the service with:
```bash
ssh user@bootc-system-ip
sudo systemctl status chatbot
```
### What are bootable containers?
What's a [bootable OCI container](https://containers.github.io/bootc/) and what's it got to do with AI?
That's a good question! We think it's a good idea to embed AI workloads (or any workload!) into bootable images at _build time_ rather than
at _runtime_. This extends the benefits, such as portability and predictability, that containerizing applications provides to the operating system.
Bootable OCI images bake exactly what you need to run your workloads into the operating system at build time by using your favorite containerization
tools. Might I suggest [podman](https://podman.io/)?
Once installed, a bootc enabled system can be updated by providing an updated bootable OCI image from any OCI
image registry with a single `bootc` command. This works especially well for fleets of devices that have fixed workloads - think
factories or appliances. Who doesn't want to add a little AI to their appliance, am I right?
Bootable images lend toward immutable operating systems, and the more immutable an operating system is, the less that can go wrong at runtime!
#### Creating bootable disk images
You can convert a bootc image to a bootable disk image using the
This container image allows you to build and deploy [multiple disk image types](../../common/README_bootc_image_builder.md) from bootc container images.
Default image types can be set via the DISK_TYPE Makefile variable.
Write-Error "Checksum validation is not supported for maven-mvnd. `nPlease disable validation by removing 'distributionSha256Sum' from your maven-wrapper.properties."
if ((Get-FileHash"$TMP_DOWNLOAD_DIR/$distributionUrlName"-AlgorithmSHA256).Hash.ToLower()-ne$distributionSha256Sum){
Write-Error "Error: Failed to validate Maven distribution SHA-256, your Maven distribution might be compromised. If you updated your Maven version, you need to update the specified distributionSha256Sum property."