Compare commits

..

69 Commits
v0.9.0 ... main

Author SHA1 Message Date
Markus Lehtonen 968389540d
Merge pull request #548 from containers/dependabot/go_modules/golang.org/x/oauth2-0.27.0
build(deps): bump golang.org/x/oauth2 from 0.21.0 to 0.27.0
2025-07-21 09:55:12 +03:00
dependabot[bot] ce5911c701
build(deps): bump golang.org/x/oauth2 from 0.21.0 to 0.27.0
Bumps [golang.org/x/oauth2](https://github.com/golang/oauth2) from 0.21.0 to 0.27.0.
- [Commits](https://github.com/golang/oauth2/compare/v0.21.0...v0.27.0)

---
updated-dependencies:
- dependency-name: golang.org/x/oauth2
  dependency-version: 0.27.0
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-07-18 18:13:48 +00:00
Feruzjon Muyassarov 70946857b7
Merge pull request #545 from klihub/devel/pick-resources-by-hints
topology-aware: try picking resources by hints first
2025-07-02 12:32:37 +03:00
Krisztian Litkey 7ea0bb55b4
docs: document hint-based picking of resources.
Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
2025-07-02 10:53:32 +03:00
Krisztian Litkey e86db46a09
topology-aware: add pick-resources-by-hints annotation.
Only try to pick resources by hints, if a container is annotated
for it using 'pick-resources-by-hints.resource-policy.nri.io'.

Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
2025-07-02 09:56:46 +03:00
Krisztian Litkey 10be15819c
topology-aware: pick memory by podresapi hint(s) first.
If a container has pod resource API hints, try allocating from
and pinning to memory from nodes which hints indicate locality
to.

Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
2025-07-02 09:56:46 +03:00
Krisztian Litkey 2153e80189
topology-aware: take exclusive CPUs by podresapi hint(s) first.
If a container asks for at least as many exsclusive CPUs as it
has pod resource API hints, try allocating CPUs by hints first.

With multiple devices allocated to a single container, this can
help in cases where the collective locality of devices forces
allocation high in the pool tree, where we should prefer CPUs
with locality to one of the devices and avoid other CPUs.

For instance, devices with locality to NUMA node #0 and #3, or
'half of' sockets #0 and #1, and a request for 2 CPUs, we end
up in the root pool. But we should only prefer allocating CPUs
with locality to NUMA nodes #0 or #3 and avoiding any CPU with
locality to node #1 or #3.

Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
2025-07-02 09:56:46 +03:00
Krisztian Litkey 020899473f
topology-aware: give more priority for topology hints.
Give more priority for topology hints than earlier, putting
them right below annotated affinities. This will give hints
priority over memory pinning tightness, which is preferable
when a container allocates multiple devices with different
memory locality.

Hints now have precedence over annotated memory type. This
might be a bit questionable, since hints are implied while
memory type annotations are explicit. We probably can live
with this for the time being. Hints can be selectively dis-
abled per pod or container to restore the earlier behavior.

Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
2025-07-02 09:56:45 +03:00
Krisztian Litkey 72fcbcf7e7
podresapi: allow checking hint type.
Allow checking if a topology hint is based on pod resource API.

Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
2025-07-02 09:56:45 +03:00
Krisztian Litkey 061808f02c
helm: set defaults for agent config in values.
Set defaults for agent config in values. This makes it easy to
enable pod resource API by passing this to helm install

   --set config.agent.podResourceAPI=true

Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
2025-07-02 09:56:45 +03:00
Feruzjon Muyassarov ffbbf5c950
Merge pull request #544 from klihub/devel/use-tagged-goresctrl
go.{mod,sum}: use new goresctrl tag v0.9.0.
2025-06-30 23:46:07 +03:00
Krisztian Litkey af3807a78d
go.{mod,sum}: point goresctrl to new v0.9.0 tag.
Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
2025-06-30 22:00:53 +03:00
Feruzjon Muyassarov 07d7788c38
Merge pull request #543 from klihub/devel/expose-rdt-metrics
resmgr: expose RDT metrics.
2025-06-30 14:37:24 +03:00
Krisztian Litkey 8d6d3280fc
resmgr: expose goresctrl RDT metrics.
Expose RDT metrics provided by `goresctrl` making them
available as `rdt` within `policy` metrics group.

Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
2025-06-28 16:00:48 +03:00
Krisztian Litkey 5dac03f57a
rdt: update for adjusted logging.
Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
2025-06-28 16:00:48 +03:00
Krisztian Litkey 63906272d2
log: implement slog.Handler for log bridging.
Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
2025-06-28 16:00:48 +03:00
Krisztian Litkey 425f13a47f
go.{mod,sum}: bump goresctrl to latest main/HEAD.
Bump goresctrl to latest HEAD to get updated/improved RDT
metrics support.

Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
2025-06-28 15:49:27 +03:00
Markus Lehtonen be6d152e36
Merge pull request #542 from klihub/docs/typo-fixes
docs: fix a few typos.
2025-06-23 17:07:02 +03:00
Krisztian Litkey d72a295797
docs: fix a few rushed in typos.
Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
2025-06-19 23:08:09 +03:00
Krisztian Litkey ce84ee1f09 docs: describe common RDT functionality.
Add a chapter about common resource management policy features.
Add an initial description of RDT/cache control there.

Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
2025-06-19 23:00:08 +03:00
Krisztian Litkey e35d4c5dcf helm: allow enabling RDT control via charts.
Added an RDT control config block to the balloons and topology-
aware policy configuration. The control defaults to off. It can
be turned on by passing '--set config.control.rdt.enable=true'
to helm during installation. This will also add the necessary
extra caps (CAP_SYS_ADMIN and CAP_DAC_OVERRIDE) for resctrlfs
access.

Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
2025-06-19 23:00:08 +03:00
Krisztian Litkey 97ad2433f8 resmgr: apply RDT configuration using goresctrl.
Apply RDT configuration if the corresponding control is enabled
and we have class/partitioning configuration data.

Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
2025-06-19 23:00:08 +03:00
Krisztian Litkey 691902a324 config,crds: add RDT class configuration.
Add RDT class and partition configuration. If given and the
corresponding control is enabled, the configuration is used
to set up RDT CLOSes using goresctrl.

The configuration is almost identical to the goresctrl RDT
configuration, but does not support the shorthand notation
for unified caches which would be problematic for a custom
resource data type.

IOW, one cannot simply say

```
...
  l[23]Allocation:
    all: 75%
...
```

but instead needs to explicitly write this as

```
...
  l[23]Allocation:
    all:
      unified: 75%
...
```

Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
2025-06-19 23:00:08 +03:00
Antti Kervinen 0880f51027 e2e: test composite balloons
Signed-off-by: Antti Kervinen <antti.kervinen@intel.com>
2025-06-17 11:04:06 +03:00
Antti Kervinen 3eca11c958 doc: document composite balloons
Signed-off-by: Antti Kervinen <antti.kervinen@intel.com>
2025-06-17 11:04:06 +03:00
Antti Kervinen 7ce5682767 balloons: implement composite balloons
Signed-off-by: Antti Kervinen <antti.kervinen@intel.com>
2025-06-17 11:04:06 +03:00
Antti Kervinen 612de8cc70 balloons: add "components" to the config for composite balloons
Signed-off-by: Antti Kervinen <antti.kervinen@intel.com>
2025-06-17 11:04:06 +03:00
Antti Kervinen 70b0e49873 doc: how to allow using all CPUs and memories with a balloon
Signed-off-by: Antti Kervinen <antti.kervinen@intel.com>
2025-06-17 11:03:44 +03:00
Krisztian Litkey 4e519039df resmgr: update container-exported resource data.
Update resource data exported to containers when a container's
allocation changes.

Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
2025-06-17 10:59:22 +03:00
Antti Kervinen b459f69d9f e2e: test the memory-policy plugin
Signed-off-by: Antti Kervinen <antti.kervinen@intel.com>
2025-06-17 08:41:53 +03:00
Antti Kervinen 9c3d761ce7 e2e: helm-launch supports launching the memory-policy plugin
Signed-off-by: Antti Kervinen <antti.kervinen@intel.com>
2025-06-17 08:41:53 +03:00
Antti Kervinen 67d8c0166d e2e: ansible deployment file for the memory-policy plugin
This is first ansible deployment file for an NRI non-resource-policy
plugin. It comes with some technical debt for refactoring duplicate
tasks in deployment files with NRI resource policy plugins. This
version is deliberately separate from them in order to minimize
conflicts.

Signed-off-by: Antti Kervinen <antti.kervinen@intel.com>
2025-06-17 08:41:53 +03:00
Antti Kervinen c117cc9349 doc: an example on preventing creating a scheduled container
Signed-off-by: Antti Kervinen <antti.kervinen@intel.com>
2025-06-16 17:22:35 +03:00
Antti Kervinen 1480cfb63e helm: readme for nri-memory-policy
Signed-off-by: Antti Kervinen <antti.kervinen@intel.com>
2025-06-16 17:15:04 +03:00
Antti Kervinen 09e6a60f25 helm: add Chart for memory-policy deployment
Signed-off-by: Antti Kervinen <antti.kervinen@intel.com>
2025-06-16 17:15:04 +03:00
Antti Kervinen 31f521d84d boilerplates: fix typos in copyrights
Signed-off-by: Antti Kervinen <antti.kervinen@intel.com>
2025-06-16 17:14:18 +03:00
Antti Kervinen 0b6250c2f5 memory-policy: Dockerfile and a sample config for the plugin image
Signed-off-by: Antti Kervinen <antti.kervinen@intel.com>
2025-06-16 17:14:18 +03:00
Antti Kervinen e00a80cc70 memory-policy: NRI plugin for setting memory policy
nri-memory-policy plugin sets memory policy when an annotated
container is getting created. The policy can be made effective in two
alternative ways. First, using MemoryPolicy NRI adjustment in OCI
spec, and second, by injecting mpolset into the container and
prefixing container command with it.

Memory policies are annotated into pods either as a memory policy
class name or as a custom annotated policy. Memory policy classes are
defined in the plugin configuration.

Signed-off-by: Antti Kervinen <antti.kervinen@intel.com>
2025-06-16 17:14:18 +03:00
Feruzjon Muyassarov 8363703c35
Merge pull request #538 from klihub/fixes/docs/golang-version
docs: fix golang version.
2025-06-16 10:00:18 +03:00
Krisztian Litkey 1c8082e462
docs: fix golang version.
Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
2025-06-14 17:14:54 +03:00
Krisztian Litkey 5e2aeb3522 sysfs: make CPU frequency Min/Max members exported.
Make the CPUFreq Min,Max members exported. Fold baseFreq into
the same struct.

Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
2025-06-11 14:30:51 +03:00
Feruzjon Muyassarov dec6e0c51e go.{mod,sum} update and tidy deps
Signed-off-by: Feruzjon Muyassarov <feruzjon.muyassarov@est.tech>
2025-06-11 14:29:10 +03:00
Feruzjon Muyassarov f9e9524199 Drop tools.go in favor of native tool directive support in go 1.24
github: strip away v prefix during bundle submission

Signed-off-by: Feruzjon Muyassarov <feruzjon.muyassarov@est.tech>
2025-06-11 14:29:10 +03:00
Feruzjon Muyassarov 809bf76e6e
Merge pull request #529 from klihub/devel/build-and-publish-test-images-and-charts
.github: allow building and publishing per-topic test images and Helm charts.
2025-06-11 10:52:24 +03:00
Feruzjon Muyassarov 33a492b898
Merge pull request #528 from klihub/devel/bump-golang-version
golang: bump go version to 1.24[.3].
2025-06-10 15:47:47 +03:00
Markus Lehtonen 2cce78ddb5 docs/balloons: document missing instrumentation options
The instrumentation.metrics.polled was left out as I was not sure if
this has any practical use at the moment.

Signed-off-by: Markus Lehtonen <markus.lehtonen@intel.com>
2025-06-10 15:47:12 +03:00
Markus Lehtonen aab98d43e7 docs/balloons: document logger config options
Signed-off-by: Markus Lehtonen <markus.lehtonen@intel.com>
2025-06-10 15:46:36 +03:00
Markus Lehtonen 5f48d2895f
Merge pull request #532 from klihub/fixes/container-exported-memset-format
topology-aware: fix format of container-exported memsets.
2025-06-10 15:14:13 +03:00
Krisztian Litkey 3d616d2886 sysfs: fix CPU.GetCaches() to not return empty slice.
Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
2025-06-10 15:09:58 +03:00
Krisztian Litkey 8272330be1
topology-aware: fix format of container-exported memsets.
Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
2025-06-10 14:51:56 +03:00
Antti Kervinen f4eb3f5a5c doc: fix balloons documentation preserve option example
Signed-off-by: Antti Kervinen <antti.kervinen@intel.com>
2025-06-09 22:12:57 +03:00
Krisztian Litkey b5e73533ca
.github: allow building and publishing test images and charts.
Update github workflows to allow publishing images and Helm charts
for test builds. These builds are triggered by pushing to a branch
matching test/build/$TOPIC. Images and charts will be named after
$TOPIC taken from the test branch name.

For instance, pushing a v0.9 development tree to a branch named
'test/build/dra-driver' in the 'klihub' fork of the repo builds
and publishes (among others) this image and chart to the OCI repo
ghcr.io/klihub/nri-plugins:

  nri-plugins/nri-resource-policy-topology-aware v0.9-dra-driver-unstable
  helm-charts/nri-resource-policy-topology-aware v0.9-dra-driver-unstable

The built chart references the image by default. IOW, once such a
build is ready, it can be installed with a single command like this:

helm install --devel -n kube-system test \
  oci://ghcr.io/klihub/nri-plugins/helm-charts/nri-resource-policy-topology-aware \
  --version v0.9-dra-driver-unstable \
  --set image.pullPolicy=Always

Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
2025-06-09 20:18:01 +03:00
Krisztian Litkey 01df6b616c
.github: point unstable charts to same OCI repo.
Point built unstable charts to images in the same OCI repo.

Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
2025-06-09 20:16:58 +03:00
Antti Kervinen 57a1d88d4a mpolset: get/set memory policy and exec a command
mpolset is a standaone executable that can be compiled into a static
binary. In this form, it can be injected into containers, where the
original command of a container can be prefixed with mpolset and
arguments that define the default memory policy for the original
command in the container.

If command is not given, mpolset prints the default memory
policy. This offers an easy way to test how kernels handle missing
nodes and various flag combinations, for instance.

For example: set, execute and get memory policy:

./mpolset -nodes 0-2 -mode MPOL_INTERLEAVE -- ./mpolset

Signed-off-by: Antti Kervinen <antti.kervinen@intel.com>
2025-06-09 19:04:54 +03:00
Antti Kervinen 81122e30b4 mempolicy: go interface for set_mempolicy and get_mempolicy syscalls
Signed-off-by: Antti Kervinen <antti.kervinen@intel.com>
2025-06-09 19:03:29 +03:00
Krisztian Litkey bcc229856a
golang: bump go version to 1.24[.3].
Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
2025-06-09 14:23:46 +03:00
Antti Kervinen a2f4601206 e2e: fix CNI directory in recent containerd versions
Due to containerd configuration change in recent versions, that is
switching from bin_dir to bin_dirs, our configuration modification for
Fedora stopped working. This patch makes the CNI location fix more
robust, and it works with old and new containerd versions.

Signed-off-by: Antti Kervinen <antti.kervinen@intel.com>
2025-06-02 15:23:05 +03:00
Antti Kervinen e0cf7ef316 e2e: drop old shims from custom-build containerd
Recent versions of containerd do not build dropped shims. It is
realistic to assume that we are building and testing with most recent
containerd versions rather than old ones, so let's support that case
in our default e2e workflow, too.

This change affects only testing with containerd_src set to a local
containerd build.

Signed-off-by: Antti Kervinen <antti.kervinen@intel.com>
2025-05-26 15:20:15 +03:00
Antti Kervinen 021cd91304 sysfs: add a helper for gathering whatever IDs related to CPUs
This helper gathers package, die, node, etc. ids from a set of
CPUs. The helper takes care of checking validity/existence of each CPU
in the set.

Signed-off-by: Antti Kervinen <antti.kervinen@intel.com>
2025-05-26 14:54:44 +03:00
Feruzjon Muyassarov c8e943a528
Merge pull request #512 from klihub/fixes/olm-bundle-submissions
.github: only trigger OLM submission on release.
2025-05-13 12:32:45 +05:00
Krisztian Litkey 551e2cc9b8
.github: only trigger OLM submission on release.
Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
2025-05-13 09:18:38 +03:00
Feruzjon Muyassarov 53a3f245b5 gitHub: remove v prefix from version tag during bundle submission
Signed-off-by: Feruzjon Muyassarov <feruzjon.muyassarov@est.tech>
2025-04-14 15:08:28 +03:00
Krisztian Litkey fba3364f4e resmgr: purge cached pod resource list upon pod stop/removal.
Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
2025-04-07 09:34:11 +03:00
Krisztian Litkey 465eda6108 podresapi: add support for purging cached pod resource lists.
Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
2025-04-07 09:34:11 +03:00
Feruzjon Muyassarov b027f002b6 github: explicitly ensure contents-only copying
Signed-off-by: Feruzjon Muyassarov <feruzjon.muyassarov@est.tech>
2025-04-03 15:13:31 +03:00
Feruzjon Muyassarov 3968d095d1
Merge pull request #506 from klihub/fixes/helm-workflow
.github: fix cut-and-pase typo breaking helm fix.
2025-04-01 14:58:32 +03:00
Krisztian Litkey 9d32586ab8
.github: fix cut-and-pase typo breaking helm fix.
Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
2025-04-01 14:55:41 +03:00
Feruzjon Muyassarov 048000c491
Merge pull request #505 from klihub/fixes/helm-sign-one-chart-at-a-time
.github: sign one Helm chart at a time.
2025-04-01 14:49:15 +03:00
Krisztian Litkey 2389a502c4
.github: sign one Helm chart at a time.
Work around Helm flakiness when packaging/signing multiple
charts with a passphrase provided on stdin/pipe by signing
one chart at a time.

Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
2025-04-01 14:43:30 +03:00
82 changed files with 5457 additions and 293 deletions

View File

@ -7,6 +7,7 @@ on:
branches:
- main
- release-*
- test/build/*
env:
CHARTS_DIR: deployment/helm/
@ -55,14 +56,17 @@ jobs:
run: |
find "$CHARTS_DIR" -name values.yaml | xargs -I '{}' \
sed -e s"/pullPolicy:.*/pullPolicy: IfNotPresent/" -i '{}'
echo ${{ secrets.BOT_PASSPHRASE }} | helm package \
--sign \
--key ${{ steps.import-gpg.outputs.email }} \
--keyring ~/.gnupg/secring.gpg \
--version "$GITHUB_REF_NAME" \
--app-version "$GITHUB_REF_NAME" \
"$CHARTS_DIR"/* \
--passphrase-file "-"
for chart in "$CHARTS_DIR"/*; do
echo ${{ secrets.BOT_PASSPHRASE }} | \
helm package \
--sign \
--key ${{ steps.import-gpg.outputs.email }} \
--keyring ~/.gnupg/secring.gpg \
--version "$GITHUB_REF_NAME" \
--app-version "$GITHUB_REF_NAME" \
--passphrase-file "-" \
$chart;
done
find . -name '*.tgz' -print | while read SRC_FILE; do
DEST_FILE=$(echo $SRC_FILE | sed 's/v/helm-chart-v/g')
mv $SRC_FILE $DEST_FILE
@ -103,15 +107,32 @@ jobs:
# - image version: 'unstable'.
majmin="$(git describe --tags | sed -E 's/(v[0-9]*\.[0-9]*).*$/\1/')"
CHART_VERSION="${majmin}-unstable"
if [ $GITHUB_REF_NAME = "main" ]; then
APP_VERSION=unstable
else
APP_VERSION="${majmin}-unstable"
fi
variant=""
case $GITHUB_REF_NAME in
main)
APP_VERSION=unstable
;;
release-*)
APP_VERSION="${majmin}-unstable"
;;
test/build/*)
variant="${GITHUB_REF_NAME#test/build/}"
variant="${variant//\//-}"
tag="${majmin}-${variant}-unstable"
APP_VERSION="${majmin}-${variant}-unstable"
CHART_VERSION="${majmin}-${variant}-unstable"
;;
esac
echo "- Using APP_VERSION=$APP_VERSION"
echo " CHART_VERSION=$CHART_VERSION"
# Package charts
find "$CHARTS_DIR" -name values.yaml | xargs -I '{}' \
sed -e s"/pullPolicy:.*/pullPolicy: Always/" -i '{}'
helm package --version "$CHART_VERSION" --app-version $APP_VERSION "$CHARTS_DIR"/*
orig="ghcr.io/containers/nri-plugins"
repo="${{ env.REGISTRY }}/${{ env.REGISTRY_PATH }}"
find "$CHARTS_DIR" -name values.yaml | xargs -I '{}' \
sed -e s"| name: $orig/| name: $repo/|g" -i '{}'
helm package --version "$CHART_VERSION" --app-version "$APP_VERSION" "$CHARTS_DIR"/*
find "$CHARTS_DIR" -name values.yaml | xargs -I '{}' \
git checkout '{}'
mkdir ../$UNSTABLE_CHARTS

View File

@ -5,6 +5,7 @@ on:
branches:
- main
- release-*
- test/build/*
tags:
- v[0-9]+.[0-9]+.[0-9]+
@ -47,6 +48,7 @@ jobs:
echo " - image: $img"
echo " - digest: $sha"
echo " - digging out tag from git ref $GITREF..."
variant=""
case $GITREF in
refs/tags/v*)
tag="${GITREF#refs/tags/}"
@ -57,6 +59,12 @@ jobs:
refs/heads/release-*)
tag="v${GITREF#refs/heads/release-}-unstable"
;;
refs/heads/test/build/*)
variant="${GITREF#refs/heads/test/build/}"
variant="${variant//\//-}"
majmin="$(git describe --tags | sed -E 's/(v[0-9]*\.[0-9]*).*$/\1/')"
tag="${majmin}-${variant}-unstable"
;;
*)
echo "error: can't determine tag."
exit 1

View File

@ -4,10 +4,6 @@ on:
release:
types:
- published
issues:
types:
- opened
- reopened
env:
HUB_REPO: k8s-operatorhub/community-operators
@ -27,9 +23,6 @@ jobs:
BOT_GPG_PASSPHRASE: ${{ secrets.BOT_GPG_PASSPHRASE }}
steps:
- name: Debug dump workflow
uses: raven-actions/debug@v1
- name: Determine target tepository and tag
id: check
run: |
@ -41,17 +34,6 @@ jobs:
exit 0
fi
# If trigger was a matching issue, file PR against the filer's fork which we
# implicitly assume to exist.
title="${{ github.event.issue.title }}"
if [[ "$title" =~ ^'OLM test submit v' ]]; then
USER_REPO="${{ github.event.issue.user.login }}/community-operators"
echo "repo=$USER_REPO" >> $GITHUB_OUTPUT
echo "fork=$BOT_REPO" >> $GITHUB_OUTPUT
echo "tag=${title#OLM test submit }" >> $GITHUB_OUTPUT
exit 0
fi
# Otherwise skip.
echo "skip=true" >> $GITHUB_OUTPUT
@ -106,11 +88,14 @@ jobs:
with:
fetch-depth: 0
- name: Build the bundle
- name: Set version environment variable
run: |
version="${{ env.TAG }}"
version="${version#v}"
pushd deployment/operator && VERSION=${version} make bundle && popd
echo "version=$version" >> $GITHUB_ENV
- name: Build the bundle
run: pushd deployment/operator && VERSION=${version} make bundle && popd
- name: Checkout upstream community-operators repo
uses: actions/checkout@v4
@ -132,8 +117,8 @@ jobs:
- name: Copy the bundle to the community-operators repo
run: |
mkdir -p community-operators/operators/nri-plugins-operator/${{ env.TAG }}
cp -r deployment/operator/bundle/ community-operators/operators/nri-plugins-operator/${{ env.TAG }}
mkdir -p community-operators/operators/nri-plugins-operator/${version}
cp -r deployment/operator/bundle/* community-operators/operators/nri-plugins-operator/${version}
- name: Create Pull Request
uses: peter-evans/create-pull-request@v6
@ -153,10 +138,3 @@ jobs:
body: |
Added OLM bundle for [nri-plugins operator ${{ env.TAG }}](https://github.com/containers/nri-plugins/releases/tag/${{ env.TAG }})
> Auto-generated by `Github Actions Bot`
- name: Close triggering issue on success
if: ${{ github.event_name == 'issues' }}
env:
GH_TOKEN: ${{ github.token }}
run: |
gh issue close --comment "Test PR filed successfully." ${{ github.event.issue.number }}

View File

@ -37,7 +37,7 @@ GO_LINT := golint -set_exit_status
GO_FMT := gofmt
GO_VET := $(GO_CMD) vet
GO_DEPS := $(GO_CMD) list -f '{{ join .Deps "\n" }}'
GO_VERSION ?= 1.23.4
GO_VERSION ?= 1.24.3
GO_MODULES := $(shell $(GO_CMD) list ./...)
GO_SUBPKGS := $(shell find ./pkg -name go.mod | sed 's:/go.mod::g' | grep -v testdata | \
@ -77,12 +77,14 @@ PLUGINS ?= \
nri-resource-policy-topology-aware \
nri-resource-policy-balloons \
nri-resource-policy-template \
nri-memory-policy \
nri-memory-qos \
nri-memtierd \
nri-sgx-epc
BINARIES ?= \
config-manager
config-manager \
mpolset
OTHER_IMAGE_TARGETS ?= \
nri-plugins-operator-image \

View File

@ -1,5 +1,5 @@
# Build Stage
ARG GO_VERSION=1.23
ARG GO_VERSION=1.24
FROM golang:${GO_VERSION}-bullseye AS build
WORKDIR /go/builder

27
cmd/mpolset/Dockerfile Normal file
View File

@ -0,0 +1,27 @@
ARG GO_VERSION=1.23
FROM golang:${GO_VERSION}-bullseye AS builder
ARG IMAGE_VERSION
ARG BUILD_VERSION
ARG BUILD_BUILDID
ARG DEBUG=0
ARG NORACE=0
WORKDIR /go/builder
# Fetch go dependencies in a separate layer for caching
COPY go.mod go.sum ./
COPY pkg/topology/ pkg/topology/
RUN go mod download
# Build mpolset
COPY . .
RUN make BINARIES=mpolset IMAGE_VERSION=${IMAGE_VERSION} BUILD_VERSION=${BUILD_VERSION} BUILD_BUILDID=${BUILD_BUILDID} build-binaries-static
FROM gcr.io/distroless/static
COPY --from=builder /go/builder/build/bin/mpolset /bin/mpolset
ENTRYPOINT ["/bin/mpolset"]

164
cmd/mpolset/mpolset.go Normal file
View File

@ -0,0 +1,164 @@
// Copyright The NRI Plugins Authors. All Rights Reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
// mpolset is an executable that sets the memory policy for a process
// and then executes the specified command.
package main
import (
"flag"
"fmt"
"os"
"os/exec"
"strconv"
"strings"
"syscall"
"github.com/containers/nri-plugins/pkg/mempolicy"
"github.com/containers/nri-plugins/pkg/utils/cpuset"
"github.com/sirupsen/logrus"
)
type logrusFormatter struct{}
func (f *logrusFormatter) Format(entry *logrus.Entry) ([]byte, error) {
return fmt.Appendf(nil, "mpolset: %s %s\n", entry.Level, entry.Message), nil
}
var (
log *logrus.Logger
)
func modeToString(mode uint) string {
// Convert mode to string representation
flagsStr := ""
for name, value := range mempolicy.Flags {
if mode&value != 0 {
flagsStr += "|"
flagsStr += name
mode &= ^value
}
}
modeStr := mempolicy.ModeNames[mode]
if modeStr == "" {
modeStr = fmt.Sprintf("unknown mode %d)", mode)
}
return modeStr + flagsStr
}
func main() {
var err error
log = logrus.StandardLogger()
log.SetFormatter(&logrusFormatter{})
modeFlag := flag.String("mode", "", "Memory policy mode. Valid values are mode numbers and names, e.g. 3 or MPOL_INTERLEAVE. List available modes with -mode help")
flagsFlag := flag.String("flags", "", "Comma-separated list of memory policy flags,e.g. MPOL_F_STATIC_NODES. List available flags with -flags help")
nodesFlag := flag.String("nodes", "", "Comma-separated list of nodes, e.g. 0,1-3")
ignoreErrorsFlag := flag.Bool("ignore-errors", false, "Ignore errors when setting memory policy")
verboseFlag := flag.Bool("v", false, "Enable verbose logging")
veryVerboseFlag := flag.Bool("vv", false, "Enable very verbose logging")
flag.Parse()
log.SetLevel(logrus.InfoLevel)
if *verboseFlag {
log.SetLevel(logrus.DebugLevel)
}
if *veryVerboseFlag {
log.SetLevel(logrus.TraceLevel)
}
execCmd := flag.Args()
mode := uint(0)
switch {
case *modeFlag == "help":
fmt.Printf("Valid memory policy modes:\n")
for mode := range len(mempolicy.ModeNames) {
fmt.Printf(" %s (%d)\n", mempolicy.ModeNames[uint(mode)], mode)
}
os.Exit(0)
case *modeFlag != "" && (*modeFlag)[0] >= '0' && (*modeFlag)[0] <= '9':
imode, err := strconv.Atoi(*modeFlag)
if err != nil {
log.Fatalf("invalid -mode: %v", err)
}
mode = uint(imode)
case *modeFlag != "":
ok := false
mode, ok = mempolicy.Modes[*modeFlag]
if !ok {
log.Fatalf("invalid -mode: %v", *modeFlag)
}
case len(execCmd) > 0:
log.Fatalf("missing -mode")
}
nodes := []int{}
if *nodesFlag != "" {
nodeMask, err := cpuset.Parse(*nodesFlag)
if err != nil {
log.Fatalf("invalid -nodes: %v", err)
}
nodes = nodeMask.List()
}
if *flagsFlag != "" {
if strings.Contains(*flagsFlag, "help") {
fmt.Printf("Valid memory policy flags:\n")
for flag := range mempolicy.Flags {
fmt.Printf(" %s\n", flag)
}
os.Exit(0)
}
flags := strings.Split(*flagsFlag, ",")
for _, flag := range flags {
flagBit, ok := mempolicy.Flags[flag]
if !ok {
log.Fatalf("invalid -flags: %v", flag)
}
mode |= flagBit
}
}
if len(execCmd) == 0 {
mode, nodes, err := mempolicy.GetMempolicy()
if err != nil {
log.Fatalf("GetMempolicy failed: %v", err)
}
modeStr := modeToString(mode)
fmt.Printf("Current memory policy: %s (%d), nodes: %s\n", modeStr, mode, cpuset.New(nodes...).String())
os.Exit(0)
}
log.Debugf("setting memory policy: %s (%d), nodes: %v\n", modeToString(mode), mode, cpuset.New(nodes...).String())
if err := mempolicy.SetMempolicy(mode, nodes); err != nil {
log.Errorf("SetMempolicy failed: %v", err)
if ignoreErrorsFlag == nil || !*ignoreErrorsFlag {
os.Exit(1)
}
}
log.Debugf("executing: %v\n", execCmd)
executable, err := exec.LookPath(execCmd[0])
if err != nil {
log.Fatalf("Looking for executable %q failed: %v", execCmd[0], err)
}
log.Tracef("- executable: %q\n", execCmd[0])
log.Tracef("- environment: %v\n", os.Environ())
err = syscall.Exec(executable, execCmd, os.Environ())
if err != nil {
log.Fatalf("Executing %q failed: %v", executable, err)
}
}

View File

@ -1,4 +1,4 @@
ARG GO_VERSION=1.23
ARG GO_VERSION=1.24
FROM golang:${GO_VERSION}-bullseye AS builder

View File

@ -19,6 +19,7 @@ import (
"math"
"path/filepath"
"strconv"
"strings"
cfgapi "github.com/containers/nri-plugins/pkg/apis/config/v1alpha1/resmgr/policy/balloons"
"github.com/containers/nri-plugins/pkg/cpuallocator"
@ -116,6 +117,7 @@ type Balloon struct {
LoadedVirtDevs map[string]struct{}
cpuTreeAlloc *cpuTreeAllocator
memTypeMask libmem.TypeMask
components []*Balloon
}
// loadClassVirtDev is a virtual device under load due to a load class.
@ -166,8 +168,24 @@ func (bln Balloon) AvailMilliCpus() int {
}
func (bln Balloon) MaxAvailMilliCpus(freeCpus cpuset.CPUSet) int {
if bln.Def.MaxCpus == NoLimit {
return (bln.Cpus.Size() + freeCpus.Size()) * 1000
availableFreeCpus := freeCpus.Size()
if len(bln.components) > 0 {
// MaxCpus of component balloons can limit the size of
// the composite balloon.
compMinAvailmCPUs := -1
for _, comp := range bln.components {
mcpus := comp.MaxAvailMilliCpus(freeCpus)
if mcpus < compMinAvailmCPUs || compMinAvailmCPUs == -1 {
compMinAvailmCPUs = mcpus
}
}
// Assume this composite balloon allocates equal
// number of CPUs for every component balloon.
sumMinAvailCPUs := (compMinAvailmCPUs * len(bln.components)) / 1000
availableFreeCpus = min(sumMinAvailCPUs, availableFreeCpus)
}
if bln.Def.MaxCpus == NoLimit || bln.Def.MaxCpus > availableFreeCpus {
return (bln.Cpus.Size() + availableFreeCpus) * 1000
}
return bln.Def.MaxCpus * 1000
}
@ -430,6 +448,23 @@ func (p *balloons) GetTopologyZones() []*policy.TopologyZone {
Value: fmt.Sprintf("%dm", bln.AvailMilliCpus()-blnReqmCpu),
},
}
if len(bln.components) > 0 {
var compCpusString func(*Balloon) string
compCpusString = func(b *Balloon) string {
if len(b.components) == 0 {
return fmt.Sprintf("{%s}", b.Cpus)
}
res := []string{}
for _, comp := range b.components {
res = append(res, compCpusString(comp))
}
return fmt.Sprintf("{%s}", strings.Join(res, ", "))
}
attributes = append(attributes, &policyapi.ZoneAttribute{
Name: policyapi.ComponentCPUsAttribute,
Value: compCpusString(bln),
})
}
zone.Attributes = append(zone.Attributes, attributes...)
zones = append(zones, zone)
@ -741,10 +776,24 @@ func (p *balloons) useCpuClass(bln *Balloon) error {
// - User-defined CPU AllocatorPriority: bln.Def.AllocatorPriority.
// - All existing balloon instances: p.balloons.
// - CPU configurations by user: bln.Def.CpuClass (for bln in p.balloons)
if len(bln.components) > 0 {
// If this is a composite balloon, CPU class is
// defined in the component balloons.
log.Debugf("apply CPU class %q on CPUs %s of composite balloon %q",
bln.Def.CpuClass, bln.Cpus, bln.PrettyName())
for _, compBln := range bln.components {
if err := p.useCpuClass(compBln); err != nil {
log.Warnf("failed to apply CPU class %q on CPUs %s of %q in composite balloon %q: %v",
compBln.Def.CpuClass, compBln.Cpus, compBln.PrettyName(), bln.PrettyName(), err)
}
}
return nil
}
if err := cpucontrol.Assign(p.cch, bln.Def.CpuClass, bln.Cpus.UnsortedList()...); err != nil {
log.Warnf("failed to apply class %q on CPUs %q: %v", bln.Def.CpuClass, bln.Cpus, err)
} else {
log.Debugf("apply class %q on CPUs %q", bln.Def.CpuClass, bln.Cpus)
log.Debugf("apply CPU class %q on CPUs %q of %q", bln.Def.CpuClass, bln.Cpus, bln.PrettyName())
}
return nil
}
@ -756,7 +805,11 @@ func (p *balloons) forgetCpuClass(bln *Balloon) {
if err := cpucontrol.Assign(p.cch, p.bpoptions.IdleCpuClass, bln.Cpus.UnsortedList()...); err != nil {
log.Warnf("failed to forget class %q of cpus %q: %v", bln.Def.CpuClass, bln.Cpus, err)
} else {
log.Debugf("forget class %q of cpus %q", bln.Def.CpuClass, bln.Cpus)
if len(bln.components) > 0 {
log.Debugf("forget classes of composite balloon %q cpus %q", bln.Def.Name, bln.Cpus)
} else {
log.Debugf("forget class %q of cpus %q", bln.Def.CpuClass, bln.Cpus)
}
}
}
@ -821,6 +874,63 @@ func (p *balloons) virtDevsChangeDuringCpuAllocation(loadClassNames []string) bo
return false
}
func (p *balloons) newCompositeBalloon(blnDef *BalloonDef, confCpus bool, freeInstance int) (*Balloon, error) {
componentBlns := make([]*Balloon, 0, len(blnDef.Components))
deleteComponentBlns := func() {
for _, compBln := range componentBlns {
log.Debugf("removing component balloon %s of composite balloon %s",
compBln.PrettyName(), blnDef.Name)
p.deleteBalloon(compBln)
}
}
for _, comp := range blnDef.Components {
// Create a balloon for each component.
compDef := p.balloonDefByName(comp.DefName)
if compDef == nil {
deleteComponentBlns()
return nil, balloonsError("unknown balloon definition %q in composite balloon %q",
comp.DefName, blnDef.Name)
}
compBln, err := p.newBalloon(compDef, confCpus)
if err != nil || compBln == nil {
deleteComponentBlns()
return nil, balloonsError("failed to create component balloon %q for composite balloon %q: %v",
comp.DefName, blnDef.Name, err)
}
componentBlns = append(componentBlns, compBln)
log.Debugf("created component balloon %s of composite balloon %s",
compBln.PrettyName(), blnDef.Name)
}
memTypeMask, _ := memTypeMaskFromStringList(blnDef.MemoryTypes)
bln := &Balloon{
Def: blnDef,
Instance: freeInstance,
Groups: make(map[string]int),
PodIDs: make(map[string][]string),
Cpus: cpuset.New(),
SharedIdleCpus: cpuset.New(),
LoadedVirtDevs: make(map[string]struct{}),
cpuTreeAlloc: nil, // Allocator is not used for composite balloons.
memTypeMask: memTypeMask,
components: componentBlns,
}
log.Debugf("created composite balloon %s with %d components. Now resize it to %d mCPU",
bln.PrettyName(), len(bln.components), blnDef.MinCpus*1000)
if err := p.resizeBalloon(bln, blnDef.MinCpus*1000); err != nil {
deleteComponentBlns()
return nil, err
}
if confCpus {
if err := p.useCpuClass(bln); err != nil {
deleteComponentBlns()
log.Errorf("failed to apply CPU configuration to new composite balloon %s[%d] (cpus: %s): %w",
blnDef.Name, bln.Instance, bln.Cpus, err)
return nil, err
}
}
return bln, nil
}
func (p *balloons) newBalloon(blnDef *BalloonDef, confCpus bool) (*Balloon, error) {
var cpus cpuset.CPUSet
var err error
@ -843,6 +953,9 @@ func (p *balloons) newBalloon(blnDef *BalloonDef, confCpus bool) (*Balloon, erro
break
}
}
if len(blnDef.Components) > 0 {
return p.newCompositeBalloon(blnDef, confCpus, freeInstance)
}
// Configure cpuTreeAllocator for this balloon. The reserved
// balloon always prefers to be close to the virtual device
// that is close to ReservedResources CPUs. All other balloon
@ -1289,6 +1402,7 @@ func (p *balloons) applyBalloonDef(balloons *[]*Balloon, blnDef *BalloonDef, fre
func (p *balloons) validateConfig(bpoptions *BalloonsOptions) error {
seenNames := map[string]struct{}{}
undefinedLoadClasses := map[string]struct{}{}
compositeBlnDefs := map[string]*BalloonDef{}
for _, blnDef := range bpoptions.BalloonDefs {
if blnDef.Name == "" {
return balloonsError("missing or empty name in a balloon type")
@ -1321,6 +1435,32 @@ func (p *balloons) validateConfig(bpoptions *BalloonsOptions) error {
if blnDef.PreferIsolCpus && blnDef.ShareIdleCpusInSame != "" {
log.Warn("WARNING: using PreferIsolCpus with ShareIdleCpusInSame is highly discouraged")
}
if len(blnDef.Components) > 0 {
compositeBlnDefs[blnDef.Name] = blnDef
if blnDef.CpuClass != "" {
return balloonsError("composite balloon %q cannot have CpuClasses", blnDef.Name)
}
forbiddenCpuAllocationOptions := []string{}
if blnDef.PreferSpreadOnPhysicalCores != nil {
forbiddenCpuAllocationOptions = append(forbiddenCpuAllocationOptions, "PreferSpreadOnPhysicalCores")
}
if blnDef.AllocatorTopologyBalancing != nil {
forbiddenCpuAllocationOptions = append(forbiddenCpuAllocationOptions, "AllocatorTopologyBalancing")
}
if len(blnDef.PreferCloseToDevices) > 0 {
forbiddenCpuAllocationOptions = append(forbiddenCpuAllocationOptions, "PreferCloseToDevices")
}
if blnDef.PreferIsolCpus {
forbiddenCpuAllocationOptions = append(forbiddenCpuAllocationOptions, "PreferIsolCpus")
}
if blnDef.PreferCoreType != "" {
forbiddenCpuAllocationOptions = append(forbiddenCpuAllocationOptions, "PreferCoreType")
}
if len(forbiddenCpuAllocationOptions) > 0 {
return balloonsError("CPU allocation options not allowed in composite balloons, but %q has: %s",
blnDef.Name, strings.Join(forbiddenCpuAllocationOptions, ", "))
}
}
for _, load := range blnDef.Loads {
undefinedLoadClasses[load] = struct{}{}
}
@ -1337,6 +1477,41 @@ func (p *balloons) validateConfig(bpoptions *BalloonsOptions) error {
if len(undefinedLoadClasses) > 0 {
return balloonsError("loads defined in balloonTypes but missing from loadClasses: %v", undefinedLoadClasses)
}
var circularCheck func(name string, seen map[string]int) error
circularCheck = func(name string, seen map[string]int) error {
if seen[name] > 0 {
return balloonsError("circular composition detected in composite balloon %q", name)
}
seen[name] += 1
if compBlnDef, ok := compositeBlnDefs[name]; ok {
for _, comp := range compBlnDef.Components {
if err := circularCheck(comp.DefName, seen); err != nil {
return err
}
}
}
seen[name] -= 1
return nil
}
for compBlnName, compBlnDef := range compositeBlnDefs {
for compIdx, comp := range compBlnDef.Components {
if comp.DefName == "" {
return balloonsError("missing or empty component balloonType name in composite balloon %q component %d",
compBlnName, compIdx+1)
}
// Make sure every component balloon type is
// defined in BalloonDefs.
if _, ok := seenNames[comp.DefName]; !ok {
return balloonsError("balloon type %q in composite balloon %q is not defined in balloonTypes",
comp.DefName, compBlnName)
}
}
// Check for circular compositions.
seen := map[string]int{}
if err := circularCheck(compBlnName, seen); err != nil {
return err
}
}
return nil
}
@ -1635,8 +1810,38 @@ func (p *balloons) closestMems(cpus cpuset.CPUSet) idset.IDSet {
return idset.NewIDSet(p.memAllocator.CPUSetAffinity(cpus).Slice()...)
}
// resizeCompositeBalloon changes the CPUs allocated for all sub-components
func (p *balloons) resizeCompositeBalloon(bln *Balloon, newMilliCpus int) error {
origFreeCpus := p.freeCpus.Clone()
origCompBlnsCpus := []cpuset.CPUSet{}
newMilliCpusPerComponent := newMilliCpus / len(bln.components)
blnCpus := cpuset.New()
for _, compBln := range bln.components {
origCompBlnsCpus = append(origCompBlnsCpus, compBln.Cpus.Clone())
if err := p.resizeBalloon(compBln, newMilliCpusPerComponent); err != nil {
p.freeCpus = origFreeCpus
for i, origCompBlnCpus := range origCompBlnsCpus {
bln.components[i].Cpus = origCompBlnCpus
}
return balloonsError("resize composite balloon %s: %w", bln.PrettyName(), err)
}
blnCpus = blnCpus.Union(compBln.Cpus)
}
p.forgetCpuClass(bln) // reset CPU classes in balloon's old CPUs
bln.Cpus = blnCpus
log.Debugf("- resize composite ballooon successful: %s, freecpus: %#s", bln, p.freeCpus)
p.updatePinning(bln)
if err := p.useCpuClass(bln); err != nil { // set CPU classes in balloon's new CPUs
log.Warnf("failed to apply CPU class to balloon %s: %v", bln.PrettyName(), err)
}
return nil
}
// resizeBalloon changes the CPUs allocated for a balloon, if allowed.
func (p *balloons) resizeBalloon(bln *Balloon, newMilliCpus int) error {
if len(bln.components) > 0 {
return p.resizeCompositeBalloon(bln, newMilliCpus)
}
oldCpuCount := bln.Cpus.Size()
newCpuCount := (newMilliCpus + 999) / 1000
if bln.Def.MaxCpus > NoLimit && newCpuCount > bln.Def.MaxCpus {

View File

@ -0,0 +1,27 @@
ARG GO_VERSION=1.24
FROM golang:${GO_VERSION}-bullseye AS builder
ARG IMAGE_VERSION
ARG BUILD_VERSION
ARG BUILD_BUILDID
WORKDIR /go/builder
# Fetch go dependencies in a separate layer for caching
COPY go.mod go.sum ./
COPY pkg/topology/ pkg/topology/
RUN go mod download
# Build nri-resmgr
COPY . .
RUN make clean
RUN make IMAGE_VERSION=${IMAGE_VERSION} BUILD_VERSION=${BUILD_VERSION} BUILD_BUILDID=${BUILD_BUILDID} PLUGINS=nri-memory-policy BINARIES=mpolset build-plugins-static build-binaries-static
FROM gcr.io/distroless/static
COPY --from=builder /go/builder/build/bin/nri-memory-policy /bin/nri-memory-policy
COPY --from=builder /go/builder/build/bin/mpolset /bin/mpolset
COPY --from=builder /go/builder/sample-configs/nri-memory-policy.yaml /etc/nri/memory-policy/config.yaml
ENTRYPOINT ["/bin/nri-memory-policy", "-idx", "92", "-config", "/etc/nri/memory-policy/config.yaml"]

View File

@ -0,0 +1,595 @@
// Copyright 2025 Intel Corporation. All Rights Reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
package main
import (
"context"
"flag"
"fmt"
"os"
"path/filepath"
"strconv"
"strings"
"sigs.k8s.io/yaml"
"github.com/sirupsen/logrus"
"github.com/containerd/nri/pkg/api"
"github.com/containerd/nri/pkg/stub"
libmem "github.com/containers/nri-plugins/pkg/resmgr/lib/memory"
system "github.com/containers/nri-plugins/pkg/sysfs"
"github.com/containers/nri-plugins/pkg/utils/cpuset"
idset "github.com/intel/goresctrl/pkg/utils"
)
type plugin struct {
stub stub.Stub
config *Config
}
type Config struct {
InjectMpolset bool `json:"injectMpolset,omitempty"`
Classes []*MemoryPolicyClass `json:"classes,omitempty"`
}
type MemoryPolicyClass struct {
Name string `json:"name"`
PolicySpec *MemoryPolicySpec `json:"policy"`
}
type MemoryPolicySpec struct {
Mode string `json:"mode"`
Nodes string `json:"nodes"`
Flags []string `json:"flags,omitempty"`
}
type LinuxMemoryPolicy struct {
Mode string
Nodes string
Flags []string
}
const (
annotationSuffix = ".memory-policy.nri.io"
mpolsetInjectDir = "/mnt/nri-memory-policy-mpolset"
)
var (
sys system.System
log *logrus.Logger
verbose bool
veryVerbose bool
// When NRI API supports memory policies, we will switch to using
// api.MpolMode and api.MpolFlag instead of these maps.
MpolMode_value = map[string]int32{
"MPOL_DEFAULT": 0,
"MPOL_PREFERRED": 1,
"MPOL_BIND": 2,
"MPOL_INTERLEAVE": 3,
"MPOL_LOCAL": 4,
"MPOL_PREFERRED_MANY": 5,
"MPOL_WEIGHTED_INTERLEAVE": 6,
}
MpolFlag_value = map[string]int32{
"MPOL_F_STATIC_NODES": 0,
"MPOL_F_RELATIVE_NODES": 1,
"MPOL_F_NUMA_BALANCING": 2,
}
)
func (mpol *MemoryPolicySpec) String() string {
if mpol == nil {
return "nil"
}
modeFlags := strings.Join(append([]string{mpol.Mode}, mpol.Flags...), "|")
return fmt.Sprintf("%s:%s", modeFlags, mpol.Nodes)
}
// onClose handles losing connection to container runtime.
func (p *plugin) onClose() {
log.Infof("Connection to the runtime lost, exiting...")
os.Exit(0)
}
// Configure handles connecting to container runtime's NRI server.
func (p *plugin) Configure(ctx context.Context, config, runtime, version string) (stub.EventMask, error) {
log.Infof("Connected to %s %s...", runtime, version)
if config != "" {
log.Debugf("loading configuration from NRI server")
if err := p.setConfig([]byte(config)); err != nil {
return 0, err
}
return 0, nil
}
// If we are to use mpolset injection, prepare /mnt/nri-memory-policy-mpolset
// to contain mpolset so that it can be injected into containers
if p.config != nil && p.config.InjectMpolset {
if err := prepareMpolset(); err != nil {
log.Errorf("failed to prepare mpolset: %v", err)
return 0, fmt.Errorf("configuration option injectMpolset preparation failed: %v", err)
}
}
return 0, nil
}
// prepareMpolset prepares mpolset for injection into containers.
func prepareMpolset() error {
// copy mpolset to /mnt/nri-memory-policy-mpolset
if err := os.MkdirAll(mpolsetInjectDir, 0755); err != nil {
log.Debugf("failed to create %q: %v", mpolsetInjectDir, err)
}
// mpolset is expected to be located in the same directory as this plugin
mpolsetTarget := filepath.Join(mpolsetInjectDir, "mpolset")
// read the directory of this plugin and replace plugin's name (for example nri-memory-policy) with mpolset
// to get the path to mpolset
pluginPath, err := os.Executable()
if err != nil {
log.Debugf("failed to get plugin path: %v", err)
}
pluginDir := filepath.Dir(pluginPath)
mpolsetSource := filepath.Join(pluginDir, "mpolset")
// check that mpolset exists
if _, err := os.Stat(mpolsetSource); os.IsNotExist(err) {
log.Errorf("mpolset not found in %q: %v", pluginDir, err)
return fmt.Errorf("configuration injectMpolset requires mpolset, but it was not found in %q: %v", pluginDir, err)
}
// copy mpolset to /mnt/nri-memory-policy-mpolset which is located on the host
mpolsetData, err := os.ReadFile(mpolsetSource)
if err != nil {
return fmt.Errorf("failed to read mpolset contents from %q: %v", mpolsetSource, err)
}
if err := os.WriteFile(mpolsetTarget, mpolsetData, 0755); err != nil {
return fmt.Errorf("failed to %q mpolset: %v", mpolsetTarget, err)
}
return nil
}
// setConfig applies new plugin configuration.
func (p *plugin) setConfig(config []byte) error {
cfg := &Config{}
if err := yaml.Unmarshal(config, cfg); err != nil {
return fmt.Errorf("failed to unmarshal configuration: %w", err)
}
for _, class := range cfg.Classes {
if class.Name == "" {
return fmt.Errorf("name missing in class definition")
}
if class.PolicySpec == nil {
return fmt.Errorf("class %q has no policy", class.Name)
}
if class.PolicySpec.Mode == "" {
return fmt.Errorf("class %q has no mode", class.Name)
}
}
p.config = cfg
log.Debugf("plugin configuration: %+v", p.config)
return nil
}
// pprintCtr() returns unique human readable container name.
func pprintCtr(pod *api.PodSandbox, ctr *api.Container) string {
return fmt.Sprintf("%s/%s:%s", pod.GetNamespace(), pod.GetName(), ctr.GetName())
}
// effectiveAnnotations returns map of annotation key prefixes and
// values that are effective for a container. It checks for
// container-specific annotations first, and if not found, it
// returns pod-level annotations. "policy" and "class" annotations
// are mutually exclusive.
//
// Example annotations:
//
// class.memory-policy.nri.io: default-class-for-containers-in-pod
//
// class.memory-policy.nri.io/container.my-special-container: special-class
//
// policy.memory-policy.nri.io/container.my-special-container2: |+
//
// mode: MPOL_INTERLEAVE
// nodes: max-dist:19
// flags: [MPOL_F_STATIC_NODES]
func effectiveAnnotations(pod *api.PodSandbox, ctr *api.Container) map[string]string {
effAnn := map[string]string{}
for key, value := range pod.GetAnnotations() {
annPrefix, hasSuffix := strings.CutSuffix(key, annotationSuffix+"/container."+ctr.Name)
if hasSuffix {
// Override possibly already found pod-level annotation.
log.Tracef("- found container-specific annotation %q", key)
if annPrefix == "class" || annPrefix == "policy" {
delete(effAnn, "class")
delete(effAnn, "policy")
}
effAnn[annPrefix] = value
continue
}
annPrefix, hasSuffix = strings.CutSuffix(key, annotationSuffix)
if hasSuffix {
if annPrefix == "class" || annPrefix == "policy" {
_, hasClass := effAnn["class"]
_, hasPolicy := effAnn["policy"]
if hasClass || hasPolicy {
log.Tracef("- ignoring pod-level annotation %q due to a container-specific annotation", key)
continue
}
}
log.Tracef("- found pod-level annotation %q", key)
effAnn[annPrefix] = value
continue
}
}
return effAnn
}
// takePolicyAnnotation() takes the policy annotation from the
// annotations map. It returns the policy and removes the
// annotation from the map.
func takePolicyAnnotation(ann map[string]string) (*MemoryPolicySpec, error) {
if value, ok := ann["policy"]; ok {
delete(ann, "policy")
if value == "" {
return nil, nil
}
policy := &MemoryPolicySpec{}
if err := yaml.Unmarshal([]byte(value), policy); err != nil {
return nil, fmt.Errorf("failed to unmarshal policy: %w", err)
}
return policy, nil
}
return nil, nil
}
// takeClassAnnotation() takes the class annotation from the
// annotations map. It returns the class and removes the
// annotation from the map.
func (p *plugin) takeClassAnnotation(ann map[string]string) (*MemoryPolicyClass, error) {
if value, ok := ann["class"]; ok {
delete(ann, "class")
if value == "" {
return nil, nil
}
for _, class := range p.config.Classes {
if class.Name == value {
return class, nil
}
}
return nil, fmt.Errorf("class %q not found in configuration", value)
}
return nil, nil
}
// getPolicySpec() returns the memory policy for a container.
func (p *plugin) getPolicySpec(pod *api.PodSandbox, ctr *api.Container) (*MemoryPolicySpec, error) {
effAnn := effectiveAnnotations(pod, ctr)
policySpec, err := takePolicyAnnotation(effAnn)
if err != nil {
return nil, fmt.Errorf("invalid 'policy' annotation: %w", err)
}
if policySpec != nil {
log.Tracef("- effective policy annotation: %+v", policySpec)
return policySpec, nil
}
class, err := p.takeClassAnnotation(effAnn)
if err != nil {
return nil, fmt.Errorf("invalid 'class' annotation: %w", err)
}
if class != nil {
log.Tracef("- effective class annotation: %+v", class)
if class.PolicySpec == nil {
return nil, fmt.Errorf("class %q has no policy", class.Name)
}
return class.PolicySpec, nil
}
// Check for unknown annotations.
for ann := range effAnn {
return nil, fmt.Errorf("unknown annotation %s%s", ann, annotationSuffix)
}
log.Tracef("- no memory policy found in annotations")
return nil, nil
}
// ToLinuxMemoryPolicy() converts the memory policy specification into
// valid mode, nodes and flags. It is responsible for:
// - validating the mode and flags. Passing invalid mode or flags into
// injected command would be dangerous.
// - calculating exact node numbers based on node specification,
// container's cpuset and allowed memory nodes.
func (policySpec *MemoryPolicySpec) ToLinuxMemoryPolicy(ctr *api.Container) (*LinuxMemoryPolicy, error) {
var err error
var nodeMask libmem.NodeMask
if policySpec == nil {
return nil, nil
}
// Validate mode.
_, ok := MpolMode_value[policySpec.Mode]
if !ok {
return nil, fmt.Errorf("invalid memory policy mode %q", policySpec.Mode)
}
// Resolve nodes based on the policy specification.
ctrCpuset := sys.OnlineCPUs()
if ctrCpus := ctr.GetLinux().GetResources().GetCpu().GetCpus(); ctrCpus != "" {
ctrCpuset, err = cpuset.Parse(ctrCpus)
if err != nil {
return nil, fmt.Errorf("failed to parse allowed CPUs %q: %v", ctrCpus, err)
}
}
allowedMemsMask := libmem.NewNodeMask(sys.NodeIDs()...)
ctrMems := ctr.GetLinux().GetResources().GetCpu().GetMems()
if ctrMems != "" {
if parsedMask, err := libmem.ParseNodeMask(ctrMems); err == nil {
allowedMemsMask = parsedMask
} else {
return nil, fmt.Errorf("failed to parse allowed mems %q: %v", ctrMems, err)
}
}
log.Tracef("- allowed mems: %s, cpus %s", ctrMems, ctrCpuset)
switch {
// "all" includes all nodes into the mask.
case policySpec.Nodes == "all":
nodeMask = libmem.NewNodeMask(sys.NodeIDs()...)
log.Tracef("- nodes %q (all)", nodeMask.MemsetString())
// "allowed-mems" includes only allowed memory nodes into the mask.
case policySpec.Nodes == "allowed-mems":
nodeMask = allowedMemsMask
log.Tracef("- nodes: %q (allowed-mems)", nodeMask.MemsetString())
// "cpu-packages" includes all nodes that are in the same package
// as the CPUs in the container's cpuset.
case policySpec.Nodes == "cpu-packages":
pkgs := sys.IDSetForCPUs(ctrCpuset, func(cpu system.CPU) idset.ID {
return cpu.PackageID()
})
nodeMask = libmem.NewNodeMask()
for _, nodeId := range sys.NodeIDs() {
nodePkgId := sys.Node(nodeId).PackageID()
if pkgs.Has(nodePkgId) {
nodeMask = nodeMask.Set(nodeId)
}
}
log.Tracef("- nodes: %q (cpu-packages %q)", nodeMask.MemsetString(), pkgs)
// "cpu-nodes" includes all nodes in the cpuset of the container.
case policySpec.Nodes == "cpu-nodes":
nodeIds := sys.IDSetForCPUs(ctrCpuset, func(cpu system.CPU) idset.ID {
return cpu.NodeID()
})
nodeMask = libmem.NewNodeMask(nodeIds.Members()...)
log.Tracef("- nodes: %q (cpu-nodes)", nodeMask.MemsetString())
// "max-dist:<int>" includes all nodes that are within the
// specified distance from the CPUs in the container's cpuset.
case strings.HasPrefix(policySpec.Nodes, "max-dist:"):
maxDist := policySpec.Nodes[len("max-dist:"):]
maxDistInt, err := strconv.Atoi(maxDist)
if err != nil {
return nil, fmt.Errorf("failed to parse max-dist %q: %v", maxDist, err)
}
nodeMask = libmem.NewNodeMask()
fromNodes := sys.IDSetForCPUs(ctrCpuset, func(cpu system.CPU) idset.ID {
return cpu.NodeID()
})
for _, fromNode := range fromNodes.Members() {
for _, toNode := range sys.NodeIDs() {
if sys.NodeDistance(fromNode, toNode) <= maxDistInt {
nodeMask = nodeMask.Set(toNode)
}
}
}
log.Tracef("- nodes %q (max-dist %d from CPU nodes %q)", nodeMask.MemsetString(), maxDistInt, fromNodes)
// <int>[-<int>][, ...] includes the set of nodes.
case policySpec.Nodes[0] >= '0' && policySpec.Nodes[0] <= '9':
nodeMask, err = libmem.ParseNodeMask(policySpec.Nodes)
if err != nil {
return nil, fmt.Errorf("failed to parse nodes %q: %v", policySpec.Nodes, err)
}
log.Tracef("- nodes %q (hardcoded)", nodeMask.MemsetString())
default:
return nil, fmt.Errorf("invalid nodes: %q", policySpec.Nodes)
}
nodes := nodeMask.MemsetString()
if (nodeMask & allowedMemsMask) != nodeMask {
log.Debugf("some memory policy nodes (%s) are not allowed (%s)", nodes, allowedMemsMask.MemsetString())
}
// Copy and validate flags.
flags := []string{}
for _, flag := range policySpec.Flags {
if _, ok := MpolFlag_value[flag]; !ok {
return nil, fmt.Errorf("invalid memory policy flag %q", flag)
}
flags = append(flags, flag)
}
linuxMemoryPolicy := &LinuxMemoryPolicy{
Mode: policySpec.Mode,
Nodes: nodes,
Flags: flags,
}
return linuxMemoryPolicy, nil
}
// ToMemoryPolicyAdjustment() returns a ContainerAdjustment with
// corresponding LinuxMemoryPolicy adjustment.
func (policy *LinuxMemoryPolicy) ToMemoryPolicyAdjustment() (*api.ContainerAdjustment, error) {
if policy == nil {
return nil, nil
}
return nil, fmt.Errorf("memory policy adjustment is not implemented yet")
// // Uncomment this to use memory policy in NRI API
// ca := &api.ContainerAdjustment{}
// mode, ok := api.MpolMode_value[policy.Mode]
// if !ok {
// return nil, fmt.Errorf("invalid memory policy mode %q", policy.Mode)
// }
//
// flags := []api.MpolFlag{}
// for _, flag := range policy.Flags {
// if flagValue, ok := api.MpolFlag_value[flag]; ok {
// flags = append(flags, api.MpolFlag(flagValue))
// } else {
// return nil, fmt.Errorf("invalid memory policy flag %q", flag)
// }
// }
// ca.SetLinuxMemoryPolicy(api.MpolMode(mode), policy.Nodes, flags...)
// return ca, nil
}
// ToCommandInjectionAdjustment() converts the memory policy into a
// command injection adjustment that mounts the mpolset binary, too.
func (policy *LinuxMemoryPolicy) ToCommandInjectionAdjustment(ctr *api.Container) (*api.ContainerAdjustment, error) {
ca := &api.ContainerAdjustment{}
ca.AddMount(&api.Mount{
Source: mpolsetInjectDir,
Destination: mpolsetInjectDir,
Type: "bind",
Options: []string{"bind", "ro", "rslave"},
})
mpolsetArgs := []string{
filepath.Join(mpolsetInjectDir, "mpolset"),
"--mode", policy.Mode,
"--nodes", policy.Nodes,
}
if len(policy.Flags) > 0 {
mpolsetArgs = append(mpolsetArgs, "--flags", strings.Join(policy.Flags, ","))
}
if veryVerbose {
mpolsetArgs = append(mpolsetArgs, "-vv")
}
mpolsetArgs = append(mpolsetArgs, "--")
ca.SetArgs(append(mpolsetArgs, ctr.GetArgs()...))
return ca, nil
}
// CreateContainer modifies container when it is being created.
func (p *plugin) CreateContainer(ctx context.Context, pod *api.PodSandbox, ctr *api.Container) (*api.ContainerAdjustment, []*api.ContainerUpdate, error) {
var ca *api.ContainerAdjustment
var err error
ppName := pprintCtr(pod, ctr)
log.Tracef("CreateContainer %s", ppName)
policySpec, err := p.getPolicySpec(pod, ctr)
if err != nil {
log.Errorf("CreateContainer %s: failed to get policy: %v", ppName, err)
return nil, nil, err
}
if policySpec == nil || policySpec.Mode == "" {
log.Tracef("- no memory policy")
return nil, nil, nil
}
policy, err := policySpec.ToLinuxMemoryPolicy(ctr)
if err != nil || policy == nil {
log.Errorf("CreateContainer %s: failed to convert policy to LinuxMemoryPolicy: %v", ppName, err)
return nil, nil, err
}
if p.config.InjectMpolset {
if ca, err = policy.ToCommandInjectionAdjustment(ctr); err != nil {
log.Errorf("CreateContainer %s: failed to convert adjustment into mpolset command: %v", ppName, err)
return nil, nil, err
}
} else {
if ca, err = policy.ToMemoryPolicyAdjustment(); err != nil {
log.Errorf("CreateContainer %s: failed to convert policy to ContainerAdjustment: %v", ppName, err)
return nil, nil, err
}
}
log.Debugf("CreateContainer %s: memory policy %s (resolved nodes: %s)", ppName, policySpec, policy.Nodes)
log.Tracef("- adjustment: %+v", ca)
return ca, nil, nil
}
func main() {
var (
pluginName string
pluginIdx string
configFile string
err error
)
log = logrus.StandardLogger()
log.SetFormatter(&logrus.TextFormatter{
PadLevelText: true,
})
flag.StringVar(&pluginName, "name", "", "plugin name to register to NRI")
flag.StringVar(&pluginIdx, "idx", "", "plugin index to register to NRI")
flag.StringVar(&configFile, "config", "", "configuration file name")
flag.BoolVar(&verbose, "v", false, "verbose output")
flag.BoolVar(&veryVerbose, "vv", false, "very verbose output and run mpolset -vv injected in containers")
flag.Parse()
if verbose {
log.SetLevel(logrus.DebugLevel)
}
if veryVerbose {
log.SetLevel(logrus.TraceLevel)
}
p := &plugin{}
if configFile != "" {
log.Debugf("read configuration from %q", configFile)
config, err := os.ReadFile(configFile)
if err != nil {
log.Fatalf("error reading configuration file %q: %s", configFile, err)
}
if err = p.setConfig(config); err != nil {
log.Fatalf("error applying configuration from file %q: %s", configFile, err)
}
}
sys, err = system.DiscoverSystem(system.DiscoverCPUTopology)
if err != nil {
log.Fatalf("failed to discover CPU topology: %v", err)
}
opts := []stub.Option{
stub.WithOnClose(p.onClose),
}
if pluginName != "" {
opts = append(opts, stub.WithPluginName(pluginName))
}
if pluginIdx != "" {
opts = append(opts, stub.WithPluginIdx(pluginIdx))
}
if p.stub, err = stub.New(p, opts...); err != nil {
log.Fatalf("failed to create plugin stub: %v", err)
}
if err = p.stub.Run(context.Background()); err != nil {
log.Errorf("plugin exited (%v)", err)
os.Exit(1)
}
}

View File

@ -1,4 +1,4 @@
ARG GO_VERSION=1.23
ARG GO_VERSION=1.24
FROM golang:${GO_VERSION}-bullseye AS builder

View File

@ -1,4 +1,4 @@
// Copyright 2023 Inter Corporation. All Rights Reserved.
// Copyright 2023 Intel Corporation. All Rights Reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.

View File

@ -1,4 +1,4 @@
ARG GO_VERSION=1.23
ARG GO_VERSION=1.24
FROM golang:${GO_VERSION}-bullseye AS builder

View File

@ -1,4 +1,4 @@
ARG GO_VERSION=1.23
ARG GO_VERSION=1.24
FROM golang:${GO_VERSION}-bullseye AS builder

View File

@ -1,4 +1,4 @@
ARG GO_VERSION=1.23
ARG GO_VERSION=1.24
FROM golang:${GO_VERSION}-bullseye AS builder

View File

@ -1,4 +1,4 @@
ARG GO_VERSION=1.23
ARG GO_VERSION=1.24
FROM golang:${GO_VERSION}-bullseye AS builder

View File

@ -14,7 +14,12 @@
package topologyaware
import libmem "github.com/containers/nri-plugins/pkg/resmgr/lib/memory"
import (
"fmt"
"github.com/containers/nri-plugins/pkg/agent/podresapi"
libmem "github.com/containers/nri-plugins/pkg/resmgr/lib/memory"
)
func (p *policy) getMemOffer(pool Node, req Request) (*libmem.Offer, error) {
var (
@ -45,6 +50,61 @@ func (p *policy) getMemOffer(pool Node, req Request) (*libmem.Offer, error) {
return o, err
}
func (p *policy) getMemOfferByHints(pool Node, req Request) (*libmem.Offer, error) {
ctr := req.GetContainer()
memType := req.MemoryType()
if memType == memoryPreserve {
return nil, fmt.Errorf("%s by hints: memoryPreserve requested", pool.Name())
}
zone := libmem.NodeMask(0)
from := libmem.NewNodeMask(pool.GetMemset(memType).Members()...)
mtyp := libmem.TypeMask(memType)
for provider, hint := range ctr.GetTopologyHints() {
if !podresapi.IsPodResourceHint(provider) {
continue
}
m, err := libmem.ParseNodeMask(hint.NUMAs)
if err != nil {
return nil, err
}
if !from.And(m).Contains(m.Slice()...) {
return nil, fmt.Errorf("%s by hints: %s of wrong type (%s)", pool.Name(), m, mtyp)
}
zone = zone.Set(m.Slice()...)
}
if zone == libmem.NodeMask(0) {
return nil, fmt.Errorf("%s by hints: no pod resource API hints", pool.Name())
}
if zoneType := p.memAllocator.ZoneType(zone); zoneType != mtyp {
return nil, fmt.Errorf("%s by hints: no type %s", pool.Name(), mtyp.Clear(zoneType.Slice()...))
}
o, err := p.memAllocator.GetOffer(
libmem.ContainerWithTypes(
ctr.GetID(),
ctr.PrettyName(),
string(ctr.GetQOSClass()),
req.MemAmountToAllocate(),
zone,
mtyp,
),
)
if err != nil {
return nil, err
}
return o, nil
}
func (p *policy) restoreMemOffer(g Grant) (*libmem.Offer, error) {
var (
ctr = g.GetContainer()

View File

@ -261,6 +261,9 @@ func (fake *mockSystem) CoreKindCPUs(sysfs.CoreKind) cpuset.CPUSet {
func (fake *mockSystem) CoreKinds() []sysfs.CoreKind {
return nil
}
func (fake *mockSystem) IDSetForCPUs(cpus cpuset.CPUSet, f func(cpu system.CPU) idset.ID) idset.IDSet {
panic("unimplemented")
}
func (fake *mockSystem) AllThreadsForCPUs(cpuset.CPUSet) cpuset.CPUSet {
return cpuset.New()
}

View File

@ -39,6 +39,8 @@ const (
keyCpuPriorityPreference = "prefer-cpu-priority"
// annotation key for hiding hyperthreads from allocated CPU sets
keyHideHyperthreads = "hide-hyperthreads"
// annotation key for picking individual resources by topology hints
keyPickResourcesByHints = "pick-resources-by-hints"
// effective annotation key for isolated CPU preference
preferIsolatedCPUsKey = "prefer-isolated-cpus" + "." + kubernetes.ResmgrKeyNamespace
@ -54,6 +56,8 @@ const (
preferCpuPriorityKey = keyCpuPriorityPreference + "." + kubernetes.ResmgrKeyNamespace
// effective annotation key for hiding hyperthreads
hideHyperthreadsKey = keyHideHyperthreads + "." + kubernetes.ResmgrKeyNamespace
// effective annotation key for picking resources by topology hints
pickResourcesByHints = keyPickResourcesByHints + "." + kubernetes.ResmgrKeyNamespace
)
type prefKind int
@ -449,6 +453,25 @@ func memoryAllocationPreference(pod cache.Pod, c cache.Container) (int64, int64,
return req, lim, mtype
}
func pickByHintsPreference(pod cache.Pod, container cache.Container) bool {
value, ok := pod.GetEffectiveAnnotation(pickResourcesByHints, container.GetName())
if !ok {
return false
}
pick, err := strconv.ParseBool(value)
if err != nil {
log.Error("failed to parse pick resources by hints preference %s = '%s': %v",
pickResourcesByHints, value, err)
return false
}
log.Debug("%s: effective pick resources by hints preference %v",
container.PrettyName(), pick)
return pick
}
// String stringifies a cpuClass.
func (t cpuClass) String() string {
if cpuClassName, ok := cpuClassNames[t]; ok {

View File

@ -670,10 +670,11 @@ func (p *policy) compareScores(request Request, pools []Node, scores map[int]Sco
//
// - insufficient isolated, reserved or shared capacity loses
// - if we have affinity, the higher affinity score wins
// - if only one node matches the memory type request, it wins
// - if we have topology hints
// * better hint score wins
// * for a tie, prefer the lower node then the smaller id
// - if we have a better matching or tighter fitting memory offer, it wins
// - if only one node matches the memory type request, it wins
// - for low-prio and high-prio CPU preference, if only one node has such CPUs, it wins
// - if a node is lower in the tree it wins
// - for reserved allocations
@ -722,6 +723,55 @@ func (p *policy) compareScores(request Request, pools []Node, scores map[int]Sco
log.Debug(" - affinity is a TIE")
// better topology hint score wins
hScores1 := score1.HintScores()
if len(hScores1) > 0 {
hScores2 := score2.HintScores()
hs1, nz1 := combineHintScores(hScores1)
hs2, nz2 := combineHintScores(hScores2)
if hs1 > hs2 {
log.Debug(" => %s WINS on hints", node1.Name())
return true
}
if hs2 > hs1 {
log.Debug(" => %s WINS on hints", node2.Name())
return false
}
log.Debug(" - hints are a TIE")
if hs1 == 0 {
if nz1 > nz2 {
log.Debug(" => %s WINS on non-zero hints", node1.Name())
return true
}
if nz2 > nz1 {
log.Debug(" => %s WINS on non-zero hints", node2.Name())
return false
}
log.Debug(" - non-zero hints are a TIE")
}
// for a tie, prefer lower nodes and smaller ids
if hs1 == hs2 && nz1 == nz2 && (hs1 != 0 || nz1 != 0) {
if depth1 > depth2 {
log.Debug(" => %s WINS as it is lower", node1.Name())
return true
}
if depth1 < depth2 {
log.Debug(" => %s WINS as it is lower", node2.Name())
return false
}
log.Debug(" => %s WINS based on equal hint socres, lower id",
map[bool]string{true: node1.Name(), false: node2.Name()}[id1 < id2])
return id1 < id2
}
}
// better matching or tighter memory offer wins
switch {
case o1 != nil && o2 == nil:
@ -789,55 +839,6 @@ func (p *policy) compareScores(request Request, pools []Node, scores map[int]Sco
log.Debug(" - memory type is a TIE")
}
// better topology hint score wins
hScores1 := score1.HintScores()
if len(hScores1) > 0 {
hScores2 := score2.HintScores()
hs1, nz1 := combineHintScores(hScores1)
hs2, nz2 := combineHintScores(hScores2)
if hs1 > hs2 {
log.Debug(" => %s WINS on hints", node1.Name())
return true
}
if hs2 > hs1 {
log.Debug(" => %s WINS on hints", node2.Name())
return false
}
log.Debug(" - hints are a TIE")
if hs1 == 0 {
if nz1 > nz2 {
log.Debug(" => %s WINS on non-zero hints", node1.Name())
return true
}
if nz2 > nz1 {
log.Debug(" => %s WINS on non-zero hints", node2.Name())
return false
}
log.Debug(" - non-zero hints are a TIE")
}
// for a tie, prefer lower nodes and smaller ids
if hs1 == hs2 && nz1 == nz2 && (hs1 != 0 || nz1 != 0) {
if depth1 > depth2 {
log.Debug(" => %s WINS as it is lower", node1.Name())
return true
}
if depth1 < depth2 {
log.Debug(" => %s WINS as it is lower", node2.Name())
return false
}
log.Debug(" => %s WINS based on equal hint socres, lower id",
map[bool]string{true: node1.Name(), false: node2.Name()}[id1 < id2])
return id1 < id2
}
}
// for low-prio and high-prio CPU preference, the only fulfilling node wins
log.Debug(" - preferred CPU priority is %s", request.CPUPrio())
switch request.CPUPrio() {

View File

@ -19,7 +19,9 @@ import (
"strconv"
"time"
"github.com/containers/nri-plugins/pkg/agent/podresapi"
"github.com/containers/nri-plugins/pkg/sysfs"
"github.com/containers/nri-plugins/pkg/topology"
"github.com/containers/nri-plugins/pkg/utils/cpuset"
"github.com/containers/nri-plugins/pkg/cpuallocator"
@ -113,6 +115,8 @@ type Request interface {
MemAmountToAllocate() int64
// MemoryLimit returns the memory limit for the request.
MemoryLimit() int64
// PickByHints returns if picking resources by hints is preferred for this request.
PickByHints() bool
// ColdStart returns the cold start timeout.
ColdStart() time.Duration
}
@ -216,15 +220,16 @@ var _ Supply = &supply{}
// request implements our Request interface.
type request struct {
container cache.Container // container for this request
full int // number of full CPUs requested
fraction int // amount of fractional CPU requested
isolate bool // prefer isolated exclusive CPUs
cpuType cpuClass // preferred CPU type (normal, reserved)
prio cpuPrio // CPU priority preference, ignored for fraction requests
memReq int64
memLim int64
memType memoryType // requested types of memory
container cache.Container // container for this request
full int // number of full CPUs requested
fraction int // amount of fractional CPU requested
isolate bool // prefer isolated exclusive CPUs
cpuType cpuClass // preferred CPU type (normal, reserved)
prio cpuPrio // CPU priority preference, ignored for fraction requests
memReq int64 // memory request
memLim int64 // memory limit
memType memoryType // requested types of memory
pickByHints bool // preference to pick resources by hints
// coldStart tells the timeout (in milliseconds) how long to wait until
// a DRAM memory controller should be added to a container asking for a
@ -379,8 +384,11 @@ func (cs *supply) Allocate(r Request, o *libmem.Offer) (Grant, map[string]libmem
// AllocateCPU allocates CPU for a grant from the supply.
func (cs *supply) AllocateCPU(r Request) (Grant, error) {
var exclusive cpuset.CPUSet
var err error
var (
exclusive cpuset.CPUSet
err error
ok bool
)
cr := r.(*request)
@ -407,7 +415,14 @@ func (cs *supply) AllocateCPU(r Request) (Grant, error) {
// allocate isolated exclusive CPUs or slice them off the sharable set
switch {
case full > 0 && cs.isolated.Size() >= full && cr.isolate:
exclusive, err = cs.takeCPUs(&cs.isolated, nil, full, cr.CPUPrio())
if cr.PickByHints() {
exclusive, ok = cs.takeCPUsByHints(&cs.isolated, cr)
if !ok {
exclusive, err = cs.takeCPUs(&cs.isolated, nil, full, cr.CPUPrio())
}
} else {
exclusive, err = cs.takeCPUs(&cs.isolated, nil, full, cr.CPUPrio())
}
if err != nil {
return nil, policyError("internal error: "+
"%s: can't take %d exclusive isolated CPUs from %s: %v",
@ -415,7 +430,14 @@ func (cs *supply) AllocateCPU(r Request) (Grant, error) {
}
case full > 0 && cs.AllocatableSharedCPU() > 1000*full:
exclusive, err = cs.takeCPUs(&cs.sharable, nil, full, cr.CPUPrio())
if cr.PickByHints() {
exclusive, ok = cs.takeCPUsByHints(&cs.sharable, cr)
if !ok {
exclusive, err = cs.takeCPUs(&cs.sharable, nil, full, cr.CPUPrio())
}
} else {
exclusive, err = cs.takeCPUs(&cs.sharable, nil, full, cr.CPUPrio())
}
if err != nil {
return nil, policyError("internal error: "+
"%s: can't take %d exclusive CPUs from %s: %v",
@ -524,6 +546,52 @@ func (cs *supply) takeCPUs(from, to *cpuset.CPUSet, cnt int, prio cpuPrio) (cpus
return cset, err
}
// takeCPUsByHints tries to allocate isolated or exclusive CPUs by topology hints.
func (cs *supply) takeCPUsByHints(from *cpuset.CPUSet, cr *request) (cpuset.CPUSet, bool) {
hints := []*topology.Hint{}
for provider, hint := range cr.GetContainer().GetTopologyHints() {
if podresapi.IsPodResourceHint(provider) {
hints = append(hints, &hint)
}
}
if len(hints) == 0 || len(hints) > cr.full {
return cpuset.New(), false
}
total := cr.full
perHint := 1
if len(hints) < total && total%len(hints) == 0 {
perHint = total / len(hints)
}
free := (*from).Clone()
cpus := cpuset.New()
for _, h := range hints {
cset := free.Intersection(cpuset.MustParse(h.CPUs))
cs, err := cs.takeCPUs(&cset, nil, perHint, cr.CPUPrio())
if err != nil {
log.Errorf("failed to allocate CPUs by topology hints: %v", err)
return cpuset.New(), false
}
cpus = cpus.Union(cs)
free = free.Difference(cs)
total -= perHint
}
if total > 0 {
cs, err := cs.takeCPUs(&free, nil, total, cr.CPUPrio())
if err != nil {
log.Errorf("failed to allocate CPUs by topology hints: %v", err)
return cpuset.New(), false
}
cpus = cpus.Union(cs)
free = free.Difference(cs)
}
*from = free
return cpus, true
}
// DumpCapacity returns a printable representation of the supply's resource capacity.
func (cs *supply) DumpCapacity() string {
cpu, mem, sep := "", "", ""
@ -680,16 +748,17 @@ func newRequest(container cache.Container, types libmem.TypeMask) Request {
}
return &request{
container: container,
full: full,
fraction: fraction,
isolate: isolate,
cpuType: cpuType,
memReq: req,
memLim: lim,
memType: mtype,
coldStart: coldStart,
prio: prio,
container: container,
full: full,
fraction: fraction,
isolate: isolate,
cpuType: cpuType,
memReq: req,
memLim: lim,
memType: mtype,
coldStart: coldStart,
prio: prio,
pickByHints: pickByHintsPreference(pod, container),
}
}
@ -767,6 +836,10 @@ func (cr *request) MemoryType() memoryType {
return cr.memType
}
func (cr *request) PickByHints() bool {
return cr.pickByHints
}
// ColdStart returns the cold start timeout (in milliseconds).
func (cr *request) ColdStart() time.Duration {
return cr.coldStart
@ -850,7 +923,21 @@ func (cs *supply) GetScore(req Request) Score {
node = cs.node.Policy().root
}
o, err := node.Policy().getMemOffer(node, cr)
var (
o *libmem.Offer
err error
)
if cr.PickByHints() {
o, err = node.Policy().getMemOfferByHints(node, cr)
if err != nil {
log.Errorf("failed to get offer by hints: %v", err)
o, err = node.Policy().getMemOffer(node, cr)
}
} else {
o, err = node.Policy().getMemOffer(node, cr)
}
if err != nil {
log.Error("failed to get offer for request %s: %v", req, err)
} else {

View File

@ -410,15 +410,15 @@ func (p *policy) ExportResourceData(c cache.Container) map[string]string {
dram := mems.And(p.memAllocator.Masks().NodesByTypes(libmem.TypeMaskDRAM))
pmem := mems.And(p.memAllocator.Masks().NodesByTypes(libmem.TypeMaskPMEM))
hbm := mems.And(p.memAllocator.Masks().NodesByTypes(libmem.TypeMaskHBM))
data["ALL_MEMS"] = mems.String()
data["ALL_MEMS"] = mems.MemsetString()
if dram.Size() > 0 {
data["DRAM_MEMS"] = dram.String()
data["DRAM_MEMS"] = dram.MemsetString()
}
if pmem.Size() > 0 {
data["PMEM_MEMS"] = pmem.String()
data["PMEM_MEMS"] = pmem.MemsetString()
}
if hbm.Size() > 0 {
data["HBM_MEMS"] = hbm.String()
data["HBM_MEMS"] = hbm.MemsetString()
}
return data

View File

@ -94,6 +94,28 @@ spec:
AllocatorTopologyBalancing is the balloon type specific
parameter of the policy level parameter with the same name.
type: boolean
components:
description: |-
Components is a list of component properties. Every
component has a balloonType property according to which
CPUs are allocated for that component. Specifying the
Components list makes this a composite balloon type whose
instances uses all CPUs of its component instances, and no
other CPUs.
items:
description: BalloonDefComponent contains a balloon component
definition.
properties:
balloonType:
description: |-
BalloonType is the name of the balloon type of this
component. It must match the name of a balloon type
defined in the ballonTypes of the policy.
type: string
required:
- balloonType
type: object
type: array
cpuClass:
description: |-
CpuClass controls how CPUs of a balloon are (re)configured
@ -372,6 +394,308 @@ spec:
When enabled, policy implementations can adjust cache allocation
and memory bandwidth by assigning containers to RDT classes.
type: boolean
force:
description: Force indicates if the configuration should be
forced to goresctrl.
type: boolean
options:
description: Options container common goresctrl/rdt settings.
properties:
l2:
description: CatOptions contains the common settings for
cache allocation.
properties:
optional:
type: boolean
required:
- optional
type: object
l3:
description: CatOptions contains the common settings for
cache allocation.
properties:
optional:
type: boolean
required:
- optional
type: object
mb:
description: MbOptions contains the common settings for
memory bandwidth allocation.
properties:
optional:
type: boolean
required:
- optional
type: object
type: object
partitions:
additionalProperties:
description: PartitionConfig provides configuration for
a single cache partition.
properties:
classes:
additionalProperties:
description: ClassConfig provides configuration for
a single named cache CLOS/class.
properties:
l2Allocation:
additionalProperties:
description: "CacheIdCatConfig is the cache
allocation configuration for one cache id.\nCode
and Data represent an optional configuration
for separate code and data\npaths and only
have effect when RDT CDP (Code and Data Prioritization)
is\nenabled in the system. Code and Data go
in tandem so that both or neither\nmust be
specified - only specifying the other is considered
a configuration\nerror.\n\n\tTODO(klihub):
Ideally we'd have a validation rule ensuring
that either\n\tunified or code+data are set
here. I tried that using a CEL-expression\n\tbut
couldn't avoid hitting the complexity estimation
limit (even with\n\textra MaxProperties limits
thrown in). Maybe we'll be able to do that\n\teventually
with https://github.com/kubernetes-sigs/controller-tools/pull/1212"
properties:
code:
description: |-
CacheProportion specifies a share of the available cache lines.
Supported formats:
- percentage, e.g. `50%`
- percentage range, e.g. `50-60%`
- bit numbers, e.g. `0-5`, `2,3`, must contain one contiguous block of bits set
- hex bitmask, e.g. `0xff0`, must contain one contiguous block of bits set
type: string
data:
description: |-
CacheProportion specifies a share of the available cache lines.
Supported formats:
- percentage, e.g. `50%`
- percentage range, e.g. `50-60%`
- bit numbers, e.g. `0-5`, `2,3`, must contain one contiguous block of bits set
- hex bitmask, e.g. `0xff0`, must contain one contiguous block of bits set
type: string
unified:
description: |-
CacheProportion specifies a share of the available cache lines.
Supported formats:
- percentage, e.g. `50%`
- percentage range, e.g. `50-60%`
- bit numbers, e.g. `0-5`, `2,3`, must contain one contiguous block of bits set
- hex bitmask, e.g. `0xff0`, must contain one contiguous block of bits set
type: string
type: object
description: CatConfig contains the L2 or L3 cache
allocation configuration for one partition or
class.
type: object
l3Allocation:
additionalProperties:
description: "CacheIdCatConfig is the cache
allocation configuration for one cache id.\nCode
and Data represent an optional configuration
for separate code and data\npaths and only
have effect when RDT CDP (Code and Data Prioritization)
is\nenabled in the system. Code and Data go
in tandem so that both or neither\nmust be
specified - only specifying the other is considered
a configuration\nerror.\n\n\tTODO(klihub):
Ideally we'd have a validation rule ensuring
that either\n\tunified or code+data are set
here. I tried that using a CEL-expression\n\tbut
couldn't avoid hitting the complexity estimation
limit (even with\n\textra MaxProperties limits
thrown in). Maybe we'll be able to do that\n\teventually
with https://github.com/kubernetes-sigs/controller-tools/pull/1212"
properties:
code:
description: |-
CacheProportion specifies a share of the available cache lines.
Supported formats:
- percentage, e.g. `50%`
- percentage range, e.g. `50-60%`
- bit numbers, e.g. `0-5`, `2,3`, must contain one contiguous block of bits set
- hex bitmask, e.g. `0xff0`, must contain one contiguous block of bits set
type: string
data:
description: |-
CacheProportion specifies a share of the available cache lines.
Supported formats:
- percentage, e.g. `50%`
- percentage range, e.g. `50-60%`
- bit numbers, e.g. `0-5`, `2,3`, must contain one contiguous block of bits set
- hex bitmask, e.g. `0xff0`, must contain one contiguous block of bits set
type: string
unified:
description: |-
CacheProportion specifies a share of the available cache lines.
Supported formats:
- percentage, e.g. `50%`
- percentage range, e.g. `50-60%`
- bit numbers, e.g. `0-5`, `2,3`, must contain one contiguous block of bits set
- hex bitmask, e.g. `0xff0`, must contain one contiguous block of bits set
type: string
type: object
description: CatConfig contains the L2 or L3 cache
allocation configuration for one partition or
class.
type: object
mbAllocation:
additionalProperties:
description: |-
CacheIdMbaConfig is the memory bandwidth configuration for one cache id.
It's an array of at most two values, specifying separate values to be used
for percentage based and MBps based memory bandwidth allocation. For
example, `{"80%", "1000MBps"}` would allocate 80% if percentage based
allocation is used by the Linux kernel, or 1000 MBps in case MBps based
allocation is in use.
items:
description: |-
MbProportion specifies a share of available memory bandwidth. It's an
integer value followed by a unit. Two units are supported:
- percentage, e.g. `80%`
- MBps, e.g. `1000MBps`
type: string
type: array
description: MbaConfig contains the memory bandwidth
configuration for one partition or class.
type: object
type: object
type: object
l2Allocation:
additionalProperties:
description: "CacheIdCatConfig is the cache allocation
configuration for one cache id.\nCode and Data represent
an optional configuration for separate code and
data\npaths and only have effect when RDT CDP (Code
and Data Prioritization) is\nenabled in the system.
Code and Data go in tandem so that both or neither\nmust
be specified - only specifying the other is considered
a configuration\nerror.\n\n\tTODO(klihub): Ideally
we'd have a validation rule ensuring that either\n\tunified
or code+data are set here. I tried that using a
CEL-expression\n\tbut couldn't avoid hitting the
complexity estimation limit (even with\n\textra
MaxProperties limits thrown in). Maybe we'll be
able to do that\n\teventually with https://github.com/kubernetes-sigs/controller-tools/pull/1212"
properties:
code:
description: |-
CacheProportion specifies a share of the available cache lines.
Supported formats:
- percentage, e.g. `50%`
- percentage range, e.g. `50-60%`
- bit numbers, e.g. `0-5`, `2,3`, must contain one contiguous block of bits set
- hex bitmask, e.g. `0xff0`, must contain one contiguous block of bits set
type: string
data:
description: |-
CacheProportion specifies a share of the available cache lines.
Supported formats:
- percentage, e.g. `50%`
- percentage range, e.g. `50-60%`
- bit numbers, e.g. `0-5`, `2,3`, must contain one contiguous block of bits set
- hex bitmask, e.g. `0xff0`, must contain one contiguous block of bits set
type: string
unified:
description: |-
CacheProportion specifies a share of the available cache lines.
Supported formats:
- percentage, e.g. `50%`
- percentage range, e.g. `50-60%`
- bit numbers, e.g. `0-5`, `2,3`, must contain one contiguous block of bits set
- hex bitmask, e.g. `0xff0`, must contain one contiguous block of bits set
type: string
type: object
description: CatConfig contains the L2 or L3 cache allocation
configuration for one partition or class.
type: object
l3Allocation:
additionalProperties:
description: "CacheIdCatConfig is the cache allocation
configuration for one cache id.\nCode and Data represent
an optional configuration for separate code and
data\npaths and only have effect when RDT CDP (Code
and Data Prioritization) is\nenabled in the system.
Code and Data go in tandem so that both or neither\nmust
be specified - only specifying the other is considered
a configuration\nerror.\n\n\tTODO(klihub): Ideally
we'd have a validation rule ensuring that either\n\tunified
or code+data are set here. I tried that using a
CEL-expression\n\tbut couldn't avoid hitting the
complexity estimation limit (even with\n\textra
MaxProperties limits thrown in). Maybe we'll be
able to do that\n\teventually with https://github.com/kubernetes-sigs/controller-tools/pull/1212"
properties:
code:
description: |-
CacheProportion specifies a share of the available cache lines.
Supported formats:
- percentage, e.g. `50%`
- percentage range, e.g. `50-60%`
- bit numbers, e.g. `0-5`, `2,3`, must contain one contiguous block of bits set
- hex bitmask, e.g. `0xff0`, must contain one contiguous block of bits set
type: string
data:
description: |-
CacheProportion specifies a share of the available cache lines.
Supported formats:
- percentage, e.g. `50%`
- percentage range, e.g. `50-60%`
- bit numbers, e.g. `0-5`, `2,3`, must contain one contiguous block of bits set
- hex bitmask, e.g. `0xff0`, must contain one contiguous block of bits set
type: string
unified:
description: |-
CacheProportion specifies a share of the available cache lines.
Supported formats:
- percentage, e.g. `50%`
- percentage range, e.g. `50-60%`
- bit numbers, e.g. `0-5`, `2,3`, must contain one contiguous block of bits set
- hex bitmask, e.g. `0xff0`, must contain one contiguous block of bits set
type: string
type: object
description: CatConfig contains the L2 or L3 cache allocation
configuration for one partition or class.
type: object
mbAllocation:
additionalProperties:
description: |-
CacheIdMbaConfig is the memory bandwidth configuration for one cache id.
It's an array of at most two values, specifying separate values to be used
for percentage based and MBps based memory bandwidth allocation. For
example, `{"80%", "1000MBps"}` would allocate 80% if percentage based
allocation is used by the Linux kernel, or 1000 MBps in case MBps based
allocation is in use.
items:
description: |-
MbProportion specifies a share of available memory bandwidth. It's an
integer value followed by a unit. Two units are supported:
- percentage, e.g. `80%`
- MBps, e.g. `1000MBps`
type: string
type: array
description: MbaConfig contains the memory bandwidth
configuration for one partition or class.
type: object
type: object
description: Partitions configure cache partitions.
type: object
usePodQoSAsDefaultClass:
description: |-
usePodQoSAsDefaultClass controls whether a container's Pod QoS

View File

@ -123,6 +123,308 @@ spec:
When enabled, policy implementations can adjust cache allocation
and memory bandwidth by assigning containers to RDT classes.
type: boolean
force:
description: Force indicates if the configuration should be
forced to goresctrl.
type: boolean
options:
description: Options container common goresctrl/rdt settings.
properties:
l2:
description: CatOptions contains the common settings for
cache allocation.
properties:
optional:
type: boolean
required:
- optional
type: object
l3:
description: CatOptions contains the common settings for
cache allocation.
properties:
optional:
type: boolean
required:
- optional
type: object
mb:
description: MbOptions contains the common settings for
memory bandwidth allocation.
properties:
optional:
type: boolean
required:
- optional
type: object
type: object
partitions:
additionalProperties:
description: PartitionConfig provides configuration for
a single cache partition.
properties:
classes:
additionalProperties:
description: ClassConfig provides configuration for
a single named cache CLOS/class.
properties:
l2Allocation:
additionalProperties:
description: "CacheIdCatConfig is the cache
allocation configuration for one cache id.\nCode
and Data represent an optional configuration
for separate code and data\npaths and only
have effect when RDT CDP (Code and Data Prioritization)
is\nenabled in the system. Code and Data go
in tandem so that both or neither\nmust be
specified - only specifying the other is considered
a configuration\nerror.\n\n\tTODO(klihub):
Ideally we'd have a validation rule ensuring
that either\n\tunified or code+data are set
here. I tried that using a CEL-expression\n\tbut
couldn't avoid hitting the complexity estimation
limit (even with\n\textra MaxProperties limits
thrown in). Maybe we'll be able to do that\n\teventually
with https://github.com/kubernetes-sigs/controller-tools/pull/1212"
properties:
code:
description: |-
CacheProportion specifies a share of the available cache lines.
Supported formats:
- percentage, e.g. `50%`
- percentage range, e.g. `50-60%`
- bit numbers, e.g. `0-5`, `2,3`, must contain one contiguous block of bits set
- hex bitmask, e.g. `0xff0`, must contain one contiguous block of bits set
type: string
data:
description: |-
CacheProportion specifies a share of the available cache lines.
Supported formats:
- percentage, e.g. `50%`
- percentage range, e.g. `50-60%`
- bit numbers, e.g. `0-5`, `2,3`, must contain one contiguous block of bits set
- hex bitmask, e.g. `0xff0`, must contain one contiguous block of bits set
type: string
unified:
description: |-
CacheProportion specifies a share of the available cache lines.
Supported formats:
- percentage, e.g. `50%`
- percentage range, e.g. `50-60%`
- bit numbers, e.g. `0-5`, `2,3`, must contain one contiguous block of bits set
- hex bitmask, e.g. `0xff0`, must contain one contiguous block of bits set
type: string
type: object
description: CatConfig contains the L2 or L3 cache
allocation configuration for one partition or
class.
type: object
l3Allocation:
additionalProperties:
description: "CacheIdCatConfig is the cache
allocation configuration for one cache id.\nCode
and Data represent an optional configuration
for separate code and data\npaths and only
have effect when RDT CDP (Code and Data Prioritization)
is\nenabled in the system. Code and Data go
in tandem so that both or neither\nmust be
specified - only specifying the other is considered
a configuration\nerror.\n\n\tTODO(klihub):
Ideally we'd have a validation rule ensuring
that either\n\tunified or code+data are set
here. I tried that using a CEL-expression\n\tbut
couldn't avoid hitting the complexity estimation
limit (even with\n\textra MaxProperties limits
thrown in). Maybe we'll be able to do that\n\teventually
with https://github.com/kubernetes-sigs/controller-tools/pull/1212"
properties:
code:
description: |-
CacheProportion specifies a share of the available cache lines.
Supported formats:
- percentage, e.g. `50%`
- percentage range, e.g. `50-60%`
- bit numbers, e.g. `0-5`, `2,3`, must contain one contiguous block of bits set
- hex bitmask, e.g. `0xff0`, must contain one contiguous block of bits set
type: string
data:
description: |-
CacheProportion specifies a share of the available cache lines.
Supported formats:
- percentage, e.g. `50%`
- percentage range, e.g. `50-60%`
- bit numbers, e.g. `0-5`, `2,3`, must contain one contiguous block of bits set
- hex bitmask, e.g. `0xff0`, must contain one contiguous block of bits set
type: string
unified:
description: |-
CacheProportion specifies a share of the available cache lines.
Supported formats:
- percentage, e.g. `50%`
- percentage range, e.g. `50-60%`
- bit numbers, e.g. `0-5`, `2,3`, must contain one contiguous block of bits set
- hex bitmask, e.g. `0xff0`, must contain one contiguous block of bits set
type: string
type: object
description: CatConfig contains the L2 or L3 cache
allocation configuration for one partition or
class.
type: object
mbAllocation:
additionalProperties:
description: |-
CacheIdMbaConfig is the memory bandwidth configuration for one cache id.
It's an array of at most two values, specifying separate values to be used
for percentage based and MBps based memory bandwidth allocation. For
example, `{"80%", "1000MBps"}` would allocate 80% if percentage based
allocation is used by the Linux kernel, or 1000 MBps in case MBps based
allocation is in use.
items:
description: |-
MbProportion specifies a share of available memory bandwidth. It's an
integer value followed by a unit. Two units are supported:
- percentage, e.g. `80%`
- MBps, e.g. `1000MBps`
type: string
type: array
description: MbaConfig contains the memory bandwidth
configuration for one partition or class.
type: object
type: object
type: object
l2Allocation:
additionalProperties:
description: "CacheIdCatConfig is the cache allocation
configuration for one cache id.\nCode and Data represent
an optional configuration for separate code and
data\npaths and only have effect when RDT CDP (Code
and Data Prioritization) is\nenabled in the system.
Code and Data go in tandem so that both or neither\nmust
be specified - only specifying the other is considered
a configuration\nerror.\n\n\tTODO(klihub): Ideally
we'd have a validation rule ensuring that either\n\tunified
or code+data are set here. I tried that using a
CEL-expression\n\tbut couldn't avoid hitting the
complexity estimation limit (even with\n\textra
MaxProperties limits thrown in). Maybe we'll be
able to do that\n\teventually with https://github.com/kubernetes-sigs/controller-tools/pull/1212"
properties:
code:
description: |-
CacheProportion specifies a share of the available cache lines.
Supported formats:
- percentage, e.g. `50%`
- percentage range, e.g. `50-60%`
- bit numbers, e.g. `0-5`, `2,3`, must contain one contiguous block of bits set
- hex bitmask, e.g. `0xff0`, must contain one contiguous block of bits set
type: string
data:
description: |-
CacheProportion specifies a share of the available cache lines.
Supported formats:
- percentage, e.g. `50%`
- percentage range, e.g. `50-60%`
- bit numbers, e.g. `0-5`, `2,3`, must contain one contiguous block of bits set
- hex bitmask, e.g. `0xff0`, must contain one contiguous block of bits set
type: string
unified:
description: |-
CacheProportion specifies a share of the available cache lines.
Supported formats:
- percentage, e.g. `50%`
- percentage range, e.g. `50-60%`
- bit numbers, e.g. `0-5`, `2,3`, must contain one contiguous block of bits set
- hex bitmask, e.g. `0xff0`, must contain one contiguous block of bits set
type: string
type: object
description: CatConfig contains the L2 or L3 cache allocation
configuration for one partition or class.
type: object
l3Allocation:
additionalProperties:
description: "CacheIdCatConfig is the cache allocation
configuration for one cache id.\nCode and Data represent
an optional configuration for separate code and
data\npaths and only have effect when RDT CDP (Code
and Data Prioritization) is\nenabled in the system.
Code and Data go in tandem so that both or neither\nmust
be specified - only specifying the other is considered
a configuration\nerror.\n\n\tTODO(klihub): Ideally
we'd have a validation rule ensuring that either\n\tunified
or code+data are set here. I tried that using a
CEL-expression\n\tbut couldn't avoid hitting the
complexity estimation limit (even with\n\textra
MaxProperties limits thrown in). Maybe we'll be
able to do that\n\teventually with https://github.com/kubernetes-sigs/controller-tools/pull/1212"
properties:
code:
description: |-
CacheProportion specifies a share of the available cache lines.
Supported formats:
- percentage, e.g. `50%`
- percentage range, e.g. `50-60%`
- bit numbers, e.g. `0-5`, `2,3`, must contain one contiguous block of bits set
- hex bitmask, e.g. `0xff0`, must contain one contiguous block of bits set
type: string
data:
description: |-
CacheProportion specifies a share of the available cache lines.
Supported formats:
- percentage, e.g. `50%`
- percentage range, e.g. `50-60%`
- bit numbers, e.g. `0-5`, `2,3`, must contain one contiguous block of bits set
- hex bitmask, e.g. `0xff0`, must contain one contiguous block of bits set
type: string
unified:
description: |-
CacheProportion specifies a share of the available cache lines.
Supported formats:
- percentage, e.g. `50%`
- percentage range, e.g. `50-60%`
- bit numbers, e.g. `0-5`, `2,3`, must contain one contiguous block of bits set
- hex bitmask, e.g. `0xff0`, must contain one contiguous block of bits set
type: string
type: object
description: CatConfig contains the L2 or L3 cache allocation
configuration for one partition or class.
type: object
mbAllocation:
additionalProperties:
description: |-
CacheIdMbaConfig is the memory bandwidth configuration for one cache id.
It's an array of at most two values, specifying separate values to be used
for percentage based and MBps based memory bandwidth allocation. For
example, `{"80%", "1000MBps"}` would allocate 80% if percentage based
allocation is used by the Linux kernel, or 1000 MBps in case MBps based
allocation is in use.
items:
description: |-
MbProportion specifies a share of available memory bandwidth. It's an
integer value followed by a unit. Two units are supported:
- percentage, e.g. `80%`
- MBps, e.g. `1000MBps`
type: string
type: array
description: MbaConfig contains the memory bandwidth
configuration for one partition or class.
type: object
type: object
description: Partitions configure cache partitions.
type: object
usePodQoSAsDefaultClass:
description: |-
usePodQoSAsDefaultClass controls whether a container's Pod QoS

View File

@ -137,6 +137,308 @@ spec:
When enabled, policy implementations can adjust cache allocation
and memory bandwidth by assigning containers to RDT classes.
type: boolean
force:
description: Force indicates if the configuration should be
forced to goresctrl.
type: boolean
options:
description: Options container common goresctrl/rdt settings.
properties:
l2:
description: CatOptions contains the common settings for
cache allocation.
properties:
optional:
type: boolean
required:
- optional
type: object
l3:
description: CatOptions contains the common settings for
cache allocation.
properties:
optional:
type: boolean
required:
- optional
type: object
mb:
description: MbOptions contains the common settings for
memory bandwidth allocation.
properties:
optional:
type: boolean
required:
- optional
type: object
type: object
partitions:
additionalProperties:
description: PartitionConfig provides configuration for
a single cache partition.
properties:
classes:
additionalProperties:
description: ClassConfig provides configuration for
a single named cache CLOS/class.
properties:
l2Allocation:
additionalProperties:
description: "CacheIdCatConfig is the cache
allocation configuration for one cache id.\nCode
and Data represent an optional configuration
for separate code and data\npaths and only
have effect when RDT CDP (Code and Data Prioritization)
is\nenabled in the system. Code and Data go
in tandem so that both or neither\nmust be
specified - only specifying the other is considered
a configuration\nerror.\n\n\tTODO(klihub):
Ideally we'd have a validation rule ensuring
that either\n\tunified or code+data are set
here. I tried that using a CEL-expression\n\tbut
couldn't avoid hitting the complexity estimation
limit (even with\n\textra MaxProperties limits
thrown in). Maybe we'll be able to do that\n\teventually
with https://github.com/kubernetes-sigs/controller-tools/pull/1212"
properties:
code:
description: |-
CacheProportion specifies a share of the available cache lines.
Supported formats:
- percentage, e.g. `50%`
- percentage range, e.g. `50-60%`
- bit numbers, e.g. `0-5`, `2,3`, must contain one contiguous block of bits set
- hex bitmask, e.g. `0xff0`, must contain one contiguous block of bits set
type: string
data:
description: |-
CacheProportion specifies a share of the available cache lines.
Supported formats:
- percentage, e.g. `50%`
- percentage range, e.g. `50-60%`
- bit numbers, e.g. `0-5`, `2,3`, must contain one contiguous block of bits set
- hex bitmask, e.g. `0xff0`, must contain one contiguous block of bits set
type: string
unified:
description: |-
CacheProportion specifies a share of the available cache lines.
Supported formats:
- percentage, e.g. `50%`
- percentage range, e.g. `50-60%`
- bit numbers, e.g. `0-5`, `2,3`, must contain one contiguous block of bits set
- hex bitmask, e.g. `0xff0`, must contain one contiguous block of bits set
type: string
type: object
description: CatConfig contains the L2 or L3 cache
allocation configuration for one partition or
class.
type: object
l3Allocation:
additionalProperties:
description: "CacheIdCatConfig is the cache
allocation configuration for one cache id.\nCode
and Data represent an optional configuration
for separate code and data\npaths and only
have effect when RDT CDP (Code and Data Prioritization)
is\nenabled in the system. Code and Data go
in tandem so that both or neither\nmust be
specified - only specifying the other is considered
a configuration\nerror.\n\n\tTODO(klihub):
Ideally we'd have a validation rule ensuring
that either\n\tunified or code+data are set
here. I tried that using a CEL-expression\n\tbut
couldn't avoid hitting the complexity estimation
limit (even with\n\textra MaxProperties limits
thrown in). Maybe we'll be able to do that\n\teventually
with https://github.com/kubernetes-sigs/controller-tools/pull/1212"
properties:
code:
description: |-
CacheProportion specifies a share of the available cache lines.
Supported formats:
- percentage, e.g. `50%`
- percentage range, e.g. `50-60%`
- bit numbers, e.g. `0-5`, `2,3`, must contain one contiguous block of bits set
- hex bitmask, e.g. `0xff0`, must contain one contiguous block of bits set
type: string
data:
description: |-
CacheProportion specifies a share of the available cache lines.
Supported formats:
- percentage, e.g. `50%`
- percentage range, e.g. `50-60%`
- bit numbers, e.g. `0-5`, `2,3`, must contain one contiguous block of bits set
- hex bitmask, e.g. `0xff0`, must contain one contiguous block of bits set
type: string
unified:
description: |-
CacheProportion specifies a share of the available cache lines.
Supported formats:
- percentage, e.g. `50%`
- percentage range, e.g. `50-60%`
- bit numbers, e.g. `0-5`, `2,3`, must contain one contiguous block of bits set
- hex bitmask, e.g. `0xff0`, must contain one contiguous block of bits set
type: string
type: object
description: CatConfig contains the L2 or L3 cache
allocation configuration for one partition or
class.
type: object
mbAllocation:
additionalProperties:
description: |-
CacheIdMbaConfig is the memory bandwidth configuration for one cache id.
It's an array of at most two values, specifying separate values to be used
for percentage based and MBps based memory bandwidth allocation. For
example, `{"80%", "1000MBps"}` would allocate 80% if percentage based
allocation is used by the Linux kernel, or 1000 MBps in case MBps based
allocation is in use.
items:
description: |-
MbProportion specifies a share of available memory bandwidth. It's an
integer value followed by a unit. Two units are supported:
- percentage, e.g. `80%`
- MBps, e.g. `1000MBps`
type: string
type: array
description: MbaConfig contains the memory bandwidth
configuration for one partition or class.
type: object
type: object
type: object
l2Allocation:
additionalProperties:
description: "CacheIdCatConfig is the cache allocation
configuration for one cache id.\nCode and Data represent
an optional configuration for separate code and
data\npaths and only have effect when RDT CDP (Code
and Data Prioritization) is\nenabled in the system.
Code and Data go in tandem so that both or neither\nmust
be specified - only specifying the other is considered
a configuration\nerror.\n\n\tTODO(klihub): Ideally
we'd have a validation rule ensuring that either\n\tunified
or code+data are set here. I tried that using a
CEL-expression\n\tbut couldn't avoid hitting the
complexity estimation limit (even with\n\textra
MaxProperties limits thrown in). Maybe we'll be
able to do that\n\teventually with https://github.com/kubernetes-sigs/controller-tools/pull/1212"
properties:
code:
description: |-
CacheProportion specifies a share of the available cache lines.
Supported formats:
- percentage, e.g. `50%`
- percentage range, e.g. `50-60%`
- bit numbers, e.g. `0-5`, `2,3`, must contain one contiguous block of bits set
- hex bitmask, e.g. `0xff0`, must contain one contiguous block of bits set
type: string
data:
description: |-
CacheProportion specifies a share of the available cache lines.
Supported formats:
- percentage, e.g. `50%`
- percentage range, e.g. `50-60%`
- bit numbers, e.g. `0-5`, `2,3`, must contain one contiguous block of bits set
- hex bitmask, e.g. `0xff0`, must contain one contiguous block of bits set
type: string
unified:
description: |-
CacheProportion specifies a share of the available cache lines.
Supported formats:
- percentage, e.g. `50%`
- percentage range, e.g. `50-60%`
- bit numbers, e.g. `0-5`, `2,3`, must contain one contiguous block of bits set
- hex bitmask, e.g. `0xff0`, must contain one contiguous block of bits set
type: string
type: object
description: CatConfig contains the L2 or L3 cache allocation
configuration for one partition or class.
type: object
l3Allocation:
additionalProperties:
description: "CacheIdCatConfig is the cache allocation
configuration for one cache id.\nCode and Data represent
an optional configuration for separate code and
data\npaths and only have effect when RDT CDP (Code
and Data Prioritization) is\nenabled in the system.
Code and Data go in tandem so that both or neither\nmust
be specified - only specifying the other is considered
a configuration\nerror.\n\n\tTODO(klihub): Ideally
we'd have a validation rule ensuring that either\n\tunified
or code+data are set here. I tried that using a
CEL-expression\n\tbut couldn't avoid hitting the
complexity estimation limit (even with\n\textra
MaxProperties limits thrown in). Maybe we'll be
able to do that\n\teventually with https://github.com/kubernetes-sigs/controller-tools/pull/1212"
properties:
code:
description: |-
CacheProportion specifies a share of the available cache lines.
Supported formats:
- percentage, e.g. `50%`
- percentage range, e.g. `50-60%`
- bit numbers, e.g. `0-5`, `2,3`, must contain one contiguous block of bits set
- hex bitmask, e.g. `0xff0`, must contain one contiguous block of bits set
type: string
data:
description: |-
CacheProportion specifies a share of the available cache lines.
Supported formats:
- percentage, e.g. `50%`
- percentage range, e.g. `50-60%`
- bit numbers, e.g. `0-5`, `2,3`, must contain one contiguous block of bits set
- hex bitmask, e.g. `0xff0`, must contain one contiguous block of bits set
type: string
unified:
description: |-
CacheProportion specifies a share of the available cache lines.
Supported formats:
- percentage, e.g. `50%`
- percentage range, e.g. `50-60%`
- bit numbers, e.g. `0-5`, `2,3`, must contain one contiguous block of bits set
- hex bitmask, e.g. `0xff0`, must contain one contiguous block of bits set
type: string
type: object
description: CatConfig contains the L2 or L3 cache allocation
configuration for one partition or class.
type: object
mbAllocation:
additionalProperties:
description: |-
CacheIdMbaConfig is the memory bandwidth configuration for one cache id.
It's an array of at most two values, specifying separate values to be used
for percentage based and MBps based memory bandwidth allocation. For
example, `{"80%", "1000MBps"}` would allocate 80% if percentage based
allocation is used by the Linux kernel, or 1000 MBps in case MBps based
allocation is in use.
items:
description: |-
MbProportion specifies a share of available memory bandwidth. It's an
integer value followed by a unit. Two units are supported:
- percentage, e.g. `80%`
- MBps, e.g. `1000MBps`
type: string
type: array
description: MbaConfig contains the memory bandwidth
configuration for one partition or class.
type: object
type: object
description: Partitions configure cache partitions.
type: object
usePodQoSAsDefaultClass:
description: |-
usePodQoSAsDefaultClass controls whether a container's Pod QoS

View File

@ -94,6 +94,28 @@ spec:
AllocatorTopologyBalancing is the balloon type specific
parameter of the policy level parameter with the same name.
type: boolean
components:
description: |-
Components is a list of component properties. Every
component has a balloonType property according to which
CPUs are allocated for that component. Specifying the
Components list makes this a composite balloon type whose
instances uses all CPUs of its component instances, and no
other CPUs.
items:
description: BalloonDefComponent contains a balloon component
definition.
properties:
balloonType:
description: |-
BalloonType is the name of the balloon type of this
component. It must match the name of a balloon type
defined in the ballonTypes of the policy.
type: string
required:
- balloonType
type: object
type: array
cpuClass:
description: |-
CpuClass controls how CPUs of a balloon are (re)configured
@ -372,6 +394,308 @@ spec:
When enabled, policy implementations can adjust cache allocation
and memory bandwidth by assigning containers to RDT classes.
type: boolean
force:
description: Force indicates if the configuration should be
forced to goresctrl.
type: boolean
options:
description: Options container common goresctrl/rdt settings.
properties:
l2:
description: CatOptions contains the common settings for
cache allocation.
properties:
optional:
type: boolean
required:
- optional
type: object
l3:
description: CatOptions contains the common settings for
cache allocation.
properties:
optional:
type: boolean
required:
- optional
type: object
mb:
description: MbOptions contains the common settings for
memory bandwidth allocation.
properties:
optional:
type: boolean
required:
- optional
type: object
type: object
partitions:
additionalProperties:
description: PartitionConfig provides configuration for
a single cache partition.
properties:
classes:
additionalProperties:
description: ClassConfig provides configuration for
a single named cache CLOS/class.
properties:
l2Allocation:
additionalProperties:
description: "CacheIdCatConfig is the cache
allocation configuration for one cache id.\nCode
and Data represent an optional configuration
for separate code and data\npaths and only
have effect when RDT CDP (Code and Data Prioritization)
is\nenabled in the system. Code and Data go
in tandem so that both or neither\nmust be
specified - only specifying the other is considered
a configuration\nerror.\n\n\tTODO(klihub):
Ideally we'd have a validation rule ensuring
that either\n\tunified or code+data are set
here. I tried that using a CEL-expression\n\tbut
couldn't avoid hitting the complexity estimation
limit (even with\n\textra MaxProperties limits
thrown in). Maybe we'll be able to do that\n\teventually
with https://github.com/kubernetes-sigs/controller-tools/pull/1212"
properties:
code:
description: |-
CacheProportion specifies a share of the available cache lines.
Supported formats:
- percentage, e.g. `50%`
- percentage range, e.g. `50-60%`
- bit numbers, e.g. `0-5`, `2,3`, must contain one contiguous block of bits set
- hex bitmask, e.g. `0xff0`, must contain one contiguous block of bits set
type: string
data:
description: |-
CacheProportion specifies a share of the available cache lines.
Supported formats:
- percentage, e.g. `50%`
- percentage range, e.g. `50-60%`
- bit numbers, e.g. `0-5`, `2,3`, must contain one contiguous block of bits set
- hex bitmask, e.g. `0xff0`, must contain one contiguous block of bits set
type: string
unified:
description: |-
CacheProportion specifies a share of the available cache lines.
Supported formats:
- percentage, e.g. `50%`
- percentage range, e.g. `50-60%`
- bit numbers, e.g. `0-5`, `2,3`, must contain one contiguous block of bits set
- hex bitmask, e.g. `0xff0`, must contain one contiguous block of bits set
type: string
type: object
description: CatConfig contains the L2 or L3 cache
allocation configuration for one partition or
class.
type: object
l3Allocation:
additionalProperties:
description: "CacheIdCatConfig is the cache
allocation configuration for one cache id.\nCode
and Data represent an optional configuration
for separate code and data\npaths and only
have effect when RDT CDP (Code and Data Prioritization)
is\nenabled in the system. Code and Data go
in tandem so that both or neither\nmust be
specified - only specifying the other is considered
a configuration\nerror.\n\n\tTODO(klihub):
Ideally we'd have a validation rule ensuring
that either\n\tunified or code+data are set
here. I tried that using a CEL-expression\n\tbut
couldn't avoid hitting the complexity estimation
limit (even with\n\textra MaxProperties limits
thrown in). Maybe we'll be able to do that\n\teventually
with https://github.com/kubernetes-sigs/controller-tools/pull/1212"
properties:
code:
description: |-
CacheProportion specifies a share of the available cache lines.
Supported formats:
- percentage, e.g. `50%`
- percentage range, e.g. `50-60%`
- bit numbers, e.g. `0-5`, `2,3`, must contain one contiguous block of bits set
- hex bitmask, e.g. `0xff0`, must contain one contiguous block of bits set
type: string
data:
description: |-
CacheProportion specifies a share of the available cache lines.
Supported formats:
- percentage, e.g. `50%`
- percentage range, e.g. `50-60%`
- bit numbers, e.g. `0-5`, `2,3`, must contain one contiguous block of bits set
- hex bitmask, e.g. `0xff0`, must contain one contiguous block of bits set
type: string
unified:
description: |-
CacheProportion specifies a share of the available cache lines.
Supported formats:
- percentage, e.g. `50%`
- percentage range, e.g. `50-60%`
- bit numbers, e.g. `0-5`, `2,3`, must contain one contiguous block of bits set
- hex bitmask, e.g. `0xff0`, must contain one contiguous block of bits set
type: string
type: object
description: CatConfig contains the L2 or L3 cache
allocation configuration for one partition or
class.
type: object
mbAllocation:
additionalProperties:
description: |-
CacheIdMbaConfig is the memory bandwidth configuration for one cache id.
It's an array of at most two values, specifying separate values to be used
for percentage based and MBps based memory bandwidth allocation. For
example, `{"80%", "1000MBps"}` would allocate 80% if percentage based
allocation is used by the Linux kernel, or 1000 MBps in case MBps based
allocation is in use.
items:
description: |-
MbProportion specifies a share of available memory bandwidth. It's an
integer value followed by a unit. Two units are supported:
- percentage, e.g. `80%`
- MBps, e.g. `1000MBps`
type: string
type: array
description: MbaConfig contains the memory bandwidth
configuration for one partition or class.
type: object
type: object
type: object
l2Allocation:
additionalProperties:
description: "CacheIdCatConfig is the cache allocation
configuration for one cache id.\nCode and Data represent
an optional configuration for separate code and
data\npaths and only have effect when RDT CDP (Code
and Data Prioritization) is\nenabled in the system.
Code and Data go in tandem so that both or neither\nmust
be specified - only specifying the other is considered
a configuration\nerror.\n\n\tTODO(klihub): Ideally
we'd have a validation rule ensuring that either\n\tunified
or code+data are set here. I tried that using a
CEL-expression\n\tbut couldn't avoid hitting the
complexity estimation limit (even with\n\textra
MaxProperties limits thrown in). Maybe we'll be
able to do that\n\teventually with https://github.com/kubernetes-sigs/controller-tools/pull/1212"
properties:
code:
description: |-
CacheProportion specifies a share of the available cache lines.
Supported formats:
- percentage, e.g. `50%`
- percentage range, e.g. `50-60%`
- bit numbers, e.g. `0-5`, `2,3`, must contain one contiguous block of bits set
- hex bitmask, e.g. `0xff0`, must contain one contiguous block of bits set
type: string
data:
description: |-
CacheProportion specifies a share of the available cache lines.
Supported formats:
- percentage, e.g. `50%`
- percentage range, e.g. `50-60%`
- bit numbers, e.g. `0-5`, `2,3`, must contain one contiguous block of bits set
- hex bitmask, e.g. `0xff0`, must contain one contiguous block of bits set
type: string
unified:
description: |-
CacheProportion specifies a share of the available cache lines.
Supported formats:
- percentage, e.g. `50%`
- percentage range, e.g. `50-60%`
- bit numbers, e.g. `0-5`, `2,3`, must contain one contiguous block of bits set
- hex bitmask, e.g. `0xff0`, must contain one contiguous block of bits set
type: string
type: object
description: CatConfig contains the L2 or L3 cache allocation
configuration for one partition or class.
type: object
l3Allocation:
additionalProperties:
description: "CacheIdCatConfig is the cache allocation
configuration for one cache id.\nCode and Data represent
an optional configuration for separate code and
data\npaths and only have effect when RDT CDP (Code
and Data Prioritization) is\nenabled in the system.
Code and Data go in tandem so that both or neither\nmust
be specified - only specifying the other is considered
a configuration\nerror.\n\n\tTODO(klihub): Ideally
we'd have a validation rule ensuring that either\n\tunified
or code+data are set here. I tried that using a
CEL-expression\n\tbut couldn't avoid hitting the
complexity estimation limit (even with\n\textra
MaxProperties limits thrown in). Maybe we'll be
able to do that\n\teventually with https://github.com/kubernetes-sigs/controller-tools/pull/1212"
properties:
code:
description: |-
CacheProportion specifies a share of the available cache lines.
Supported formats:
- percentage, e.g. `50%`
- percentage range, e.g. `50-60%`
- bit numbers, e.g. `0-5`, `2,3`, must contain one contiguous block of bits set
- hex bitmask, e.g. `0xff0`, must contain one contiguous block of bits set
type: string
data:
description: |-
CacheProportion specifies a share of the available cache lines.
Supported formats:
- percentage, e.g. `50%`
- percentage range, e.g. `50-60%`
- bit numbers, e.g. `0-5`, `2,3`, must contain one contiguous block of bits set
- hex bitmask, e.g. `0xff0`, must contain one contiguous block of bits set
type: string
unified:
description: |-
CacheProportion specifies a share of the available cache lines.
Supported formats:
- percentage, e.g. `50%`
- percentage range, e.g. `50-60%`
- bit numbers, e.g. `0-5`, `2,3`, must contain one contiguous block of bits set
- hex bitmask, e.g. `0xff0`, must contain one contiguous block of bits set
type: string
type: object
description: CatConfig contains the L2 or L3 cache allocation
configuration for one partition or class.
type: object
mbAllocation:
additionalProperties:
description: |-
CacheIdMbaConfig is the memory bandwidth configuration for one cache id.
It's an array of at most two values, specifying separate values to be used
for percentage based and MBps based memory bandwidth allocation. For
example, `{"80%", "1000MBps"}` would allocate 80% if percentage based
allocation is used by the Linux kernel, or 1000 MBps in case MBps based
allocation is in use.
items:
description: |-
MbProportion specifies a share of available memory bandwidth. It's an
integer value followed by a unit. Two units are supported:
- percentage, e.g. `80%`
- MBps, e.g. `1000MBps`
type: string
type: array
description: MbaConfig contains the memory bandwidth
configuration for one partition or class.
type: object
type: object
description: Partitions configure cache partitions.
type: object
usePodQoSAsDefaultClass:
description: |-
usePodQoSAsDefaultClass controls whether a container's Pod QoS

View File

@ -103,7 +103,13 @@ spec:
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop: ["ALL"]
drop:
- ALL
{{- if .Values.config.control.rdt.enable }}
add:
- SYS_ADMIN
- DAC_OVERRIDE
{{- end }}
resources:
requests:
cpu: {{ .Values.resources.cpu }}

View File

@ -78,6 +78,9 @@
"type": "integer",
"minimum": 0,
"maximum": 99
},
"enableRDT": {
"type": "boolean"
}
}
},

View File

@ -12,6 +12,10 @@ config:
reservedResources:
cpu: 1000m
agent:
nodeResourceTopology: true
podResourceAPI: false
allocatorTopologyBalancing: true
balloonTypes:
@ -38,8 +42,56 @@ config:
reservedPoolNamespaces:
- kube-system
control:
rdt:
enable: false
usePodQoSAsDefaultClass: true
options:
l2:
optional: true
l3:
optional: true
mb:
optional: true
partitions:
fullCache:
l2Allocation:
all:
unified: 100%
l3Allocation:
all:
unified: 100%
classes:
BestEffort:
l2Allocation:
all:
unified: 33%
l3Allocation:
all:
unified: 33%
Burstable:
l2Allocation:
all:
unified: 66%
l3Allocation:
all:
unified: 66%
Guaranteed:
l2Allocation:
all:
unified: 100%
l3Allocation:
all:
unified: 100%
log:
source: true
klog:
skip_headers: true
debug:
- none
instrumentation:
httpEndpoint: ":8891"
prometheusExport: false

View File

@ -0,0 +1,11 @@
apiVersion: v2
appVersion: unstable
description: |
The memory-policy NRI plugin configures default Linux memory policy
for containers at creation time.
name: nri-memory-policy
sources:
- https://github.com/containers/nri-plugins
home: https://github.com/containers/nri-plugins
type: application
version: v0.0.0

View File

@ -0,0 +1,80 @@
# Memory Policy Plugin
This chart deploys memory-policy Node Resource Interface (NRI)
plugin. The memory-policy NRI plugin configures default Linux memory
policy for containers at creation time.
## Prerequisites
- Kubernetes 1.24+
- Helm 3.0.0+
- Container runtime:
- containerD:
- built with NRI > v0.9.0, needs [command line adjustments](https://github.com/containerd/nri/commit/eba3d98ffa7db804e67fd79dd791f95b163ed960).
- CRI-O
- built with NRI > v0.9.0, needs [command line adjustments](https://github.com/containerd/nri/commit/eba3d98ffa7db804e67fd79dd791f95b163ed960).
## Installing the Chart
Path to the chart: `nri-memory-policy`.
```sh
helm repo add nri-plugins https://containers.github.io/nri-plugins
helm install my-memory-policy nri-plugins/nri-memory-policy --namespace kube-system
```
The command above deploys the memory-policy plugin on the Kubernetes cluster
within the `kube-system` namespace with default configuration. To customize the
available parameters as described in the [Configuration options](#configuration-options)
below, you have two options: you can use the `--set` flag or create a custom
values.yaml file and provide it using the `-f` flag. For example:
```sh
# Install the memory-policy plugin with custom values specified in a custom values.yaml file
cat <<EOF > myPath/values.yaml
nri:
runtime:
patchConfig: true
plugin:
index: 92
tolerations:
- key: "node-role.kubernetes.io/control-plane"
operator: "Exists"
effect: "NoSchedule"
EOF
helm install my-memory-policy nri-plugins/nri-memory-policy --namespace kube-system -f myPath/values.yaml
```
## Uninstalling the Chart
To uninstall the memory-policy plugin run the following command:
```sh
helm delete my-memory-policy --namespace kube-system
```
## Configuration options
The tables below present an overview of the parameters available for users to
customize with their own values, along with the default values.
| Name | Default | Description |
| ------------------------ | ----------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------- |
| `image.name` | [ghcr.io/containers/nri-plugins/nri-memory-policy](https://ghcr.io/containers/nri-plugins/nri-memory-policy) | container image name |
| `image.tag` | unstable | container image tag |
| `image.pullPolicy` | Always | image pull policy |
| `resources.cpu` | 10m | cpu resources for the Pod |
| `resources.memory` | 100Mi | memory qouta for the Pod |
| `nri.runtime.config.pluginRegistrationTimeout` | "" | set NRI plugin registration timeout in NRI config of containerd or CRI-O |
| `nri.runtime.config.pluginRequestTimeout` | "" | set NRI plugin request timeout in NRI config of containerd or CRI-O |
| `nri.runtime.patchConfig` | false | patch NRI configuration in containerd or CRI-O |
| `nri.plugin.index` | 92 | NRI plugin index, larger than in NRI resource plugins |
| `initImage.name` | [ghcr.io/containers/nri-plugins/config-manager](https://ghcr.io/containers/nri-plugins/config-manager) | init container image name |
| `initImage.tag` | unstable | init container image tag |
| `initImage.pullPolicy` | Always | init container image pull policy |
| `tolerations` | [] | specify taint toleration key, operator and effect |
| `affinity` | [] | specify node affinity |
| `nodeSelector` | [] | specify node selector labels |
| `podPriorityClassNodeCritical` | true | enable [marking Pod as node critical](https://kubernetes.io/docs/tasks/administer-cluster/guaranteed-scheduling-critical-addon-pods/#marking-pod-as-critical) |

View File

@ -0,0 +1,16 @@
{{/*
Common labels
*/}}
{{- define "nri-plugin.labels" -}}
helm.sh/chart: {{ .Chart.Name }}-{{ .Chart.Version }}
app.kubernetes.io/managed-by: {{ .Release.Service }}
{{ include "nri-plugin.selectorLabels" . }}
{{- end -}}
{{/*
Selector labels
*/}}
{{- define "nri-plugin.selectorLabels" -}}
app.kubernetes.io/name: nri-memory-policy
app.kubernetes.io/instance: {{ .Release.Name }}
{{- end -}}

View File

@ -0,0 +1,10 @@
apiVersion: v1
kind: ConfigMap
metadata:
name: nri-memory-policy-config.default
namespace: {{ .Release.Namespace }}
labels:
{{- include "nri-plugin.labels" . | nindent 4 }}
data:
config.yaml: |
{{- toYaml .Values.config | nindent 4 }}

View File

@ -0,0 +1,102 @@
apiVersion: apps/v1
kind: DaemonSet
metadata:
labels:
{{- include "nri-plugin.labels" . | nindent 4 }}
name: nri-memory-policy
namespace: {{ .Release.Namespace }}
spec:
selector:
matchLabels:
{{- include "nri-plugin.selectorLabels" . | nindent 6 }}
template:
metadata:
labels:
{{- include "nri-plugin.labels" . | nindent 8 }}
spec:
{{- with .Values.tolerations }}
tolerations:
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.affinity }}
affinity:
{{- toYaml . | nindent 8 }}
{{- end }}
nodeSelector:
kubernetes.io/os: "linux"
{{- with .Values.nodeSelector }}
{{- toYaml . | nindent 8 }}
{{- end }}
{{- if .Values.nri.runtime.patchConfig }}
initContainers:
- name: patch-runtime
{{- if (not (or (eq .Values.nri.runtime.config nil) (eq .Values.nri.runtime.config.pluginRegistrationTimeout ""))) }}
args:
- -nri-plugin-registration-timeout
- {{ .Values.nri.runtime.config.pluginRegistrationTimeout }}
- -nri-plugin-request-timeout
- {{ .Values.nri.runtime.config.pluginRequestTimeout }}
{{- end }}
image: {{ .Values.initContainerImage.name }}:{{ .Values.initContainerImage.tag | default .Chart.AppVersion }}
imagePullPolicy: {{ .Values.initContainerImage.pullPolicy }}
volumeMounts:
- name: containerd-config
mountPath: /etc/containerd
- name: crio-config
mountPath: /etc/crio/crio.conf.d
- name: dbus-socket
mountPath: /var/run/dbus/system_bus_socket
securityContext:
privileged: true
{{- end }}
containers:
- name: nri-memory-policy
command:
- nri-memory-policy
- --idx
- "{{ .Values.nri.plugin.index | int | printf "%02d" }}"
- --config
- /etc/nri/memory-policy/config.yaml
- -v
image: {{ .Values.image.name }}:{{ .Values.image.tag | default .Chart.AppVersion }}
imagePullPolicy: {{ .Values.image.pullPolicy }}
resources:
requests:
cpu: {{ .Values.resources.cpu }}
memory: {{ .Values.resources.memory }}
volumeMounts:
- name: memory-policy-config-vol
mountPath: /etc/nri/memory-policy
- name: nrisockets
mountPath: /var/run/nri
- name: mpolset-vol
mountPath: /mnt/nri-memory-policy-mpolset
{{- if .Values.podPriorityClassNodeCritical }}
priorityClassName: system-node-critical
{{- end }}
volumes:
- name: memory-policy-config-vol
configMap:
name: nri-memory-policy-config.default
- name: nrisockets
hostPath:
path: /var/run/nri
type: DirectoryOrCreate
- name: mpolset-vol
hostPath:
path: /mnt/nri-memory-policy-mpolset
type: DirectoryOrCreate
{{- if .Values.nri.runtime.patchConfig }}
- name: containerd-config
hostPath:
path: /etc/containerd/
type: DirectoryOrCreate
- name: crio-config
hostPath:
path: /etc/crio/crio.conf.d/
type: DirectoryOrCreate
- name: dbus-socket
hostPath:
path: /var/run/dbus/system_bus_socket
type: Socket
{{- end }}

View File

@ -0,0 +1,117 @@
{
"$schema": "http://json-schema.org/schema#",
"required": [
"image",
"resources"
],
"properties": {
"image": {
"type": "object",
"required": [
"name",
"pullPolicy"
],
"properties": {
"name": {
"type": "string"
},
"tag": {
"type": "string"
},
"pullPolicy": {
"type": "string",
"enum": ["Never", "Always", "IfNotPresent"]
}
}
},
"initContainerImage": {
"type": "object",
"required": [
"name",
"pullPolicy"
],
"properties": {
"name": {
"type": "string"
},
"tag": {
"type": "string"
},
"pullPolicy": {
"type": "string",
"enum": ["Never", "Always", "IfNotPresent"]
}
}
},
"resources": {
"type": "object",
"required": [
"cpu",
"memory"
],
"properties": {
"cpu": {
"type": "string"
},
"memory": {
"type": "string"
}
}
},
"nri": {
"type": "object",
"required": [
"plugin",
"runtime"
],
"properties": {
"plugin": {
"type": "object",
"required": [
"index"
],
"properties": {
"index": {
"type": "integer",
"minimum": 0,
"maximum": 99
}
}
},
"runtime": {
"type": "object",
"required": [
"patchConfig"
],
"properties": {
"patchConfig": {
"type": "boolean"
},
"config": {
"type": "object",
"required": [
"pluginRegistrationTimeout",
"pluginRequestTimeout"
],
"properties": {
"pluginRegistrationTimeout": {
"type": "string",
"$comment": "allowed range is 5-30s",
"pattern": "^(([5-9])|([1-2][0-9])|(30))s$"
},
"pluginRequestTimeout": {
"type": "string",
"$comment": "allowed range is 2-30s",
"pattern": "^(([2-9])|([1-2][0-9])|(30))s$"
}
}
}
}
}
}
},
"podPriorityClassNodeCritical": {
"type": "boolean"
}
}
}

View File

@ -0,0 +1,89 @@
# Default values for memory-policy.
# This is a YAML-formatted file.
# Declare variables to be passed into your templates.
---
image:
name: ghcr.io/containers/nri-plugins/nri-memory-policy
# tag, if defined will use the given image tag, otherwise Chart.AppVersion will be used
#tag: unstable
pullPolicy: Always
config:
injectMpolset: true
classes:
- name: interleave-all
policy:
mode: MPOL_INTERLEAVE
nodes: allowed-mems
- name: interleave-cpu-packages
policy:
mode: MPOL_INTERLEAVE
nodes: cpu-packages
- name: interleave-cpu-nodes
policy:
mode: MPOL_INTERLEAVE
nodes: cpu-nodes
- name: interleave-within-socket
policy:
mode: MPOL_INTERLEAVE
nodes: max-dist:19
resources:
cpu: 10m
memory: 100Mi
nri:
plugin:
# Plugin index should be large enough to let resource policy plugins run first.
# CPU and memory affinity set by resource policy affects nodes that the memory
# policy can or should use.
index: 95
runtime:
patchConfig: false
# config:
# pluginRegistrationTimeout: 5s
# pluginRequestTimeout: 2s
initContainerImage:
name: ghcr.io/containers/nri-plugins/nri-config-manager
# If not defined Chart.AppVersion will be used
#tag: unstable
pullPolicy: Always
tolerations: []
#
# Example:
#
# tolerations:
# - key: "node-role.kubernetes.io/control-plane"
# operator: "Exists"
# effect: "NoSchedule"
affinity: []
#
# Example:
#
# affinity:
# nodeAffinity:
# requiredDuringSchedulingIgnoredDuringExecution:
# nodeSelectorTerms:
# - matchExpressions:
# - key: topology.kubernetes.io/disk
# operator: In
# values:
# - ssd
nodeSelector: []
#
# Example:
#
# nodeSelector:
# kubernetes.io/disk: "ssd"
# NRI plugins should be considered as part of the container runtime.
# By default we make them part of the system-node-critical priority
# class. This should mitigate the potential risk of a plugin getting
# evicted under heavy system load. It should also ensure that during
# autoscaling enough new nodes are brought up to leave room for the
# plugin on each new node.
podPriorityClassNodeCritical: true

View File

@ -123,6 +123,308 @@ spec:
When enabled, policy implementations can adjust cache allocation
and memory bandwidth by assigning containers to RDT classes.
type: boolean
force:
description: Force indicates if the configuration should be
forced to goresctrl.
type: boolean
options:
description: Options container common goresctrl/rdt settings.
properties:
l2:
description: CatOptions contains the common settings for
cache allocation.
properties:
optional:
type: boolean
required:
- optional
type: object
l3:
description: CatOptions contains the common settings for
cache allocation.
properties:
optional:
type: boolean
required:
- optional
type: object
mb:
description: MbOptions contains the common settings for
memory bandwidth allocation.
properties:
optional:
type: boolean
required:
- optional
type: object
type: object
partitions:
additionalProperties:
description: PartitionConfig provides configuration for
a single cache partition.
properties:
classes:
additionalProperties:
description: ClassConfig provides configuration for
a single named cache CLOS/class.
properties:
l2Allocation:
additionalProperties:
description: "CacheIdCatConfig is the cache
allocation configuration for one cache id.\nCode
and Data represent an optional configuration
for separate code and data\npaths and only
have effect when RDT CDP (Code and Data Prioritization)
is\nenabled in the system. Code and Data go
in tandem so that both or neither\nmust be
specified - only specifying the other is considered
a configuration\nerror.\n\n\tTODO(klihub):
Ideally we'd have a validation rule ensuring
that either\n\tunified or code+data are set
here. I tried that using a CEL-expression\n\tbut
couldn't avoid hitting the complexity estimation
limit (even with\n\textra MaxProperties limits
thrown in). Maybe we'll be able to do that\n\teventually
with https://github.com/kubernetes-sigs/controller-tools/pull/1212"
properties:
code:
description: |-
CacheProportion specifies a share of the available cache lines.
Supported formats:
- percentage, e.g. `50%`
- percentage range, e.g. `50-60%`
- bit numbers, e.g. `0-5`, `2,3`, must contain one contiguous block of bits set
- hex bitmask, e.g. `0xff0`, must contain one contiguous block of bits set
type: string
data:
description: |-
CacheProportion specifies a share of the available cache lines.
Supported formats:
- percentage, e.g. `50%`
- percentage range, e.g. `50-60%`
- bit numbers, e.g. `0-5`, `2,3`, must contain one contiguous block of bits set
- hex bitmask, e.g. `0xff0`, must contain one contiguous block of bits set
type: string
unified:
description: |-
CacheProportion specifies a share of the available cache lines.
Supported formats:
- percentage, e.g. `50%`
- percentage range, e.g. `50-60%`
- bit numbers, e.g. `0-5`, `2,3`, must contain one contiguous block of bits set
- hex bitmask, e.g. `0xff0`, must contain one contiguous block of bits set
type: string
type: object
description: CatConfig contains the L2 or L3 cache
allocation configuration for one partition or
class.
type: object
l3Allocation:
additionalProperties:
description: "CacheIdCatConfig is the cache
allocation configuration for one cache id.\nCode
and Data represent an optional configuration
for separate code and data\npaths and only
have effect when RDT CDP (Code and Data Prioritization)
is\nenabled in the system. Code and Data go
in tandem so that both or neither\nmust be
specified - only specifying the other is considered
a configuration\nerror.\n\n\tTODO(klihub):
Ideally we'd have a validation rule ensuring
that either\n\tunified or code+data are set
here. I tried that using a CEL-expression\n\tbut
couldn't avoid hitting the complexity estimation
limit (even with\n\textra MaxProperties limits
thrown in). Maybe we'll be able to do that\n\teventually
with https://github.com/kubernetes-sigs/controller-tools/pull/1212"
properties:
code:
description: |-
CacheProportion specifies a share of the available cache lines.
Supported formats:
- percentage, e.g. `50%`
- percentage range, e.g. `50-60%`
- bit numbers, e.g. `0-5`, `2,3`, must contain one contiguous block of bits set
- hex bitmask, e.g. `0xff0`, must contain one contiguous block of bits set
type: string
data:
description: |-
CacheProportion specifies a share of the available cache lines.
Supported formats:
- percentage, e.g. `50%`
- percentage range, e.g. `50-60%`
- bit numbers, e.g. `0-5`, `2,3`, must contain one contiguous block of bits set
- hex bitmask, e.g. `0xff0`, must contain one contiguous block of bits set
type: string
unified:
description: |-
CacheProportion specifies a share of the available cache lines.
Supported formats:
- percentage, e.g. `50%`
- percentage range, e.g. `50-60%`
- bit numbers, e.g. `0-5`, `2,3`, must contain one contiguous block of bits set
- hex bitmask, e.g. `0xff0`, must contain one contiguous block of bits set
type: string
type: object
description: CatConfig contains the L2 or L3 cache
allocation configuration for one partition or
class.
type: object
mbAllocation:
additionalProperties:
description: |-
CacheIdMbaConfig is the memory bandwidth configuration for one cache id.
It's an array of at most two values, specifying separate values to be used
for percentage based and MBps based memory bandwidth allocation. For
example, `{"80%", "1000MBps"}` would allocate 80% if percentage based
allocation is used by the Linux kernel, or 1000 MBps in case MBps based
allocation is in use.
items:
description: |-
MbProportion specifies a share of available memory bandwidth. It's an
integer value followed by a unit. Two units are supported:
- percentage, e.g. `80%`
- MBps, e.g. `1000MBps`
type: string
type: array
description: MbaConfig contains the memory bandwidth
configuration for one partition or class.
type: object
type: object
type: object
l2Allocation:
additionalProperties:
description: "CacheIdCatConfig is the cache allocation
configuration for one cache id.\nCode and Data represent
an optional configuration for separate code and
data\npaths and only have effect when RDT CDP (Code
and Data Prioritization) is\nenabled in the system.
Code and Data go in tandem so that both or neither\nmust
be specified - only specifying the other is considered
a configuration\nerror.\n\n\tTODO(klihub): Ideally
we'd have a validation rule ensuring that either\n\tunified
or code+data are set here. I tried that using a
CEL-expression\n\tbut couldn't avoid hitting the
complexity estimation limit (even with\n\textra
MaxProperties limits thrown in). Maybe we'll be
able to do that\n\teventually with https://github.com/kubernetes-sigs/controller-tools/pull/1212"
properties:
code:
description: |-
CacheProportion specifies a share of the available cache lines.
Supported formats:
- percentage, e.g. `50%`
- percentage range, e.g. `50-60%`
- bit numbers, e.g. `0-5`, `2,3`, must contain one contiguous block of bits set
- hex bitmask, e.g. `0xff0`, must contain one contiguous block of bits set
type: string
data:
description: |-
CacheProportion specifies a share of the available cache lines.
Supported formats:
- percentage, e.g. `50%`
- percentage range, e.g. `50-60%`
- bit numbers, e.g. `0-5`, `2,3`, must contain one contiguous block of bits set
- hex bitmask, e.g. `0xff0`, must contain one contiguous block of bits set
type: string
unified:
description: |-
CacheProportion specifies a share of the available cache lines.
Supported formats:
- percentage, e.g. `50%`
- percentage range, e.g. `50-60%`
- bit numbers, e.g. `0-5`, `2,3`, must contain one contiguous block of bits set
- hex bitmask, e.g. `0xff0`, must contain one contiguous block of bits set
type: string
type: object
description: CatConfig contains the L2 or L3 cache allocation
configuration for one partition or class.
type: object
l3Allocation:
additionalProperties:
description: "CacheIdCatConfig is the cache allocation
configuration for one cache id.\nCode and Data represent
an optional configuration for separate code and
data\npaths and only have effect when RDT CDP (Code
and Data Prioritization) is\nenabled in the system.
Code and Data go in tandem so that both or neither\nmust
be specified - only specifying the other is considered
a configuration\nerror.\n\n\tTODO(klihub): Ideally
we'd have a validation rule ensuring that either\n\tunified
or code+data are set here. I tried that using a
CEL-expression\n\tbut couldn't avoid hitting the
complexity estimation limit (even with\n\textra
MaxProperties limits thrown in). Maybe we'll be
able to do that\n\teventually with https://github.com/kubernetes-sigs/controller-tools/pull/1212"
properties:
code:
description: |-
CacheProportion specifies a share of the available cache lines.
Supported formats:
- percentage, e.g. `50%`
- percentage range, e.g. `50-60%`
- bit numbers, e.g. `0-5`, `2,3`, must contain one contiguous block of bits set
- hex bitmask, e.g. `0xff0`, must contain one contiguous block of bits set
type: string
data:
description: |-
CacheProportion specifies a share of the available cache lines.
Supported formats:
- percentage, e.g. `50%`
- percentage range, e.g. `50-60%`
- bit numbers, e.g. `0-5`, `2,3`, must contain one contiguous block of bits set
- hex bitmask, e.g. `0xff0`, must contain one contiguous block of bits set
type: string
unified:
description: |-
CacheProportion specifies a share of the available cache lines.
Supported formats:
- percentage, e.g. `50%`
- percentage range, e.g. `50-60%`
- bit numbers, e.g. `0-5`, `2,3`, must contain one contiguous block of bits set
- hex bitmask, e.g. `0xff0`, must contain one contiguous block of bits set
type: string
type: object
description: CatConfig contains the L2 or L3 cache allocation
configuration for one partition or class.
type: object
mbAllocation:
additionalProperties:
description: |-
CacheIdMbaConfig is the memory bandwidth configuration for one cache id.
It's an array of at most two values, specifying separate values to be used
for percentage based and MBps based memory bandwidth allocation. For
example, `{"80%", "1000MBps"}` would allocate 80% if percentage based
allocation is used by the Linux kernel, or 1000 MBps in case MBps based
allocation is in use.
items:
description: |-
MbProportion specifies a share of available memory bandwidth. It's an
integer value followed by a unit. Two units are supported:
- percentage, e.g. `80%`
- MBps, e.g. `1000MBps`
type: string
type: array
description: MbaConfig contains the memory bandwidth
configuration for one partition or class.
type: object
type: object
description: Partitions configure cache partitions.
type: object
usePodQoSAsDefaultClass:
description: |-
usePodQoSAsDefaultClass controls whether a container's Pod QoS

View File

@ -96,7 +96,13 @@ spec:
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop: ["ALL"]
drop:
- ALL
{{- if .Values.config.control.rdt.enable }}
add:
- SYS_ADMIN
- DAC_OVERRIDE
{{- end }}
resources:
requests:
cpu: {{ .Values.resources.cpu }}

View File

@ -11,6 +11,50 @@ image:
config:
reservedResources:
cpu: 750m
agent:
nodeResourceTopology: true
podResourceAPI: false
control:
rdt:
enable: false
usePodQoSAsDefaultClass: true
options:
l2:
optional: true
l3:
optional: true
mb:
optional: true
partitions:
fullCache:
l2Allocation:
all:
unified: 100%
l3Allocation:
all:
unified: 100%
classes:
BestEffort:
l2Allocation:
all:
unified: 33%
l3Allocation:
all:
unified: 33%
Burstable:
l2Allocation:
all:
unified: 66%
l3Allocation:
all:
unified: 66%
Guaranteed:
l2Allocation:
all:
unified: 100%
l3Allocation:
all:
unified: 100%
log:
source: true
klog:

View File

@ -137,6 +137,308 @@ spec:
When enabled, policy implementations can adjust cache allocation
and memory bandwidth by assigning containers to RDT classes.
type: boolean
force:
description: Force indicates if the configuration should be
forced to goresctrl.
type: boolean
options:
description: Options container common goresctrl/rdt settings.
properties:
l2:
description: CatOptions contains the common settings for
cache allocation.
properties:
optional:
type: boolean
required:
- optional
type: object
l3:
description: CatOptions contains the common settings for
cache allocation.
properties:
optional:
type: boolean
required:
- optional
type: object
mb:
description: MbOptions contains the common settings for
memory bandwidth allocation.
properties:
optional:
type: boolean
required:
- optional
type: object
type: object
partitions:
additionalProperties:
description: PartitionConfig provides configuration for
a single cache partition.
properties:
classes:
additionalProperties:
description: ClassConfig provides configuration for
a single named cache CLOS/class.
properties:
l2Allocation:
additionalProperties:
description: "CacheIdCatConfig is the cache
allocation configuration for one cache id.\nCode
and Data represent an optional configuration
for separate code and data\npaths and only
have effect when RDT CDP (Code and Data Prioritization)
is\nenabled in the system. Code and Data go
in tandem so that both or neither\nmust be
specified - only specifying the other is considered
a configuration\nerror.\n\n\tTODO(klihub):
Ideally we'd have a validation rule ensuring
that either\n\tunified or code+data are set
here. I tried that using a CEL-expression\n\tbut
couldn't avoid hitting the complexity estimation
limit (even with\n\textra MaxProperties limits
thrown in). Maybe we'll be able to do that\n\teventually
with https://github.com/kubernetes-sigs/controller-tools/pull/1212"
properties:
code:
description: |-
CacheProportion specifies a share of the available cache lines.
Supported formats:
- percentage, e.g. `50%`
- percentage range, e.g. `50-60%`
- bit numbers, e.g. `0-5`, `2,3`, must contain one contiguous block of bits set
- hex bitmask, e.g. `0xff0`, must contain one contiguous block of bits set
type: string
data:
description: |-
CacheProportion specifies a share of the available cache lines.
Supported formats:
- percentage, e.g. `50%`
- percentage range, e.g. `50-60%`
- bit numbers, e.g. `0-5`, `2,3`, must contain one contiguous block of bits set
- hex bitmask, e.g. `0xff0`, must contain one contiguous block of bits set
type: string
unified:
description: |-
CacheProportion specifies a share of the available cache lines.
Supported formats:
- percentage, e.g. `50%`
- percentage range, e.g. `50-60%`
- bit numbers, e.g. `0-5`, `2,3`, must contain one contiguous block of bits set
- hex bitmask, e.g. `0xff0`, must contain one contiguous block of bits set
type: string
type: object
description: CatConfig contains the L2 or L3 cache
allocation configuration for one partition or
class.
type: object
l3Allocation:
additionalProperties:
description: "CacheIdCatConfig is the cache
allocation configuration for one cache id.\nCode
and Data represent an optional configuration
for separate code and data\npaths and only
have effect when RDT CDP (Code and Data Prioritization)
is\nenabled in the system. Code and Data go
in tandem so that both or neither\nmust be
specified - only specifying the other is considered
a configuration\nerror.\n\n\tTODO(klihub):
Ideally we'd have a validation rule ensuring
that either\n\tunified or code+data are set
here. I tried that using a CEL-expression\n\tbut
couldn't avoid hitting the complexity estimation
limit (even with\n\textra MaxProperties limits
thrown in). Maybe we'll be able to do that\n\teventually
with https://github.com/kubernetes-sigs/controller-tools/pull/1212"
properties:
code:
description: |-
CacheProportion specifies a share of the available cache lines.
Supported formats:
- percentage, e.g. `50%`
- percentage range, e.g. `50-60%`
- bit numbers, e.g. `0-5`, `2,3`, must contain one contiguous block of bits set
- hex bitmask, e.g. `0xff0`, must contain one contiguous block of bits set
type: string
data:
description: |-
CacheProportion specifies a share of the available cache lines.
Supported formats:
- percentage, e.g. `50%`
- percentage range, e.g. `50-60%`
- bit numbers, e.g. `0-5`, `2,3`, must contain one contiguous block of bits set
- hex bitmask, e.g. `0xff0`, must contain one contiguous block of bits set
type: string
unified:
description: |-
CacheProportion specifies a share of the available cache lines.
Supported formats:
- percentage, e.g. `50%`
- percentage range, e.g. `50-60%`
- bit numbers, e.g. `0-5`, `2,3`, must contain one contiguous block of bits set
- hex bitmask, e.g. `0xff0`, must contain one contiguous block of bits set
type: string
type: object
description: CatConfig contains the L2 or L3 cache
allocation configuration for one partition or
class.
type: object
mbAllocation:
additionalProperties:
description: |-
CacheIdMbaConfig is the memory bandwidth configuration for one cache id.
It's an array of at most two values, specifying separate values to be used
for percentage based and MBps based memory bandwidth allocation. For
example, `{"80%", "1000MBps"}` would allocate 80% if percentage based
allocation is used by the Linux kernel, or 1000 MBps in case MBps based
allocation is in use.
items:
description: |-
MbProportion specifies a share of available memory bandwidth. It's an
integer value followed by a unit. Two units are supported:
- percentage, e.g. `80%`
- MBps, e.g. `1000MBps`
type: string
type: array
description: MbaConfig contains the memory bandwidth
configuration for one partition or class.
type: object
type: object
type: object
l2Allocation:
additionalProperties:
description: "CacheIdCatConfig is the cache allocation
configuration for one cache id.\nCode and Data represent
an optional configuration for separate code and
data\npaths and only have effect when RDT CDP (Code
and Data Prioritization) is\nenabled in the system.
Code and Data go in tandem so that both or neither\nmust
be specified - only specifying the other is considered
a configuration\nerror.\n\n\tTODO(klihub): Ideally
we'd have a validation rule ensuring that either\n\tunified
or code+data are set here. I tried that using a
CEL-expression\n\tbut couldn't avoid hitting the
complexity estimation limit (even with\n\textra
MaxProperties limits thrown in). Maybe we'll be
able to do that\n\teventually with https://github.com/kubernetes-sigs/controller-tools/pull/1212"
properties:
code:
description: |-
CacheProportion specifies a share of the available cache lines.
Supported formats:
- percentage, e.g. `50%`
- percentage range, e.g. `50-60%`
- bit numbers, e.g. `0-5`, `2,3`, must contain one contiguous block of bits set
- hex bitmask, e.g. `0xff0`, must contain one contiguous block of bits set
type: string
data:
description: |-
CacheProportion specifies a share of the available cache lines.
Supported formats:
- percentage, e.g. `50%`
- percentage range, e.g. `50-60%`
- bit numbers, e.g. `0-5`, `2,3`, must contain one contiguous block of bits set
- hex bitmask, e.g. `0xff0`, must contain one contiguous block of bits set
type: string
unified:
description: |-
CacheProportion specifies a share of the available cache lines.
Supported formats:
- percentage, e.g. `50%`
- percentage range, e.g. `50-60%`
- bit numbers, e.g. `0-5`, `2,3`, must contain one contiguous block of bits set
- hex bitmask, e.g. `0xff0`, must contain one contiguous block of bits set
type: string
type: object
description: CatConfig contains the L2 or L3 cache allocation
configuration for one partition or class.
type: object
l3Allocation:
additionalProperties:
description: "CacheIdCatConfig is the cache allocation
configuration for one cache id.\nCode and Data represent
an optional configuration for separate code and
data\npaths and only have effect when RDT CDP (Code
and Data Prioritization) is\nenabled in the system.
Code and Data go in tandem so that both or neither\nmust
be specified - only specifying the other is considered
a configuration\nerror.\n\n\tTODO(klihub): Ideally
we'd have a validation rule ensuring that either\n\tunified
or code+data are set here. I tried that using a
CEL-expression\n\tbut couldn't avoid hitting the
complexity estimation limit (even with\n\textra
MaxProperties limits thrown in). Maybe we'll be
able to do that\n\teventually with https://github.com/kubernetes-sigs/controller-tools/pull/1212"
properties:
code:
description: |-
CacheProportion specifies a share of the available cache lines.
Supported formats:
- percentage, e.g. `50%`
- percentage range, e.g. `50-60%`
- bit numbers, e.g. `0-5`, `2,3`, must contain one contiguous block of bits set
- hex bitmask, e.g. `0xff0`, must contain one contiguous block of bits set
type: string
data:
description: |-
CacheProportion specifies a share of the available cache lines.
Supported formats:
- percentage, e.g. `50%`
- percentage range, e.g. `50-60%`
- bit numbers, e.g. `0-5`, `2,3`, must contain one contiguous block of bits set
- hex bitmask, e.g. `0xff0`, must contain one contiguous block of bits set
type: string
unified:
description: |-
CacheProportion specifies a share of the available cache lines.
Supported formats:
- percentage, e.g. `50%`
- percentage range, e.g. `50-60%`
- bit numbers, e.g. `0-5`, `2,3`, must contain one contiguous block of bits set
- hex bitmask, e.g. `0xff0`, must contain one contiguous block of bits set
type: string
type: object
description: CatConfig contains the L2 or L3 cache allocation
configuration for one partition or class.
type: object
mbAllocation:
additionalProperties:
description: |-
CacheIdMbaConfig is the memory bandwidth configuration for one cache id.
It's an array of at most two values, specifying separate values to be used
for percentage based and MBps based memory bandwidth allocation. For
example, `{"80%", "1000MBps"}` would allocate 80% if percentage based
allocation is used by the Linux kernel, or 1000 MBps in case MBps based
allocation is in use.
items:
description: |-
MbProportion specifies a share of available memory bandwidth. It's an
integer value followed by a unit. Two units are supported:
- percentage, e.g. `80%`
- MBps, e.g. `1000MBps`
type: string
type: array
description: MbaConfig contains the memory bandwidth
configuration for one partition or class.
type: object
type: object
description: Partitions configure cache partitions.
type: object
usePodQoSAsDefaultClass:
description: |-
usePodQoSAsDefaultClass controls whether a container's Pod QoS

View File

@ -103,7 +103,13 @@ spec:
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop: ["ALL"]
drop:
- ALL
{{- if .Values.config.control.rdt.enable }}
add:
- SYS_ADMIN
- DAC_OVERRIDE
{{- end }}
resources:
requests:
cpu: {{ .Values.resources.cpu }}

View File

@ -78,6 +78,9 @@
"type": "integer",
"minimum": 0,
"maximum": 99
},
"enableRDT": {
"type": "boolean"
}
}
},

View File

@ -11,10 +11,56 @@ image:
config:
reservedResources:
cpu: 750m
agent:
nodeResourceTopology: true
podResourceAPI: false
control:
rdt:
enable: false
usePodQoSAsDefaultClass: true
options:
l2:
optional: true
l3:
optional: true
mb:
optional: true
partitions:
fullCache:
l2Allocation:
all:
unified: 100%
l3Allocation:
all:
unified: 100%
classes:
BestEffort:
l2Allocation:
all:
unified: 33%
l3Allocation:
all:
unified: 33%
Burstable:
l2Allocation:
all:
unified: 66%
l3Allocation:
all:
unified: 66%
Guaranteed:
l2Allocation:
all:
unified: 100%
l3Allocation:
all:
unified: 100%
log:
source: true
klog:
skip_headers: true
debug:
- none
instrumentation:
httpEndpoint: ":8891"
prometheusExport: false

View File

@ -1,6 +1,6 @@
FROM sphinxdoc/sphinx:7.4.7
ARG GO_VERSION=1.23
ARG GO_VERSION=1.24.3
RUN apt-get update && apt-get install -y wget git

View File

@ -95,7 +95,7 @@ Balloons policy parameters:
result, the policy will not modify CPU or memory pinning of
matching containers.
```
ignore:
preserve:
matchExpressions:
- key: name
operator: In
@ -275,6 +275,13 @@ Balloons policy parameters:
should get their CPUs from separate cache blocks for best
performance. Every listed class must be specified in
`loadClasses`.
- `components` is a list of components of a balloon. If a balloon
consists of components, its CPUs are allocated by allocating CPUs
for each component balloon separately, and then adding them up.
See [combining balloons](#combining-balloons) for more details and
an example. Properties of components in the list are:
- `balloonType` specifies the name of the balloon type according
to which CPUs are allocated to this component.
- `loadClasses`: lists properties of loads that containers in balloons
generate to some parts of the system. When the policy allocates CPUs
for load generating balloon instances, it selects CPUs so that it
@ -318,6 +325,13 @@ Balloons policy parameters:
and assigned containers are readable through `/metrics` from the
httpEndpoint.
- `reportPeriod`: `/metrics` aggregation interval for polled metrics.
- `metrics`: defines which metrics to collect.
- `enabled`: a list of glob patterns that match metrics to collect.
Example: `["policy"]`
- `samplingRatePerMillion`: the number of samples to collect per million spans.
Example: `100000`
- `tracingCollector`: defines the external endpoint for tracing data collection.
Example: `otlp-http://localhost:4318`.
- `agent`: controls communicating with the Kubernetes node agent and
the API server.
- `nodeResourceTopology`: if `true`, expose balloons as node
@ -325,6 +339,11 @@ Balloons policy parameters:
resources. Moreover, showing containers assigned to balloons and
their CPU/memory affinities can be enabled with
`showContainersInNrt`. The default is `false`.
- `log`: contains the logging configuration for the policy.
- `debug`: an array of components to enable debug logging for.
Example: `["policy"]`.
- `source`: set to `true` to prefix messages with the name of the logger
source.
### Example
@ -522,6 +541,128 @@ metadata:
memory-type.resource-policy.nri.io/container.LLM: HBM,DRAM
```
## Combining Balloons
Sometimes a container needs a set of CPUs where some CPUs have
different properties or they are selected based on different criteria
than other CPUs.
This kind of a container needs to be assigned into a composite
balloon. A composite balloon consists of component balloons.
Specifying `components` in a balloon type makes it a composite balloon
type. Composite balloons get their CPUs by combining CPUs of their
components.
Each component specifies its balloon type, that must be defined in the
`balloonTypes` list. CPUs are allocated to the component based on its
own balloon type configuration only. As CPUs are not allocated
directly to composite balloons, CPU allocation parameters are not
allowed in composite balloon types.
When the policy creates new composite balloon, it creates hidden
instances of balloons's components, too. Resizing the composite
balloon due to changes in its containers causes resizing these hidden
instances.
Example: allocate CPUs for distributed AI inference containers so
that, depending on a balloon type, a container will get:
- equal number of CPUs from all 4 NUMA nodes in the system
- equal number of CPUs from both NUMA nodes on CPU package 0
- equal number of CPUs from both NUMA nodes on CPU package 1.
Following balloon type configuration implements this. Containers can
be assigned into balloons `balance-all-nodes`, `balance-pkg0-nodes`,
and `balance-pkg1-nodes`, respectively.
```yaml
balloonTypes:
- name: balance-all-nodes
components:
- balloonType: balance-pkg0-nodes
- balloonType: balance-pkg1-nodes
- name: balance-pkg0-nodes
components:
- balloonType: node0
- balloonType: node1
- name: balance-pkg1-nodes
components:
- balloonType: node2
- balloonType: node3
- name: node0
preferCloseToDevices:
- /sys/devices/system/node/node0
- name: node1
preferCloseToDevices:
- /sys/devices/system/node/node1
- name: node2
preferCloseToDevices:
- /sys/devices/system/node/node2
- name: node3
preferCloseToDevices:
- /sys/devices/system/node/node3
```
## Prevent Creating a Container
Sometimes unwanted or unknown containers must not be created on a
node, not even in the case that such a pod is scheduled on the node
accidentally. This can be prevented by a policy that has a balloon type
that is not able to run a container that matches to the type. One way
to define such a balloon type is by specifying that instances of the
type will never have enough CPUs to run any containers. That is, set
`maxCPUs` and `minCPUs` to -1.
```yaml
apiVersion: config.nri/v1alpha1
kind: BalloonsPolicy
metadata:
name: default
namespace: kube-system
spec:
balloonTypes:
- name: unknown-containers
maxCPUs: -1
minCPUs: -1
matchExpressions:
- key: name
operator: NotIn
values:
- containerA
- containerB
...
```
## Reset CPU and memory pinning
CPU and memory pinning of all containers can be forcibly reset with
the following policy. The policy assigns containers from all
namespaces to the same "reserved" balloon instance, and allows them to
use all other CPUs in the system, too.
```
apiVersion: config.nri/v1alpha1
kind: BalloonsPolicy
metadata:
name: default
namespace: kube-system
spec:
balloonTypes:
- name: reserved
namespaces:
- "*"
shareIdleCPUsInSame: system
reservedResources:
cpu: 1
pinCPU: true
pinMemory: true
```
## Metrics and Debugging
In order to enable more verbose logging and metrics exporting from the

View File

@ -0,0 +1,151 @@
# Common Functionality
## Overview
There is some common functionality implemented by the generic resource management
infrastructure shared by all resource policy plugin implementations. This functionality
is available in all policies, unless stated otherwise in the policy-specific documentation.
## Cache Allocation
Plugins can be configured to exercise class-based control over the L2 and L3 cache
allocated to containers' processes. In practice, containers are assigned to classes.
Classes have a corresponding cache allocation configuration. This configuration is
applied to all containers and subsequently to all processes started in a container.
To enable cache control use the `control.rdt.enable` option which defaults to `false`.
Plugins can be configured to assign containers by default to a cache class named after
the Pod QoS class of the container: one of `BestEffort`, `Burstable`, and `Guaranteed`.
The configuration setting controlling this behavior is `control.rdt.usagePodQoSAsDefaultClass`
and it defaults to `false`.
Additionally, containers can be explicitly annotated to be assigned to a class.
Use the `rdtclass.resource-policy.nri.io` annotation key for this. For instance
```yaml
apiVersion: v1
kind: Pod
metadata:
name: test-pod
annotations:
rdtclass.resource-policy.nri.io/pod: poddefaultclass
rdtclass.resource-policy.nri.io/container.special-container: specialclass
...
```
This will assign the container named `special-container` within the pod to
the `specialclass` RDT class and any other container within the pod to the
`poddefaultclass` RDT class. Effectively these containers' processes will
be assigned to the RDT CLOSes corresponding to those classes.
### Cache Class/Partitioning Configuration
RDT configuration is supplied as part of the`control.rdt` configuration block.
Here is a sample snippet as a Helm chart value which assigns 33%, 66% and 100%
of cache lines to `BestEffort`, `Burstable` and `Guaranteed` Pod QoS class
containers correspondingly:
```yaml
config:
control:
rdt:
enable: false
usePodQoSAsDefaultClass: true
options:
l2:
optional: true
l3:
optional: true
mb:
optional: true
partitions:
fullCache:
l2Allocation:
all:
unified: 100%
l3Allocation:
all:
unified: 100%
classes:
BestEffort:
l2Allocation:
all:
unified: 33%
l3Allocation:
all:
unified: 33%
Burstable:
l2Allocation:
all:
unified: 66%
l3Allocation:
all:
unified: 66%
Guaranteed:
l2Allocation:
all:
unified: 100%
l3Allocation:
all:
unified: 100%
```
The actual library used to implement cache control is [goresctrl](https://github.com/intel/goresctrl).
Please refer to its [documentation](https://github.com/intel/goresctrl/blob/main/doc/rdt.md) for
a more detailed description of configuration semantics.
#### A Warning About Configuration Syntax Differences
Note that the configuration syntax used for cache partitioning and classes is slightly
different for [goresctrl](https://github.com/intel/goresctrl/blob/main/doc/rdt.md) and
NRI Reference Plugins. When directly using goresctrl you can use a shorthand notation
like this
```yaml
...
classes:
fullCache:
l2Allocation:
all: 100%
l3Allocation:
all: 100%
...
```
to actually mean
```yaml
...
classes:
fullCache:
l2Allocation:
all:
unified: 100%
l3Allocation:
all:
unified: 100%
...
```
This is not possible with the NRI Reference Plugins configuration CR. Here you
must use the latter full syntax.
### Cache Allocation Prerequisites
Note that for cache allocation control to work, you must have
- a hardware platform which supports cache allocation
- resctrlfs pseudofilesystem enabled in your kernel, and loaded if it is a module
- the resctrlfs filesystem mounted (possibly with extra options for your platform)
## Cache Usage Monitoring
TBD
## Memory Bandwidth Allocation
TBD
## Memory Bandwidth Monitoring
TBD

View File

@ -1,6 +1,6 @@
# Policies
Currently there are two resource policies:
Currently there are two real resource policies implemented:
The Topology Aware resource policy provides a nearly zero configuration
resource policy that allocates resources evenly in order to avoid the "noisy
@ -9,6 +9,15 @@ neighbor" problem.
The Balloons resource policy allows user to allocate workloads to resources in
a more user controlled way.
Additionally there is a wire-frame Template resource policy implementation
without any real resource assignment logic. It can be used as a template to
implement new policies from scratch.
Also, there is some common functionality offered by the shared generic resource
management code used in these policies. This functionality is available in all
policies.
```{toctree}
---
maxdepth: 1
@ -16,4 +25,5 @@ maxdepth: 1
topology-aware.md
balloons.md
template.md
common-functionality.md
```

View File

@ -540,6 +540,68 @@ locality to a NUMA node is advertised by the API. Annotated allow and deny
lists can be used to selectively disable or enable per-resource hints, using
`podresapi:$RESOURCE_NAME` as the path for the resource.
### Picking CPU And Memory By Topology Hints
Normally topology hints are only used to pick the assigned pool for a workload.
Once a pool is selected the available resources within the pool are considered
equally good for satisfying the topology hints. When the policy is allocating
exclusive CPUs and picking pinned memory for the workload, only other potential
criteria and attributes are considered for picking the individual resources.
When multiple devices are allocated to a single container, it is possible that
this default assumption of all resources within the pool being topologically
equal is not true. If a container is allocated misaligned devices, IOW devices
with different memory or CPU locality, it is possible that only some of the CPU
and memory in the selected pool satisfy the device hints and therefore have the
desired locality.
For instance when in a two-socket system with socket #0 having NUMA nodes #0,#1
and socket #1 having NUMA nodes #2,#3, if a container is allocated two devices,
one with locality to node #0 and another with locality to node #3, the only pool
fulfilling topology hints for both devices is the root node. However, half of the
resources in the pool are optimal for one of the devices and the other half are
not optimal for either.
A container can be annotated to prefer hint based selection and pinning of CPU
and memory resources using the `pick-resources-by-hints.resource-policy.nri.io`
annotation. For example,
```yaml
apiVersion: v1
kind: Pod
metadata:
name: data-pump
annotations:
k8s.v1.cni.cncf.io/networks: sriov-net1
prefer-isolated-cpus.resource-policy.nri.io/container.ctr0: "true"
pick-resources-by-hints.resource-policy.nri.io/container.ctr0: "true"
spec:
containers:
- name: ctr0
image: dpdk-pump
imagePullPolicy: Always
resources:
requests:
cpu: 2
memory: 100M
vendor.com/sriov_netdevice_A: '1'
vendor.com/sriov_netdevice_B: '1'
limits:
vendor.com/sriov_netdevice_A: '1'
vendor.com/sriov_netdevice_B: '1'
cpu: 2
memory: 100M
```
When annotated like that, the policy will try to pick one exclusive isolated
CPU with locality to one device and another with locality to the other. It will
also try to pick and pin to memory aligned with these devices. If this succeeds
for all devices, the effective resources for the container will be the union of
the individually picked resources. If picking resources by hints fails for any
of the devices, the policy falls back to picking resource from the pool without
considering device hints.
## Container Affinity and Anti-Affinity
### Introduction

67
go.mod
View File

@ -1,39 +1,38 @@
module github.com/containers/nri-plugins
go 1.23.4
go 1.24.0
require (
github.com/askervin/gofmbt v0.0.0-20250119175120-506d925f666f
github.com/containerd/nri v0.6.0
github.com/containerd/nri v0.9.1-0.20250530003506-6120e633d4ad
github.com/containerd/otelttrpc v0.0.0-20240305015340-ea5083fda723
github.com/containerd/ttrpc v1.2.3-0.20231030150553-baadfd8e7956
github.com/containerd/ttrpc v1.2.7
github.com/containers/nri-plugins/pkg/topology v0.0.0
github.com/coreos/go-systemd/v22 v22.5.0
github.com/fsnotify/fsnotify v1.6.0
github.com/intel/goresctrl v0.8.0
github.com/intel/goresctrl v0.9.0
github.com/k8stopologyawareschedwg/noderesourcetopology-api v0.1.2
github.com/onsi/ginkgo/v2 v2.19.0
github.com/onsi/gomega v1.33.1
github.com/onsi/ginkgo/v2 v2.21.0
github.com/onsi/gomega v1.35.1
github.com/pelletier/go-toml/v2 v2.1.0
github.com/prometheus/client_golang v1.19.1
github.com/prometheus/client_model v0.6.1
github.com/sirupsen/logrus v1.9.3
github.com/stretchr/testify v1.9.0
github.com/stretchr/testify v1.10.0
go.opentelemetry.io/otel v1.19.0
go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc v1.19.0
go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp v1.19.0
go.opentelemetry.io/otel/sdk v1.19.0
go.opentelemetry.io/otel/trace v1.19.0
golang.org/x/sys v0.30.0
golang.org/x/time v0.3.0
golang.org/x/sys v0.31.0
golang.org/x/time v0.9.0
google.golang.org/grpc v1.65.0
k8s.io/api v0.31.2
k8s.io/apimachinery v0.31.2
k8s.io/apimachinery v0.33.1
k8s.io/client-go v0.31.2
k8s.io/code-generator v0.31.2
k8s.io/klog/v2 v2.130.1
k8s.io/kubelet v0.31.2
k8s.io/utils v0.0.0-20240711033017-18e509b52bc8
k8s.io/utils v0.0.0-20241104100929-3ea5e8cea738
sigs.k8s.io/controller-runtime v0.16.2
sigs.k8s.io/yaml v1.4.0
)
@ -42,27 +41,28 @@ require (
github.com/beorn7/perks v1.0.1 // indirect
github.com/cenkalti/backoff/v4 v4.2.1 // indirect
github.com/cespare/xxhash/v2 v2.3.0 // indirect
github.com/containerd/log v0.1.0 // indirect
github.com/davecgh/go-spew v1.1.2-0.20180830191138-d8f796af33cc // indirect
github.com/emicklei/go-restful/v3 v3.11.0 // indirect
github.com/fxamacker/cbor/v2 v2.7.0 // indirect
github.com/go-logr/logr v1.4.2 // indirect
github.com/go-logr/stdr v1.2.2 // indirect
github.com/go-openapi/jsonpointer v0.19.6 // indirect
github.com/go-openapi/jsonpointer v0.21.0 // indirect
github.com/go-openapi/jsonreference v0.20.2 // indirect
github.com/go-openapi/swag v0.22.4 // indirect
github.com/go-openapi/swag v0.23.0 // indirect
github.com/go-task/slim-sprig/v3 v3.0.0 // indirect
github.com/godbus/dbus/v5 v5.0.4 // indirect
github.com/gogo/protobuf v1.3.2 // indirect
github.com/golang/protobuf v1.5.4 // indirect
github.com/google/gnostic-models v0.6.8 // indirect
github.com/google/go-cmp v0.6.0 // indirect
github.com/google/gofuzz v1.2.0 // indirect
github.com/google/pprof v0.0.0-20240525223248-4bfdf5a9a2af // indirect
github.com/google/gnostic-models v0.6.9 // indirect
github.com/google/go-cmp v0.7.0 // indirect
github.com/google/pprof v0.0.0-20241029153458-d1b30febd7db // indirect
github.com/google/uuid v1.6.0 // indirect
github.com/grpc-ecosystem/grpc-gateway/v2 v2.18.0 // indirect
github.com/imdario/mergo v0.3.6 // indirect
github.com/josharian/intern v1.0.0 // indirect
github.com/json-iterator/go v1.1.12 // indirect
github.com/knqyf263/go-plugin v0.8.1-0.20240827022226-114c6257e441 // indirect
github.com/mailru/easyjson v0.7.7 // indirect
github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd // indirect
github.com/modern-go/reflect2 v1.0.2 // indirect
@ -73,32 +73,35 @@ require (
github.com/prometheus/common v0.55.0 // indirect
github.com/prometheus/procfs v0.15.1 // indirect
github.com/spf13/pflag v1.0.5 // indirect
github.com/tetratelabs/wazero v1.9.0 // indirect
github.com/x448/float16 v0.8.4 // indirect
go.opentelemetry.io/otel/exporters/otlp/otlptrace v1.19.0 // indirect
go.opentelemetry.io/otel/metric v1.19.0 // indirect
go.opentelemetry.io/proto/otlp v1.0.0 // indirect
golang.org/x/mod v0.17.0 // indirect
golang.org/x/net v0.36.0 // indirect
golang.org/x/oauth2 v0.21.0 // indirect
golang.org/x/sync v0.11.0 // indirect
golang.org/x/term v0.29.0 // indirect
golang.org/x/text v0.22.0 // indirect
golang.org/x/tools v0.21.1-0.20240508182429-e35e4ccd0d2d // indirect
golang.org/x/mod v0.21.0 // indirect
golang.org/x/net v0.38.0 // indirect
golang.org/x/oauth2 v0.27.0 // indirect
golang.org/x/sync v0.12.0 // indirect
golang.org/x/term v0.30.0 // indirect
golang.org/x/text v0.23.0 // indirect
golang.org/x/tools v0.26.0 // indirect
google.golang.org/genproto/googleapis/api v0.0.0-20240528184218-531527333157 // indirect
google.golang.org/genproto/googleapis/rpc v0.0.0-20240701130421-f6361c86f094 // indirect
google.golang.org/protobuf v1.34.2 // indirect
google.golang.org/protobuf v1.36.5 // indirect
gopkg.in/evanphx/json-patch.v4 v4.12.0 // indirect
gopkg.in/inf.v0 v0.9.1 // indirect
gopkg.in/yaml.v2 v2.4.0 // indirect
gopkg.in/yaml.v3 v3.0.1 // indirect
k8s.io/cri-api v0.31.2 // indirect
k8s.io/gengo/v2 v2.0.0-20240228010128-51d4e06bde70 // indirect
k8s.io/kube-openapi v0.0.0-20240228011516-70dd3763d340 // indirect
sigs.k8s.io/json v0.0.0-20221116044647-bc3834ca7abd // indirect
sigs.k8s.io/structured-merge-diff/v4 v4.4.1 // indirect
k8s.io/code-generator v0.33.1 // indirect
k8s.io/gengo/v2 v2.0.0-20250207200755-1244d31929d7 // indirect
k8s.io/kube-openapi v0.0.0-20250318190949-c8a335a9a2ff // indirect
sigs.k8s.io/json v0.0.0-20241010143419-9aa6b5e7a4b3 // indirect
sigs.k8s.io/randfill v1.0.0 // indirect
sigs.k8s.io/structured-merge-diff/v4 v4.6.0 // indirect
)
replace (
github.com/containers/nri-plugins/pkg/topology v0.0.0 => ./pkg/topology
github.com/opencontainers/runtime-tools => github.com/opencontainers/runtime-tools v0.0.0-20221026201742-946c877fa809
)
tool k8s.io/code-generator

126
go.sum
View File

@ -641,13 +641,15 @@ github.com/cncf/xds/go v0.0.0-20211011173535-cb28da3451f1/go.mod h1:eXthEFrGJvWH
github.com/cncf/xds/go v0.0.0-20220314180256-7f1daf1720fc/go.mod h1:eXthEFrGJvWHgFFCl3hGmgk+/aYT6PnTQLykKQRLhEs=
github.com/cncf/xds/go v0.0.0-20230105202645-06c439db220b/go.mod h1:eXthEFrGJvWHgFFCl3hGmgk+/aYT6PnTQLykKQRLhEs=
github.com/cncf/xds/go v0.0.0-20230607035331-e9ce68804cb4/go.mod h1:eXthEFrGJvWHgFFCl3hGmgk+/aYT6PnTQLykKQRLhEs=
github.com/containerd/nri v0.6.0 h1:hdztxwL0gCS1CrCa9bvD1SoJiFN4jBuRQhplCvCPMj8=
github.com/containerd/nri v0.6.0/go.mod h1:F7OZfO4QTPqw5r87aq+syZJwiVvRYLIlHZiZDBV1W3A=
github.com/containerd/log v0.1.0 h1:TCJt7ioM2cr/tfR8GPbGf9/VRAX8D2B4PjzCpfX540I=
github.com/containerd/log v0.1.0/go.mod h1:VRRf09a7mHDIRezVKTRCrOq78v577GXq3bSa3EhrzVo=
github.com/containerd/nri v0.9.1-0.20250530003506-6120e633d4ad h1:FiRhXzn9B9ToI30hX/2i4dRUKUoVELTc7PlEqvFLwGE=
github.com/containerd/nri v0.9.1-0.20250530003506-6120e633d4ad/go.mod h1:zA1mhuTD3Frj9fyyIp+1+H2AXS/IueLvxRpkAzmP6RQ=
github.com/containerd/otelttrpc v0.0.0-20240305015340-ea5083fda723 h1:swk9KxrmARZjSMrHc1Lzb39XhcDwAhYpqkBhinCFLCQ=
github.com/containerd/otelttrpc v0.0.0-20240305015340-ea5083fda723/go.mod h1:ZKzztepTSz/LKtbUSzfBNVwgqBEPABVZV9PQF/l53+Q=
github.com/containerd/ttrpc v1.2.2/go.mod h1:sIT6l32Ph/H9cvnJsfXM5drIVzTr5A2flTf1G5tYZak=
github.com/containerd/ttrpc v1.2.3-0.20231030150553-baadfd8e7956 h1:BQwXCrKPRdDQvTYfiDatp36FIH/EF7JTBOZU+EPIKWY=
github.com/containerd/ttrpc v1.2.3-0.20231030150553-baadfd8e7956/go.mod h1:ieWsXucbb8Mj9PH0rXCw1i8IunRbbAiDkpXkbfflWBM=
github.com/containerd/ttrpc v1.2.7 h1:qIrroQvuOL9HQ1X6KHe2ohc7p+HP/0VE6XPU7elJRqQ=
github.com/containerd/ttrpc v1.2.7/go.mod h1:YCXHsb32f+Sq5/72xHubdiJRQY9inL4a4ZQrAbN1q9o=
github.com/coreos/go-systemd/v22 v22.5.0 h1:RrqgGjYQKalulkV8NGVIfkXQf6YYmOyiJKk8iXXhfZs=
github.com/coreos/go-systemd/v22 v22.5.0/go.mod h1:Y58oyj3AT4RCenI/lSvhwexgC+NSVTIJ3seZv2GcEnc=
github.com/creack/pty v1.1.9/go.mod h1:oKZEueFk5CKHvIhNR5MUki03XCEU+Q6VDXinZuGJ33E=
@ -697,13 +699,14 @@ github.com/go-logr/logr v1.4.2 h1:6pFjapn8bFcIbiKo3XT4j/BhANplGihG6tvd+8rYgrY=
github.com/go-logr/logr v1.4.2/go.mod h1:9T104GzyrTigFIr8wt5mBrctHMim0Nb2HLGrmQ40KvY=
github.com/go-logr/stdr v1.2.2 h1:hSWxHoqTgW2S2qGc0LTAI563KZ5YKYRhT3MFKZMbjag=
github.com/go-logr/stdr v1.2.2/go.mod h1:mMo/vtBO5dYbehREoey6XUKy/eSumjCCveDpRre4VKE=
github.com/go-openapi/jsonpointer v0.19.6 h1:eCs3fxoIi3Wh6vtgmLTOjdhSpiqphQ+DaPn38N2ZdrE=
github.com/go-openapi/jsonpointer v0.19.6/go.mod h1:osyAmYz/mB/C3I+WsTTSgw1ONzaLJoLCyoi6/zppojs=
github.com/go-openapi/jsonpointer v0.21.0 h1:YgdVicSA9vH5RiHs9TZW5oyafXZFc6+2Vc1rr/O9oNQ=
github.com/go-openapi/jsonpointer v0.21.0/go.mod h1:IUyH9l/+uyhIYQ/PXVA41Rexl+kOkAPDdXEYns6fzUY=
github.com/go-openapi/jsonreference v0.20.2 h1:3sVjiK66+uXK/6oQ8xgcRKcFgQ5KXa2KvnJRumpMGbE=
github.com/go-openapi/jsonreference v0.20.2/go.mod h1:Bl1zwGIM8/wsvqjsOQLJ/SH+En5Ap4rVB5KVcIDZG2k=
github.com/go-openapi/swag v0.22.3/go.mod h1:UzaqsxGiab7freDnrUUra0MwWfN/q7tE4j+VcZ0yl14=
github.com/go-openapi/swag v0.22.4 h1:QLMzNJnMGPRNDCbySlcj1x01tzU8/9LTTL9hZZZogBU=
github.com/go-openapi/swag v0.22.4/go.mod h1:UzaqsxGiab7freDnrUUra0MwWfN/q7tE4j+VcZ0yl14=
github.com/go-openapi/swag v0.23.0 h1:vsEVJDUo2hPJ2tu0/Xc+4noaxyEffXNIs3cOULZ+GrE=
github.com/go-openapi/swag v0.23.0/go.mod h1:esZ8ITTYEsH1V2trKHjAN8Ai7xHb8RV+YSZ577vPjgQ=
github.com/go-pdf/fpdf v0.5.0/go.mod h1:HzcnA+A23uwogo0tp9yU+l3V+KXhiESpt1PMayhOh5M=
github.com/go-pdf/fpdf v0.6.0/go.mod h1:HzcnA+A23uwogo0tp9yU+l3V+KXhiESpt1PMayhOh5M=
github.com/go-task/slim-sprig/v3 v3.0.0 h1:sUs3vkvUymDpBKi3qH1YSqBQk9+9D/8M2mN1vB6EwHI=
@ -756,8 +759,8 @@ github.com/golang/snappy v0.0.4/go.mod h1:/XxbfmMg8lxefKM7IXC3fBNl/7bRcc72aCRzEW
github.com/google/btree v0.0.0-20180813153112-4030bb1f1f0c/go.mod h1:lNA+9X1NB3Zf8V7Ke586lFgjr2dZNuvo3lPJSGZ5JPQ=
github.com/google/btree v1.0.0/go.mod h1:lNA+9X1NB3Zf8V7Ke586lFgjr2dZNuvo3lPJSGZ5JPQ=
github.com/google/flatbuffers v2.0.8+incompatible/go.mod h1:1AeVuKshWv4vARoZatz6mlQ0JxURH0Kv5+zNeJKJCa8=
github.com/google/gnostic-models v0.6.8 h1:yo/ABAfM5IMRsS1VnXjTBvUb61tFIHozhlYvRgGre9I=
github.com/google/gnostic-models v0.6.8/go.mod h1:5n7qKqH0f5wFt+aWF8CW6pZLLNOfYuF5OpfBSENuI8U=
github.com/google/gnostic-models v0.6.9 h1:MU/8wDLif2qCXZmzncUQ/BOfxWfthHi63KqpoNbWqVw=
github.com/google/gnostic-models v0.6.9/go.mod h1:CiWsm0s6BSQd1hRn8/QmxqB6BesYcbSZxsz9b0KuDBw=
github.com/google/go-cmp v0.2.0/go.mod h1:oXzfMopK8JAjlY9xF4vHSVASa0yLyX7SntLO5aqRK0M=
github.com/google/go-cmp v0.3.0/go.mod h1:8QqcDgzrUqlUb/G2PQTWiueGozuR1884gddMywk6iLU=
github.com/google/go-cmp v0.3.1/go.mod h1:8QqcDgzrUqlUb/G2PQTWiueGozuR1884gddMywk6iLU=
@ -773,8 +776,8 @@ github.com/google/go-cmp v0.5.6/go.mod h1:v8dTdLbMG2kIc/vJvl+f65V22dbkXbowE6jgT/
github.com/google/go-cmp v0.5.7/go.mod h1:n+brtR0CgQNWTVd5ZUFpTBC8YFBDLK/h/bpaJ8/DtOE=
github.com/google/go-cmp v0.5.8/go.mod h1:17dUlkBOakJ0+DkrSSNjCkIjxS6bF9zb3elmeNGIjoY=
github.com/google/go-cmp v0.5.9/go.mod h1:17dUlkBOakJ0+DkrSSNjCkIjxS6bF9zb3elmeNGIjoY=
github.com/google/go-cmp v0.6.0 h1:ofyhxvXcZhMsU5ulbFiLKl/XBFqE1GSq7atu8tAmTRI=
github.com/google/go-cmp v0.6.0/go.mod h1:17dUlkBOakJ0+DkrSSNjCkIjxS6bF9zb3elmeNGIjoY=
github.com/google/go-cmp v0.7.0 h1:wk8382ETsv4JYUZwIsn6YpYiWiBsYLSJiTsyBybVuN8=
github.com/google/go-cmp v0.7.0/go.mod h1:pXiqmnSA92OHEEa9HXL2W4E7lf9JzCmGVUdgjX3N/iU=
github.com/google/gofuzz v1.0.0/go.mod h1:dBl0BpW6vV/+mYPU4Po3pmUjxk6FQPldtuIdl/M65Eg=
github.com/google/gofuzz v1.2.0 h1:xRy4A+RhZaiKjJ1bPfwQ8sedCA+YS2YcCHW6ec7JMi0=
github.com/google/gofuzz v1.2.0/go.mod h1:dBl0BpW6vV/+mYPU4Po3pmUjxk6FQPldtuIdl/M65Eg=
@ -798,8 +801,8 @@ github.com/google/pprof v0.0.0-20210226084205-cbba55b83ad5/go.mod h1:kpwsk12EmLe
github.com/google/pprof v0.0.0-20210601050228-01bbb1931b22/go.mod h1:kpwsk12EmLew5upagYY7GY0pfYCcupk39gWOCRROcvE=
github.com/google/pprof v0.0.0-20210609004039-a478d1d731e9/go.mod h1:kpwsk12EmLew5upagYY7GY0pfYCcupk39gWOCRROcvE=
github.com/google/pprof v0.0.0-20210720184732-4bb14d4b1be1/go.mod h1:kpwsk12EmLew5upagYY7GY0pfYCcupk39gWOCRROcvE=
github.com/google/pprof v0.0.0-20240525223248-4bfdf5a9a2af h1:kmjWCqn2qkEml422C2Rrd27c3VGxi6a/6HNq8QmHRKM=
github.com/google/pprof v0.0.0-20240525223248-4bfdf5a9a2af/go.mod h1:K1liHPHnj73Fdn/EKuT8nrFqBihUSKXoLYU0BuatOYo=
github.com/google/pprof v0.0.0-20241029153458-d1b30febd7db h1:097atOisP2aRj7vFgYQBbFN4U4JNXUNYpxael3UzMyo=
github.com/google/pprof v0.0.0-20241029153458-d1b30febd7db/go.mod h1:vavhavw2zAxS5dIdcRluK6cSGGPlZynqzFM8NdvU144=
github.com/google/renameio v0.1.0/go.mod h1:KWCgfxg9yswjAJkECMjeO8J8rahYeXnNhOm40UhjYkI=
github.com/google/uuid v1.1.2/go.mod h1:TIyPZe4MgqvfeYDBFedMoGGpEw/LqOeaOT+nhxU+yHo=
github.com/google/uuid v1.3.0/go.mod h1:TIyPZe4MgqvfeYDBFedMoGGpEw/LqOeaOT+nhxU+yHo=
@ -835,8 +838,8 @@ github.com/ianlancetaylor/demangle v0.0.0-20181102032728-5e5cf60278f6/go.mod h1:
github.com/ianlancetaylor/demangle v0.0.0-20200824232613-28f6c0f3b639/go.mod h1:aSSvb/t6k1mPoxDqO4vJh6VOCGPwU4O0C2/Eqndh1Sc=
github.com/imdario/mergo v0.3.6 h1:xTNEAn+kxVO7dTZGu0CegyqKZmoWFI0rF8UxjlB2d28=
github.com/imdario/mergo v0.3.6/go.mod h1:2EnlNZ0deacrJVfApfmtdGgDfMuh/nq6Ok1EcJh5FfA=
github.com/intel/goresctrl v0.8.0 h1:N3shVbS3kA1Hk2AmcbHv8805Hjbv+zqsCIZCGktxx50=
github.com/intel/goresctrl v0.8.0/go.mod h1:T3ZZnuHSNouwELB5wvOoUJaB7l/4Rm23rJy/wuWJlr0=
github.com/intel/goresctrl v0.9.0 h1:IKI4ZrPTazLyFgdnWEkR9LS+DDATapOgoBtGxVMHePs=
github.com/intel/goresctrl v0.9.0/go.mod h1:1S8GDqL46GuKb525bxNhIEEkhf4rhVcbSf9DuKhp7mw=
github.com/josharian/intern v1.0.0 h1:vlS4z54oSdjm0bgjRigI+G1HpF+tI+9rE5LLzOg8HmY=
github.com/josharian/intern v1.0.0/go.mod h1:5DoeVV0s6jJacbCEi61lwdGj/aVlrQvzHFFd8Hwg//Y=
github.com/json-iterator/go v1.1.12 h1:PV8peI4a0ysnczrg+LtxykD8LfKY9ML6u2jnxaEnrnM=
@ -853,6 +856,8 @@ github.com/kisielk/gotool v1.0.0/go.mod h1:XhKaO+MFFWcvkIS/tQcRk01m1F5IRFswLeQ+o
github.com/klauspost/asmfmt v1.3.2/go.mod h1:AG8TuvYojzulgDAMCnYn50l/5QV3Bs/tp6j0HLHbNSE=
github.com/klauspost/compress v1.15.9/go.mod h1:PhcZ0MbTNciWF3rruxRgKxI5NkcHHrHUDtV4Yw2GlzU=
github.com/klauspost/cpuid/v2 v2.0.9/go.mod h1:FInQzS24/EEf25PyTYn52gqo7WaD8xa0213Md/qVLRg=
github.com/knqyf263/go-plugin v0.8.1-0.20240827022226-114c6257e441 h1:Q/sZeuWkXprbKJSs7AwXryuZKSEL/a8ltC7e7xSspN0=
github.com/knqyf263/go-plugin v0.8.1-0.20240827022226-114c6257e441/go.mod h1:CvCrNDMiKFlAlLFLmcoEfsTROEfNKbEZAMMrwQnLXCM=
github.com/kr/fs v0.1.0/go.mod h1:FFnZGqtBN9Gxj7eW1uZ42v5BccTP0vu6NEaFoC2HwRg=
github.com/kr/pretty v0.1.0/go.mod h1:dAy3ld7l9f0ibDNOQOHHMYYIIbhfbHSm3C4ZsoJORNo=
github.com/kr/pretty v0.2.1/go.mod h1:ipq/a2n7PKx3OHsz4KJII5eveXtPO4qwEXGdVfWzfnI=
@ -880,10 +885,10 @@ github.com/modern-go/reflect2 v1.0.2 h1:xBagoLtFs94CBntxluKeaWgTMpvLxC4ur3nMaC9G
github.com/modern-go/reflect2 v1.0.2/go.mod h1:yWuevngMOJpCy52FWWMvUC8ws7m/LJsjYzDa0/r8luk=
github.com/munnerz/goautoneg v0.0.0-20191010083416-a7dc8b61c822 h1:C3w9PqII01/Oq1c1nUAm88MOHcQC9l5mIlSMApZMrHA=
github.com/munnerz/goautoneg v0.0.0-20191010083416-a7dc8b61c822/go.mod h1:+n7T8mK8HuQTcFwEeznm/DIxMOiR9yIdICNftLE1DvQ=
github.com/onsi/ginkgo/v2 v2.19.0 h1:9Cnnf7UHo57Hy3k6/m5k3dRfGTMXGvxhHFvkDTCTpvA=
github.com/onsi/ginkgo/v2 v2.19.0/go.mod h1:rlwLi9PilAFJ8jCg9UE1QP6VBpd6/xj3SRC0d6TU0To=
github.com/onsi/gomega v1.33.1 h1:dsYjIxxSR755MDmKVsaFQTE22ChNBcuuTWgkUDSubOk=
github.com/onsi/gomega v1.33.1/go.mod h1:U4R44UsT+9eLIaYRB2a5qajjtQYn0hauxvRm16AVYg0=
github.com/onsi/ginkgo/v2 v2.21.0 h1:7rg/4f3rB88pb5obDgNZrNHrQ4e6WpjonchcpuBRnZM=
github.com/onsi/ginkgo/v2 v2.21.0/go.mod h1:7Du3c42kxCUegi0IImZ1wUQzMBVecgIHjR1C+NkhLQo=
github.com/onsi/gomega v1.35.1 h1:Cwbd75ZBPxFSuZ6T+rN/WCb/gOc6YgFBXLlZLhC7Ds4=
github.com/onsi/gomega v1.35.1/go.mod h1:PvZbdDc8J6XJEpDK4HCuRBm8a6Fzp9/DmhC9C7yFlog=
github.com/opencontainers/runtime-spec v1.1.0 h1:HHUyrt9mwHUjtasSbXSMvs4cyFxh+Bll4AjJ9odEGpg=
github.com/opencontainers/runtime-spec v1.1.0/go.mod h1:jwyrGlmzljRJv/Fgzds9SsS/C5hL+LL3ko9hs6T5lQ0=
github.com/pelletier/go-toml/v2 v2.1.0 h1:FnwAJ4oYMvbT/34k9zzHuZNrhlz48GB3/s6at6/MHO4=
@ -918,8 +923,8 @@ github.com/rogpeppe/fastuuid v1.2.0/go.mod h1:jVj6XXZzXRy/MSR5jhDC/2q6DgLz+nrA6L
github.com/rogpeppe/go-internal v1.3.0/go.mod h1:M8bDsm7K2OlrFYOpmOWEs/qY81heoFRclV5y23lUDJ4=
github.com/rogpeppe/go-internal v1.6.1/go.mod h1:xXDCJY+GAPziupqXw64V24skbSoqbTEfhy4qGm1nDQc=
github.com/rogpeppe/go-internal v1.9.0/go.mod h1:WtVeX8xhTBvf0smdhujwtBcq4Qrzq/fJaraNFVN+nFs=
github.com/rogpeppe/go-internal v1.12.0 h1:exVL4IDcn6na9z1rAb56Vxr+CgyK3nn3O+epU5NdKM8=
github.com/rogpeppe/go-internal v1.12.0/go.mod h1:E+RYuTGaKKdloAfM02xzb0FW3Paa99yedzYV+kq4uf4=
github.com/rogpeppe/go-internal v1.13.1 h1:KvO1DLK/DRN07sQ1LQKScxyZJuNnedQ5/wKSR38lUII=
github.com/rogpeppe/go-internal v1.13.1/go.mod h1:uMEvuHeurkdAXX61udpOXGD/AzZDWNMNyH2VO9fmH0o=
github.com/ruudk/golang-pdf417 v0.0.0-20181029194003-1af4ab5afa58/go.mod h1:6lfFZQK844Gfx8o5WFuvpxWRwnSoipWe/p622j1v06w=
github.com/ruudk/golang-pdf417 v0.0.0-20201230142125-a7e3863a1245/go.mod h1:pQAZKsJ8yyVxGRWYNEm9oFB8ieLgKFnamEyDmSA0BRk=
github.com/sirupsen/logrus v1.8.1/go.mod h1:yWOB1SBYBC5VeMP7gHvWumXLIWorT60ONWic61uBYv0=
@ -945,8 +950,10 @@ github.com/stretchr/testify v1.8.0/go.mod h1:yNjHg4UonilssWZ8iaSj1OCr/vHnekPRkoO
github.com/stretchr/testify v1.8.1/go.mod h1:w2LPCIKwWwSfY2zedu0+kehJoqGctiVI29o6fzry7u4=
github.com/stretchr/testify v1.8.3/go.mod h1:sz/lmYIOXD/1dqDmKjjqLyZ2RngseejIcXlSw2iwfAo=
github.com/stretchr/testify v1.8.4/go.mod h1:sz/lmYIOXD/1dqDmKjjqLyZ2RngseejIcXlSw2iwfAo=
github.com/stretchr/testify v1.9.0 h1:HtqpIVDClZ4nwg75+f6Lvsy/wHu+3BoSGCbBAcpTsTg=
github.com/stretchr/testify v1.9.0/go.mod h1:r2ic/lqez/lEtzL7wO/rwa5dbSLXVDPFyf8C91i36aY=
github.com/stretchr/testify v1.10.0 h1:Xv5erBjTwe/5IxqUQTdXv5kgmIvbHo3QQyRwhJsOfJA=
github.com/stretchr/testify v1.10.0/go.mod h1:r2ic/lqez/lEtzL7wO/rwa5dbSLXVDPFyf8C91i36aY=
github.com/tetratelabs/wazero v1.9.0 h1:IcZ56OuxrtaEz8UYNRHBrUa9bYeX9oVY93KspZZBf/I=
github.com/tetratelabs/wazero v1.9.0/go.mod h1:TSbcXCfFP0L2FGkRPxHphadXPjo1T6W+CseNNY7EkjM=
github.com/x448/float16 v0.8.4 h1:qLwI1I70+NjRFUR3zs1JPUCgaCXSh3SW62uAKT1mSBM=
github.com/x448/float16 v0.8.4/go.mod h1:14CWIYCyZA/cWjXOioeEpHeN/83MdbZDRQHoFcYsOfg=
github.com/yuin/goldmark v1.1.25/go.mod h1:3hX8gzYuyVAZsxl0MRgGTJEmQBFcNTphYh9decYSb74=
@ -1055,8 +1062,8 @@ golang.org/x/mod v0.6.0-dev.0.20220419223038-86c51ed26bb4/go.mod h1:jJ57K6gSWd91
golang.org/x/mod v0.7.0/go.mod h1:iBbtSCu2XBx23ZKBPSOrRkjjQPZFPuis4dIYUhu/chs=
golang.org/x/mod v0.8.0/go.mod h1:iBbtSCu2XBx23ZKBPSOrRkjjQPZFPuis4dIYUhu/chs=
golang.org/x/mod v0.9.0/go.mod h1:iBbtSCu2XBx23ZKBPSOrRkjjQPZFPuis4dIYUhu/chs=
golang.org/x/mod v0.17.0 h1:zY54UmvipHiNd+pm+m0x9KhZ9hl1/7QNMyxXbc6ICqA=
golang.org/x/mod v0.17.0/go.mod h1:hTbmBsO62+eylJbnUtE2MGJUyE7QWk4xUqPFrRgJ+7c=
golang.org/x/mod v0.21.0 h1:vvrHzRwRfVKSiLrG+d4FMl/Qi4ukBCE6kZlTUkDYRT0=
golang.org/x/mod v0.21.0/go.mod h1:6SkKJ3Xj0I0BrPOZoBy3bdMptDDU9oJrpohJ3eWZ1fY=
golang.org/x/net v0.0.0-20180724234803-3673e40ba225/go.mod h1:mL1N/T3taQHkDXs73rZJwtUhF3w3ftmwwsq0BUmARs4=
golang.org/x/net v0.0.0-20180826012351-8a410e7b638d/go.mod h1:mL1N/T3taQHkDXs73rZJwtUhF3w3ftmwwsq0BUmARs4=
golang.org/x/net v0.0.0-20190108225652-1e06a53dbb7e/go.mod h1:mL1N/T3taQHkDXs73rZJwtUhF3w3ftmwwsq0BUmARs4=
@ -1114,8 +1121,8 @@ golang.org/x/net v0.6.0/go.mod h1:2Tu9+aMcznHK/AK1HMvgo6xiTLG5rD5rZLDS+rp2Bjs=
golang.org/x/net v0.7.0/go.mod h1:2Tu9+aMcznHK/AK1HMvgo6xiTLG5rD5rZLDS+rp2Bjs=
golang.org/x/net v0.8.0/go.mod h1:QVkue5JL9kW//ek3r6jTKnTFis1tRmNAW2P1shuFdJc=
golang.org/x/net v0.9.0/go.mod h1:d48xBJpPfHeWQsugry2m+kC02ZBRGRgulfHnEXEuWns=
golang.org/x/net v0.36.0 h1:vWF2fRbw4qslQsQzgFqZff+BItCvGFQqKzKIzx1rmoA=
golang.org/x/net v0.36.0/go.mod h1:bFmbeoIPfrw4sMHNhb4J9f6+tPziuGjq7Jk/38fxi1I=
golang.org/x/net v0.38.0 h1:vRMAPTMaeGqVhG5QyLJHqNDwecKTomGeqbnfZyKlBI8=
golang.org/x/net v0.38.0/go.mod h1:ivrbrMbzFq5J41QOQh0siUuly180yBYtLp+CKbEaFx8=
golang.org/x/oauth2 v0.0.0-20180821212333-d2e6202438be/go.mod h1:N/0e6XlmueqKjAGxoOufVs8QHGRruUQn6yWY3a++T0U=
golang.org/x/oauth2 v0.0.0-20190226205417-e64efc72b421/go.mod h1:gOpvHmFTYa4IltrdGE7lF6nIHvwfUNPOp7c8zoXwtLw=
golang.org/x/oauth2 v0.0.0-20190604053449-0f29369cfe45/go.mod h1:gOpvHmFTYa4IltrdGE7lF6nIHvwfUNPOp7c8zoXwtLw=
@ -1145,8 +1152,8 @@ golang.org/x/oauth2 v0.4.0/go.mod h1:RznEsdpjGAINPTOF0UH/t+xJ75L18YO3Ho6Pyn+uRec
golang.org/x/oauth2 v0.5.0/go.mod h1:9/XBHVqLaWO3/BRHs5jbpYCnOZVjj5V0ndyaAM7KB4I=
golang.org/x/oauth2 v0.6.0/go.mod h1:ycmewcwgD4Rpr3eZJLSB4Kyyljb3qDh40vJ8STE5HKw=
golang.org/x/oauth2 v0.7.0/go.mod h1:hPLQkd9LyjfXTiRohC/41GhcFqxisoUQ99sCUOHO9x4=
golang.org/x/oauth2 v0.21.0 h1:tsimM75w1tF/uws5rbeHzIWxEqElMehnc+iW793zsZs=
golang.org/x/oauth2 v0.21.0/go.mod h1:XYTD2NtWslqkgxebSiOHnXEap4TF09sJSc7H1sXbhtI=
golang.org/x/oauth2 v0.27.0 h1:da9Vo7/tDv5RH/7nZDz1eMGS/q1Vv1N/7FCrBhI9I3M=
golang.org/x/oauth2 v0.27.0/go.mod h1:onh5ek6nERTohokkhCD/y2cV4Do3fxFHFuAejCkRWT8=
golang.org/x/sync v0.0.0-20180314180146-1d60e4601c6f/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sync v0.0.0-20181108010431-42b317875d0f/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sync v0.0.0-20181221193216-37e7f081c4d4/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
@ -1163,8 +1170,8 @@ golang.org/x/sync v0.0.0-20220722155255-886fb9371eb4/go.mod h1:RxMgew5VJxzue5/jJ
golang.org/x/sync v0.0.0-20220819030929-7fc1605a5dde/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sync v0.0.0-20220929204114-8fcdb60fdcc0/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sync v0.1.0/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sync v0.11.0 h1:GGz8+XQP4FvTTrjZPzNKTMFtSXH80RAzG+5ghFPgK9w=
golang.org/x/sync v0.11.0/go.mod h1:Czt+wKu1gCyEFDUtn0jG5QVvpJ6rzVqr5aXyt9drQfk=
golang.org/x/sync v0.12.0 h1:MHc5BpPuC30uJk597Ri8TV3CNZcTLu6B6z4lJy+g6Jw=
golang.org/x/sync v0.12.0/go.mod h1:1dzgHSNfp02xaA81J2MS99Qcpr2w7fw1gpm99rleRqA=
golang.org/x/sys v0.0.0-20180830151530-49385e6e1522/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY=
golang.org/x/sys v0.0.0-20190215142949-d0b11bdaac8a/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY=
golang.org/x/sys v0.0.0-20190312061237-fead79001313/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
@ -1248,8 +1255,8 @@ golang.org/x/sys v0.5.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.6.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.7.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.12.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.30.0 h1:QjkSwP/36a20jFYWkSue1YwXzLmsV5Gfq7Eiy72C1uc=
golang.org/x/sys v0.30.0/go.mod h1:/VUhepiaJMQUp4+oa/7Zr1D23ma6VTLIYjOOTFZPUcA=
golang.org/x/sys v0.31.0 h1:ioabZlmFYtWhL+TRYpcnNlLwhyxaM9kWTDEmfnprqik=
golang.org/x/sys v0.31.0/go.mod h1:BJP2sWEmIv4KK5OTEluFJCKSidICx8ciO85XgH3Ak8k=
golang.org/x/term v0.0.0-20201126162022-7de9c90e9dd1/go.mod h1:bj7SfCRtBDWHUb9snDiAeCFNEtKQo2Wmx5Cou7ajbmo=
golang.org/x/term v0.0.0-20210927222741-03fcf44c2211/go.mod h1:jbD1KX2456YbFQfuXm/mYQcufACuNUgVhRMnK/tPxf8=
golang.org/x/term v0.2.0/go.mod h1:TVmDHMZPmdnySmBfhjOoOdhjzdE1h4u1VwSiw2l1Nuc=
@ -1258,8 +1265,8 @@ golang.org/x/term v0.4.0/go.mod h1:9P2UbLfCdcvo3p/nzKvsmas4TnlujnuoV9hGgYzW1lQ=
golang.org/x/term v0.5.0/go.mod h1:jMB1sMXY+tzblOD4FWmEbocvup2/aLOaQEp7JmGp78k=
golang.org/x/term v0.6.0/go.mod h1:m6U89DPEgQRMq3DNkDClhWw02AUbt2daBVO4cn4Hv9U=
golang.org/x/term v0.7.0/go.mod h1:P32HKFT3hSsZrRxla30E9HqToFYAQPCMs/zFMBUFqPY=
golang.org/x/term v0.29.0 h1:L6pJp37ocefwRRtYPKSWOWzOtWSxVajvz2ldH/xi3iU=
golang.org/x/term v0.29.0/go.mod h1:6bl4lRlvVuDgSf3179VpIxBF0o10JUpXWOnI7nErv7s=
golang.org/x/term v0.30.0 h1:PQ39fJZ+mfadBm0y5WlL4vlM7Sx1Hgf13sMIY2+QS9Y=
golang.org/x/term v0.30.0/go.mod h1:NYYFdzHoI5wRh/h5tDMdMqCqPJZEuNqVR5xJLd/n67g=
golang.org/x/text v0.0.0-20170915032832-14c0d48ead0c/go.mod h1:NqM8EUOU14njkJ3fqMW+pc6Ldnwhi/IjpwHt7yyuwOQ=
golang.org/x/text v0.3.0/go.mod h1:NqM8EUOU14njkJ3fqMW+pc6Ldnwhi/IjpwHt7yyuwOQ=
golang.org/x/text v0.3.1-0.20180807135948-17ff2d5776d2/go.mod h1:NqM8EUOU14njkJ3fqMW+pc6Ldnwhi/IjpwHt7yyuwOQ=
@ -1276,15 +1283,16 @@ golang.org/x/text v0.6.0/go.mod h1:mrYo+phRRbMaCq/xk9113O4dZlRixOauAjOtrjsXDZ8=
golang.org/x/text v0.7.0/go.mod h1:mrYo+phRRbMaCq/xk9113O4dZlRixOauAjOtrjsXDZ8=
golang.org/x/text v0.8.0/go.mod h1:e1OnstbJyHTd6l/uOt8jFFHp6TRDWZR/bV3emEE/zU8=
golang.org/x/text v0.9.0/go.mod h1:e1OnstbJyHTd6l/uOt8jFFHp6TRDWZR/bV3emEE/zU8=
golang.org/x/text v0.22.0 h1:bofq7m3/HAFvbF51jz3Q9wLg3jkvSPuiZu/pD1XwgtM=
golang.org/x/text v0.22.0/go.mod h1:YRoo4H8PVmsu+E3Ou7cqLVH8oXWIHVoX0jqUWALQhfY=
golang.org/x/text v0.23.0 h1:D71I7dUrlY+VX0gQShAThNGHFxZ13dGLBHQLVl1mJlY=
golang.org/x/text v0.23.0/go.mod h1:/BLNzu4aZCJ1+kcD0DNRotWKage4q2rGVAg4o22unh4=
golang.org/x/time v0.0.0-20181108054448-85acf8d2951c/go.mod h1:tRJNPiyCQ0inRvYxbN9jk5I+vvW/OXSQhTDSoE431IQ=
golang.org/x/time v0.0.0-20190308202827-9d24e82272b4/go.mod h1:tRJNPiyCQ0inRvYxbN9jk5I+vvW/OXSQhTDSoE431IQ=
golang.org/x/time v0.0.0-20191024005414-555d28b269f0/go.mod h1:tRJNPiyCQ0inRvYxbN9jk5I+vvW/OXSQhTDSoE431IQ=
golang.org/x/time v0.0.0-20220922220347-f3bd1da661af/go.mod h1:tRJNPiyCQ0inRvYxbN9jk5I+vvW/OXSQhTDSoE431IQ=
golang.org/x/time v0.1.0/go.mod h1:tRJNPiyCQ0inRvYxbN9jk5I+vvW/OXSQhTDSoE431IQ=
golang.org/x/time v0.3.0 h1:rg5rLMjNzMS1RkNLzCG38eapWhnYLFYXDXj2gOlr8j4=
golang.org/x/time v0.3.0/go.mod h1:tRJNPiyCQ0inRvYxbN9jk5I+vvW/OXSQhTDSoE431IQ=
golang.org/x/time v0.9.0 h1:EsRrnYcQiGH+5FfbgvV4AP7qEZstoyrHB0DzarOQ4ZY=
golang.org/x/time v0.9.0/go.mod h1:3BpzKBy/shNhVucY/MWOyx10tF3SFh9QdLuxbVysPQM=
golang.org/x/tools v0.0.0-20180525024113-a5b4c53f6e8b/go.mod h1:n7NCudcB/nEzxVGmLbDWY5pfWTLqBcC2KZ6jyYvM4mQ=
golang.org/x/tools v0.0.0-20180917221912-90fa682c2a6e/go.mod h1:n7NCudcB/nEzxVGmLbDWY5pfWTLqBcC2KZ6jyYvM4mQ=
golang.org/x/tools v0.0.0-20190114222345-bf090417da8b/go.mod h1:n7NCudcB/nEzxVGmLbDWY5pfWTLqBcC2KZ6jyYvM4mQ=
@ -1348,8 +1356,8 @@ golang.org/x/tools v0.1.12/go.mod h1:hNGJHUnrk76NpqgfD5Aqm5Crs+Hm0VOH/i9J2+nxYbc
golang.org/x/tools v0.3.0/go.mod h1:/rWhSS2+zyEVwoJf8YAX6L2f0ntZ7Kn/mGgAWcipA5k=
golang.org/x/tools v0.6.0/go.mod h1:Xwgl3UAJ/d3gWutnCtw505GrjyAbvKui8lOU390QaIU=
golang.org/x/tools v0.7.0/go.mod h1:4pg6aUX35JBAogB10C9AtvVL+qowtN4pT3CGSQex14s=
golang.org/x/tools v0.21.1-0.20240508182429-e35e4ccd0d2d h1:vU5i/LfpvrRCpgM/VPfJLg5KjxD3E+hfT1SH+d9zLwg=
golang.org/x/tools v0.21.1-0.20240508182429-e35e4ccd0d2d/go.mod h1:aiJjzUbINMkxbQROHiO6hDPo2LHcIPhhQsa9DLh0yGk=
golang.org/x/tools v0.26.0 h1:v/60pFQmzmT9ExmjDv2gGIfi3OqfKoEP6I5+umXlbnQ=
golang.org/x/tools v0.26.0/go.mod h1:TPVVj70c7JJ3WCazhD8OdXcZg/og+b9+tH/KxylGwH0=
golang.org/x/xerrors v0.0.0-20190717185122-a985d3407aa7/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0=
golang.org/x/xerrors v0.0.0-20191011141410-1b5146add898/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0=
golang.org/x/xerrors v0.0.0-20191204190536-9bdfabe68543/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0=
@ -1632,8 +1640,8 @@ google.golang.org/protobuf v1.28.0/go.mod h1:HV8QOd/L58Z+nl8r43ehVNZIU/HEI6OcFqw
google.golang.org/protobuf v1.28.1/go.mod h1:HV8QOd/L58Z+nl8r43ehVNZIU/HEI6OcFqwMG9pJV4I=
google.golang.org/protobuf v1.29.1/go.mod h1:HV8QOd/L58Z+nl8r43ehVNZIU/HEI6OcFqwMG9pJV4I=
google.golang.org/protobuf v1.30.0/go.mod h1:HV8QOd/L58Z+nl8r43ehVNZIU/HEI6OcFqwMG9pJV4I=
google.golang.org/protobuf v1.34.2 h1:6xV6lTsCfpGD21XK49h7MhtcApnLqkfYgPcdHftf6hg=
google.golang.org/protobuf v1.34.2/go.mod h1:qYOHts0dSfpeUzUFpOMr/WGzszTmLH+DiWniOlNbLDw=
google.golang.org/protobuf v1.36.5 h1:tPhr+woSbjfYvY6/GPufUoYizxw1cF/yFoxJ2fmpwlM=
google.golang.org/protobuf v1.36.5/go.mod h1:9fA7Ob0pmnwhb644+1+CVWFRbNajQ6iRojtC/QF5bRE=
gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
gopkg.in/check.v1 v1.0.0-20180628173108-788fd7840127/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
gopkg.in/check.v1 v1.0.0-20201130134442-10cb98267c6c h1:Hei/4ADfdWqJk1ZMxUNpqntNwaWcugrBjAiHlqqRiVk=
@ -1645,7 +1653,6 @@ gopkg.in/inf.v0 v0.9.1 h1:73M5CoZyi3ZLMOyDlQh031Cx6N9NDJ2Vvfl76EDAgDc=
gopkg.in/inf.v0 v0.9.1/go.mod h1:cWUDdTG/fYaXco+Dcufb5Vnc6Gp2YChqWtbxRZE0mXw=
gopkg.in/yaml.v2 v2.2.2/go.mod h1:hI93XBmqTisBFMUTm0b8Fm+jr3Dg1NNxqwp+5A1VGuI=
gopkg.in/yaml.v2 v2.2.3/go.mod h1:hI93XBmqTisBFMUTm0b8Fm+jr3Dg1NNxqwp+5A1VGuI=
gopkg.in/yaml.v2 v2.2.8/go.mod h1:hI93XBmqTisBFMUTm0b8Fm+jr3Dg1NNxqwp+5A1VGuI=
gopkg.in/yaml.v2 v2.4.0 h1:D8xgwECY7CYvx+Y2n4sBz93Jn9JRvxdiyyo8CTfuKaY=
gopkg.in/yaml.v2 v2.4.0/go.mod h1:RDklbk79AGWmwhnvt/jBztapEOGDOx6ZbXqjP6csGnQ=
gopkg.in/yaml.v3 v3.0.0-20200313102051-9f266ea9e77c/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=
@ -1661,24 +1668,22 @@ honnef.co/go/tools v0.0.1-2020.1.4/go.mod h1:X/FiERA/W4tHapMX5mGpAtMSVEeEUOyHaw9
honnef.co/go/tools v0.1.3/go.mod h1:NgwopIslSNH47DimFoV78dnkksY2EFtX0ajyb3K/las=
k8s.io/api v0.31.2 h1:3wLBbL5Uom/8Zy98GRPXpJ254nEFpl+hwndmk9RwmL0=
k8s.io/api v0.31.2/go.mod h1:bWmGvrGPssSK1ljmLzd3pwCQ9MgoTsRCuK35u6SygUk=
k8s.io/apimachinery v0.31.2 h1:i4vUt2hPK56W6mlT7Ry+AO8eEsyxMD1U44NR22CLTYw=
k8s.io/apimachinery v0.31.2/go.mod h1:rsPdaZJfTfLsNJSQzNHQvYoTmxhoOEofxtOsF3rtsMo=
k8s.io/apimachinery v0.33.1 h1:mzqXWV8tW9Rw4VeW9rEkqvnxj59k1ezDUl20tFK/oM4=
k8s.io/apimachinery v0.33.1/go.mod h1:BHW0YOu7n22fFv/JkYOEfkUYNRN0fj0BlvMFWA7b+SM=
k8s.io/client-go v0.31.2 h1:Y2F4dxU5d3AQj+ybwSMqQnpZH9F30//1ObxOKlTI9yc=
k8s.io/client-go v0.31.2/go.mod h1:NPa74jSVR/+eez2dFsEIHNa+3o09vtNaWwWwb1qSxSs=
k8s.io/code-generator v0.31.2 h1:xLWxG0HEpMSHfcM//3u3Ro2Hmc6AyyLINQS//Z2GEOI=
k8s.io/code-generator v0.31.2/go.mod h1:eEQHXgBU/m7LDaToDoiz3t97dUUVyOblQdwOr8rivqc=
k8s.io/cri-api v0.31.2 h1:O/weUnSHvM59nTio0unxIUFyRHMRKkYn96YDILSQKmo=
k8s.io/cri-api v0.31.2/go.mod h1:Po3TMAYH/+KrZabi7QiwQI4a692oZcUOUThd/rqwxrI=
k8s.io/gengo/v2 v2.0.0-20240228010128-51d4e06bde70 h1:NGrVE502P0s0/1hudf8zjgwki1X/TByhmAoILTarmzo=
k8s.io/gengo/v2 v2.0.0-20240228010128-51d4e06bde70/go.mod h1:VH3AT8AaQOqiGjMF9p0/IM1Dj+82ZwjfxUP1IxaHE+8=
k8s.io/code-generator v0.33.1 h1:ZLzIRdMsh3Myfnx9BaooX6iQry29UJjVfVG+BuS+UMw=
k8s.io/code-generator v0.33.1/go.mod h1:HUKT7Ubp6bOgIbbaPIs9lpd2Q02uqkMCMx9/GjDrWpY=
k8s.io/gengo/v2 v2.0.0-20250207200755-1244d31929d7 h1:2OX19X59HxDprNCVrWi6jb7LW1PoqTlYqEq5H2oetog=
k8s.io/gengo/v2 v2.0.0-20250207200755-1244d31929d7/go.mod h1:EJykeLsmFC60UQbYJezXkEsG2FLrt0GPNkU5iK5GWxU=
k8s.io/klog/v2 v2.130.1 h1:n9Xl7H1Xvksem4KFG4PYbdQCQxqc/tTUyrgXaOhHSzk=
k8s.io/klog/v2 v2.130.1/go.mod h1:3Jpz1GvMt720eyJH1ckRHK1EDfpxISzJ7I9OYgaDtPE=
k8s.io/kube-openapi v0.0.0-20240228011516-70dd3763d340 h1:BZqlfIlq5YbRMFko6/PM7FjZpUb45WallggurYhKGag=
k8s.io/kube-openapi v0.0.0-20240228011516-70dd3763d340/go.mod h1:yD4MZYeKMBwQKVht279WycxKyM84kkAx2DPrTXaeb98=
k8s.io/kube-openapi v0.0.0-20250318190949-c8a335a9a2ff h1:/usPimJzUKKu+m+TE36gUyGcf03XZEP0ZIKgKj35LS4=
k8s.io/kube-openapi v0.0.0-20250318190949-c8a335a9a2ff/go.mod h1:5jIi+8yX4RIb8wk3XwBo5Pq2ccx4FP10ohkbSKCZoK8=
k8s.io/kubelet v0.31.2 h1:6Hytyw4LqWqhgzoi7sPfpDGClu2UfxmPmaiXPC4FRgI=
k8s.io/kubelet v0.31.2/go.mod h1:0E4++3cMWi2cJxOwuaQP3eMBa7PSOvAFgkTPlVc/2FA=
k8s.io/utils v0.0.0-20240711033017-18e509b52bc8 h1:pUdcCO1Lk/tbT5ztQWOBi5HBgbBP1J8+AsQnQCKsi8A=
k8s.io/utils v0.0.0-20240711033017-18e509b52bc8/go.mod h1:OLgZIPagt7ERELqWJFomSt595RzquPNLL48iOWgYOg0=
k8s.io/utils v0.0.0-20241104100929-3ea5e8cea738 h1:M3sRQVHv7vB20Xc2ybTt7ODCeFj6JSWYFzOFnYeS6Ro=
k8s.io/utils v0.0.0-20241104100929-3ea5e8cea738/go.mod h1:OLgZIPagt7ERELqWJFomSt595RzquPNLL48iOWgYOg0=
lukechampine.com/uint128 v1.1.1/go.mod h1:c4eWIwlEGaxC/+H1VguhU4PHXNWDCDMUlWdIWl2j1gk=
lukechampine.com/uint128 v1.2.0/go.mod h1:c4eWIwlEGaxC/+H1VguhU4PHXNWDCDMUlWdIWl2j1gk=
modernc.org/cc/v3 v3.36.0/go.mod h1:NFUHyPn4ekoC/JHeZFfZurN6ixxawE1BnVonP/oahEI=
@ -1719,9 +1724,12 @@ rsc.io/quote/v3 v3.1.0/go.mod h1:yEA65RcK8LyAZtP9Kv3t0HmxON59tX3rD+tICJqUlj0=
rsc.io/sampler v1.3.0/go.mod h1:T1hPZKmBbMNahiBKFy5HrXp6adAjACjK9JXDnKaTXpA=
sigs.k8s.io/controller-runtime v0.16.2 h1:mwXAVuEk3EQf478PQwQ48zGOXvW27UJc8NHktQVuIPU=
sigs.k8s.io/controller-runtime v0.16.2/go.mod h1:vpMu3LpI5sYWtujJOa2uPK61nB5rbwlN7BAB8aSLvGU=
sigs.k8s.io/json v0.0.0-20221116044647-bc3834ca7abd h1:EDPBXCAspyGV4jQlpZSudPeMmr1bNJefnuqLsRAsHZo=
sigs.k8s.io/json v0.0.0-20221116044647-bc3834ca7abd/go.mod h1:B8JuhiUyNFVKdsE8h686QcCxMaH6HrOAZj4vswFpcB0=
sigs.k8s.io/structured-merge-diff/v4 v4.4.1 h1:150L+0vs/8DA78h1u02ooW1/fFq/Lwr+sGiqlzvrtq4=
sigs.k8s.io/structured-merge-diff/v4 v4.4.1/go.mod h1:N8hJocpFajUSSeSJ9bOZ77VzejKZaXsTtZo4/u7Io08=
sigs.k8s.io/json v0.0.0-20241010143419-9aa6b5e7a4b3 h1:/Rv+M11QRah1itp8VhT6HoVx1Ray9eB4DBr+K+/sCJ8=
sigs.k8s.io/json v0.0.0-20241010143419-9aa6b5e7a4b3/go.mod h1:18nIHnGi6636UCz6m8i4DhaJ65T6EruyzmoQqI2BVDo=
sigs.k8s.io/randfill v0.0.0-20250304075658-069ef1bbf016/go.mod h1:XeLlZ/jmk4i1HRopwe7/aU3H5n1zNUcX6TM94b3QxOY=
sigs.k8s.io/randfill v1.0.0 h1:JfjMILfT8A6RbawdsK2JXGBR5AQVfd+9TbzrlneTyrU=
sigs.k8s.io/randfill v1.0.0/go.mod h1:XeLlZ/jmk4i1HRopwe7/aU3H5n1zNUcX6TM94b3QxOY=
sigs.k8s.io/structured-merge-diff/v4 v4.6.0 h1:IUA9nvMmnKWcj5jl84xn+T5MnlZKThmUW1TdblaLVAc=
sigs.k8s.io/structured-merge-diff/v4 v4.6.0/go.mod h1:dDy58f92j70zLsuZVuUX5Wp9vtxXpaZnkPGWeqDfCps=
sigs.k8s.io/yaml v1.4.0 h1:Mk1wCc2gy/F0THH0TAp1QYyJNzRm2KCLy3o5ASXVI5E=
sigs.k8s.io/yaml v1.4.0/go.mod h1:Ejl7/uTz7PSA4eKMyQCUTnhZYNmLIl+5c2lQPGR2BPY=

View File

@ -78,3 +78,12 @@ func (a *Agent) GoListPodResources(timeout time.Duration) <-chan *podresapi.PodR
return ch
}
// PurgePodResources removes any cached resources for the given pod.
func (a *Agent) PurgePodResources(ns, pod string) {
if !a.podResCli.HasClient() {
return
}
a.podResCli.PurgePodResources(ns, pod)
}

View File

@ -158,3 +158,10 @@ func (c *Client) Get(ctx context.Context, namespace, pod string) (*PodResources,
return l.GetPodResources(namespace, pod), nil
}
// PurgePodResources removes any cached resources for the given pod.
func (c *Client) PurgePodResources(namespace, pod string) {
if c.cached != nil {
c.cached.PurgePodResources(namespace, pod)
}
}

View File

@ -22,6 +22,10 @@ import (
api "k8s.io/kubelet/pkg/apis/podresources/v1"
)
const (
HintProvider = "podresourceapi:"
)
// PodResources contains resources for a pod.
type PodResources struct {
*api.PodResources
@ -114,6 +118,16 @@ func (l *PodResourcesList) GetContainer(ns, pod, ctr string) *ContainerResources
return l.GetPodResources(ns, pod).GetContainer(ctr)
}
func (l *PodResourcesList) PurgePodResources(ns, pod string) {
if l == nil {
return
}
if podMap, ok := l.m[ns]; ok {
delete(podMap, pod)
}
}
// GetDeviceTopologyHints returns topology hints for the given container. checkDenied
// is used to filter out hints that are disallowed.
func (r *ContainerResources) GetDeviceTopologyHints(checkDenied func(string) bool) topology.Hints {
@ -124,7 +138,7 @@ func (r *ContainerResources) GetDeviceTopologyHints(checkDenied func(string) boo
hints := make(topology.Hints)
for _, dev := range r.GetDevices() {
name := "podresourceapi:" + dev.GetResourceName()
name := HintProvider + dev.GetResourceName()
if checkDenied(name) {
log.Info("filtering hints for disallowed device %s", name)
@ -156,3 +170,7 @@ func (r *ContainerResources) GetDeviceTopologyHints(checkDenied func(string) boo
return hints
}
func IsPodResourceHint(provider string) bool {
return strings.HasPrefix(provider, HintProvider)
}

View File

@ -50,5 +50,14 @@ func (c *BalloonsPolicy) Validate() error {
if c == nil {
return nil
}
return c.Spec.Config.Validate()
if err := c.CommonConfig().Validate(); err != nil {
return err
}
if err := c.Spec.Config.Validate(); err != nil {
return err
}
return nil
}

View File

@ -39,3 +39,15 @@ type CommonConfig struct {
Log log.Config
Instrumentation instrumentation.Config
}
func (c *CommonConfig) Validate() error {
if c == nil {
return nil
}
if err := c.Control.RDT.Validate(); err != nil {
return err
}
return nil
}

View File

@ -14,6 +14,32 @@
package rdt
import (
"encoding/json"
"fmt"
"log/slog"
grcpath "github.com/intel/goresctrl/pkg/path"
"github.com/intel/goresctrl/pkg/rdt"
)
var (
// ErrConfigConversion is an error returned if we can't convert our configuration
// to a goresctrl native representation (goresctrl/pkg/rdt.Config).
ErrConfigConversion = fmt.Errorf("failed to convert to native goresctrl configuration")
)
var (
// Expose goresctrl/rdt functions for configuration via this package.
SetPrefix func(string) = grcpath.SetPrefix
Initialize func(string) error = rdt.Initialize
SetLogger func(*slog.Logger) = rdt.SetLogger
SetConfig func(*rdt.Config, bool) error = rdt.SetConfig
// And some that we need for other plumbing.
NewCollector = rdt.NewCollector
)
// Config provides runtime configuration for class based cache allocation
// and memory bandwidth control.
// +kubebuilder:object:generate=true
@ -27,4 +53,134 @@ type Config struct {
// class is used as its RDT class, if this is otherwise unset.
// +optional
UsePodQoSAsDefaultClass bool `json:"usePodQoSAsDefaultClass,omitempty"`
// Options container common goresctrl/rdt settings.
Options Options `json:"options,omitempty"`
// Partitions configure cache partitions.
Partitions map[string]PartitionConfig `json:"partitions,omitempty"`
// Force indicates if the configuration should be forced to goresctrl.
Force bool `json:"force,omitempty"`
}
// +kubebuilder:object:generate=true
// PartitionConfig provides configuration for a single cache partition.
type PartitionConfig struct {
L2Allocation CatConfig `json:"l2Allocation,omitempty"`
L3Allocation CatConfig `json:"l3Allocation,omitempty"`
MBAllocation MbaConfig `json:"mbAllocation,omitempty"`
Classes map[string]ClassConfig `json:"classes,omitempty"`
}
// +kubebuilder:object:generate=true
// ClassConfig provides configuration for a single named cache CLOS/class.
type ClassConfig struct {
L2Allocation CatConfig `json:"l2Allocation,omitempty"`
L3Allocation CatConfig `json:"l3Allocation,omitempty"`
MBAllocation MbaConfig `json:"mbAllocation,omitempty"`
}
// CatConfig contains the L2 or L3 cache allocation configuration for one partition or class.
type CatConfig map[string]CacheIdCatConfig
// MbaConfig contains the memory bandwidth configuration for one partition or class.
type MbaConfig map[string]CacheIdMbaConfig
// +kubebuilder:object:generate=true
// CacheIdCatConfig is the cache allocation configuration for one cache id.
// Code and Data represent an optional configuration for separate code and data
// paths and only have effect when RDT CDP (Code and Data Prioritization) is
// enabled in the system. Code and Data go in tandem so that both or neither
// must be specified - only specifying the other is considered a configuration
// error.
//
// TODO(klihub): Ideally we'd have a validation rule ensuring that either
// unified or code+data are set here. I tried that using a CEL-expression
// but couldn't avoid hitting the complexity estimation limit (even with
// extra MaxProperties limits thrown in). Maybe we'll be able to do that
// eventually with https://github.com/kubernetes-sigs/controller-tools/pull/1212
type CacheIdCatConfig struct {
Unified CacheProportion `json:"unified,omitempty"`
Code CacheProportion `json:"code,omitempty"`
Data CacheProportion `json:"data,omitempty"`
}
// CacheIdMbaConfig is the memory bandwidth configuration for one cache id.
// It's an array of at most two values, specifying separate values to be used
// for percentage based and MBps based memory bandwidth allocation. For
// example, `{"80%", "1000MBps"}` would allocate 80% if percentage based
// allocation is used by the Linux kernel, or 1000 MBps in case MBps based
// allocation is in use.
type CacheIdMbaConfig []MbProportion
// MbProportion specifies a share of available memory bandwidth. It's an
// integer value followed by a unit. Two units are supported:
//
// - percentage, e.g. `80%`
// - MBps, e.g. `1000MBps`
type MbProportion string
// CacheProportion specifies a share of the available cache lines.
// Supported formats:
//
// - percentage, e.g. `50%`
// - percentage range, e.g. `50-60%`
// - bit numbers, e.g. `0-5`, `2,3`, must contain one contiguous block of bits set
// - hex bitmask, e.g. `0xff0`, must contain one contiguous block of bits set
type CacheProportion string
// +kubebuilder:object:generate=true
// Options contains common settings.
type Options struct {
L2 CatOptions `json:"l2,omitempty"`
L3 CatOptions `json:"l3,omitempty"`
MB MbOptions `json:"mb,omitempty"`
}
// +kubebuilder:object:generate=true
// CatOptions contains the common settings for cache allocation.
type CatOptions struct {
Optional bool `json:"optional"`
}
// +kubebuilder:object:generate=true
// MbOptions contains the common settings for memory bandwidth allocation.
type MbOptions struct {
Optional bool `json:"optional"`
}
type GoresctrlConfig struct {
// Options contain common settings.
Options Options `json:"options,omitempty"`
// Partitions configure cache partitions.
Partitions map[string]PartitionConfig `json:"partitions,omitempty"`
}
// ToGoresctrl returns the configuration in native goresctrl format.
func (c *Config) ToGoresctrl() (*rdt.Config, bool, error) {
if c == nil || !c.Enable {
return nil, false, nil
}
in := GoresctrlConfig{
Options: c.Options,
Partitions: c.Partitions,
}
data, err := json.Marshal(in)
if err != nil {
return nil, false, fmt.Errorf("%w: %w", ErrConfigConversion, err)
}
out := &rdt.Config{}
err = json.Unmarshal(data, &out)
if err != nil {
return nil, false, fmt.Errorf("%w: %w", ErrConfigConversion, err)
}
return out, c.Force, nil
}
// Validate validates the configuration.
func (c *Config) Validate() error {
_, _, err := c.ToGoresctrl()
return err
}

View File

@ -20,9 +20,92 @@ package rdt
import ()
// DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
func (in *CacheIdCatConfig) DeepCopyInto(out *CacheIdCatConfig) {
*out = *in
}
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new CacheIdCatConfig.
func (in *CacheIdCatConfig) DeepCopy() *CacheIdCatConfig {
if in == nil {
return nil
}
out := new(CacheIdCatConfig)
in.DeepCopyInto(out)
return out
}
// DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
func (in *CatOptions) DeepCopyInto(out *CatOptions) {
*out = *in
}
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new CatOptions.
func (in *CatOptions) DeepCopy() *CatOptions {
if in == nil {
return nil
}
out := new(CatOptions)
in.DeepCopyInto(out)
return out
}
// DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
func (in *ClassConfig) DeepCopyInto(out *ClassConfig) {
*out = *in
if in.L2Allocation != nil {
in, out := &in.L2Allocation, &out.L2Allocation
*out = make(CatConfig, len(*in))
for key, val := range *in {
(*out)[key] = val
}
}
if in.L3Allocation != nil {
in, out := &in.L3Allocation, &out.L3Allocation
*out = make(CatConfig, len(*in))
for key, val := range *in {
(*out)[key] = val
}
}
if in.MBAllocation != nil {
in, out := &in.MBAllocation, &out.MBAllocation
*out = make(MbaConfig, len(*in))
for key, val := range *in {
var outVal []MbProportion
if val == nil {
(*out)[key] = nil
} else {
inVal := (*in)[key]
in, out := &inVal, &outVal
*out = make(CacheIdMbaConfig, len(*in))
copy(*out, *in)
}
(*out)[key] = outVal
}
}
}
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new ClassConfig.
func (in *ClassConfig) DeepCopy() *ClassConfig {
if in == nil {
return nil
}
out := new(ClassConfig)
in.DeepCopyInto(out)
return out
}
// DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
func (in *Config) DeepCopyInto(out *Config) {
*out = *in
out.Options = in.Options
if in.Partitions != nil {
in, out := &in.Partitions, &out.Partitions
*out = make(map[string]PartitionConfig, len(*in))
for key, val := range *in {
(*out)[key] = *val.DeepCopy()
}
}
}
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new Config.
@ -34,3 +117,88 @@ func (in *Config) DeepCopy() *Config {
in.DeepCopyInto(out)
return out
}
// DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
func (in *MbOptions) DeepCopyInto(out *MbOptions) {
*out = *in
}
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new MbOptions.
func (in *MbOptions) DeepCopy() *MbOptions {
if in == nil {
return nil
}
out := new(MbOptions)
in.DeepCopyInto(out)
return out
}
// DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
func (in *Options) DeepCopyInto(out *Options) {
*out = *in
out.L2 = in.L2
out.L3 = in.L3
out.MB = in.MB
}
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new Options.
func (in *Options) DeepCopy() *Options {
if in == nil {
return nil
}
out := new(Options)
in.DeepCopyInto(out)
return out
}
// DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
func (in *PartitionConfig) DeepCopyInto(out *PartitionConfig) {
*out = *in
if in.L2Allocation != nil {
in, out := &in.L2Allocation, &out.L2Allocation
*out = make(CatConfig, len(*in))
for key, val := range *in {
(*out)[key] = val
}
}
if in.L3Allocation != nil {
in, out := &in.L3Allocation, &out.L3Allocation
*out = make(CatConfig, len(*in))
for key, val := range *in {
(*out)[key] = val
}
}
if in.MBAllocation != nil {
in, out := &in.MBAllocation, &out.MBAllocation
*out = make(MbaConfig, len(*in))
for key, val := range *in {
var outVal []MbProportion
if val == nil {
(*out)[key] = nil
} else {
inVal := (*in)[key]
in, out := &inVal, &outVal
*out = make(CacheIdMbaConfig, len(*in))
copy(*out, *in)
}
(*out)[key] = outVal
}
}
if in.Classes != nil {
in, out := &in.Classes, &out.Classes
*out = make(map[string]ClassConfig, len(*in))
for key, val := range *in {
(*out)[key] = *val.DeepCopy()
}
}
}
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new PartitionConfig.
func (in *PartitionConfig) DeepCopy() *PartitionConfig {
if in == nil {
return nil
}
out := new(PartitionConfig)
in.DeepCopyInto(out)
return out
}

View File

@ -24,7 +24,7 @@ import ()
func (in *Config) DeepCopyInto(out *Config) {
*out = *in
in.CPU.DeepCopyInto(&out.CPU)
out.RDT = in.RDT
in.RDT.DeepCopyInto(&out.RDT)
out.BlockIO = in.BlockIO
}

View File

@ -142,6 +142,13 @@ func (l CPUTopologyLevel) Value() int {
type BalloonDef struct {
// Name of the balloon definition.
Name string `json:"name"`
// Components is a list of component properties. Every
// component has a balloonType property according to which
// CPUs are allocated for that component. Specifying the
// Components list makes this a composite balloon type whose
// instances uses all CPUs of its component instances, and no
// other CPUs.
Components []BalloonDefComponent `json:"components,omitempty"`
// Namespaces control which namespaces are assigned into
// balloon instances from this definition. This is used by
// namespace assign methods.
@ -269,6 +276,16 @@ type BalloonDef struct {
ShowContainersInNrt *bool `json:"showContainersInNrt,omitempty"`
}
// BalloonDefComponent contains a balloon component definition.
// +kubebuilder:object:generate=true
type BalloonDefComponent struct {
// BalloonType is the name of the balloon type of this
// component. It must match the name of a balloon type
// defined in the ballonTypes of the policy.
// +kubebuilder:validation:Required
DefName string `json:"balloonType"`
}
// LoadClass specifies how a load affects the system and load
// generating containers themselves.
type LoadClass struct {

View File

@ -26,6 +26,11 @@ import (
// DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
func (in *BalloonDef) DeepCopyInto(out *BalloonDef) {
*out = *in
if in.Components != nil {
in, out := &in.Components, &out.Components
*out = make([]BalloonDefComponent, len(*in))
copy(*out, *in)
}
if in.Namespaces != nil {
in, out := &in.Namespaces, &out.Namespaces
*out = make([]string, len(*in))
@ -95,6 +100,21 @@ func (in *BalloonDef) DeepCopy() *BalloonDef {
return out
}
// DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
func (in *BalloonDefComponent) DeepCopyInto(out *BalloonDefComponent) {
*out = *in
}
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new BalloonDefComponent.
func (in *BalloonDefComponent) DeepCopy() *BalloonDefComponent {
if in == nil {
return nil
}
out := new(BalloonDefComponent)
in.DeepCopyInto(out)
return out
}
// DeepCopyInto is an autogenerated deepcopy function, copying the receiver, writing into out. in must be non-nil.
func (in *Config) DeepCopyInto(out *Config) {
*out = *in

View File

@ -14,6 +14,8 @@
package v1alpha1
import "github.com/intel/goresctrl/pkg/rdt"
var (
_ ResmgrConfig = &TemplatePolicy{}
)
@ -45,3 +47,7 @@ func (c *TemplatePolicy) PolicyConfig() interface{} {
}
return &c.Spec.Config
}
func (c *TemplatePolicy) RdtConfig() (*rdt.Config, bool, error) {
return nil, false, nil
}

View File

@ -45,3 +45,15 @@ func (c *TopologyAwarePolicy) PolicyConfig() interface{} {
}
return &c.Spec.Config
}
func (c *TopologyAwarePolicy) Validate() error {
if c == nil {
return nil
}
if err := c.CommonConfig().Validate(); err != nil {
return err
}
return nil
}

View File

@ -16,6 +16,7 @@ package log
import (
"fmt"
"log/slog"
"strings"
"sync"
@ -96,6 +97,9 @@ type Logger interface {
// Source returns the source name of this Logger.
Source() string
// SlogHandler returns an slog.Handler for this logger.
SlogHandler() slog.Handler
}
// logger implements Logger.

80
pkg/log/slog-logger.go Normal file
View File

@ -0,0 +1,80 @@
// Copyright The NRI Plugins Authors. All Rights Reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
package log
import (
"context"
"log/slog"
"strings"
)
type slogger struct {
l Logger
}
var _ slog.Handler = &slogger{}
// SetSlogLogger sets up the default logger for the slog package.
func SetSlogLogger(source string) {
var l Logger
if source == "" {
l = Default()
} else {
l = log.get(source)
}
slog.SetDefault(slog.New(l.SlogHandler()))
}
func (l logger) SlogHandler() slog.Handler {
return &slogger{l: l}
}
func (s *slogger) Enabled(_ context.Context, level slog.Level) bool {
switch level {
case slog.LevelDebug:
return log.level <= LevelDebug
case slog.LevelInfo:
return log.level <= LevelInfo
case slog.LevelWarn:
return log.level <= LevelWarn
case slog.LevelError:
return log.level <= LevelError
}
return level >= slog.LevelInfo
}
func (s *slogger) Handle(_ context.Context, r slog.Record) error {
switch r.Level {
case slog.LevelDebug:
s.l.Debug("%s", strings.TrimPrefix(r.Message, r.Level.String()+" "))
case slog.LevelInfo:
s.l.Info("%s", strings.TrimPrefix(r.Message, r.Level.String()+" "))
case slog.LevelWarn:
s.l.Warn("%s", strings.TrimPrefix(r.Message, r.Level.String()+" "))
case slog.LevelError:
s.l.Error("%s", strings.TrimPrefix(r.Message, r.Level.String()+" "))
}
return nil
}
func (s *slogger) WithAttrs(_ []slog.Attr) slog.Handler {
return s
}
func (s *slogger) WithGroup(_ string) slog.Handler {
return s
}

131
pkg/mempolicy/mempolicy.go Normal file
View File

@ -0,0 +1,131 @@
// Copyright The NRI Plugins Authors. All Rights Reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
// mempolicy package provides low-level functions to set and get
// default memory policyfor a process using the Linux kernel's
// set_mempolicy and get_mempolicy syscalls.
package mempolicy
import (
"fmt"
"syscall"
"unsafe"
)
const (
MPOL_DEFAULT = iota
MPOL_PREFERRED
MPOL_BIND
MPOL_INTERLEAVE
MPOL_LOCAL
MPOL_PREFERRED_MANY
MPOL_WEIGHTED_INTERLEAVE
MPOL_F_STATIC_NODES uint = (1 << 15)
MPOL_F_RELATIVE_NODES uint = (1 << 14)
MPOL_F_NUMA_BALANCING uint = (1 << 13)
SYS_SET_MEMPOLICY = 238
SYS_GET_MEMPOLICY = 239
MAX_NUMA_NODES = 1024
)
var Modes = map[string]uint{
"MPOL_DEFAULT": MPOL_DEFAULT,
"MPOL_PREFERRED": MPOL_PREFERRED,
"MPOL_BIND": MPOL_BIND,
"MPOL_INTERLEAVE": MPOL_INTERLEAVE,
"MPOL_LOCAL": MPOL_LOCAL,
"MPOL_PREFERRED_MANY": MPOL_PREFERRED_MANY,
"MPOL_WEIGHTED_INTERLEAVE": MPOL_WEIGHTED_INTERLEAVE,
}
var Flags = map[string]uint{
"MPOL_F_STATIC_NODES": MPOL_F_STATIC_NODES,
"MPOL_F_RELATIVE_NODES": MPOL_F_RELATIVE_NODES,
"MPOL_F_NUMA_BALANCING": MPOL_F_NUMA_BALANCING,
}
var ModeNames map[uint]string
var FlagNames map[uint]string
func nodesToMask(nodes []int) ([]uint64, error) {
maxNode := 0
for _, node := range nodes {
if node > maxNode {
maxNode = node
}
if node < 0 {
return nil, fmt.Errorf("node %d out of range", node)
}
}
if maxNode >= MAX_NUMA_NODES {
return nil, fmt.Errorf("node %d out of range", maxNode)
}
mask := make([]uint64, (maxNode/64)+1)
for _, node := range nodes {
mask[node/64] |= (1 << (node % 64))
}
return mask, nil
}
func maskToNodes(mask []uint64) []int {
nodes := make([]int, 0)
for i := range MAX_NUMA_NODES {
if (mask[i/64] & (1 << (i % 64))) != 0 {
nodes = append(nodes, i)
}
}
return nodes
}
// SetMempolicy calls set_mempolicy syscall
func SetMempolicy(mpol uint, nodes []int) error {
nodeMask, err := nodesToMask(nodes)
if err != nil {
return err
}
nodeMaskPtr := unsafe.Pointer(&nodeMask[0])
_, _, errno := syscall.Syscall(SYS_SET_MEMPOLICY, uintptr(mpol), uintptr(nodeMaskPtr), uintptr(len(nodeMask)*64))
if errno != 0 {
return syscall.Errno(errno)
}
return nil
}
// GetMempolicy calls get_mempolicy syscall
func GetMempolicy() (uint, []int, error) {
var mpol uint
maxNode := uint64(MAX_NUMA_NODES)
nodeMask := make([]uint64, maxNode/64)
nodeMaskPtr := unsafe.Pointer(&nodeMask[0])
_, _, errno := syscall.Syscall(SYS_GET_MEMPOLICY, uintptr(unsafe.Pointer(&mpol)), uintptr(nodeMaskPtr), uintptr(maxNode))
if errno != 0 {
return 0, []int{}, syscall.Errno(errno)
}
return mpol, maskToNodes(nodeMask), nil
}
func init() {
ModeNames = make(map[uint]string)
for k, v := range Modes {
ModeNames[v] = k
}
FlagNames = make(map[uint]string)
for k, v := range Flags {
FlagNames[v] = k
}
}

55
pkg/resmgr/blockio.go Normal file
View File

@ -0,0 +1,55 @@
// Copyright The NRI Plugins Authors. All Rights Reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
package resmgr
import (
"github.com/containers/nri-plugins/pkg/apis/config/v1alpha1/resmgr/control/blockio"
)
//
// Notes:
// This is now just a placeholder, mostly to keep our RDT and block I/O
// control implementation aligned, since both of them are handled using
// goresctrl in the runtime currently. However unlike for RDT, we can't
// easily split class configuration and class translation to (cgroup io
// control) parameters between two processes (NRI plugins and runtime),
// because class configuration is just an in-process mapping of names
// to parameters. We'll need more work to bring block I/O control up to
// the same level as RDT, for instance by adding block I/O cgroup v2
// support to goresctrl, doing class name to parameter conversion here,
// and using the v2 unified NRI field to pass those to the runtime (and
// check if this works properly with runc/crun).
type blkioControl struct {
resmgr *resmgr
hostRoot string
}
func newBlockioControl(resmgr *resmgr, hostRoot string) *blkioControl {
return &blkioControl{
resmgr: resmgr,
hostRoot: hostRoot,
}
}
func (c *blkioControl) configure(cfg *blockio.Config) error {
if cfg == nil {
return nil
}
c.resmgr.cache.ConfigureBlockIOControl(cfg.Enable)
return nil
}

View File

@ -335,6 +335,7 @@ func (p *nriPlugin) StopPodSandbox(ctx context.Context, podSandbox *api.PodSandb
pod, _ := m.cache.LookupPod(podSandbox.GetId())
released := slices.Clone(pod.GetContainers())
m.agent.PurgePodResources(pod.GetNamespace(), pod.GetName())
if err := p.runPostReleaseHooks(event, released...); err != nil {
nri.Error("%s: failed to run post-release hooks for pod %s: %v",
@ -370,6 +371,7 @@ func (p *nriPlugin) RemovePodSandbox(ctx context.Context, podSandbox *api.PodSan
pod, _ := m.cache.LookupPod(podSandbox.GetId())
released := slices.Clone(pod.GetContainers())
m.agent.PurgePodResources(pod.GetNamespace(), pod.GetName())
if err := p.runPostReleaseHooks(event, released...); err != nil {
nri.Error("%s: failed to run post-release hooks for pod %s: %v",
@ -687,6 +689,7 @@ func (p *nriPlugin) getPendingUpdates(skip *api.Container) []*api.ContainerUpdat
for _, ctrl := range c.GetPending() {
c.ClearPending(ctrl)
}
m.policy.ExportResourceData(c)
}
}

View File

@ -179,6 +179,8 @@ const (
// least 2 CPUs must be allocated to the balloon. This results
// in excess 700 mCPUs available for bursting, for instance.
ExcessCPUsAttribute = "excess cpus"
// ComponentCPUsAttribute lists CPUs of components of a composite balloon
ComponentCPUsAttribute = "component cpusets"
// Exporting containers as topology subzones
ContainerAllocationZoneType = "allocation for container"
)

120
pkg/resmgr/rdt.go Normal file
View File

@ -0,0 +1,120 @@
// Copyright The NRI Plugins Authors. All Rights Reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
package resmgr
import (
"fmt"
"log/slog"
"github.com/containers/nri-plugins/pkg/apis/config/v1alpha1/resmgr/control/rdt"
logger "github.com/containers/nri-plugins/pkg/log"
"github.com/containers/nri-plugins/pkg/metrics"
"github.com/prometheus/client_golang/prometheus"
)
var (
rdtlog = logger.Get("goresctrl")
)
type rdtControl struct {
hostRoot string
resmgr *resmgr
collector *rdtCollector
}
func newRdtControl(resmgr *resmgr, hostRoot string) *rdtControl {
slog.SetDefault(slog.New(rdtlog.SlogHandler()))
if hostRoot != "" {
rdt.SetPrefix(opt.HostRoot)
}
collector, err := registerRdtCollector()
if err != nil {
log.Error("failed to register RDT metrics collector: %v", err)
}
return &rdtControl{
resmgr: resmgr,
hostRoot: hostRoot,
collector: collector,
}
}
func (c *rdtControl) configure(cfg *rdt.Config) error {
if cfg == nil {
return nil
}
if cfg.Enable {
nativeCfg, force, err := cfg.ToGoresctrl()
if err != nil {
return err
}
if err := rdt.Initialize(""); err != nil {
return fmt.Errorf("failed to initialize goresctrl/rdt: %w", err)
}
log.Info("goresctrl/rdt initialized")
if nativeCfg != nil {
if err := rdt.SetConfig(nativeCfg, force); err != nil {
return fmt.Errorf("failed to configure goresctrl/rdt: %w", err)
}
log.Info("goresctrl/rdt configuration updated")
}
}
c.resmgr.cache.ConfigureRDTControl(cfg.Enable)
c.collector.enable(cfg.Enable)
return nil
}
type rdtCollector struct {
prometheus.Collector
enabled bool
}
func registerRdtCollector() (*rdtCollector, error) {
options := []metrics.RegisterOption{
metrics.WithGroup("policy"),
metrics.WithCollectorOptions(
metrics.WithoutSubsystem(),
),
}
c := &rdtCollector{Collector: rdt.NewCollector()}
if err := metrics.Register("rdt", c, options...); err != nil {
return nil, err
}
return c, nil
}
func (c *rdtCollector) enable(enabled bool) {
c.enabled = enabled
}
func (c *rdtCollector) Describe(ch chan<- *prometheus.Desc) {
rdtlog.Debug("describing RDT metrics")
c.Collector.Describe(ch)
}
func (c *rdtCollector) Collect(ch chan<- prometheus.Metric) {
rdtlog.Debug("collecting RDT metrics")
c.Collector.Collect(ch)
}

View File

@ -30,7 +30,6 @@ import (
"github.com/containers/nri-plugins/pkg/resmgr/policy"
"github.com/containers/nri-plugins/pkg/sysfs"
"github.com/containers/nri-plugins/pkg/topology"
goresctrlpath "github.com/intel/goresctrl/pkg/path"
"sigs.k8s.io/yaml"
cfgapi "github.com/containers/nri-plugins/pkg/apis/config/v1alpha1"
@ -59,6 +58,8 @@ type resmgr struct {
events chan interface{} // channel for delivering events
stop chan interface{} // channel for signalling shutdown to goroutines
nri *nriPlugin // NRI plugins, if we're running as such
rdt *rdtControl // control for RDT allocation and monitoring
blkio *blkioControl // control for block I/O prioritization and throttling
running bool
}
@ -77,7 +78,6 @@ func NewResourceManager(backend policy.Backend, agt *agent.Agent) (ResourceManag
if opt.HostRoot != "" {
sysfs.SetSysRoot(opt.HostRoot)
topology.SetSysRoot(opt.HostRoot)
goresctrlpath.SetPrefix(opt.HostRoot)
}
if opt.MetricsTimer != 0 {
@ -109,6 +109,9 @@ func NewResourceManager(backend policy.Backend, agt *agent.Agent) (ResourceManag
return nil, err
}
m.rdt = newRdtControl(m, opt.HostRoot)
m.blkio = newBlockioControl(m, opt.HostRoot)
if err := m.setupControllers(); err != nil {
return nil, err
}
@ -175,8 +178,15 @@ func (m *resmgr) start(cfg cfgapi.ResmgrConfig) error {
log.Warnf("failed to configure logger: %v", err)
}
m.cache.ConfigureRDTControl(mCfg.Control.RDT.Enable)
m.cache.ConfigureBlockIOControl(mCfg.Control.BlockIO.Enable)
// m.cache.ConfigureBlockIOControl(mCfg.Control.BlockIO.Enable)
if err := m.blkio.configure(&mCfg.Control.BlockIO); err != nil {
return err
}
if err := m.rdt.configure(&mCfg.Control.RDT); err != nil {
return err
}
if err := m.policy.Start(m.cfg.PolicyConfig()); err != nil {
return err
@ -305,8 +315,15 @@ func (m *resmgr) reconfigure(cfg cfgapi.ResmgrConfig) error {
log.Warnf("failed to restart controllers: %v", err)
}
m.cache.ConfigureRDTControl(mCfg.Control.RDT.Enable)
m.cache.ConfigureBlockIOControl(mCfg.Control.BlockIO.Enable)
// m.cache.ConfigureBlockIOControl(mCfg.Control.BlockIO.Enable)
if err := m.blkio.configure(&mCfg.Control.BlockIO); err != nil {
return err
}
if err := m.rdt.configure(&mCfg.Control.RDT); err != nil {
return err
}
err := m.policy.Reconfigure(cfg.PolicyConfig())
if err != nil {

View File

@ -125,6 +125,7 @@ type System interface {
OfflineCPUs() cpuset.CPUSet
CoreKindCPUs(CoreKind) cpuset.CPUSet
CoreKinds() []CoreKind
IDSetForCPUs(cpuset.CPUSet, func(CPU) idset.ID) idset.IDSet
AllThreadsForCPUs(cpuset.CPUSet) cpuset.CPUSet
SingleThreadForCPUs(cpuset.CPUSet) cpuset.CPUSet
AllCPUsSharingNthLevelCacheWithCPUs(int, cpuset.CPUSet) cpuset.CPUSet
@ -240,7 +241,6 @@ type cpu struct {
node idset.ID // node id
core idset.ID // core id
threads idset.IDSet // sibling/hyper-threads
baseFreq uint64 // CPU base frequency
freq CPUFreq // CPU frequencies
epp EPP // Energy Performance Preference from cpufreq governor
online bool // whether this CPU is online
@ -252,8 +252,9 @@ type cpu struct {
// CPUFreq is a CPU frequency scaling range
type CPUFreq struct {
min uint64 // minimum frequency (kHz)
max uint64 // maximum frequency (kHz)
Base uint64 // base frequency
Min uint64 // minimum frequency (kHz)
Max uint64 // maximum frequency (kHz)
}
// EPP represents the value of a CPU energy performance profile
@ -485,8 +486,8 @@ func (sys *system) Discover(flags DiscoveryFlag) error {
sys.Debug(" node: %d", cpu.node)
sys.Debug(" core: %d (%s)", cpu.core, cpu.coreKind)
sys.Debug(" threads: %s", cpu.threads)
sys.Debug(" base freq: %d", cpu.baseFreq)
sys.Debug(" freq: %d - %d", cpu.freq.min, cpu.freq.max)
sys.Debug(" base freq: %d", cpu.freq.Base)
sys.Debug(" freq: %d - %d", cpu.freq.Min, cpu.freq.Max)
sys.Debug(" epp: %d", cpu.epp)
for idx, c := range cpu.caches {
@ -785,6 +786,17 @@ func (sys *system) CoreKinds() []CoreKind {
return kinds
}
// IDSetForCPUs returns a set of IDs for the given CPUs.
func (sys *system) IDSetForCPUs(cpus cpuset.CPUSet, idForCPU func(cpu CPU) idset.ID) idset.IDSet {
ids := idset.NewIDSet()
for _, id := range cpus.UnsortedList() {
if cpu, ok := sys.cpus[id]; ok {
ids.Add(idForCPU(cpu))
}
}
return ids
}
func (sys *system) AllThreadsForCPUs(cpus cpuset.CPUSet) cpuset.CPUSet {
all := cpuset.New()
for _, id := range cpus.UnsortedList() {
@ -1039,14 +1051,14 @@ func (sys *system) discoverCPU(path string) error {
}
}
if _, err := readSysfsEntry(path, "cpufreq/base_frequency", &cpu.baseFreq); err != nil {
cpu.baseFreq = 0
if _, err := readSysfsEntry(path, "cpufreq/base_frequency", &cpu.freq.Base); err != nil {
cpu.freq.Base = 0
}
if _, err := readSysfsEntry(path, "cpufreq/cpuinfo_min_freq", &cpu.freq.min); err != nil {
cpu.freq.min = 0
if _, err := readSysfsEntry(path, "cpufreq/cpuinfo_min_freq", &cpu.freq.Min); err != nil {
cpu.freq.Min = 0
}
if _, err := readSysfsEntry(path, "cpufreq/cpuinfo_max_freq", &cpu.freq.max); err != nil {
cpu.freq.max = 0
if _, err := readSysfsEntry(path, "cpufreq/cpuinfo_max_freq", &cpu.freq.Max); err != nil {
cpu.freq.Max = 0
}
if _, err := readSysfsEntry(path, "cpufreq/energy_performance_preference", &cpu.epp); err != nil {
cpu.epp = EPPUnknown
@ -1131,7 +1143,7 @@ func (c *cpu) ThreadCPUSet() cpuset.CPUSet {
// BaseFrequency returns the base frequency setting for this CPU.
func (c *cpu) BaseFrequency() uint64 {
return c.baseFreq
return c.freq.Base
}
// FrequencyRange returns the frequency range for this CPU.
@ -1162,23 +1174,23 @@ func (c *cpu) SstClos() int {
// SetFrequencyLimits sets the frequency scaling limits for this CPU.
func (c *cpu) SetFrequencyLimits(min, max uint64) error {
if c.freq.min == 0 {
if c.freq.Min == 0 {
return nil
}
min /= 1000
max /= 1000
if min < c.freq.min && min != 0 {
min = c.freq.min
if min < c.freq.Min && min != 0 {
min = c.freq.Min
}
if min > c.freq.max {
min = c.freq.max
if min > c.freq.Max {
min = c.freq.Max
}
if max < c.freq.min && max != 0 {
max = c.freq.min
if max < c.freq.Min && max != 0 {
max = c.freq.Min
}
if max > c.freq.max {
max = c.freq.max
if max > c.freq.Max {
max = c.freq.Max
}
if _, err := writeSysfsEntry(c.path, "cpufreq/scaling_min_freq", min, nil); err != nil {
@ -1198,7 +1210,7 @@ func (c *cpu) CacheCount() int {
// GetCaches returns the caches for this CPU.
func (c *cpu) GetCaches() []*Cache {
caches := make([]*Cache, 0, len(c.caches))
caches := make([]*Cache, len(c.caches))
copy(caches, c.caches)
return caches
}

View File

@ -0,0 +1,17 @@
classes:
- name: interleave-all
policy:
mode: MPOL_INTERLEAVE
nodes: allowed-mems
- name: interleave-cpu-packages
policy:
mode: MPOL_INTERLEAVE
nodes: cpu-packages
- name: interleave-cpu-nodes
policy:
mode: MPOL_INTERLEAVE
nodes: cpu-nodes
- name: interleave-local-nodes
policy:
mode: MPOL_INTERLEAVE
nodes: max-dist:19

View File

@ -1,18 +0,0 @@
// Copyright The NRI Plugins Authors. All Rights Reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
// This package imports things required by build scripts, to force `go mod` to see them as dependencies
package tools
import _ "k8s.io/code-generator"

View File

@ -0,0 +1,26 @@
config:
injectMpolset: $( ( [[ "$MPOLSET" == "1" ]] && echo true ) || echo false )
classes:
- name: prefer-one-cxl
policy:
mode: MPOL_PREFERRED
nodes: 4
flags:
- MPOL_F_RELATIVE_NODES
- name: prefer-all-cxls
policy:
mode: MPOL_PREFERRED_MANY
nodes: 4,5
- name: interleave-max-bandwidth
policy:
mode: MPOL_INTERLEAVE
nodes: max-dist:19
flags:
- MPOL_F_STATIC_NODES
- name: bind-hbm
policy:
mode: MPOL_BIND
nodes: 2,3

View File

@ -0,0 +1,36 @@
helm-terminate
MPOLSET=1
helm_config=$(instantiate dram-hbm-cxl.helm-config.yaml) helm-launch memory-policy
verify-policy() {
local container=$1
local expected=$2
vm-command "for pid in \$(pgrep -f $container); do grep heap /proc/\$pid/numa_maps; done | head -n 1"
local observed="$COMMAND_OUTPUT"
if [[ "$observed" != *"$expected"* ]]; then
command-error "expected memory policy: $expected, got: $observed"
fi
echo "verify $container memory policy is $expected: ok"
}
ANN0="class.memory-policy.nri.io: prefer-one-cxl" \
ANN1="class.memory-policy.nri.io/container.pod0c1: prefer-all-cxls" \
ANN2="class.memory-policy.nri.io/container.pod0c2: interleave-max-bandwidth" \
ANN3="class.memory-policy.nri.io/container.pod0c3: bind-hbm" \
ANN4="policy.memory-policy.nri.io/container.pod0c4: |+
mode: MPOL_BIND
nodes: 4,5
flags:
- MPOL_F_STATIC_NODES" \
ANN5="policy.memory-policy.nri.io/container.pod0c5: \"\"" \
ANN6="class.memory-policy.nri.io/container.pod0c6: \"\"" \
CONTCOUNT=7 \
create besteffort
verify-policy pod0c0 'prefer=relative:4'
verify-policy pod0c1 'prefer (many):4-5'
verify-policy pod0c2 'interleave=static:0-3'
verify-policy pod0c3 'bind:2-3'
verify-policy pod0c4 'bind=static:4-5'
verify-policy pod0c5 'default' # unset pod-default with empty policy
verify-policy pod0c6 'default' # unset pod-default with empty class

View File

@ -0,0 +1,7 @@
[
{"mem": "2G", "threads":2, "cores": 2, "nodes": 1, "packages": 2},
{"mem": "1G", "node-dist": {"0": 15, "1": 30, "2": 10, "3": 35}},
{"mem": "1G", "node-dist": {"0": 30, "1": 15, "2": 35, "3": 10}},
{"mem": "4G", "node-dist": {"0": 60, "1": 70, "2": 62, "3": 72, "4": 10, "5": 75}},
{"mem": "4G", "node-dist": {"0": 70, "1": 60, "2": 72, "3": 62, "4": 75, "5": 10}}
]

View File

@ -0,0 +1,57 @@
---
- hosts: all
become: no
become_user: root
vars:
cri_runtime: "{{ cri_runtime }}"
is_containerd: false
is_crio: false
plugin_name: "memory-policy"
image_ref: "unset"
test_image: "localhost/{{ plugin_name }}:testing"
tasks:
- set_fact:
is_containerd: true
when: cri_runtime == "containerd"
- set_fact:
is_crio: true
when: cri_runtime == "crio"
- name: copy helm charts
become: yes
copy:
src: "{{ nri_resource_policy_src }}/deployment/helm/{{ plugin_name }}"
dest: "./helm"
- name: get latest nri-memory-policy deployment image name
delegate_to: localhost
shell: "ls -1t {{ nri_resource_policy_src }}/build/images/nri-{{ plugin_name }}-image-*.tar"
register: nri_plugin_images
failed_when: nri_plugin_images.rc != 0
- name: parse image reference from image tarball
become: no
set_fact:
image_ref: "{{ nri_plugin_images.stdout_lines[0] | regex_replace('.*-image-([0-9a-z]*).tar', '\\1') }}"
delegate_to: localhost
- name: copy latest nri-memory-policy deployment image
copy: src="{{ nri_plugin_images.stdout_lines[0] }}" dest="."
- name: import nri plugin image when using containerd
become: yes
shell: "{{ item }}"
with_items:
- ctr -n k8s.io images import `basename {{ nri_plugin_images.stdout_lines[0] }}`
- ctr -n k8s.io images tag --force `ctr -n k8s.io images ls | grep sha256:{{ image_ref }} | tr -s '\t' ' ' | cut -d ' ' -f1` {{ test_image }}
when: is_containerd
- name: load nri plugin image when using cri-o
become: yes
shell: "{{ item }}"
with_items:
- sudo podman image load -i `basename {{ nri_plugin_images.stdout_lines[0] }}`
- sudo podman image tag {{ image_ref }} {{ test_image }}
when: is_crio

View File

@ -200,8 +200,6 @@
with_items:
- "{{ containerd_src }}/bin/ctr"
- "{{ containerd_src }}/bin/containerd"
- "{{ containerd_src }}/bin/containerd-shim"
- "{{ containerd_src }}/bin/containerd-shim-runc-v1"
- "{{ containerd_src }}/bin/containerd-shim-runc-v2"
when: is_containerd and containerd_src != ""
@ -310,10 +308,10 @@
- name: Update CNI plugin directory on Fedora
when: ansible_facts['distribution'] == "Fedora"
ansible.builtin.lineinfile:
ansible.builtin.replace:
path: /etc/containerd/config.toml
regexp: ' *bin_dir *= *./opt/cni/bin. *'
line: 'bin_dir = "/usr/libexec/cni"'
regexp: '/opt/cni/bin'
replace: '/usr/libexec/cni'
- name: Configure bridge CNI plugin
when: cni_plugin == "bridge"

View File

@ -0,0 +1,54 @@
config:
agent:
nodeResourceTopology: true
allocatorTopologyBalancing: false
reservedResources:
cpu: 750m
pinCPU: true
pinMemory: true
balloonTypes:
- name: node0
preferCloseToDevices:
- /sys/devices/system/node/node0
- name: node1
preferCloseToDevices:
- /sys/devices/system/node/node1
- name: node2
preferCloseToDevices:
- /sys/devices/system/node/node2
- name: node3
preferCloseToDevices:
- /sys/devices/system/node/node3
- name: balance-all-nodes
components:
- balloonType: balance-pkg0-nodes
- balloonType: balance-pkg1-nodes
minCPUs: 4
minBalloons: 1
showContainersInNrt: true
- name: balance-pkg0-nodes
components:
- balloonType: node0
- balloonType: node1
- name: balance-pkg1-nodes
components:
- balloonType: node2
- balloonType: node3
preferNewBalloons: true
- name: kube-system-cpu-core
preferCloseToDevices:
- /sys/devices/system/cpu/cpu6/cache/index0
- name: reserved
components:
- balloonType: kube-system-cpu-core
log:
debug:
- policy

View File

@ -0,0 +1,100 @@
# Test balloons that are composed of other balloons.
helm-terminate
helm_config=$TEST_DIR/balloons-composite.cfg helm-launch balloons
cleanup() {
vm-command "kubectl delete -n kube-system pod pod2 --now; kubectl delete pods --all --now"
}
verify-nrt() {
jqquery="$1"
expected="$2"
vm-command "kubectl get -n kube-system noderesourcetopologies.topology.node.k8s.io -o json | jq -r '$jqquery'"
if [[ -n "$expected" ]]; then
if [[ "$expected" != "$COMMAND_OUTPUT" ]]; then
command-error "invalid output, expected: '$expected'"
fi
fi
}
cleanup
CPUREQ="500m" MEMREQ="100M" CPULIM="500m" MEMLIM=""
POD_ANNOTATION="balloon.balloons.resource-policy.nri.io: balance-all-nodes" CONTCOUNT=2 create balloons-busybox
report allowed
verify 'len(cpus["pod0c0"]) == 4' \
'len(cpus["pod0c1"]) == 4' \
'nodes["pod0c0"] == nodes["pod0c1"] == {"node0", "node1", "node2", "node3"}'
verify-nrt '.items[0].zones[] | select (.name == "balance-all-nodes[0]")' # no check, print for debugging
verify-nrt '.items[0].zones[] | select (.name == "balance-all-nodes[0]") .attributes[] | select (.name == "excess cpus") .value' 3000m
# Balance a large workload on all NUMA nodes
CPUREQ="9" MEMREQ="100M" CPULIM="" MEMLIM=""
POD_ANNOTATION="balloon.balloons.resource-policy.nri.io: balance-all-nodes" CONTCOUNT=1 create balloons-busybox
report allowed
verify 'len(cpus["pod1c0"]) == 12' \
'cpus["pod0c0"] == cpus["pod0c1"] == cpus["pod1c0"]' \
'len(set.intersection(cpus["pod1c0"], {"cpu00", "cpu01", "cpu02", "cpu03"})) == 3' \
'len(set.intersection(cpus["pod1c0"], {"cpu04", "cpu05", "cpu06", "cpu07"})) == 3' \
'len(set.intersection(cpus["pod1c0"], {"cpu08", "cpu09", "cpu10", "cpu11"})) == 3' \
'len(set.intersection(cpus["pod1c0"], {"cpu12", "cpu13", "cpu14", "cpu15"})) == 3' \
'len(set.intersection(cpus["pod1c0"], {"cpu06", "cpu07"})) == 1' # cpu06 or cpu07 is reserved
verify-nrt '.items[0].zones[] | select (.name == "balance-all-nodes[0]")' # no check, print for debugging
CPUREQ="100m" MEMREQ="" CPULIM="100m" MEMLIM=""
namespace=kube-system create balloons-busybox
report allowed
verify 'cpus["pod2c0"].issubset({"cpu06", "cpu07"})' # allow either/both hyperthreads sharing the L0 cache
# Remove large pod. The size of the balanced-all-nodes[0] should drop from 12 to 4 CPUs.
# Verify the balance is still there.
vm-command "kubectl delete pod pod1 --now"
report allowed
verify 'len(cpus["pod0c0"]) == 4' \
'cpus["pod0c0"] == cpus["pod0c1"]' \
'len(set.intersection(cpus["pod0c0"], {"cpu00", "cpu01", "cpu02", "cpu03"})) == 1' \
'len(set.intersection(cpus["pod0c0"], {"cpu04", "cpu05", "cpu06", "cpu07"})) == 1' \
'len(set.intersection(cpus["pod0c0"], {"cpu08", "cpu09", "cpu10", "cpu11"})) == 1' \
'len(set.intersection(cpus["pod0c0"], {"cpu12", "cpu13", "cpu14", "cpu15"})) == 1'
# Delete all pods. balanced-all-nodes[0] should stay, because of MinBalloons:1.
vm-command "kubectl delete pods --all --now; kubectl delete pod pod2 -n kube-system"
# Create two pods in separate balance-pkg1-nodes balloons, consuming all 3+3 free CPUs in node3+node4.
CPUREQ="1" MEMREQ="100M" CPULIM="" MEMLIM=""
POD_ANNOTATION="balloon.balloons.resource-policy.nri.io: balance-pkg1-nodes" CONTCOUNT=1 create balloons-busybox
report allowed
verify 'len(cpus["pod3c0"]) == 2' \
'len(set.intersection(cpus["pod3c0"], {"cpu08", "cpu09", "cpu10", "cpu11"})) == 1' \
'len(set.intersection(cpus["pod3c0"], {"cpu12", "cpu13", "cpu14", "cpu15"})) == 1'
verify-nrt '.items[0].zones[] | select (.name == "balance-pkg1-nodes[0]")' # no check, print for debugging
CPUREQ="4" MEMREQ="100M" CPULIM="" MEMLIM=""
POD_ANNOTATION="balloon.balloons.resource-policy.nri.io: balance-pkg1-nodes" CONTCOUNT=1 create balloons-busybox
report allowed
verify 'len(cpus["pod4c0"]) == 4' \
'len(set.intersection(cpus["pod4c0"], {"cpu08", "cpu09", "cpu10", "cpu11"})) == 2' \
'len(set.intersection(cpus["pod4c0"], {"cpu12", "cpu13", "cpu14", "cpu15"})) == 2' \
'disjoint_sets(cpus["pod4c0"], cpus["pod3c0"])'
verify-nrt '.items[0].zones[] | select (.name == "balance-pkg1-nodes[1]")' # no check, print for debugging
# Remove pods. Now composite balloons balance-pkg1-nodes[0] and
# balance-pkg1-nodes[1] should be deleted completely (in contrast to
# previously only downsizing balance-all-nodes), so
# balance-all-nodes[0] should be able to grow again.
vm-command "kubectl delete pods --all --now"
# Inflate balance-all-nodes[0] to the max.
CPUREQ="12" MEMREQ="100M" CPULIM="" MEMLIM=""
POD_ANNOTATION="balloon.balloons.resource-policy.nri.io: balance-all-nodes" CONTCOUNT=1 create balloons-busybox
report allowed
verify 'len(cpus["pod5c0"]) == 12' \
'len(set.intersection(cpus["pod5c0"], {"cpu00", "cpu01", "cpu02", "cpu03"})) == 3' \
'len(set.intersection(cpus["pod5c0"], {"cpu04", "cpu05", "cpu06", "cpu07"})) == 3' \
'len(set.intersection(cpus["pod5c0"], {"cpu08", "cpu09", "cpu10", "cpu11"})) == 3' \
'len(set.intersection(cpus["pod5c0"], {"cpu12", "cpu13", "cpu14", "cpu15"})) == 3'
cleanup

View File

@ -396,6 +396,10 @@ helm-launch() { # script API
ds_name=nri-resource-policy-balloons
[ -z "$cfgresource" ] && cfgresource=balloonspolicies/default
;;
*memory-policy*)
ds_name=nri-memory-policy
ctr_name=nri-memory-policy
;;
*)
error "Can't wait for plugin $plugin to start, daemonset_name not set"
return 0
@ -415,13 +419,15 @@ helm-launch() { # script API
fi
fi
timeout=$(timeout-for-deadline $deadline)
timeout=$timeout wait-config-node-status $cfgresource
if [ -n "$cfgresource" ]; then
timeout=$(timeout-for-deadline $deadline)
timeout=$timeout wait-config-node-status $cfgresource
result=$(get-config-node-status-result $cfgresource)
if [ "$result" != "Success" ]; then
reason=$(get-config-node-status-error $cfgresource)
error "Plugin $plugin configuration failed: $reason"
result=$(get-config-node-status-result $cfgresource)
if [ "$result" != "Success" ]; then
reason=$(get-config-node-status-error $cfgresource)
error "Plugin $plugin configuration failed: $reason"
fi
fi
vm-start-log-collection -n kube-system ds/$ds_name -c $ctr_name