Currently the sytem-agent-config is only generated as part of the cloudInitConfigs
when installing, but when registering a pre-installed host it is useful to have
the option to generate the system-agent-config without any OS install.
After discussion it has been suggested to enable this via a --no-toolkit flag
which can optionally be specified for pre-installed hosts, in this mode we will
only write out the system-agent config files.
- Added full registration config and statefile path parameters on elemental-register
- Remove support for multiple configuration files
- Added (hardcoded) timer to skip registration updates for 24 hours
- Store emulated TPM seed for future registration updates
- Exit with error code in case of failures (systemd will manage restarts)
- Use virtual filesystem where possible
This commit stops using the ServiceAccount.Secrets list, as noted my
k8s this should not be used to find SA's associated secrets and this
is no longer being automatically managed by k8s since v1.24.
Signed-off-by: David Cassany <dcassany@suse.com>
Reduce cyclomatic complexity of the `Register` function
----
Error: cyclomatic complexity 21 of func `Register` is high (> 20)
(gocyclo)
Signed-off-by: Francesco Giudici <francesco.giudici@suse.com>
Add MachineRegistration Elemental Registration Auth to allow
selection of the authentication method.
Add MachineInventory MachineHash key for authentication types different than
TPM (which has its own TPMHash key).
make generate
make build-manifests
...and add a new helper in the util package to verify is a resource is
owned by an object (identified by its UID).
Signed-off-by: Francesco Giudici <francesco.giudici@suse.com>
* Add client registration config utility
* Use a config-map for the seed-image pod
* Allow ConfigMaps manipulation in SeedImage RBAC
* Drop configmap-uid annotation
* go mod tidy
* Adapt tests
* Add createConfigMapObject tests
Signed-off-by: David Cassany <dcassany@suse.com>
* Add cloud-init support to seedImage
This commit adds a field to the SeedImage Spec for a cloud-config that
will be included in the built ISO.
If the cloud-config field is not set an empty file will be added to the
ISOs iso-config dir.
The reconciliation will take place in case the cloud-config is changed
and the base64 encoded value is used in an annotation in order to see if
the value has changed.
Signed-off-by: Fredrik Lönnegren <fredrik.lonnegren@suse.com>
* Linting
Signed-off-by: Fredrik Lönnegren <fredrik.lonnegren@suse.com>
* Add seedImage unit-tests
Signed-off-by: Fredrik Lönnegren <fredrik.lonnegren@suse.com>
---------
Signed-off-by: Fredrik Lönnegren <fredrik.lonnegren@suse.com>
We used to return the registration yaml in that case: makes no sense.
Just return the error.
Keep instead returning the registration yaml when using websocket with
no auth, also if we expect a plain HTTP GET to retrieve the registration
yaml.
Signed-off-by: Francesco Giudici <francesco.giudici@suse.com>
We now have the SeedImage resource to start and track image building
tasks: drop the old build-image api.
Signed-off-by: Francesco Giudici <francesco.giudici@suse.com>
This commit adds a new exchange between the registering client and the
operator: the registering client will pass some data that will be put in
the MachineInventory annotations.
This is meant to be a way to track those dynamic data from the host that
could be handy to have in the MachineInventory.
The only data passed in the current commit is the host address used to
register.
Signed-off-by: Francesco Giudici <francesco.giudici@suse.com>
* operator: isolate hostinfo data
Since we already wrap the "ghw" library to collect system data in the
hostinfo package, let's move all the logic dealing with conversion from
raw data to labels there for better isolation.
* operator: add few more fields in System Data collection
In particular, NICs MAC address
---------
Signed-off-by: Francesco Giudici <francesco.giudici@suse.com>
* operator: build-image API POD lifecycle management
Only one POD at a time allowed (for now).
* elemental-operator: improve build-image logging
* operator: add build-image Service
* operator: build-image API: delete Pod and Service on failure
* operator: add functions to manage registration cache
* operator: finalize build-image API Pod lifecycle
* operator: fix build-image API tests
* operator: ensure clean-up of build-image pods
* operator: add Services creation/deletion to Elemental ClusterRole
* operator: build-image: set download URL when job is completed
* operator: build-image: retry build job Pod creation if needed
* operator: build-image: in case of job Failure leave the Pod there
* operator: build-image: increase the time for job completion
* operator: make Code scanning happy
* operator: build-image: use NodePort Service
Signed-off-by: Francesco Giudici <francesco.giudici@suse.com>
Remove default MachineInventory labels taken from system (memory, cpu,
gpu, network and block devices).
Make these values available as templates on MachineRegistration instead
under '${System Data/...}', for example '${System Data/Memory/Total Physical
Bytes}'
Signed-off-by: Fredrik Lönnegren <fredrik.lonnegren@suse.com>
* register client: aggregate args to the Register function
Since we basically pass almost all the Registration struct parameters
one by one, let's pass a reference to the structure directly.
* register client: introduce authClient interface
The register client code is tightly coupled with TPM attestation.
While this is not a problem right now as we just support authentication
through TPM, it may be good to better separate TPM attestation from the
registration process itself for two reasons:
- better code readibility
- support of alternative authentication methods
Note that on the operator side (register "server") the code is already
structured to allow alternative authentication methods.
This commit introduces an interface with the required authentication
methods: the TPM related code in now completely isolated in the interface
implementation.
Signed-off-by: Francesco Giudici <francesco.giudici@suse.com>
By default we collect block device system data as labels. The label name
for the number of block devices found was missing the
elemental.cattle.io prefix: fix it.
Moreover, make the label keys for the number of Network Interfaces and
Block Devices consistent:
elemental.cattle.io/NetIfacesNumber
elemental.cattle.io/NetBlockDevicesNumber
Signed-off-by: Francesco Giudici <francesco.giudici@suse.com>
The default machine name is m-{UUID}.
The UUID is generated via software if SMBIOS data is disabled, otherwise
the SMBIOS {System Information/UUID} is used.
Since some hardware vendors don't properly fill the UUID SMBIOS data,
let always provide a machine name based on a software generated UUID, to
ensure name uniqueness by default.
Signed-off-by: Francesco Giudici <francesco.giudici@suse.com>
The CloudConfig structure was a serialized interface map: if an old
client is detected, convert back to that legacy type.
Signed-off-by: Francesco Giudici <francesco.giudici@suse.com>
* operator: add multiple APIs support in the API server callback
- parse incoming request to identify the requested API
- move the registration management in a separate function
- add placeholder for the new build-image API
* operator: move generic API functions from register.go to server.go
We are introducing new APIs: let's keep in the register.go package
only those functions specific to the register API only.
This commit just moves some functions from register.go to server.go.
No changes in the code.
Signed-off-by: Francesco Giudici <francesco.giudici@suse.com>
* operator: move getMachineRegistration() to server.go
Move getMachineRegistration() to server.go for usage from all
APIs. Moreover, let it take directly the token as parameter.
Signed-off-by: Francesco Giudici <francesco.giudici@suse.com>
* operator: enforce API syntax during registration
We expext to receive a path of the form:
/elemental/{api}
enforce it (or return HTTP 404 - Not found).
Signed-off-by: Francesco Giudici <francesco.giudici@suse.com>
* operator: introduce generic getValue() function
This is a generic function that could stay under server.go.
Move the specific function to retrieve the CACert under register.go
and leverage the newly introduced function.
Signed-off-by: Francesco Giudici <francesco.giudici@suse.com>
* operator: rename register.go to api_registration.go
just to make code easier to navigate
Signed-off-by: Francesco Giudici <francesco.giudici@suse.com>
* operator: add api_buildimage.go for build-image API functions
move there also the placeholder function for the build-image API
Signed-off-by: Francesco Giudici <francesco.giudici@suse.com>
* operator: add registration cache to the server
We need it to store ephemeral build image data, like the seed image
and the status of the actual build job.
We could extend it in the future to have a full cache of the
MachineRegistration that the registration server should deal with.
Signed-off-by: Francesco Giudici <francesco.giudici@suse.com>
* operator: build-image API scaffolding
Signed-off-by: Francesco Giudici <francesco.giudici@suse.com>
* operator: add tests for build-image api
Signed-off-by: Francesco Giudici <francesco.giudici@suse.com>
* operator: ensure user input from APIs is properly escaped
Signed-off-by: Francesco Giudici <francesco.giudici@suse.com>
* operator: log failure to set read deadline on the websocket
Signed-off-by: Francesco Giudici <francesco.giudici@suse.com>
* update Copyright year in modified files
Signed-off-by: Francesco Giudici <francesco.giudici@suse.com>
* operator: add a small delay before reporting the build job failure
The build job is empty, and right now just reports failure (actual
implementation will be added in the future).
Since the API tests check the state just after starting the build and
expect to find its state updated to "Started", we need the build job to
wait a while before updating the build state to "Failed", otherwise the
tests may miss the Started state.
Signed-off-by: Francesco Giudici <francesco.giudici@suse.com>
* operator: make code scanning tools happier
when user input is sanitized, use a different var to store the sanitized
value. This should made scanning tools job eisier and avoid false
positives.
On the bonus side, the code will be more readable, i.e., it will be
clear where we use the sanitized values.
Signed-off-by: Francesco Giudici <francesco.giudici@suse.com>
Signed-off-by: Francesco Giudici <francesco.giudici@suse.com>
Note that `config-dir` options for the install config got lost within the kubebuilder refactor. This is required to be able to pass custom hooks as part of the installation.
Signed-off-by: David Cassany <dcassany@suse.com>
* Return registration errors to client
Introduces two new message-types (MsgError and MsgConfig).
MsgError is sent when an error is encountered during the registration
process.
MsgConfig is used to send the elemental configuration to the client,
before this was just a raw message with no type so we need to check in
the server if the client supports the message, otherwise fallback to the
raw message.
Signed-off-by: Fredrik Lönnegren <fredrik.lonnegren@suse.com>
* Change registration error message
unknown -> unexpected
Co-authored-by: Francesco Giudici <francesco.giudici@gmail.com>
Signed-off-by: Fredrik Lönnegren <fredrik.lonnegren@gmail.com>
* Remove InventoryServer receiver argument
From writeError method
Co-authored-by: Francesco Giudici <francesco.giudici@gmail.com>
Signed-off-by: Fredrik Lönnegren <fredrik.lonnegren@suse.com>
Fixes: #265
* registration: negotiate registration protocol
* operator: always update the MachineInventory for authenticated clients
* register: rename sendData to sendSMBIOSdata
* register: rework the Register() function
* operator: rework the registration protocol loop
* operator: no need to return the msgType from the registration loop
Signed-off-by: Francesco Giudici <francesco.giudici@suse.com>
* operator/registration: switch to Kubebuilder client
Fixes#239
Signed-off-by: Francesco Giudici <francesco.giudici@suse.com>
* operator/registration: adapt tests to Kubebuilder client
Signed-off-by: Francesco Giudici <francesco.giudici@suse.com>
* unit-tests: vendor controller-runtime fake client
Signed-off-by: Francesco Giudici <francesco.giudici@suse.com>
* operator: add yaml annotations for correct marshalling
Signed-off-by: Francesco Giudici <francesco.giudici@suse.com>
* operator/RBAC: add "get" verb to ServiceAccount resources
Signed-off-by: Francesco Giudici <francesco.giudici@suse.com>
* operator: generate rbac
make generate-manifests
make build-rbac
Signed-off-by: Francesco Giudici <francesco.giudici@suse.com>
* Add elementalcli package
Create a separate package to deal with elemental-cli installation.
This brings the elemental-cli functions declared in the config package
to a new package that just parses a map[string]interface argument.
This is a step to enable usage of the elemental-cli functions with the
new elementalv1.Config.Elemental.Install type.
Signed-off-by: Francesco Giudici <francesco.giudici@suse.com>
* register: switch to Kubebuilder api
Signed-off-by: Francesco Giudici <francesco.giudici@suse.com>
* register: add mapstructure annotations for correct marshalling
Signed-off-by: Francesco Giudici <francesco.giudici@suse.com>
* register: reduce complexity of the run function
Make linter happy:
"cyclomatic complexity 16 of func `run` is high (> 15) (gocyclo)"
Signed-off-by: Francesco Giudici <francesco.giudici@suse.com>
* register: drop io/ioutil in favor of os package
io/ioutil is deprecated
Signed-off-by: Francesco Giudici <francesco.giudici@suse.com>
Signed-off-by: Francesco Giudici <francesco.giudici@suse.com>
This commit adds a rate limiter to the ManagedOSVersionChannel controller to prevent
stacking reconcile loops over the same resource in fast rates (doesn't make sense for a
ManagedOSVersionChannel). By default the controller runtime already includes an
equivalent rate limiter, but starts in the range of milliseconds, starting the exponential
rate limiter in the range of seconds is more than enough in this context.
In addition it drops the failures counter in the resource. This counter was supposed to
be used to limit the number attempts to sync in case of failure. This was a bad design,
status should not keep a counter like this as any change in status triggers a new
immediate reconcile loop, hence the counter was reaching the maximum as fast as the
controller runtime was executing reconcile loops without any rate limiter (rate limiter
applies only when there are no changes including status).
For now I think we can just live without the setting any maxium for failures. If we ever
need it I believe it should be coded and tracked within the controller itself, not in each
resource as this prevents the reconcile loop of being idempotent. Alternatively we could
prevent triggering the reconcile loop on status changes, however this prevents
reconciling if any third party (or user from the kubectl client) changes a resource status.
Fixes#257
Part of #240
Signed-off-by: David Cassany <dcassany@suse.com>
This commit adds few changes on the syncer logic:
* Makes use of ManagedOSVersionChannel status reason to track if there
is an on going synchronization rather than polling for the existence of a synchronization pod or not.
* Adds a logic to stop trying to synchronize after 4 consecutive attempts.
If it exceeds the maximum it just programs the next re-sync after the given sync
interval instead of immediately retrying.
* Adds some logging and comments here and there.
Signed-off-by: David Cassany <dcassany@suse.com>
* Implement syncer logic as part of the ManagedOSVersionChannel controller
This commit adds the logic to synchronize managedosversionchannels
within the already existing controller.
* make generate
* make build-manifests
* Update chart
* update e2e tests
Signed-off-by: David Cassany <dcassany@suse.com>
* Update vendor
* Run generation tasks
* Minor fixes in Makefile
* Remove old code
* Add remaning controllers
* Minor e2e tests improvements
* Switch osversionchannel syncer to controll runtime
* Minor fixes in controllers
* Fix unit tests
* Add new package to Dockerfile
* Update dependencies
* Add unit test helpers
* Add new machine registration controller
* Remove old machine registration controller
* Add rbac tag for secrets
* Fix container argument in chart
* Add labels to all created resources
The '+' is not a typo. Changing to use gopkg.in/yaml.v2 as it respects
yaml-tags on structs.
Example of valid file:
```yaml
node-label+:
- name=hostname
- elemental/label=test
```
Signed-off-by: Fredrik Lönnegren <fredrik.lonnegren@suse.com>
Signed-off-by: Fredrik Lönnegren <fredrik.lonnegren@suse.com>
Inventory labels are not propagated from the inventory into the node so
they are pretty useless for things like upgrades.
This patch fixes it by using the override yaml in the node k3s/rke2
configuration to append node labels obtained from the inventory
Signed-off-by: Itxaka <igarcia@suse.com>
Signed-off-by: Itxaka <igarcia@suse.com>
The OnChange function of the MachineRegistration controller is becoming
too packed: move the ServiceAccount and associated Secret creation and
management in a separate function
Signed-off-by: Francesco Giudici <francesco.giudici@suse.com>
If the ServiceAccount for the newly created MachineRegistration
is already there, ensure it has a link to the newly created Secret
Signed-off-by: Francesco Giudici <francesco.giudici@suse.com>
A Secret referencing a missing ServiceAccount will be deleted.
While we create them one after the other, still the safest path is
to create the ServiceAccount first. Otherwise we may be exposed to
a race condition in which:
1. We create the Secret referencing an unexistent ServiceAccount
2. The Secret controller will detect a Secret referencing an unexistent
ServiceAccount and will mark it for deletion
3. The ServiceAccount is created with the reference to the Secret
4. The Secret gets removed from the controller: the controller also
updates the ServiceAccount removing the linked Secret
Fixes#197
Signed-off-by: Francesco Giudici <francesco.giudici@suse.com>
* operator: manage empty config in MachineRegistrations
We don't deal with empty Spec:Config in MachineRegistrations: in that
case we would end up with a nil Config structure, which we don't check
causing the operator to panic.
Just check and deal with empty (nil) MachineRegistration config.
Fixes#202
Signed-off-by: Francesco Giudici <francesco.giudici@suse.com>
* operator:trivial: rename var machineRegistration to registration
In order to manage a MachineRegistration resource we instantiate a var named
'registration' in all the functions of the server package, but in the
'unauthenticatedResponse' function.
Let's stay coherent: rename the variable.
Signed-off-by: Francesco Giudici <francesco.giudici@suse.com>
* operator: tests: expand the data structure TestInitNewInventory
This has no functional change: just extend the configuration parameter
that can be set in the data structure used for the tests.
Make use of it in the following commit.
Signed-off-by: Francesco Giudici <francesco.giudici@suse.com>
* operator: tests: check empty config in MachineRegistrations
Signed-off-by: Francesco Giudici <francesco.giudici@suse.com>
Label objects created by elemental-operator with
"elemental.cattle.io/managed": "true"
It will used by rancher-backup operator to filter these object and
create proper backup from them.
Fixes https://github.com/rancher/elemental/issues/435
Signed-off-by: Michal Jura <mjura@suse.com>
Signed-off-by: Michal Jura <mjura@suse.com>
Mark secrets created and managed by elemental-operator.
It is needed for rancher-backup opeator to select them for backup.
Fixes https://github.com/rancher/elemental/issues/396
Signed-off-by: Michal Jura <mjura@suse.com>
Signed-off-by: Michal Jura <mjura@suse.com>
With Kubernetes 1.24, creation of ServiceAccounts no more triggers
the creation of an associated Secret resource automatically: we need
it for the ServiceAccount bound to the MachineRegistation resources.
Explicitly create it in any case.
Fixes#176
Signed-off-by: Francesco Giudici <francesco.giudici@suse.com>