Harrison
b8c28d00ff
Merge pull request #201 from HarrisonWAffel/retries-after-restart
...
Add ResetFailureCountOnServiceRestart
2024-10-17 12:25:14 -04:00
Harrison Affel
fb4a027b4d
Add ResetFailureCountOnServiceRestart, if true reset plan failure count after each restart of the system-agent
2024-10-16 14:12:27 -04:00
Harrison Affel
7300df0e0e
Add tests and update CI
2024-10-01 15:06:03 -04:00
Harrison Affel
bc9bd0b463
Windows updates
2024-09-23 17:44:11 -04:00
Jiaqi Luo
befb1d33b2
Migrate from Drone to GitHub Action
2024-07-01 13:17:57 -07:00
Chris Kim
3bf716f8e0
Add support for CATTLE_AGENT_STRICT_VERIFY|STRICT_VERIFY environment variables to ensure kubeconfig CA data is valid ( #171 )
...
Signed-off-by: Chris Kim <oats87g@gmail.com>
2024-06-27 11:29:39 -07:00
Peter Matseykanets
41c07d0600
Update Go to 1.21 and deps for k8s 1.27 ( #152 )
...
Ref: https://github.com/rancher/rancher/issues/43318
2024-02-26 16:21:27 -05:00
Chris Kim
806ef425e0
Add interlocks to ensure operations are not interrupted ( #150 )
...
* Add interlocks to ensure system-agent does not get restarted when it is applying a plan and does not start applying a plan when a restart is pending
* Remove s390x from drone file
* Don't always set CROSS to true when building
Signed-off-by: Chris Kim <oats87g@gmail.com>
2023-12-12 14:18:36 -08:00
Brad Davidson
3d8c2b53c8
Fix repeated time parse error on probes that have not yet run successfully
...
Check that last successful run time is not an empty string before trying to parse it.
Signed-off-by: Brad Davidson <brad.davidson@rancher.com>
2023-09-19 16:04:49 -07:00
Chris Kim
9e827a59b8
Add CATTLE_AGENT_ATTEMPT_NUMBER environment variable that corresponds to failure count for K8s plan application ( #115 )
...
Signed-off-by: Chris Kim <oats87g@gmail.com>
2023-05-19 08:37:35 -07:00
Chris Kim
e696ff63fe
Retry update with latest secret if plan still matches the applied plan ( #114 )
...
Signed-off-by: Chris Kim <oats87g@gmail.com>
2023-05-04 17:01:12 -07:00
Chris Kim
e57338eef9
Add error handling logic that handles edge cases to force the system-agent to restart if we encounter non-transient errors. Disallow the K8s watcher from manipulating a secret when the UID changes. ( #112 )
...
Signed-off-by: Chris Kim <oats87g@gmail.com>
2023-04-19 08:42:44 -07:00
Chris Kim
24c523a440
Bump golang to 1.19.4-alpine3.17, rancher/wharfie to v0.5.3, and dapper to v0.6.0 ( #102 )
...
* Bump golang to 1.19.4-alpine3.17, rancher/wharfie to v0.5.3, and dapper to v0.6.0
* bump golangci-lint
* fix validate script
* fix CI for validation to run go fmt
Signed-off-by: Chris Kim <oats87g@gmail.com>
2023-01-06 09:19:35 -08:00
Jake Hyde
9f22484617
Add back TLSClientConfig to transport
2022-09-29 18:30:47 -04:00
Jake Hyde
7a0853f892
Add proxy to validate rest config
2022-09-26 21:17:18 -04:00
Jamie Phillips
29a9cda11c
This fixes the TLS handshake on Windows.
2022-07-29 16:25:27 -04:00
Ross Kirkpatrick
bbb696911e
Bump to go1.18 and client-go 1.24, remove windows-specific x509 logic ( #86 )
...
* initial 1.24 k8s support plus go1.18
* fix gocr and wharfie versioning
* bump dapper
* bump go version for builds to 1.18.3, bump alpine
* handle if
* fix wharfie and gocr version pins
* bump golangci-lint to 1.18 compat
* revert dapper bump for arm
2022-07-13 11:50:29 -04:00
Donnie Adams
a509971a10
Fix nil-pointer dereference on windows context
...
Passing certContext to the deferred function ensures that the value will
be taken at that point. The issue is that it is nil when the deferred
function is called.
This change will capture the variable so its real value is passed
2022-05-06 13:57:08 -07:00
Ross Kirkpatrick
46fbba3a20
add windows support for root CA cert stores ( #84 )
...
* add windows support for system cert stores
* fix comment on unix prober
* ensure we loop over every certificate, add nil check
* clean up buffer logic
* nil check for unix prober
* fix goimports
* better code comments
* additional nil check
* check certcontext length
* init new cert pool in case of nil in prober
2022-05-05 17:42:29 -04:00
Chris Kim
5710abb984
Increase max periodic cooldown duration, tidy applyinator, and add debug messages ( #81 )
...
Increase max periodic cooldown duration, tidy applyinator, and add debug messages
Signed-off-by: Chris Kim <oats87g@gmail.com>
2022-04-13 15:16:51 -07:00
Chris Kim
2c80536ae1
add max-retries and periodic cooldown ( #80 )
...
Signed-off-by: Chris Kim <oats87g@gmail.com>
2022-04-13 14:18:06 -07:00
Chris Kim
00181cd06b
Correctly pick up on failed apply ( #79 )
...
Signed-off-by: Chris Kim <oats87g@gmail.com>
2022-03-08 10:52:51 -08:00
Chris Kim
05d9e51b0b
Move log messages around to prevent unnecessarily redundant messages ( #78 )
...
Signed-off-by: Chris Kim <oats87g@gmail.com>
2022-03-08 06:38:07 -08:00
Chris Kim
414141a983
create directory for applied plans before listing the directory ( #77 )
...
Signed-off-by: Chris Kim <oats87g@gmail.com>
2022-03-04 15:41:36 -07:00
Chris Kim
ad6e3be9b8
Only write applied plan contents if the plan actually changes ( #75 )
...
Signed-off-by: Chris Kim <oats87g@gmail.com>
2022-03-03 12:39:43 -07:00
Chris Kim
b1f6aced9e
change field name ( #73 )
...
Signed-off-by: Chris Kim <oats87g@gmail.com>
2022-02-15 13:03:21 -08:00
Chris Kim
278280e64b
Only set LastRunTime if periodic output was successful ( #72 )
...
Signed-off-by: Chris Kim <oats87g@gmail.com>
2022-02-15 11:19:25 -08:00
Chris Kim
559b61591c
Set default period to 600 seconds for periodic instructions ( #70 )
...
Signed-off-by: Chris Kim <oats87g@gmail.com>
2022-02-08 13:32:10 -08:00
Chris Kim
bd24a4886e
Enhance system-agent for production readiness and periodic probe ( #68 )
...
* Add periodic instructions (and move existing instructions to OneTimeInstructions)
* Add retention policy for applied plan
* Add more clarity for log messages
Signed-off-by: Chris Kim <oats87g@gmail.com>
2022-02-08 08:53:06 -08:00
Chris Kim
fda6e6a636
K8splan should run probes
...
Signed-off-by: Chris Kim <oats87g@gmail.com>
2021-11-08 10:39:56 -08:00
Jamie Phillips
bd7429c081
Fixes various Windows specific bugs discovered during test.
...
File permissions on Windows don't behave the same as on Linux so those needed adjusting.
Wharfie wasn't passing the OS information so the correct image wasn't being pulled for Windows.
Signed-off-by: Jamie Phillips <jamie.phillips@suse.com>
2021-11-05 12:54:14 -04:00
Chris Kim
37031465c0
Fix unnecessary writing of successful output to failure in the event that the prior plan had failed but the current one is successful ( #52 )
...
Signed-off-by: Chris Kim <oats87g@gmail.com>
2021-08-24 15:17:22 -07:00
Brian Downs
7662889c43
update type assertion to prevent panic ( #50 )
...
Signed-off-by: Brian Downs <brian.downs@gmail.com>
2021-08-23 15:16:26 -07:00
Chris Kim
20d93d3bab
Add failure handling to system-agent ( #51 )
...
* Deal with failure
* Enhance system agent to store and handle failure cases
Signed-off-by: Chris Kim <oats87g@gmail.com>
Co-authored-by: Brian Downs <brian.downs@gmail.com>
2021-08-23 15:16:12 -07:00
Jamie Phillips
0a4420a4c2
Merge pull request #46 from phillipsj/feature/windows-compilation
...
Adding Windows builds and compiles.
2021-08-20 08:24:59 -04:00
Darren Shepherd
1c0a7aac71
Compare resourceVersions as int
2021-08-18 21:33:43 -07:00
Jamie Phillips
ee713d8bfe
Adding Windows builds and compiles.
...
Signed-off-by: Jamie Phillips <jamie.phillips@suse.com>
2021-08-18 20:39:49 -04:00
Chris Kim
cbfd68459e
Don't apply if resource version is incorrect ( #43 )
...
Signed-off-by: Chris Kim <oats87g@gmail.com>
2021-08-06 14:34:09 -07:00
Darren Shepherd
c8e51740fe
Wait for command I/O to complete before check command exit code ( #42 )
2021-08-03 10:29:13 -07:00
Chris Kim
8bb75cf28a
Remove CA data if initial connection attempt fails and provide more context when unable to connect to Rancher ( #41 )
...
* Nullify ca data if initial connection fails
* Add own validate KC
* Only perform PUT if secret is actually changed
* Change message from debug to info and fix imports
Signed-off-by: Chris Kim <oats87g@gmail.com>
2021-08-03 10:28:33 -07:00
Chris Kim
81ca4e28c1
Check probes regardless of resource version
...
Signed-off-by: Chris Kim <oats87g@gmail.com>
2021-06-18 16:29:40 -04:00
Chris Kim
bd8caa5944
Initialize probeStatuses if not already
...
Signed-off-by: Chris Kim <oats87g@gmail.com>
2021-06-18 15:40:42 -04:00
Chris Kim
34a1c42b97
Perform safer secret processing
...
Signed-off-by: Chris Kim <oats87g@gmail.com>
2021-06-18 14:18:23 -04:00
Chris Kim
5777eca782
Check K8s cluster is healthy before proceeding to watch for remote K8s plans
...
Signed-off-by: Chris Kim <oats87g@gmail.com>
2021-06-15 21:43:59 -04:00
Darren Shepherd
60aedcadbb
Move probe logic to reusable method to be used from rancherd
2021-06-15 18:31:16 -07:00
Darren Shepherd
f6a8502ce6
Ignore last scanner err
...
We don't want i/o error to fail execution of the command. Race conditions
can cause the program to exit successfully but the I/O to fail.
2021-06-15 18:30:48 -07:00
Chris Kim
80bc81bbf1
Turn off local plan parser by default and set directory and file permissions to be a little more restrictive
...
Signed-off-by: Chris Kim <oats87g@gmail.com>
2021-06-09 15:24:46 -04:00
Chris Kim
b622b599bf
Add empty working directory support
...
Signed-off-by: Chris Kim <oats87g@gmail.com>
2021-06-07 14:07:38 -04:00
Chris Kim
1faec3e116
simplify duration parsing to just multiplication
...
Signed-off-by: Chris Kim <oats87g@gmail.com>
2021-06-01 13:24:22 -04:00
Chris Kim
391cdf8014
use proper duration parsing to ensure that probes do not immediately timeout
...
Signed-off-by: Chris Kim <oats87g@gmail.com>
2021-06-01 13:19:50 -04:00