Commit Graph

683 Commits

Author SHA1 Message Date
Jindrich Novy 7d82b28a6b Enhance k8s-file log rotation test coverage for corruption fix
Add comprehensive test cases to validate the k8s-file log rotation fix that
prevents buffer corruption during log rotation (commit 29d17be). The original
issue manifested as corrupted log entries with repeated timestamps and broken
formatting when logs exceeded the configured max_size.

New test coverage includes:
- Small log size limit validation (50-100 bytes) to trigger rotation scenarios
- Edge case testing with various rotation thresholds (1B to 10KB)
- Log file creation and content integrity validation
- Stress testing with extremely small rotation limits
- Validation that log files can handle proper k8s format content

Signed-off-by: Jindrich Novy <jnovy@redhat.com>
2025-07-24 12:07:41 +02:00
Jindrich Novy 238b1d099c Add test suite for k8s-file log rotation fix
- Add tests for log-size-max option acceptance and validation
- Test k8s-file log driver with size limits and multiple drivers
- Verify proper log file creation and format handling
- Add WithLogSizeMax() option to conmon Go API for testing

Tests ensure the writev_buffer_flush corruption fix in commit 29d17be
works correctly and log rotation doesn't corrupt k8s log entries.

Signed-off-by: Jindrich Novy <jnovy@redhat.com>
2025-07-21 18:56:52 +02:00
Jindrich Novy 29d17becab Fix k8s-file log corruption during log rotation
The k8s-file log driver was corrupting log entries when the log file
reached max_size and needed rotation. This manifested as garbled output
with repeated timestamps and broken log entries.

The root cause was in writev_buffer_flush() which incorrectly handled
partial writes. When writev() returned a partial write, the function
would modify the original iovec base pointers, corrupting the buffer
state. During log rotation, this corrupted state would carry over to
the new file descriptor, causing subsequent log entries to be malformed.

Changes:
- Simplify writev_buffer_flush() to use individual write() calls instead
  of complex writev() partial write handling
- Always reset buffer state after log rotation to ensure clean state
  with new file descriptor
- Remove conditional buffer reset that could leave corrupted state

This fixes the issue where long-running containers (like GitLab CE)
would produce corrupted logs after reaching the configured max_size.

Fixes: Log corruption with --log-driver k8s-file --log-opt max_size=20mb

Signed-off-by: Jindrich Novy <jnovy@redhat.com>
2025-07-18 10:42:25 +02:00
Giuseppe Scrivano c5c98fd510 conn_sock: drop -1 fron snprintf
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-07-14 14:04:11 -04:00
Giuseppe Scrivano c926ba0bf1 conn_sock: make sure strncpy buffer is NUL terminated
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-07-14 14:04:11 -04:00
Giuseppe Scrivano 724e771056 oom: drop usage of sprintf in favor of snprintf
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-07-14 14:04:11 -04:00
Giuseppe Scrivano e6609cebfe conmon: drop usage of sprintf in favor of snprintf
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-07-14 14:04:11 -04:00
Giuseppe Scrivano f690e02cae cmsg: shrink buffer to effective size
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-07-14 14:04:11 -04:00
Giuseppe Scrivano b2f13b0f12
Merge pull request #566 from sohankunkerkar/fix-terminal-size
OCPBUGS-4556: src: Fix terminal resize event processing
2025-06-13 23:03:56 +02:00
Sohan Kunkerkar 2a1dda8e92 src: Fix terminal resize event processing
The fix ensures each complete line in the control buffer
is processed exactly once by passing the correct line
start position to the processing function and properly
null-terminating each line.

Signed-off-by: Sohan Kunkerkar <sohank2602@gmail.com>
2025-06-13 13:03:19 -04:00
Ayato Tokubi 4d374fd67d fix integration github action
Signed-off-by: Ayato Tokubi <atokubi@redhat.com>
2025-06-05 09:10:32 -04:00
Ayato Tokubi ede56b9af5 fix wrong conditions of k8s-file logging
Signed-off-by: Ayato Tokubi <atokubi@redhat.com>
2025-06-05 09:10:32 -04:00
Daniel J Walsh 9b74c3258c
Merge pull request #562 from p12tic/journald-labels
logging: Add container labels to log entries on journald
2025-06-01 06:46:36 -04:00
Povilas Kanapickas f37e9e795c logging: Add container labels to log entries on journald
At present it's not possible to ship properly labeled logs when using
podman and tools like podman-compose. Container labels are lost which
makes it much harder to understand where a particular log line
originated from. Log processing and analysis is significantly more
inconvenient as well, because it's hard to group related logs, e.g.
coming from the same compose project.

This commit implements the parts necessary to annotate log messages with
container labels. Each label and value pair is specified via --log-label
LABEL=VALUE arguments.

Co-authored-by: OZoneGuy <oalkersh@protonmail.com>
Signed-off-by: Povilas Kanapickas <povilas@radix.lt>
2025-06-01 00:49:41 +03:00
Jindrich Novy de270e6eb9
Merge pull request #391 from dcermak/go-1-19
Switch go version to 1.19
2025-05-05 16:27:42 +02:00
Jindrich Novy e86e03feab
Merge pull request #560 from kolyshkin/ci
Modernize CI (a bit)
2025-05-05 14:41:06 +02:00
Jindrich Novy ecf16be6c3
Merge pull request #559 from kolyshkin/no-pkg-errors
Modernize Go code
2025-05-02 10:42:33 +02:00
Kir Kolyshkin 93dcd632c5 Makefile: simplify fmt
We can use git ls-files for clang-format, and gofmt directly for Go
(as we don't have vendor).

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-05-01 18:06:47 -07:00
Kir Kolyshkin c85e7bb4ed Remove hack/tree_status.sh
The sole reason of having this was to produce a nice informative error
message in case of failure (and because the message was not explicitly
specified, it was entirely wrong in 1 of 2 use cases).

Let's drop it and just use git to fail the job if a tree is dirty.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-05-01 18:06:47 -07:00
Kir Kolyshkin 0a5e93d95b Remove hack/kubernetes-e2e
This is not used since commit 4e61870 ("gh actions: drop perma-failing
jobs").

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-05-01 18:06:47 -07:00
Kir Kolyshkin 0b024b29da ci: add go.mod/go.sum validation
This repo never had ./vendor committed, so it's of little use,
especially with modern Go caching and proxying. So, let's remove
"go mod vendor" from the make's vendor target, and also remove
"make vendor" from the integration / conmon CI job.

Instead, add a separate GHA job to check for go.mod/go.sum correctness.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-05-01 18:06:44 -07:00
Kir Kolyshkin 7c7b0c5bbc ci/gha: add all-done job
The sole reason is to simplify branch protection rules,
requiring just these to be passed.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-05-01 17:56:51 -07:00
Kir Kolyshkin 12c3a59925 ci/gha: fix branch name
Apparently, since the master branch was renamed to main a few years
back, GHA CI jobs no longer run when a PR is merged.

Fix that. While at it, add some vertical whitespace.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-05-01 17:56:51 -07:00
Kir Kolyshkin 5b51069234 Remove old vendored go-md2man
The version of go-md2man is quite old, and is not really need to be in
this repository. It is only used by `make docs` which does not make
sense to run in CI, as we don't check the resulting man page.

PS if needed, it is easy to install by this one-liner:

	go install github.com/cpuguy83/go-md2man/v2@latest

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-05-01 17:14:13 -07:00
Kir Kolyshkin 58e4cf40fc ci/gha: remove actions/cache
Because actions/setup-go does its own caching now, adding our own:
 - unnecessarily complicates things;
 - results in double caching.

Remove it.

Note that a single common cache will be used between the two jobs (which
is OK).

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-05-01 13:36:17 -07:00
Kir Kolyshkin 9389c6114a Use gofumpt
Apparently the code is well formatted, except for 0oNNN for octal
numbers (available since Go 1.13).

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-05-01 13:24:57 -07:00
Kir Kolyshkin 8cb0c760b6 runner/conmon_test: rm unused skopeoPath
Probably a leftover from developing tests.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-05-01 13:21:35 -07:00
Kir Kolyshkin adb68be243 runner/conmon: rm unused writeConmonPipeData
This was introduced in commit 31c5a2e ("add tests running a runtime")
but was not used ever.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-05-01 13:20:07 -07:00
Kir Kolyshkin c56cab55c4 Replace ioutil.TempDir with t.TempDir
The testing.TempDir function is available since Go 1.15.

Since tests are written using ginkgo, use GinkgoT to obtain
*testing.T.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-05-01 11:57:27 -07:00
Kir Kolyshkin 4d836a4b80 Use os.ReadFile/os.WriteFile instead of ioutil
The io/ioutil package is deprecated since Go 1.16,
so let's switch to os for ReadFile and WriteFile.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-04-30 19:42:43 -07:00
Kir Kolyshkin c490967a23 runner: stop using pkg/errors
The github.com/pkg/errors is frozen since November 2021, and %w for
fmt.Errorf is available since Go 1.13 (September 2019).

Switch from pkg/errors to Go native way of wrapping errors.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-04-30 19:26:07 -07:00
Jindrich Novy 475c7de10f
Merge pull request #554 from jnovy/540
Introduce pwarnf() for better diagnosis of socket/fd write issues.
2025-04-24 12:43:34 +02:00
Jindrich Novy 6f5e1d2277
Merge pull request #555 from kolyshkin/log-fixups
Logging fixups and cleanups
2025-04-24 10:51:16 +02:00
Jindrich Novy 503645f3cf
Merge pull request #557 from kolyshkin/nits
seccomp_accept_cb: fix memory leak
2025-04-24 10:47:06 +02:00
Kir Kolyshkin 869f9d25f5 Use %m instead of strerror(errno)
While %m is GNU extension, it is also supported by alternative libc
versions (uclibc, musl) as well as by FreeBSD 12+.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-04-23 13:10:39 -07:00
Kir Kolyshkin 38ff63772b cmsg: error logging nits
In here, error and errorf print strerror(errno), but the only case
errno is set is when malloc() fails (and the errno is invariably
ENOMEM so it doesn't make much sense to print it anyway).

1. Remove adding strerror(errno) from these macros.

2. Redefine error via errorf to simplify.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-04-23 13:09:45 -07:00
Kir Kolyshkin f464b5923d seccomp_accept_cb: fix memory leak
Users of recvfd are expected to free the .name.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-04-22 14:58:41 -07:00
Kir Kolyshkin 8c35fb532c Remove pwarn macro
And convert 2 of its users to use nwarnf with %m.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-04-22 14:54:02 -07:00
Kir Kolyshkin 775ef67955 write_journald: fix logging a warning
1. Use nwarn instead of pwarn since we're not interested in errno here.

2. Add some context to the message.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-04-22 14:54:02 -07:00
Kir Kolyshkin 0e7fd175f0 write_oom_adjust: remove extra newlines from ndebugf
The ndebugf macro already adds \n so we don't have to.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-04-22 14:54:02 -07:00
Jindrich Novy 238f24a5f6 Introduce pwarnf() for better diagnosis of socket/fd write issues.
Related: #540

Signed-off-by: Jindrich Novy <jnovy@redhat.com>
2025-04-22 14:18:27 +02:00
Jindrich Novy 2e66c1cce7
Merge pull request #551 from jnovy/pollfix
Handle descriptor in non-blocking mode properly.
2025-04-15 13:04:25 +02:00
Jindrich Novy 5412374e02 Handle descriptor in non-blocking mode properly.
Resolves: #490

Signed-off-by: Jindrich Novy <jnovy@redhat.com>
2025-04-15 12:26:15 +02:00
Ayato Tokubi 82de887596 Bump conmon version to 2.1.13
Signed-off-by: Ayato Tokubi <atokubi@redhat.com>
2025-02-25 13:04:08 -05:00
Ayato Tokubi 24498b54d7 Install some packages to fix CI
Signed-off-by: Ayato Tokubi <atokubi@redhat.com>
2025-02-24 11:43:25 -05:00
Jindrich Novy 41e2c0dc06 Make timestamp generation never fail.
If, for whatever reason, quering current time fails
then the whole log message is dropped.

This PR assures the log message is not lost and
the output is done with the default timestamp.

Signed-off-by: Jindrich Novy <jnovy@redhat.com>
2025-02-11 10:33:33 -05:00
Jindrich Novy 119db20187 Change permissions of logs from 0600 to 0640
Changing log permissions to 0640 would allow the administrator to
set sticky group on the log directory, and for a selected
log-users (in a specific group) without root-permissions to
read the log files.

Fixes #539

Signed-off-by: Jindrich Novy <jnovy@redhat.com>
2025-01-09 10:25:30 -05:00
Jindrich Novy aee638f5b2
Merge pull request #536 from jnovy/retryable
Avoid bogus journal filling errors
2024-12-11 10:10:58 +07:00
Jindrich Novy 6bd3079f62
Merge pull request #535 from jnovy/manfix
Fix typos and clarify man page.
2024-12-10 11:17:06 +07:00
Jindrich Novy 02c6ea6191 Avoid bogus journal filling errors
Issue #454 is likely caused by retryable_error()
not catching all possible retriable errors from
the write(2) syscall.

This PR should be FeeBSD compatible. The only
retriable result in question is ENOBUFS which
might be bound to network latencies. Hope
these are acceptable.

Signed-off-by: Jindrich Novy <jnovy@redhat.com>
2024-12-06 05:09:23 +01:00