Commit Graph

165 Commits

Author SHA1 Message Date
Giuseppe Scrivano 87c6994f67
chunked: allow conversion without zstd compression
This commit introduces the capability to convert tar layers
to the zstd:chunked format, without performing any zstd compression.

This allows using zstd:chunked without the cost of compression and
decompression.

Closes: https://issues.redhat.com/browse/RUN-3056

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-06-09 13:17:22 +02:00
Giuseppe Scrivano 378f38ca12
chunked: refactor limit reader creation
The reader for the part is now limited before calling
prepareCompressedStreamToFile().

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-06-09 13:17:22 +02:00
Miloslav Trmač ce1a7ec125 Use a os.File.Seek method instead of directly calling unix.Seek
This ultimately amounts to the same thing (and os.File.Seek is
a tiny bit more costly, with some atomic operations) - using
os.File.Seek is more consistent with the surrounding code
relying on uses through os.File, as well as a tiny bit
shorter.

Should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2025-05-14 20:53:29 +02:00
Miloslav Trmač 59e6d24252 Use io.SeekStart instead of hard-coding 0.
Should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2025-05-14 20:53:29 +02:00
Giuseppe Scrivano c9260b973a
chunked: rename GetDiffer to NewDiffer
it is an explicit API breaking change, so that cannot be used by old
users (e.g. an older containers/image version) that are not ported to
support the new semantic that only one ApplyDiffWithDiffer call is
supported for one differ object.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-05-07 08:52:19 +02:00
Giuseppe Scrivano 719ebe0f4c
chunked: prevent reuse of chunkedDiffer
The chunkedDiffer object holds state and resources that are managed
within a single ApplyDiff call.  Reusing the same differ instance for
multiple ApplyDiff calls could lead to incorrect state or errors related
to already-closed resources.

Add a flag and check to ensure ApplyDiff cannot be called more than
once on the same chunkedDiffer instance, making its usage pattern explicit
and preventing potential misuse.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-05-07 08:52:19 +02:00
Giuseppe Scrivano 65b1dc9e06
chunked: Fix potential file descriptor leaks
Ensure temporary file descriptors for tar-split data
are closed properly in error paths within zstd:chunked
handling.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-05-07 08:52:19 +02:00
Giuseppe Scrivano 2931447e0a
chunked: fix file use after close
Transfer ownership of the `tocDigest` file handle upon
successful completion of `ApplyDiff` in the chunked
differ.  This prevents the handle from being closed by
the differ if the operation succeeds, as ownership is
effectively passed along.

Explicitly close the `TarSplit` reader during
`CleanupStagedLayer` and `ApplyDiffWithDiffer`.  This ensures
the associated file handle is released even if the layer
application process is aborted or fails before the reader
is naturally consumed and closed elsewhere.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-05-07 08:52:19 +02:00
Giuseppe Scrivano 8c9ca13d37
store: remove deprecated ApplyDiffWithDiffer
The ApplyDiffWithDiffer function was marked as deprecated,
with PrepareStagedLayer being the recommended replacement.
Its implementation was just a wrapper around PrepareStagedLayer.

Remove the deprecated function from the Store and LayerStore
interfaces and its implementation.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-05-07 08:52:18 +02:00
Giuseppe Scrivano 729821c12e
chunked: Refactor temporary file creation
Replace the direct call to unix.Open with the O_TMPFILE flag
with the dedicated openTmpFile helper function.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-04-29 12:26:57 +02:00
Giuseppe Scrivano f5bdfdc07e
chunked: use temporary file for tar-split data
Replace the in-memory buffer with a O_TMPFILE file.  This reduces the
memory requirements for a partial pull since the tar-split data can be
written to disk.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-04-29 12:26:57 +02:00
Giuseppe Scrivano 3724a5a0bd
driver: add Close() method to Differ
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-04-28 22:08:03 +02:00
Kir Kolyshkin b7fb12e894 Remove unneeded conversion
Those are the cases where the value being converted is already of that
type (checked to be that way for all os/arch combinations).

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-04-01 16:18:43 -07:00
Kir Kolyshkin c3ff7f58df Use any instead of interface{}
It's available since Go 1.18 (see https://pkg.go.dev/builtin#any).

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-03-31 15:37:25 -07:00
Kir Kolyshkin e1b7a98a21 Use min/max to simplify the code
Should not change functionality.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-03-28 20:08:32 -07:00
Kir Kolyshkin 0ee055d8d2 Do not capitalize error strings
Fix a bunch of linter warnings:

> ST1005: error strings should not be capitalized

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-03-26 14:29:57 -07:00
Марк Булатов 93d250c406
Fix handle leak
Signed-off-by: Марк Булатов <markus7bulatenko@yandex.ru>
2025-03-06 13:04:28 +01:00
Miloslav Trmač 009737bdce Allow falling back from partial pulls if the metadata is too large
... but not if the fallback would be convert_images, again
creating too large metadata.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2025-01-30 18:17:50 +01:00
Miloslav Trmač 8e3996ba14 Move the canFallback logic from make...Differ into read...Manifest
That's a logically better place, it pairs the getBlobAt
calls with the ErrBadRequest types specific to those call sites.

We will, also, add more fallback reasons.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2025-01-30 18:17:50 +01:00
Miloslav Trmač 0dbf0feab9 Remove the canFallback return value
Instead, have the callees produce ErrFallbackToOrdinaryLayerDownload
directly.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2025-01-30 18:17:50 +01:00
Miloslav Trmač 9c42f91cd8 Reorganize chunkedDiffer
Try to split fields by purpose, and document some of the context.

For consistency, use the same order in the struct literals.

Only reorders existing fields, should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2025-01-30 18:17:50 +01:00
Miloslav Trmač 9119f36918 When applying a chunked layer with a tar-split, compute its uncompressed digest
This will allow c/image to validate the uncompressed digest against the config's
RootFS.DiffID value (ensuring that the layer's contents are the same when pulled
via TOC and traditionally); and the uncompressed digest will be used as a layer ID,
ensuring users see the traditional layer and image IDs they are used to.

This doesn't work for layers without a tar-split (all estargz, and old zstd:chunked
layers); for those, we fall back to traditional pulls.

Alternatively, for EXTREMELY restricted use cases, add an
"insecure_allow_unpredictable_image_contents" option to storage.conf. This option
allows partial pulls of estargz and old zstd:chunked layers, and skips the costly
uncompressed digest computation. It is then up to the user to worry about
images where the tar representation and the TOC representation don't match,
and about unpredictable image IDs.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2025-01-07 16:56:09 +01:00
Miloslav Trmač 3c62db3445 Push the "canFallback" logic from getProperDiffer into the format handlers
We will add more, format-specific, logic to the decision.

Should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2025-01-07 16:50:19 +01:00
Miloslav Trmač 25b8040272 Inline parseBooleanPullOption into its only caller
Should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2025-01-07 16:50:19 +01:00
Miloslav Trmač 6efb877bae Centralize parsing of store.PullOptions
... so that it happens
- before we start doing anything destructive
- only once (with no risk of defaults getting out of sync)
- in a single place

Ideally this should happen along with the initial parsing
of the config file; this is not that, but it is a minor step
in that direction.

Should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2025-01-07 16:50:19 +01:00
Miloslav Trmač 116abed46c Introduce pkg/chunked/internal/path.RegularFilePathForValidatedDigest
... to centralize the composefs backing file layout design.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2025-01-07 16:50:19 +01:00
Miloslav Trmač 603328d069 Change how makeEntriesFlat tracks duplicate entries
- Use a map of struct{} to save on storage
- Track the items by path, not by the digest; that's one byte more
  per entry, but it will allow us to abstract the path logic.

Should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2025-01-07 16:50:19 +01:00
Miloslav Trmač 5c67136767 Move pkg/chunked/internal to pkg/chunked/internal/minimal
We now have several internal subpackages of pkg/chunked, so delineate
more explicitly the parts that should be kept as small as possible
because the c/image compression package depends on them.

Should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2024-12-13 01:27:58 +01:00
Giuseppe Scrivano 478fc7510d
chunked: improve detection of root directory path
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-12-11 13:34:46 +01:00
Giuseppe Scrivano 40a1ea7fb2
chunked: store the override xattr with the inode type
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-12-06 08:50:08 +01:00
Giuseppe Scrivano f098bdd5a5
chunked: honor ForceMask for created files
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-12-04 11:52:59 +01:00
Giuseppe Scrivano dcdc061f21
chunked: rework GetBlobAt usage
rewrite how the result from GetBlobAt is used, to make sure 1) that
the streams are always closed, and 2) that any error is processed.

Closes: https://issues.redhat.com/browse/OCPBUGS-43968

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-11-14 22:59:03 +01:00
Nalin Dahyabhai 435aa93e14 Disable partial pulls (zstd:chunked) by default
Disable the storage.options.pull_options.enable_partial_images option by
default, so that it will have to be explicitly enabled in order to be
used.

Update the apply-diff-from-staging-directory integration test to call
the test helper binary directly, so that the configuration file the test
writes won't have its settings overridden by command line options that
the storage() test helper function adds.

Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
2024-11-04 13:52:48 -05:00
Giuseppe Scrivano 8b5dadb045
chunked: close payload stream
The payload stream must be closed after being used.

Reported here: https://github.com/containers/storage/pull/2041#discussion_r1690061874

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-10-24 22:43:59 +02:00
Miloslav Trmač 82234e1a36 Fall back from partial pulls to ordinary pulls on VFS
... and other graph drivers which don't support partial pulls.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2024-10-18 21:54:35 +02:00
Miloslav Trmač bdbd5ed90d Refactor getProperDiffer further
Use switch to handle the various TOC presence cases

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2024-10-18 21:54:27 +02:00
Miloslav Trmač da9f66a4ac Refactor getProperDiffer a bit
Instead of sharing the badRequestErr logic, duplicate it.
That's a bit ugly, but we get better debug messages and a more
traditional control flow.

Should not change behavior, except for debug messages.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2024-10-18 21:54:06 +02:00
Miloslav Trmač 8e3938748f Split getProperDiffer from GetDiffer
... to centralize the fallback allowed / required logic.

Should not change behavior, apart from maybe some error text.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2024-10-18 21:53:56 +02:00
Miloslav Trmač 1a8fde7e01 Don't discard the error reported by tarSizeFromTarSplit
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2024-10-16 00:36:13 +02:00
Miloslav Trmač f979bada64 Compute the layer size from tar-split for zstd:chunked layers
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2024-10-14 20:10:07 +02:00
Miloslav Trmač 71fc790550 Don't set UncompressedSize on chunked pull
The current value obtained by summing the sizes of regular file contents
does not match the size of the uncompressed layer tarball.

We don't have a convenient source to compute the correct size
for estargz without pulling the full layer and defeating the point;
so we must allow for the size being unknown.

For recent zstd:chunked images, we have the full tar-split,
so we can compute the correct size; that will happen in
the following commits.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2024-10-14 20:10:07 +02:00
Giuseppe Scrivano ed132480d3
chunked: define error for partial pulls not available
define a new error type so that the caller can determine whether it is
safe to ignore the error and retrieve the resource fully.

Closes: https://github.com/containers/storage/issues/2115

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-10-02 17:13:25 +02:00
Giuseppe Scrivano 3bbd19c9dc
chunked: drop special handling for "< 64" ranges
Avoid handling cases where the server doesn't support at least 64
ranges in a request, in order to prevent falling back to the
traditional pull mechanism.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-10-02 15:49:20 +02:00
Miloslav Trmač f4597cfee5 Use range iterations
... for the trivial cases where the loop body does not reference
the iteration variable at all.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2024-09-05 18:56:35 +02:00
Miloslav Trmač 28f531e5f6 Remove the context parameter from file loaders
Conceptually, these read from the network and _should_ be interruptible.
But we could, at best, interrupt the wait on the channels returned by
GetBlobAt; we would then still get a ReadCloser where we can call
Read() but not concurrently notice a cancellation.

The cancellation needs to happen at the HTTP client side, i.e. inside
the c/image private.ImageSourceInternalOnly.GetBlobAt call (not ImageSourceSeekable),
and, although it seems not to be quite documented, the http stack's http.NewRequestWithContext
does cause the HTTP body transport to be terminated on a context cancellation.

I.e. it is out of control of this codebase anyway; so don't pretend to handle it.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2024-07-10 23:54:30 +02:00
Miloslav Trmač 412ff916c2 Remove various unused parameters
Should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2024-07-10 23:54:30 +02:00
Giuseppe Scrivano 28cba3014b
chunked: store compressed digest if validated
if the compressed digest was validated, as it happens when
'pull_options = {convert_images = "true"}' is set, then store it as
well so that reusing the blob by its compressed digest works.

Previously, when an image converted to zstd:chunked was pulled a
second time, it would not be recognized by its compressed digest,
resulting in the need to re-pull the image again.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-07-10 09:41:41 +02:00
Jan Rodák fb91f385f4 Fix errcheck: error return value of `unix.Mkdirat` is not checked
Signed-off-by: Jan Rodák <hony.com@seznam.cz>
2024-07-09 16:46:49 +02:00
Jan Rodák 8c11a68185 Fix errcheck: error return value of `c.zstdReader.Reset` is not checked
Signed-off-by: Jan Rodák <hony.com@seznam.cz>
2024-07-09 16:46:49 +02:00
Giuseppe Scrivano f35833c88d
chunked: set correctly the xattr for ForceMask
and also fix a segfault when the xattr map is nil

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-06-19 23:20:21 +02:00