Commit Graph

39 Commits

Author SHA1 Message Date
Miloslav Trmač 0ad3d51e5a Don't document the precise file position of a returned file
The callers don't really need to know, and this way we don't
need to get the details of the syntax correct.

Should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2025-05-14 20:53:29 +02:00
Miloslav Trmač fbfe821818 Use io.SeekStart instead of hard-coding 0.
Should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2025-05-14 20:53:29 +02:00
Giuseppe Scrivano 1f4b7049d1 chunked: Fix potential file descriptor leaks
Ensure temporary file descriptors for tar-split data
are closed properly in error paths within zstd:chunked
handling.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-05-07 08:52:19 +02:00
Giuseppe Scrivano 51a7982bc8 chunked: make explicit successful return
This change makes the final `nil` error return explicit, making the
successful exit path clearer.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-05-07 08:52:18 +02:00
Giuseppe Scrivano 289948adad chunked: use temporary file for tar-split data
Replace the in-memory buffer with a O_TMPFILE file.  This reduces the
memory requirements for a partial pull since the tar-split data can be
written to disk.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-04-29 12:26:57 +02:00
Giuseppe Scrivano 9751b84c7f chunked: extract blob validation into standalone function
extract the blob checksum validation logic from decodeAndValidateBlob
into a separate function.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-04-28 22:07:58 +02:00
Kir Kolyshkin 83361ab356 Remove unneeded conversion
Those are the cases where the value being converted is already of that
type (checked to be that way for all os/arch combinations).

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-04-01 16:18:43 -07:00
Miloslav Trmač b6236cd98e Remove the remaining user of golang.org/x/exp/maps
Should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2025-03-26 21:00:20 +01:00
Miloslav Trmač 929f785f43 Bump maxTocSize to 150 MB
We have seen an image with:
- total size 1.43 GB
- uncompressed zstd:chunked manifest size of 91.7 MB
- uncompressed tar-split size (not constrained by maxTocSize) 310 MB

Without more infrastructure, we are just guessing about what
the system we are running on can support, so, for now, *shrug*, bump
the number.

Eventually we should stream the data from/to disk, making this
much less relevant; that makes building the infrastructure to
estimate available memory unattractive.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2025-01-30 21:04:50 +01:00
Miloslav Trmač 51810f3644 Allow falling back from partial pulls if the metadata is too large
... but not if the fallback would be convert_images, again
creating too large metadata.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2025-01-30 18:17:50 +01:00
Miloslav Trmač 85ae2385f6 Move the canFallback logic from make...Differ into read...Manifest
That's a logically better place, it pairs the getBlobAt
calls with the ErrBadRequest types specific to those call sites.

We will, also, add more fallback reasons.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2025-01-30 18:17:50 +01:00
Miloslav Trmač eca040fb6e Move pkg/chunked/internal to pkg/chunked/internal/minimal
We now have several internal subpackages of pkg/chunked, so delineate
more explicitly the parts that should be kept as small as possible
because the c/image compression package depends on them.

Should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2024-12-13 01:27:58 +01:00
Giuseppe Scrivano e9b6d0c436 chunked: refactor value into const
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-11-14 22:59:03 +01:00
Giuseppe Scrivano 46ccf217a6 chunked: rework GetBlobAt usage
rewrite how the result from GetBlobAt is used, to make sure 1) that
the streams are always closed, and 2) that any error is processed.

Closes: https://issues.redhat.com/browse/OCPBUGS-43968

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-11-14 22:59:03 +01:00
Miloslav Trmač 1895f29029 Compute the layer size from tar-split for zstd:chunked layers
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2024-10-14 20:10:07 +02:00
Miloslav Trmač e652ec86b3 Explicitly differentiate between empty and missing tar-split
Empty tar-split shouldn't ever happen, but being precise
here doesn't hurt.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2024-10-14 20:10:07 +02:00
Miloslav Trmač 4fd50da3da Use tar-split/tar/asm.IterateHeaders now that it has been accepted
... instead of our version which makes assumptions on the
internal decisions of the tar-split project, and needs heuristics
to guess where file padding ends.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2024-09-27 19:47:25 +02:00
Miloslav Trmač 4159a976a6 Ensure that the metadata in the TOC matches the tar-split
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2024-07-18 23:36:57 +02:00
Miloslav Trmač e3634ea781 Decide on tar-split usage based on trusted data in TOC
Don't ignore the tar-split when the TOC requires one,
otherwise we could deduplicate a layer without tar-split
with a layer with tar-split.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2024-06-12 16:10:58 +02:00
Giuseppe Scrivano b01b5dff45 chunked: fix deadlock by always consuming tar-split
always consume the tar-split data when present to avoid blocking the
producer. Previously, the tar-split data was only read when the digest
was specified.

commit 12eee6df39 introduced the
regression.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-06-03 23:01:34 +02:00
Giuseppe Scrivano 12eee6df39 chunked: ignore the tar-split data if digest is empty
if a digest was not specified in the TOC, ignore completely the
tar-split data.

Otherwise the clients fail to pull images created before commit
a7247dc6e8.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-06-03 13:48:49 +02:00
Miloslav Trmač 90ae7307c9 Improve error handling a bit
Include more details in the returned error text.

Don't continue in tests when we fail to obtain a TOC.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2024-05-14 10:53:03 +02:00
Miloslav Trmač a7247dc6e8 Move the tar-split digest value into the TOC
... so that we can uniquely identify partially-pulled layers
by the TOC digest.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2024-05-14 10:53:03 +02:00
Miloslav Trmač 6bebedbff7 Unmarshal the TOC already in readZstdChunkedManifest
Other TOC formats don't fill the data in.

For now, this only increases memory usage, but we will
need the data soon.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2024-05-14 10:53:03 +02:00
Miloslav Trmač 58f5c24d42 Shorten readZstdChunkedManifest a bit further
We have the ImageSourceChunk data type, and we already
construct these values, so scan into them directly instead
of having three separate variables for the two bits of data.

Should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2024-04-22 23:56:40 +02:00
Miloslav Trmač 827c27abcd Don't use ZstdChunkedFooterData in readZstdChunkedManifest
Replace it by individual variables.

Then formally deprecate the ChecksumAnnotationTarSplit field.

Should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2024-04-22 23:56:40 +02:00
Miloslav Trmač f468eee1e9 Inline ReadFooterDataFromAnnotations into the only caller
Again, decrease the size of the compression code for c/image.

We will simplify this further immediately.

Should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2024-04-22 23:56:40 +02:00
Miloslav Trmač ea3e384742 Don't look for the binary digest when pulling layers
This code path is usually never triggered because
the annotations are present; and it was broken until recently.

Remove it to simplify the code and analysis.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2024-04-22 23:56:40 +02:00
Miloslav Trmač e750609e98 Remove ChecksumAnntation from ZstdChunkedFooterData
Manage the value directly to simplify.

This happens to fix the ReadFooterDataFromBlob code path,
which was not setting ChecksumAnntation at all.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2024-04-13 16:57:14 +02:00
Miloslav Trmač 29bca8a07e Only obtain the zstd:chunked TOC digest once
Make it structually clear that the code is all using the same value,
making it less likely for the verifier and other uses to get out of sync.

Also avoids some redundant parsing and error paths.
The conversion path looks longer, but that's just moving the parsing
from the called function (which is redundant for other callers).

Should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2024-04-13 16:57:07 +02:00
Miloslav Trmač b90f9dfed8 Only obtain the estargz TOC digest once
Make it structually clear that the code is all using the same value,
making it less likely for the verifier and other uses to get out of sync.

Also avoids some redundant parsing and error paths.

Should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2024-04-13 16:26:32 +02:00
Miloslav Trmač bbe282085b Rename a misleading parameter
Should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2024-02-26 21:06:37 +01:00
Giuseppe Scrivano 2c0ba1a132 chunked: refactor code in new functions
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2023-09-05 08:55:29 +02:00
Giuseppe Scrivano c98840f31d chunked: support converting existing images
if the "convert_images" option is set in the configuration file, then
convert traditional images to the chunked format on the fly.

This is very expensive at the moment since the entire zstd:chunked
file is created and then processed.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2023-07-26 10:46:15 +02:00
Giuseppe Scrivano 97da4fcf8e chunked: drop support for uncompressed metadata blobs
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2023-06-17 00:31:40 +02:00
Giuseppe Scrivano 8ef163a990 chunked: generate tar-split as part of zstd:chunked
change the file format to store the tar-split as part of the
zstd:chunked image.  This will allow clients to rebuild the entire
tarball without having to download it fully.

also store the uncompressed digest for the tarball, so that it can be
stored into the storage database.

Needs: https://github.com/containers/image/pull/1976

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2023-06-17 00:31:39 +02:00
Daniel J Walsh 1bf0078883 Move to golang 1.18 and later
Github.com is reporting security issues on older versions of
golang.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2023-04-03 15:26:54 -04:00
Giuseppe Scrivano 5af655e543 chunked: use io.github.containers
we do not own containers.io so let's use io.github.containers, since
the project is part of the containers organization under github.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2022-11-03 12:48:45 +01:00
Miloslav Trmač c486e1ac82 Move code only called on Linux into a Linux-specific file
Only moves unchanged code, should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2022-10-14 17:17:53 +02:00