Commit Graph

35 Commits

Author SHA1 Message Date
Giuseppe Scrivano f5bdfdc07e
chunked: use temporary file for tar-split data
Replace the in-memory buffer with a O_TMPFILE file.  This reduces the
memory requirements for a partial pull since the tar-split data can be
written to disk.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-04-29 12:26:57 +02:00
Giuseppe Scrivano 8ebe9600e3
chunked: extract blob validation into standalone function
extract the blob checksum validation logic from decodeAndValidateBlob
into a separate function.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2025-04-28 22:07:58 +02:00
Kir Kolyshkin b7fb12e894 Remove unneeded conversion
Those are the cases where the value being converted is already of that
type (checked to be that way for all os/arch combinations).

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
2025-04-01 16:18:43 -07:00
Miloslav Trmač 05de8c7758 Remove the remaining user of golang.org/x/exp/maps
Should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2025-03-26 21:00:20 +01:00
Miloslav Trmač e3e4ef8552 Bump maxTocSize to 150 MB
We have seen an image with:
- total size 1.43 GB
- uncompressed zstd:chunked manifest size of 91.7 MB
- uncompressed tar-split size (not constrained by maxTocSize) 310 MB

Without more infrastructure, we are just guessing about what
the system we are running on can support, so, for now, *shrug*, bump
the number.

Eventually we should stream the data from/to disk, making this
much less relevant; that makes building the infrastructure to
estimate available memory unattractive.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2025-01-30 21:04:50 +01:00
Miloslav Trmač 009737bdce Allow falling back from partial pulls if the metadata is too large
... but not if the fallback would be convert_images, again
creating too large metadata.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2025-01-30 18:17:50 +01:00
Miloslav Trmač 8e3996ba14 Move the canFallback logic from make...Differ into read...Manifest
That's a logically better place, it pairs the getBlobAt
calls with the ErrBadRequest types specific to those call sites.

We will, also, add more fallback reasons.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2025-01-30 18:17:50 +01:00
Miloslav Trmač 5c67136767 Move pkg/chunked/internal to pkg/chunked/internal/minimal
We now have several internal subpackages of pkg/chunked, so delineate
more explicitly the parts that should be kept as small as possible
because the c/image compression package depends on them.

Should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2024-12-13 01:27:58 +01:00
Giuseppe Scrivano 1f54749ea9
chunked: refactor value into const
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-11-14 22:59:03 +01:00
Giuseppe Scrivano dcdc061f21
chunked: rework GetBlobAt usage
rewrite how the result from GetBlobAt is used, to make sure 1) that
the streams are always closed, and 2) that any error is processed.

Closes: https://issues.redhat.com/browse/OCPBUGS-43968

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-11-14 22:59:03 +01:00
Miloslav Trmač f979bada64 Compute the layer size from tar-split for zstd:chunked layers
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2024-10-14 20:10:07 +02:00
Miloslav Trmač 7eb4a104ef Explicitly differentiate between empty and missing tar-split
Empty tar-split shouldn't ever happen, but being precise
here doesn't hurt.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2024-10-14 20:10:07 +02:00
Miloslav Trmač 39e467aa53 Use tar-split/tar/asm.IterateHeaders now that it has been accepted
... instead of our version which makes assumptions on the
internal decisions of the tar-split project, and needs heuristics
to guess where file padding ends.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2024-09-27 19:47:25 +02:00
Miloslav Trmač a1acfed89a Ensure that the metadata in the TOC matches the tar-split
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2024-07-18 23:36:57 +02:00
Miloslav Trmač f065a0a81c Decide on tar-split usage based on trusted data in TOC
Don't ignore the tar-split when the TOC requires one,
otherwise we could deduplicate a layer without tar-split
with a layer with tar-split.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2024-06-12 16:10:58 +02:00
Giuseppe Scrivano 4595fa2aab
chunked: fix deadlock by always consuming tar-split
always consume the tar-split data when present to avoid blocking the
producer. Previously, the tar-split data was only read when the digest
was specified.

commit 6875c9fbcf introduced the
regression.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-06-03 23:01:34 +02:00
Giuseppe Scrivano 6875c9fbcf
chunked: ignore the tar-split data if digest is empty
if a digest was not specified in the TOC, ignore completely the
tar-split data.

Otherwise the clients fail to pull images created before commit
b5413c2bd6.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-06-03 13:48:49 +02:00
Miloslav Trmač 70b2454cde Improve error handling a bit
Include more details in the returned error text.

Don't continue in tests when we fail to obtain a TOC.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2024-05-14 10:53:03 +02:00
Miloslav Trmač b5413c2bd6 Move the tar-split digest value into the TOC
... so that we can uniquely identify partially-pulled layers
by the TOC digest.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2024-05-14 10:53:03 +02:00
Miloslav Trmač dfb4b1ff87 Unmarshal the TOC already in readZstdChunkedManifest
Other TOC formats don't fill the data in.

For now, this only increases memory usage, but we will
need the data soon.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2024-05-14 10:53:03 +02:00
Miloslav Trmač ab5e27004b Shorten readZstdChunkedManifest a bit further
We have the ImageSourceChunk data type, and we already
construct these values, so scan into them directly instead
of having three separate variables for the two bits of data.

Should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2024-04-22 23:56:40 +02:00
Miloslav Trmač 8eeec33011 Don't use ZstdChunkedFooterData in readZstdChunkedManifest
Replace it by individual variables.

Then formally deprecate the ChecksumAnnotationTarSplit field.

Should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2024-04-22 23:56:40 +02:00
Miloslav Trmač 2c240ca3f9 Inline ReadFooterDataFromAnnotations into the only caller
Again, decrease the size of the compression code for c/image.

We will simplify this further immediately.

Should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2024-04-22 23:56:40 +02:00
Miloslav Trmač 9fbd0e0395 Don't look for the binary digest when pulling layers
This code path is usually never triggered because
the annotations are present; and it was broken until recently.

Remove it to simplify the code and analysis.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2024-04-22 23:56:40 +02:00
Miloslav Trmač 053ac6105d Remove ChecksumAnntation from ZstdChunkedFooterData
Manage the value directly to simplify.

This happens to fix the ReadFooterDataFromBlob code path,
which was not setting ChecksumAnntation at all.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2024-04-13 16:57:14 +02:00
Miloslav Trmač 1f47b38c09 Only obtain the zstd:chunked TOC digest once
Make it structually clear that the code is all using the same value,
making it less likely for the verifier and other uses to get out of sync.

Also avoids some redundant parsing and error paths.
The conversion path looks longer, but that's just moving the parsing
from the called function (which is redundant for other callers).

Should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2024-04-13 16:57:07 +02:00
Miloslav Trmač 3beea1e21e Only obtain the estargz TOC digest once
Make it structually clear that the code is all using the same value,
making it less likely for the verifier and other uses to get out of sync.

Also avoids some redundant parsing and error paths.

Should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2024-04-13 16:26:32 +02:00
Miloslav Trmač 3831f44d36 Rename a misleading parameter
Should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2024-02-26 21:06:37 +01:00
Giuseppe Scrivano eedd976e5b
chunked: refactor code in new functions
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2023-09-05 08:55:29 +02:00
Giuseppe Scrivano 303100391e
chunked: support converting existing images
if the "convert_images" option is set in the configuration file, then
convert traditional images to the chunked format on the fly.

This is very expensive at the moment since the entire zstd:chunked
file is created and then processed.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2023-07-26 10:46:15 +02:00
Giuseppe Scrivano 032aae3a70
chunked: drop support for uncompressed metadata blobs
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2023-06-17 00:31:40 +02:00
Giuseppe Scrivano 7bbf6ed448
chunked: generate tar-split as part of zstd:chunked
change the file format to store the tar-split as part of the
zstd:chunked image.  This will allow clients to rebuild the entire
tarball without having to download it fully.

also store the uncompressed digest for the tarball, so that it can be
stored into the storage database.

Needs: https://github.com/containers/image/pull/1976

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2023-06-17 00:31:39 +02:00
Daniel J Walsh a3204cf7e8
Move to golang 1.18 and later
Github.com is reporting security issues on older versions of
golang.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2023-04-03 15:26:54 -04:00
Giuseppe Scrivano f98fa3967d
chunked: use io.github.containers
we do not own containers.io so let's use io.github.containers, since
the project is part of the containers organization under github.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2022-11-03 12:48:45 +01:00
Miloslav Trmač a4870b9761 Move code only called on Linux into a Linux-specific file
Only moves unchanged code, should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2022-10-14 17:17:53 +02:00