Commit Graph

199 Commits

Author SHA1 Message Date
Giuseppe Scrivano cdf703fa81
chunked: return PathError for fsetxattr
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-06-07 17:25:26 +02:00
Giuseppe Scrivano 145b4f9a52
chunked: return PathError for utimensat
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-06-07 17:25:25 +02:00
Giuseppe Scrivano 7094880624
chunked: return PathError for Linkat errors
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-06-07 17:24:53 +02:00
Giuseppe Scrivano 886d4eedba
chunked: return PathError for fchmodat/fchmod
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-06-07 17:24:27 +02:00
Giuseppe Scrivano d1447ae783
chunked: unify chown calls
refactor all calls to Fchown and Fchownat into a single function, and
provide a better error messages.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-06-07 17:24:27 +02:00
Giuseppe Scrivano 62c4c243a3
chunked: refactor doHardLink to get full destination path
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-06-07 17:24:23 +02:00
Colin Walters cfde71b024 chunked: Add helpers for /proc/self/fd accesses
It's not hard to grep for /proc/self/fd, but this way is a bit
cleaner and avoids typos, etc.

Signed-off-by: Colin Walters <walters@verbum.org>
2024-06-07 09:30:55 -04:00
Giuseppe Scrivano 42801b27de
chunked: add tests for filesystem operations
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-06-07 14:38:24 +02:00
Giuseppe Scrivano c4ba01f635
chunked: use filepath.Split() instead of Dir()/Base()
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-06-07 14:37:42 +02:00
Giuseppe Scrivano 4c716c8628
chunked: honor mode for mkdirat
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-06-07 08:32:24 +02:00
Giuseppe Scrivano 61a2c5ddf3
chunked: refactor args to openOrCreateDirUnderRoot()
follow the same pattern used by other functions

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-06-06 22:33:43 +02:00
Giuseppe Scrivano 7c61cfaee7
chunked: refactor args to openFileUnderRoot()
follow the same pattern used by other functions.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-06-06 22:33:43 +02:00
Giuseppe Scrivano c811876d13
chunked: fix opening parent dir
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-06-06 22:33:43 +02:00
Giuseppe Scrivano 2278cb1e0e
chunked: provide constructor for seekableFile
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-06-06 21:37:13 +02:00
Giuseppe Scrivano 37800897ab
chunked: split file operations to a new file
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-06-06 13:11:46 +02:00
openshift-merge-bot[bot] de5b13bf73
Merge pull request #1951 from giuseppe/dump-handle-duplicates
dump: handle duplicates
2024-06-05 19:24:14 +00:00
Colin Walters 661531fb0d chunked: Set O_CLOEXEC
I was just reading the code and I have a mental checklist item
for "invoking open without O_CLOEXEC" that triggered here.
(See also e.g.
https://github.com/containers/composefs/pull/185#discussion_r1322925050
)

It has security-relevant properties for us, xref
CVE-2024-21626 for example.

This isn't the only missing variant of this in this codebase,
just using this targeted PR to test the waters for more PRs.

Signed-off-by: Colin Walters <walters@verbum.org>
2024-06-05 09:20:07 -04:00
Giuseppe Scrivano 1848835383
dump: handle duplicates
Add validation for duplicate entries.  Duplicates are ignored, unless
there is a mismatch in the values.  In that case, an error is returned.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-06-05 13:59:15 +02:00
Giuseppe Scrivano 588dd2767c
dump: refactor dump_test.go
move all the test cases into a struct instead of declaring each of
them separately.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-06-05 13:13:42 +02:00
Giuseppe Scrivano 06a03bd48f
dump: use the sanitized path for root check
otherwise if the root is stored as "./", it ends up adding the root
node twice causing mkcomposefs to fail.

Closes: https://github.com/containers/storage/issues/1941

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-06-04 22:12:18 +02:00
openshift-merge-bot[bot] 986e170721
Merge pull request #1940 from giuseppe/fix-escape-for-composefs-dump
dump: replace unicode package with custom functions
2024-06-04 14:44:24 +00:00
Giuseppe Scrivano 12f34d8f6d
dump: replace unicode package with custom functions
To avoid a mismatch with the C composefs library that uses isgraph()
and isprint(), do not use the unicode package but provide the expected
definition for these functions.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-06-04 16:17:01 +02:00
Giuseppe Scrivano 4595fa2aab
chunked: fix deadlock by always consuming tar-split
always consume the tar-split data when present to avoid blocking the
producer. Previously, the tar-split data was only read when the digest
was specified.

commit 6875c9fbcf introduced the
regression.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-06-03 23:01:34 +02:00
Colin Walters d78daad6ca compression: Add some doc comments
I'm digging in more to zstd:chunked and I hope these comments are
useful.

Signed-off-by: Colin Walters <walters@verbum.org>
2024-06-03 14:54:32 -04:00
openshift-merge-bot[bot] 44daeaa690
Merge pull request #1937 from giuseppe/chunked-fine-tune-threshold
chunked: change auto merge threshold to 1024
2024-06-03 15:53:01 +00:00
Giuseppe Scrivano 617a808a63
chunked: change auto merge threshold to 1024
Increase the threshold for auto-merging parts from 128 to 1024. This change
aims to reduce the number of parts in an HTTP multi-range request, thus
increasing the likelihood that the server will accept the request.

The previous threshold of 128 often resulted in a large number of small
ranges, which could lead to HTTP multi-range requests being rejected by
servers due to the excessive number of parts.

It partially addresses the reported issue.

Reported-by: https://github.com/containers/storage/issues/1928

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-06-03 15:00:35 +02:00
Giuseppe Scrivano 6875c9fbcf
chunked: ignore the tar-split data if digest is empty
if a digest was not specified in the TOC, ignore completely the
tar-split data.

Otherwise the clients fail to pull images created before commit
b5413c2bd6.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-06-03 13:48:49 +02:00
Giuseppe Scrivano 6b1c044dc1
composefs: add parent directory if missing
it solves this error with mkcomposefs when the TOC doesn't specify the
parent directory for an entry:

Error: committing the finished image: failed to put layer using a partial pull: failed to convert json to erofs: exit status 1: mkcomposefs: Parent directory missing for /usr/share/locale/ca/LC_MESSAGES/libc.mo

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-05-28 11:52:30 +02:00
Miloslav Trmač 12554ee28c Stop also writing TarSplitChecksumKey
We are already not reading it, so simplify the code.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2024-05-14 17:12:57 +02:00
Miloslav Trmač 70b2454cde Improve error handling a bit
Include more details in the returned error text.

Don't continue in tests when we fail to obtain a TOC.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2024-05-14 10:53:03 +02:00
Miloslav Trmač b5413c2bd6 Move the tar-split digest value into the TOC
... so that we can uniquely identify partially-pulled layers
by the TOC digest.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2024-05-14 10:53:03 +02:00
Miloslav Trmač dfb4b1ff87 Unmarshal the TOC already in readZstdChunkedManifest
Other TOC formats don't fill the data in.

For now, this only increases memory usage, but we will
need the data soon.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2024-05-14 10:53:03 +02:00
Miloslav Trmač 8dd381ecf3 Refactor unmarshalTOC to use a switch
This is a microptimization, we call strings.ToLower only
once, but more importantly it will make it easier to add
more fields.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2024-05-14 10:53:03 +02:00
Giuseppe Scrivano 881107d836
chunked: skip cache file for non-partial layers
if the layer does not have a manifest TOC, just ignore it instead of
raising a warning.  There is no need to create a cache file since
there is no manifest file to parse.

Closes: https://github.com/containers/storage/issues/1909

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-04-24 17:18:33 +02:00
openshift-merge-bot[bot] 408479eae9
Merge pull request #1906 from giuseppe/downgrade-message-to-info
chunked: downgrade loading cache file msg to info
2024-04-23 20:20:24 +00:00
Giuseppe Scrivano e63e3002e1
chunked: downgrade loading cache file msg to info
it can happen for any reason, like for example using a new cache file
format, in this case the file is recreated with the last version.
This is internal only and should not be displayed by default.

Closes: https://github.com/containers/storage/issues/1905

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-04-23 17:36:18 +02:00
Giuseppe Scrivano 364ff49b7e
chunked: fix divide by zero in bloom filter
if the bloom filter size is zero, the "% size" operation fails with a
divide by zero.

Closes: https://github.com/containers/storage/issues/1903

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-04-23 13:54:30 +02:00
Miloslav Trmač ab5e27004b Shorten readZstdChunkedManifest a bit further
We have the ImageSourceChunk data type, and we already
construct these values, so scan into them directly instead
of having three separate variables for the two bits of data.

Should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2024-04-22 23:56:40 +02:00
Miloslav Trmač 8eeec33011 Don't use ZstdChunkedFooterData in readZstdChunkedManifest
Replace it by individual variables.

Then formally deprecate the ChecksumAnnotationTarSplit field.

Should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2024-04-22 23:56:40 +02:00
Miloslav Trmač 2c240ca3f9 Inline ReadFooterDataFromAnnotations into the only caller
Again, decrease the size of the compression code for c/image.

We will simplify this further immediately.

Should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2024-04-22 23:56:40 +02:00
Miloslav Trmač e7a3eae5cf Make ReadFooterDataFromBlob test-only
It has no non-test users any more, so decrease the
size of this package (relevant to non-c/storage
callers of c/image).

Should not change behavior.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2024-04-22 23:56:40 +02:00
Miloslav Trmač 9fbd0e0395 Don't look for the binary digest when pulling layers
This code path is usually never triggered because
the annotations are present; and it was broken until recently.

Remove it to simplify the code and analysis.

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
2024-04-22 23:56:40 +02:00
openshift-merge-bot[bot] 2431799327
Merge pull request #1893 from giuseppe/convert-zstd-chunked-store-original-tarsize
chunked: store original tar size for converted layers
2024-04-20 10:33:32 +00:00
openshift-merge-bot[bot] 9e2794bf6a
Merge pull request #1895 from giuseppe/fix-escaping-of-paths-with-spaces-composefs
chunked: fix escape of space
2024-04-20 10:28:03 +00:00
Giuseppe Scrivano 839beda40e
chunked: fix escape of space
the code was copied from the composefs C version:

	if (noescape_space)
		hex_escape = !isprint(c);
	else
		hex_escape = !isgraph(c);

but unicode.IsGraphic() seems to behave differently and includes the
space:

isgraph(' ') -> 0
unicode.IsGraphic(' ') -> true

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-04-19 21:43:10 +02:00
Giuseppe Scrivano 639f1a62f9
chunked: store original tar size for converted layers
if the layer was converted from an existing one, store the original
layer size.

Closes: https://github.com/containers/storage/issues/1892

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-04-19 21:30:37 +02:00
Giuseppe Scrivano 065a2f3321
chunked: bump version number for cache file
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-04-19 21:28:16 +02:00
Giuseppe Scrivano 9619a53b91
chunked: store file locations as binary
it reduces the cache file size by ~3%.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-04-19 21:28:16 +02:00
Giuseppe Scrivano 59ac03970d
chunked: store file names separately
so that the same file path is stored only once in the cache file.

After this change, the cache file measured on the fedora:{38,39,40}
images is in average ~6% smaller.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-04-19 21:28:16 +02:00
Giuseppe Scrivano e9a96e022d
chunked: use a bloom filter to speedup lookup
use a bloom filter to speed up lookup of digests in a cache file.

The biggest advantage is that it reduces page faults with the mmap'ed
cache file.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2024-04-19 21:28:16 +02:00