1. Move codespell config out of Makefile, simplify (remove unused
stuff).
2. Fix found issues (using codespell -w).
3. Add a codespell CI job.
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
readCacheFileFromMemory() returns nil, nil when the version
mismatches. Do not attempt to use the cache if it was not
loaded. Ignoring the layer will ensure that the cache will be
recreated with the correct version.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
ignore the error if the layer is being deleted while we are processing
it without a lock on the store.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
the global singleton was never updated, causing the cache to be always
recreated for each layer.
It is not possible to keep the layersCache mutex for the entire load()
since it calls into some store APIs causing a deadlock since
findDigestInternal() is already called while some store locks are
held.
Another benefit is that now only one goroutine can run load()
preventing multiple calls to load() to happen in parallel doing the
same work.
Closes: https://github.com/containers/storage/issues/2023
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
My IDE runs a linter by default, and these two show up.
For the file one, it's because `Fd()` returns `uintptr`
which is unsigned and can't be negative. IOW, a `File`
object should always be a valid opened fd.
Signed-off-by: Colin Walters <walters@verbum.org>
This is a microptimization, we call strings.ToLower only
once, but more importantly it will make it easier to add
more fields.
Signed-off-by: Miloslav Trmač <mitr@redhat.com>
if the layer does not have a manifest TOC, just ignore it instead of
raising a warning. There is no need to create a cache file since
there is no manifest file to parse.
Closes: https://github.com/containers/storage/issues/1909
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
it can happen for any reason, like for example using a new cache file
format, in this case the file is recreated with the last version.
This is internal only and should not be displayed by default.
Closes: https://github.com/containers/storage/issues/1905
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
so that the same file path is stored only once in the cache file.
After this change, the cache file measured on the fedora:{38,39,40}
images is in average ~6% smaller.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
use a bloom filter to speed up lookup of digests in a cache file.
The biggest advantage is that it reduces page faults with the mmap'ed
cache file.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
The getString() function was used to extract string values, but it
doesn't handle escaped characters. Replace it with iter.ReadString()
that is slower but handles escaped characters correctly.
Closes: https://github.com/containers/storage/issues/1878
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
reduce memory usage for the process by not loading entirely in memory
any cache file for the layers.
The memory mapped files can be shared among multiple instances of
Podman, as well as not being fully loaded in memory.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
include the chunk length in the generated file location format,
This enhancement is designed to facilitate the use of the cache by external
tools which may not have knowledge of the chunk size.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
gofumpt is a superset of gofmt, enabling some more code formatting
rules.
This commit is brought to you by
gofumpt -w .
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
We now use the golang error wrapping format specifier `%w` instead of the
deprecated github.com/pkg/errors package.
Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
if the layer cache doesn't already exist, automatically create it from
the layer TOC.
commit 10697a05a2 introduced this
regression.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
automatically detect holes in sparse files (the threshold is hardcoded
at 1kb for now) and add this information to the manifest file.
The receiver will create a hole (using unix.Seek and unix.Ftruncate)
instead of writing the actual zeros.
Closes: https://github.com/containers/storage/issues/1091
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
avoid parsing each json TOC file for the layers in the local storage,
but attempt to create a lookaside cache in a custom format faster to
load (and potentially be mmap'able).
The same cache is used to lookup files, chunks and candidates for
deduplication with hard links.
There are 3 kind of digests stored:
- digest(file.payload))
- digest(digest(file.payload) + file.UID + file.GID + file.mode + file.xattrs)
- digest(i) for each i in chunks(file payload)
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
try to reuse an existing cache object, instead of creating it for
every layer.
Set a time limit on how long it can be reused so to clean up stale
references.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>