Commit Graph

50 Commits

Author SHA1 Message Date
Nalin Dahyabhai 0183a293dc Lock the mounts list with its own lockfile
Separate loading and saving the mountpoints.json table out of the main
layer load/save paths so that they can be called independently, so that
we can mount and unmount layers (which requires that we update that
information) when the layer list itself may only be held with a read
lock.

The new loadMounts() and saveMounts() methods need to be called only for
read-write layer stores.  Callers that just refer to the mount
information can take a read lock on the mounts information, but callers
that modify the mount information need to acquire a write lock.

Break the unwritten "stores don't manage their own locks" rule and have
the layer store handle managing the lock for the mountpoints list, with
the understanding that the layer store's lock will always have been
acquired before we try to take the mounts lock.

Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
2019-02-26 14:19:53 -05:00
Nalin Dahyabhai 45c05928c4 Update a comment
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
2019-02-26 14:19:53 -05:00
Nalin Dahyabhai 1194eb9848 layers/images: don't try to clean up with just a read-only lock
Don't attempt to remove conflicting names or finish layer cleanups if we
only have a read-only lock on layer or image stores, since doing either
means we'd have to modify the list of layers or images, and our lock
that we've obtained doesn't allow us to do that.

Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
2019-02-26 14:19:53 -05:00
Nalin Dahyabhai 45b0aa27aa Locker.Locked(): clarify that we're checking for write locks
Clarify that Locker.Locked() checks if we have a write lock, since
that's what we care about whenever we check it.

Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
2019-02-26 14:19:50 -05:00
Valentin Rothberg f58686dcce lockfile: implement reader-writer locks
Implement reader-writer locks to allow allow multiple readers to hold
the lock in parallel.

* The locks are still based on fcntl(2).

* Changing the lock from a reader to a writer and vice versa will block
  on the syscall.

* A writer lock can be held only by one process.  To protect against
  concurrent accesses by gourtines within the same process space, use a
  writer mutex.

* Extend the Locker interface with the `RLock()` method to acquire a
  reader lock.  If the lock is set to be read-only, all calls to
  `Lock()` will be redirected to `RLock()`.  A reader lock is only
  released via fcntl(2) when all gourtines within the same process space
  have unlocked it.  This is done via an internal counter which is
  protected (among other things) by an internal state mutex.

* Panic on violations of the lock protocol, namely when calling
  `Unlock()` on an unlocked lock.  This helps detecting violations in
  the code but also protects the storage from corruption.  Doing this
  has revealed some bugs fixed in ealier commits.

Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
2019-02-15 09:49:44 +01:00
Nalin Dahyabhai c073b43547 Add a CreateFromTemplate() method to drivers, and use it for mapped layers
Add a CreateFromTemplate() method to graph drivers, and use it instead
of a driver-oblivious diff/put method when we want to create a copy of
an image's top layer that has the same parent and which differs from the
original only in its ID maps.

This lets drivers that can quickly make an independent layer based on
another layer do something smarter than we were doing with the
driver-oblivious method.  For some drivers, a native method is
dramatically faster.

Note that the driver needs to be able to do this while still exposing
just one notional layer (i.e., one link in the chain of layers for a
given container) to the higher levels of the APIs, so if the new layer
is actually a child of the template layer, that needs to remain a detail
that's private to the driver.

Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
2019-01-17 14:28:40 -05:00
Giuseppe Scrivano 6ef3b9dafa
layers: use the package name pgzip instead of gzip
based on some comments got on the related PR for containers/image.

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2018-12-17 12:29:29 +01:00
Giuseppe Scrivano fe775d42b0
vendor: use github.com/klauspost/pgzip instead of compress/gzip
from my tests, I've seen a net improvement of around 30% on the wall
clock time in decompressing layers.

These additional packages will need to be re-vendored:

github.com/klauspost/pgzip v1.2.1
github.com/klauspost/compress v1.4.1
github.com/klauspost/cpuid v1.2.0

Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2018-12-13 16:14:54 +01:00
Šimon Lukašík 531d6cc6d2 Remove unused variable: compressionFlag
Unused since 1f7a56c869

Signed-off-by: Šimon Lukašík <isimluk@fedoraproject.org>
2018-11-28 21:15:07 +01:00
Antonio Murdaca f810101436
layers: do not try to unmount if not mounted at all on delete
This check has been wrongly removed with #198
The check must stay as it's now part of the Stop/Delete API so reintroduce it back
This fixes #233 and the associated CRI-O issues
This PR + kubernetes-sigs/cri-o#1910 fully fix the issue
I'm going to revendor c/storage in CRI-O to full fix crio after this is merged

Signed-off-by: Antonio Murdaca <runcom@linux.com>
2018-11-15 15:50:01 +01:00
Antonio Murdaca 813e37854f
layers: return the in-use layer on ErrDuplicateID
We've seen a panic on Azure with CRI-O/OCP:

Nov 08 17:52:58 master-000002 crio[5779]: panic: runtime error: invalid memory address or nil pointer dereference
Nov 08 17:52:58 master-000002 crio[5779]: [signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0x55cec3a16669]
Nov 08 17:52:58 master-000002 crio[5779]: goroutine 127 [running]:
Nov 08 17:52:58 master-000002 crio[5779]: panic(0x55cec467fda0, 0x55cec52cba20)
Nov 08 17:52:58 master-000002 crio[5779]: /opt/rh/go-toolset-7/root/usr/lib/go-toolset-7-golang/src/runtime/panic.go:551 +0x3c5 fp=0xc4206e17f0 sp=0xc4206e1750 pc=0x55cec2f47685
Nov 08 17:52:58 master-000002 crio[5779]: runtime.panicmem()
Nov 08 17:52:58 master-000002 crio[5779]: /opt/rh/go-toolset-7/root/usr/lib/go-toolset-7-golang/src/runtime/panic.go:63 +0x60 fp=0xc4206e1810 sp=0xc4206e17f0 pc=0x55cec2f46520
Nov 08 17:52:58 master-000002 crio[5779]: runtime.sigpanic()
Nov 08 17:52:58 master-000002 crio[5779]: /opt/rh/go-toolset-7/root/usr/lib/go-toolset-7-golang/src/runtime/signal_unix.go:388 +0x17e fp=0xc4206e1860 sp=0xc4206e1810 pc=0x55cec2f5d7fe
Nov 08 17:52:58 master-000002 crio[5779]: github.com/kubernetes-sigs/cri-o/vendor/github.com/containers/image/storage.(*storageImageDestination).Commit(0xc420556540, 0x55cec48b7fe0, 0xc4200ac048, 0x0, 0x0)
Nov 08 17:52:58 master-000002 crio[5779]: /builddir/build/BUILD/cri-o-71cc46544a8d31229c4ef2b88b42485f4d997c03/_output/src/github.com/kubernetes-sigs/cri-o/vendor/github.com/containers/image/storage/storage_imag

That nil pointer dereference is caused by containers/image storage
Commit() as it ignores ErrDuplicateID but the layer object is later
reused when nil.

This commit fixes the panic above by returning the layer in-use even on error so
containers/image won't panic.

I'll vendor this in c/image once merged and then in CRI-O.

Signed-off-by: Antonio Murdaca <runcom@linux.com>
2018-11-09 16:14:47 +01:00
Daniel J Walsh 24f0de4570
Start to store SELinux labels in layer store
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2018-10-18 16:16:30 -04:00
Daniel J Walsh b6ccc0acfa
Add MountOpts to stop adding fields to Get Interface
This patch adds a MountOpts field to the drivers so we can simplify
the interface to Get and allow additional options to be passed in the future.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2018-10-05 09:23:46 -04:00
Nalin Dahyabhai 2805a4374f layerStore.Put(): always check for Create() errors
If we needed to try to update the ID mappings on a just-created layer,
we were inadvertently failing to check that the layer had been
successfully created.

Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
2018-09-21 16:05:28 -04:00
Zac Medico c7ba5749d4
Add lock sanity checks to Save() methods
I have experienced "layer not known" corruption triggered by concurrent
buildah/skopeo processes, and hopefully lock sanity checks will help to
prevent this kind of problem.

Signed-off-by: Zac Medico <zmedico@gmail.com>
2018-08-24 20:31:47 -07:00
Giuseppe Scrivano 3f55e5a3ac
shifting: raise an error if the container needs shifting
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2018-07-27 19:59:23 +02:00
Giuseppe Scrivano 1897396330
drivers: inform Mount of the mappings used by the container
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
2018-07-26 06:12:42 +02:00
Daniel J Walsh 1538971882
Change Mounted to return the number of times mounted
podman unmount wants to know if the image is only mounted 1 time
and refuse to unmount if the container state expects it to be mounted.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2018-07-17 16:27:27 -04:00
Daniel J Walsh 1075a73cac
Modify storage to allow callers to determine if a mount point is mounted
Add force to umount to force the umount of a container image
Add an interface to indicate whether or not the layer is mounted
Add a boolean return from unmount to indicate when the layer is really unmounted

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2018-07-17 14:00:15 -04:00
Nalin Dahyabhai 16e7f5fb57 Only try to copy new layers/images/containers if we create them
When creating new Layers, Images, or Containers, only try to copy the
newly-created results if we actually created them.

Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
2018-05-07 10:35:28 -04:00
Nalin Dahyabhai aefafeeb85 Add LayerParentOwners()/ContainerParentOwners()
Add store methods for finding the list of UIDs and GIDs which probably
need to be mapped if a given layer or container's layer, which has to
have been mounted at least once in order for us to know where it goes,
is going to be used for a container that is run with the configured ID
mappings in a separate user namespace.

Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
2018-04-03 10:34:32 -04:00
Nalin Dahyabhai 45b4b9dd4d LayerStore.Create/CreateWithFlags/Put: tweak order of arguments
Tweak the order of arguments to LayerStore.Create()/CreateWithFlags()/Put()
so that the moreOptions struct is directly after the options map.

Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
2018-04-03 10:34:32 -04:00
Nalin Dahyabhai 97326e1d2f Support for per-container uid/gid mapping: lower
Expose reading and writing ID mapping in the archive and chrootarchive
packages, and in the driver interface.  Generally this means that
when computing or applying diffs, we need to have ID mappings passed in
that are specific to the layers we're using.

Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
2018-04-03 10:34:32 -04:00
Nalin Dahyabhai b71d4c4197 Support for per-container uid/gid mapping: upper
Add support to the Store objects for per-container UID/GID mapping.
* UID and GID maps can be specified when creating layers and containers.
* If mapping options are specified when creating a container, those
  options are used for creating the layer which we create for the
  container and recorded with the container for convenience.
* A layer defaults to using the ID mapping configured for its parent, or
  to the default which was used to initialize the Store object if it has
  no parent.

Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
2018-04-03 10:34:32 -04:00
Nalin Dahyabhai 363b02079c Always return deep-copied layer/image/container info
Always copy slices and maps in Layer, Image, and Container structures
before handing them back to callers so that, even if they modify them
directly, they won't accidentally mess with our in-memory copies of
those fields in the copies of the structures that we're using.

Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
2018-03-07 17:07:38 -05:00
Daniel J Walsh 5a785c73f4 Pass MountLabel down to diff drivers
Currently when we do a commmit, we are mounting the container without using
the mountlabel.  In certain situations we can leak mount points where the
image is already mounted with a label.  If you then attempt to commit the
image, the kernel will attempt to mount the image without a label.  The
kernel will reject this mount since SELinux does not allow the same image
to be mounted with different labels.

Passing down the label to the diff drivers, fixes this issue.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2018-02-06 13:42:25 -05:00
Daniel J Walsh 232429cdea Merge pull request #107 from nalind/preallocate-slices
Preallocate some slices that we build up
2017-09-30 07:07:48 -04:00
Nalin Dahyabhai 2f258f168e Initialize Flags and BigDataSizes maps
When we read itms from disk, if maps in the structures are empty, they
won't be allocated as part of the decoding process.  When we
subsequently go to read or write something from such a map, make sure
it's been initialized.

Add some validation of names that we convert to file names, and of
digest values, so that we can be more precise about the error code we
return when there's a problem with the values.

Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
2017-09-29 17:58:47 -04:00
Nalin Dahyabhai 1289ff09a7 Preallocate some slices that we build up
Take a guess at the final size of some slices that we build up item by
item, and try to allocate enough capacity for them before starting to
build them.  It's probably not a big speedup, though.

Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
2017-09-29 15:00:59 -04:00
Nalin Dahyabhai a546c6d7a4 Also dedupe layer/image/container names at create
We already deduplicated names in Store.SetNames(), but we weren't also
doing that when creating layers, images, and containers, or in the
individual store SetNames() methods.

Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
2017-09-29 15:00:09 -04:00
Daniel J Walsh 04c9124148 Start using drivers.Options for passing date to graphdrivers
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2017-09-19 21:15:29 +00:00
Daniel J Walsh f39066fe1b Update packages to match latest code in moby/pkg
Had to vendor in a new version of golang.org/x/net to build
Also had to make some changes to drivers to handle
archive.Reader -> io.Reader
archive.Archive -> io.ReadCloser

Also update .gitingore to ignore emacs files, containers-storage.*
and generated man pages.

Also no longer test travis against golang 1.7, cri-o, moby have also
done this.

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
2017-09-12 18:00:29 +00:00
Nalin Dahyabhai 6986edce00 Create errors using "errors"
Use the standard library's "errors" package to create errors so that
backtraces in wrapped errors terminate at the point where the error was
first wrapped, and not at the line where we created the error, which
isn't as useful for troubleshooting.

Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
2017-09-01 17:04:56 -04:00
Nalin Dahyabhai 0e21827111 Fix handling of DiffOptions.Compression in Diff()
Properly heed the DiffOptions.Compression value when generating a layer
diff between a layer and its parent, when there's no tarsplit data.

Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
2017-07-10 11:41:34 -04:00
Nalin Dahyabhai 4bbb989ca8 Correct a comment: compression is not encryption
Correct a reference to "encryption" in a comment that should instead be
referring to "compression".

Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
2017-07-07 17:19:49 -04:00
Nalin Dahyabhai 1f7a56c869 Cache the digests, sizes, and compression type
Cache the digests and sizes of a diff, both compressed and uncompressed,
along with the type of compression detected for it, that's supplied to
ApplyDiff() or Put() in the layer structure, and add methods to find a
list of layers that match one or the other digest.

Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
2017-06-20 12:30:28 -04:00
Nalin Dahyabhai f2de2a43ed Remove some unused values
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
2017-06-19 11:57:19 -04:00
Daniel J Walsh d46b0bfb28 Merge pull request #65 from nalind/diffoptions
Make Diff() methods take an optional *DiffOptions
2017-06-19 11:50:54 -04:00
Daniel J Walsh 8358b1ea98 Merge pull request #66 from runcom/memory-hungry-not
layers|containers: do not allocate slices at every delete
2017-06-19 11:50:32 -04:00
Antonio Murdaca 10844b724d
layers|containers: do not allocate slices at every delete
When Delete:ing a layer or a container the code was always allocating a
new slice just to remove an element from the original slice.
Profiling cri-o with c/storage showed that doing it at every delete is
pretty expensive:

```
         .          .    309:   newContainers := []Container{}
         .          .    310:   for _, candidate := range r.containers
{
         .          .    311:           if candidate.ID != id {
  528.17kB   528.17kB    312:                   newContainers =
append(newContainers, candidate)
         .          .    313:           }
         .          .    314:   }

         .          .    552:           newLayers := []Layer{}
         .          .    553:           for _, candidate := range
r.layers {
         .          .    554:                   if candidate.ID != id {
    1.03MB     1.03MB    555:                           newLayers =
append(newLayers, candidate)
         .          .    556:                   }
         .          .    557:           }
         .          .    558:           r.layers = newLayers
```

This patch just filters out the element to remove from the original
slice w/o allocating a new slice. After this patch, no memory overhead
anymore is shown in the profiler.

Signed-off-by: Antonio Murdaca <runcom@redhat.com>
2017-06-19 17:08:40 +02:00
Nalin Dahyabhai fb0b0e7cfe Make Diff() methods take an optional *DiffOptions
Add an optional *DiffOptions parameter to Diff() methods (which can be
nil), to allow overriding of default behaviors.

At this time, that's just what type of compression is applied, if we
want something other than what was recorded when the diff was applied,
but we can add more later if needed.

Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
2017-06-16 10:50:08 -04:00
Nalin Dahyabhai 0200465c0b Track creation dates for layers/images/containers
Add a Created field to Layer, Image, and Container structures that we
intialize when creating one of them.

Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
2017-06-16 10:12:55 -04:00
Nalin Dahyabhai a9b1fe6241 Add read-only layer/image/container stores
Implement read-only versions of layer and image store interfaces which
allocate read-only locks and which return errors whenever a write
function is called (which should only be possible after a type
assertion, since they're not part of the read-only interfaces).

Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
2017-06-12 16:31:19 -04:00
Nalin Dahyabhai 198f752fb5 Split layer and image stores into RO and RW kinds
Split the LayerStore and ImageStore interfaces into read-only and
write-only subset interfaces, and make the proper stores into unions of
the read-only and write-only method sets.

Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
2017-06-12 10:44:36 -04:00
Dan Walsh 162e676330 Add read-only locks
We need to be able to acquire locks on storage areas which aren't
mounted read-write, which return errors when we attempt to open a file
in the mode where we can take write locks on them.  This patch adds a
read-only lock type for use in those cases.

A given file can be opened for read-locking or write-locking, but not
both.  Our Locker interface gains an IsReadWrite() method to let callers
tell the difference.

Based on patches by Dan Walsh <dwalsh@redhat.com>
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
2017-06-12 10:40:39 -04:00
Dan Walsh 35dc8dbc67 Uneccessary Touch functions.
We don't need these Touch calls, since the Save function will handle it.

Signed-off-by: Dan Walsh <dwalsh@redhat.com>
2017-06-07 16:58:11 -04:00
Dan Walsh c496eac6c3 Only touch when images, containers, layers save function called
Signed-off-by: Dan Walsh <dwalsh@redhat.com>
2017-06-02 16:20:00 -04:00
Nalin Dahyabhai 671721e961 Fix consistency errors after adding/removing items
Fix consistency errors we'd hit after creating or deleting a layer,
image, or container, by replacing the slice of items in their respective
stores with a slice of pointers to items, so that pointers in name- and
ID-based indexes don't become invalid when the slice is resized.

Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
2017-06-02 13:22:47 -04:00
Dan Walsh 6160e9213e Remove parentlayers lookup.
We don't do anything with these variables, and they break additionalstores
Signed-off-by: Dan Walsh <dwalsh@redhat.com>
2017-05-16 17:44:36 -04:00
Dan Walsh 5531c8da65 Move storage/storage go objects to storage.
There is no reason for the extra directory level.

Also fixup some go lint issues
Signed-off-by: Dan Walsh <dwalsh@redhat.com>
2017-05-16 17:25:11 -04:00