This change include filter `name` and `driver`,
and also update related docs to reflect that filters usage.
Closes: #21243
Signed-off-by: Kai Qiang Wu(Kennan) <wkqwu@cn.ibm.com>
Following a journal log almost always requires a descriptor to be
allocated. In cases where we're running out of descriptors, this means
we might get stuck while attempting to start following the journal, at a
point where it's too late to report it to the client and clean up
easily. The journal reading context will cache the value once it's
allocated, so here we move the check earlier, so that we can detect a
problem when we can still report it cleanly.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
When we set up to start following a journal, if we get error results
from sd_journal_get_fd() or sd_journal_get_events() that prevent us from
following the journal, report the error instead of just mysteriously
failing.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com> (github: nalind)
Instead of implementing refcounts at each graphdriver, implement this in
the layer package which is what the engine actually interacts with now.
This means interacting directly with the graphdriver is no longer
explicitly safe with regard to Get/Put calls being refcounted.
In addition, with the containerd, layers may still be mounted after
a daemon restart since we will no longer explicitly kill containers when
we shutdown or startup engine.
Because of this ref counts would need to be repopulated.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Instead of implementing refcounts at each graphdriver, implement this in
the layer package which is what the engine actually interacts with now.
This means interacting directly with the graphdriver is no longer
explicitly safe with regard to Get/Put calls being refcounted.
In addition, with the containerd, layers may still be mounted after
a daemon restart since we will no longer explicitly kill containers when
we shutdown or startup engine.
Because of this ref counts would need to be repopulated.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Now that the namespace sharing code via runc is vendored with the
containerd changes, we can disable the restrictions on container to
container net and IPC namespace sharing when the daemon has user
namespaces enabled.
Docker-DCO-1.1-Signed-off-by: Phil Estes <estesp@linux.vnet.ibm.com> (github: estesp)
This allows a user to specify explicitly to enable
automatic copying of data from the container path to the volume path.
This does not change the default behavior of automatically copying, but
does allow a user to disable it at runtime.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
This adds support for the passthrough on build, push, login, and search.
Revamp the integration test to cover these cases and make it more
robust.
Use backticks instead of quoted strings for backslash-heavy string
contstands.
Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
This cleans up some of the use of the filepoller which makes reading
significantly more robust and gives fewer changes to fallback to the
polling based watcher.
In a lot of cases, if the file was being rotated while we were adding it
to the watcher, it would return an error that the file doesn't exist and
would fallback.
In some cases this fallback could be triggered multiple times even if we
were already on the fallback/poll-based watcher.
It also fixes an open file leak caused by not closing files properly on
rotate, as well as not closing files that were read via the `tail`
function until after the log reader is completed.
Prior to the above changes, it was relatively simple to cause the log
reader to error out by having quick rotations, for example:
```
$ docker run --name test --log-opt max-size=10b --log-opt max-files=10
-d busybox sh -c 'while true; do usleep 500000; echo hello; done'
$ docker logs -f test
```
After these changes I can run this forever without error.
Another fix removes 2 `os.Stat` calls when rotating files. The stat
calls are not needed since we are just calling `os.Rename` anyway, which
will in turn also just produce the same error that `Stat` would.
These `Stat` calls were also quite expensive.
Removing these stat calls also seemed to resolve an issue causing slow
memory growth on the daemon.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
so that the user knows what's not in the container but should be.
Its not always easy for the user to know what exact command is being run
when the 'docker run' is embedded deep in something else, like a Makefile.
Saw this while dealing with the containerd migration.
Signed-off-by: Doug Davis <dug@us.ibm.com>
Changes how the Engine interacts with Registry servers on image pull.
Previously, Engine sent a User-Agent string to the Registry server
that included only the Engine's version information. This commit
appends to that string the fields from the User-Agent sent by the
client (e.g., Compose) of the Engine. This allows Registry server
operators to understand what tools are actually generating pulls on
their registries.
Signed-off-by: Mike Goelzer <mgoelzer@docker.com>
Signed-off-by: John Howard <jhoward@microsoft.com>
Signed-off-by: John Starks <jostarks@microsoft.com>
Signed-off-by: Darren Stahl <darst@microsoft.com>
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
Docker logs was only closing the logger when the HTTP response writer received a close notification, however in non-follow mode the writer never receives a close. This means that the daemon would leak the file handle to the log, preventing the container from being removed on Windows (file in use error). This change explicitly closes the log when the end of stream is hit.
Signed-off-by: Stefan J. Wernli <swernli@microsoft.com>
When following a journal-based log, it was possible for the worker
goroutine, which reads the journal using the journal context and sends
entry data down the message channel, to be scheduled after the function
which started it had returned. This could create problems, since the
invoking function was closing the journal context object and message
channel before it returned, which could trigger use-after-free segfaults
and write-to-closed-channel panics in the worker goroutine.
Make the cleanup in the invoking function conditional so that it's only
done when we're not following the logs, and if we are, that it's left to
the worker goroutine to close them.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com> (github: nalind)
The journald log reader keeps a map of following readers so that it can
close them properly when the journald reader object itself is closed,
but it was possible for its worker goroutine to be scheduled so that the
worker attempted to remove a reader from the map before the reader had
been added to the map. This patch adds the item to the map before
starting the goroutine which is expected to eventually remove it.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com> (github: nalind)
All other options we have use `=` as separator, labels,
log configurations, graph configurations and so on.
We should be consistent and use `=` for the security
options too.
Signed-off-by: David Calavera <david.calavera@gmail.com>
The GCP logging driver is calling out to GCP cloud service on package
init.
This is regardless if you are using GCP logging or not.
This change makes this happen on the first invocation of a new GCP
logging driver instance instead.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
This fixes problems encountered when running with a remapped root (the
syscalls related to the metadata directory will fail under user
namespaces). Using 0711 rather than 0701 (which solved the problem
previously) fixes the issue.
Signed-off-by: Aleksa Sarai <asarai@suse.de>
This allows users to provide a FQDN as hostname or to use distinct hostname and
domainname parts. Depends on https://github.com/docker/libnetwork/pull/950
Signed-off-by: Tim Hockin <thockin@google.com>
this allows user to choose the compression type (i.e. gzip/zlib/none) using
--log-opt=gelf-compression-type=none or the compression level (-1..9) using
--log-opt=gelf-compression-level=0 for gelf driver.
Signed-off-by: Daniel Dao <dqminh@cloudflare.com>
Following #19995 and #17409 this PR enables skipping userns re-mapping
when creating a container (or when executing a command). Thus, enabling
privileged containers running side by side with userns remapped
containers.
The feature is enabled by specifying ```--userns:host```, which will not
remapped the user if userns are applied. If this flag is not specified,
the existing behavior (which blocks specific privileged operation)
remains.
Signed-off-by: Liron Levin <liron@twistlock.com>
When I use `docker exec -ti test ls`, I got error:
```
ERRO[0035] Handler for POST /v1.23/exec/9677ecd7aa9de96f8e9e667519ff266ad26a5be80e80021a997fff6084ed6d75/resize returned error: bad file descriptor
```
It's because `POST /exec/<id>/start` and
`POST /exec/<id>/resize` are asynchronous, it is
possible that exec process finishes and ternimal
is closed before resize. Then `console.Fd()` will
get a large invalid number and we got the above
error.
Fix it by adding synchronization between exec and
resize.
Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
Now what we provide dynamic binaries for all plaforms,
we shouldn't try to run docker without udev sync support.
This change changes the previous warning to an Error,
unless the user explicitly overrides the warning, in
which case they're at their own risk.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Attach can hang forever if there is no data to send. This PR adds notification
of Attach goroutine about container stop.
Signed-off-by: Alexander Morozov <lk4d4@docker.com>
This change centralizes the template manipulation in a single package
and adds basic string functions to their execution.
Signed-off-by: David Calavera <david.calavera@gmail.com>
Use token handler options for initialization.
Update auth endpoint to set identity token in response.
Update credential store to match distribution interface changes.
Signed-off-by: Derek McGowan <derek@mcgstyle.net> (github: dmcgowan)
Previously docker used obsolete rfc3164 syslog format for syslog. rfc3164 explicitly
uses semicolon as a separator between 'TAG' and 'Content' section of the log message.
Docker uses semicolon as a separator between image name and version tag.
When {{.ImageName}} was used as a tag expression and contained ":" syslog parser mistreated
"tag" part of the image name as syslog message body, which resulted in incorrect "syslogtag" been reported by syslog
daemon.
Use of rfc5424 log format partually fixes the issue as it does not use semicolon as a separator.
However using default rfc5424 syslog format itroduces backward incompatability because rsyslog template keyword %syslogtag%
is parsed differently. In rfc3164 it uses the "TAG" part reported before the "pid" part. In rfc5424 it uses "appname" part reported
before the pid part, while tag part is introduced by %msgid% part.
For more information on rsyslog configuration properties see: http://www.rsyslog.com/doc/master/configuration/properties.html
Added two options to specify logging in either rfc5424, rfc3164 format or unix format omitting hostname in order to keep backwards compatability with
previous versions.
Signed-off-by: Solganik Alexander <solganik@gmail.com>
Correct creation of a non-existing WORKDIR during docker build to use
remapped root uid/gid on mkdir
Docker-DCO-1.1-Signed-off-by: Phil Estes <estesp@linux.vnet.ibm.com> (github: estesp)
I can reproduce this easily on one of my servers,
`docker exec -ti my_cont ls` will not print anything,
without `-t` it acts normally.
Signed-off-by: Qiang Huang <h.huangqiang@huawei.com>
Once thin pool gets full, bad things can happen. Especially in case of xfs
it is possible that xfs keeps on retrying IO infinitely (for certain kind
of IO) and container hangs.
One way to mitigate the problem is that once thin pool is about to get full,
start failing some of the docker operations like pulling new images or
creation of new containers. That way user will get warning ahead of time
and can try to rectify it by creating more free space in thin pool. This
can be done either by deleting existing images/containers or by adding more
free space to thin pool.
This patch adds a new option dm.min_free_space to devicemapper graph
driver. Say one specifies dm.min_free_space=10%. This means atleast
10% of data and metadata blocks should be free in pool before new device
creation is allowed, otherwise operation will fail.
By default min_free_space is 10%. User can change it by specifying
dm.min_free_space=X% on command line. A value of 0% will disable the
check.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
This fixes an issue that caused the client to hang forever if the
process died before the code arrived to exit the `Kill` function.
Signed-off-by: David Calavera <david.calavera@gmail.com>
Only register a container once it's successfully started. This avoids a
race condition where the daemon is killed while in the process of
calling `libcontainer.Container.Start`, and ends up killing -1.
There is a time window where the container `initProcess` is not set, and
its PID unknown. This commit fixes the race Engine side.
Signed-off-by: Arnaud Porterie <arnaud.porterie@docker.com>
Check whether or not the file system type of a mountpoint is aufs
by calling statfs() instead of parsing mountinfo. This assumes
that aufs graph driver does not allow aufs as a backing file
system.
Signed-off-by: Tatsushi Inagaki <e29253@jp.ibm.com>
Previously, Windows layer diffs were written using a Windows-internal
format based on the BackupRead/BackupWrite Win32 APIs. This caused
problems with tar-split and tarsum and led to performance problems
in implementing methods such as DiffPath. It also was just an
unnecessary differentiation point between Windows and Linux.
With this change, Windows layer diffs look much more like their
Linux counterparts. They use AUFS-style whiteout files for files
that have been removed, and they encode all metadata directly in
the tar file.
This change only affects Windows post-TP4, since changes to the Windows
container storage APIs were necessary to make this possible.
Signed-off-by: John Starks <jostarks@microsoft.com>
This change adds "KernelMemory" to the /info endpoint and
shows a warning if KernelMemory is not supported by the kernel.
This makes it more consistent with the other memory-limit
options.
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>