So other packages don't need to import the daemon package when they
want to use this struct.
Signed-off-by: David Calavera <david.calavera@gmail.com>
Signed-off-by: Tibor Vass <tibor@docker.com>
A TopicFunc is an interface to let the pubisher decide whether it needs
to send a message to a subscriber or not. It returns true if the
publisher must send the message and false otherwise.
Users of the pubsub package can create a subscriber with a topic
function by calling `pubsub.SubscribeTopic`.
Message delivery has also been modified to use concurrent channels per
subscriber. That way, topic verification and message delivery is not
o(N+M) anymore, based on the number of subscribers and topic verification
complexity.
Using pubsub topics, the API stops controlling the message delivery,
delegating that function to a topic generated with the filtering
provided by the user. The publisher sends every message to the
subscriber if there is no filter, but the api doesn't have to select
messages to return anymore.
Signed-off-by: David Calavera <david.calavera@gmail.com>
Improves the current filtering implementation complixity.
Currently, the best case is O(N) and worst case O(N^2) for key-value filtering.
In the new implementation, the best case is O(1) and worst case O(N), again for key-value filtering.
Signed-off-by: David Calavera <david.calavera@gmail.com>
It will Tar up contents of child directory onto tmpfs if mounted over
This patch will use the new PreMount and PostMount hooks to "tar"
up the contents of the base image on top of tmpfs mount points.
Signed-off-by: Dan Walsh <dwalsh@redhat.com>
Closes#9798
@maintainers please note that this is a change to the UX. We no longer
require the -f flag on `docker tag` to move a tag from an existing image.
However, this does make us more consistent across our commands,
see https://github.com/docker/docker/issues/9798 for the history.
Signed-off-by: Doug Davis <dug@us.ibm.com>
- avoid empty Names in container list API when fails to remove
a container
- avoid dead containers when fails to create a container
Signed-off-by: Shijiang Wei <mountkin@gmail.com>
ext4 filesystem creation can take a long time on 100G thin device and
systemd might time out and kill docker service. Often user is left thinking
why docker is taking so long and logs don't give any hint. Log an info
message in journal for start and end of filesystem creation. That way
a user can look at logs and figure out that filesystem creation is
taking long time.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
libcontainer v0.0.4 introduces setting `/proc/self/oom_score_adj` to
better tune oom killing preferences for container process. This patch
simply integrates OomScoreAdj libcontainer's config option and adjust
the cli with this new option.
Signed-off-by: Antonio Murdaca <amurdaca@redhat.com>
Signed-off-by: Antonio Murdaca <runcom@redhat.com>
Currently, the resources associated with the io.Reader returned by
TarStream are only freed when it is read until EOF. This means that
partial uploads or exports (for example, in the case of a full disk or
severed connection) can leak a goroutine and open file. This commit
changes TarStream to return an io.ReadCloser. Resources are freed when
Close is called.
Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
Docker daemon uses kv-store as the host-discovery backend.
Discovery module tracks the liveness of a node through a simple
keepalive mechanism. The keepalive mechanism depends on every
node performing heartbeat by registering itself with the discovery
module (via KV-Store Put operation). And for every Put operation,
the discovery module in all other nodes will receive a Watch
notification. That keeps the node alive.
Any node that fails to register itself within the TTL timer is
considered dead and removed from the discovery database.
The default timer (heartbeat = 20 seconds & ttl = 60 seconds)
works fine for small clusters. But for large clusters, these
default timers are extremely aggressive and that causes high CPU
& most of the processing is spent managing the node discovery
and that impacts normal daemon operation.
Hence we need a way to make the discovery ttl and heartbeat
configurable. As the cluster size grows, the user can change
these timers to make sure the daemon scales.
Signed-off-by: Madhu Venugopal <madhu@docker.com>
Add distribution package for managing pulls and pushes. This is based on
the old code in the graph package, with major changes to work with the
new image/layer model.
Add v1 migration code.
Update registry, api/*, and daemon packages to use the reference
package's types where applicable.
Update daemon package to use image/layer/tag stores instead of the graph
package
Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
Adjust the docker-default profile for when the docker daemon is running in
AppArmor confinement. To enable 'docker kill' we need to allow the container
to receive kill signals from the daemon.
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Remove double reference between containers and exec configurations by
keeping only the container id.
Signed-off-by: David Calavera <david.calavera@gmail.com>
This is a small configuration struct used in two scenarios:
1. To attach I/O pipes to a running containers.
2. To attach to execution processes inside running containers.
Although they are similar, keeping the struct in the same package
than exec and container can generate cycled dependencies if we
move any of them outside the daemon, like we want to do
with the container.
Signed-off-by: David Calavera <david.calavera@gmail.com>
* This commit will mark --before and --since as deprecated, but leave their behavior
unchanged until they are removed, then re-implement them as options for --filter.
* And update the related docs.
* Update the integration tests.
Fixes issue #17716
Signed-off-by: Wen Cheng Ma <wenchma@cn.ibm.com>
- Optional "--shm-size=" was added to the sub-command(run, create,and build).
- The size of /dev/shm in the container can be changed
when container is made.
- Being able to specify is a numerical value that applies number,
b, k, m, and g.
- The default value is 64MB, when this option is not set.
- It deals with both native and lxc drivers.
Signed-off-by: NIWA Hideyuki <niwa.hiedyuki@jp.fujitsu.com>
Volumes can have more mount points beneath them and unmount will fail. This
is the case when a bind mounted directory on host already had a mount point
underneath it. So use lazy unmount instead.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Our implementation of systemd cgroups is mixture of systemd api and
plain filesystem api. It's hard to keep it up to date with systemd and
it already contains some nasty bugs with new versions. Ideally it should
be replaced with some daemon flag which will allow to set parent systemd
slice.
Signed-off-by: Alexander Morozov <lk4d4@docker.com>
By removing deprecated volume structures, now that windows mount volumes we don't need a initializer per platform.
Signed-off-by: David Calavera <david.calavera@gmail.com>
Turn BytesPipe's Read and Write functions into blocking, goroutine-safe
functions. Add a CloseWithError function to propagate an error code to
the Read function.
Adjust tests to work with the blocking Read and Write functions.
Remove BufReader, since now its users can use BytesPipe directly.
Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
Add support of `tag`, `env` and `labels` for Splunk logging driver.
Removed from message `containerId` as it is the same as `tag`.
Signed-off-by: Denis Gladkikh <denis@gladkikh.email>
This change will allow us to run SELinux in a container with
BTRFS back end. We continue to work on fixing the kernel/BTRFS
but this change will allow SELinux Security separation on BTRFS.
It basically relabels the content on container creation.
Just relabling -init directory in BTRFS use case. Everything looks like it
works. I don't believe tar/achive stores the SELinux labels, so we are good
as far as docker commit.
Tested Speed on startup with BTRFS on top of loopback directory. BTRFS
not on loopback should get even better perfomance on startup time. The
more inodes inside of the container image will increase the relabel time.
This patch will give people who care more about security the option of
runnin BTRFS with SELinux. Those who don't want to take the slow down
can disable SELinux either in individual containers or for all containers
by continuing to disable SELinux in the daemon.
Without relabel:
> time docker run --security-opt label:disable fedora echo test
test
real 0m0.918s
user 0m0.009s
sys 0m0.026s
With Relabel
test
real 0m1.942s
user 0m0.007s
sys 0m0.030s
Signed-off-by: Dan Walsh <dwalsh@redhat.com>
Signed-off-by: Dan Walsh <dwalsh@redhat.com>
If platform supports xfs filesystem then use xfs as default filesystem
for container rootfs instead of ext4. Reason being that ext4 is pre-allcating
lot of metadata (around 1.8GB on 100G thin volume) and that can take long
enough on AWS storage that systemd times out and docker fails to start.
If one disables pre-allocation of ext4 metadata, then it will be allocated
when containers are mounted and we will have multiple copies of metadata
per container. For a 100G thin device, it was around 1.5GB of metadata
per container.
ext4 has an optimization to skip zeroing if discards are issued and
underlying device guarantees that zero will be returned when discarded
blocks are read back. devicemapper thin devices don't offer that guarantee
so ext4 optimization does not kick in. In fact given discards are optional
and can be dropped on the floor if need be, it looks like it might not be
possible to guarantee that all the blocks got discarded and if read back
zero will be returned.
Signed-off-by: Anusha Ragunathan <anusha@docker.com>
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
If user wants to use a filesystem it can be specified using dm.fs=<filesystem>
option. It is possible that docker already had base image and a filesystem
on that. Later if user wants to change file system using dm.fs= option
and restarts docker, that's not possible. Warn user about it.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Replace time.Sleep with time.Tick and remove unnecessary var block.
Use Warn log-level instead of error.
Signed-off-by: Alexander Morozov <lk4d4@docker.com>
With the changes merged into runc/libcontainer, are now causing
SELinux to attempt a relabel always, even if the user did not
request the relabel.
If the user does not specify Z or z on the volume mount we should
not attempt a relabel.
Signed-off-by: Dan Walsh <dwalsh@redhat.com>
So we don't print those <no value> in the client and we don't fail
executing inspect templates with API field names.
Make sure those fields are initialized as empty slices when
a container is loaded from disk and their values are nil.
Signed-off-by: David Calavera <david.calavera@gmail.com>
Container has private network namespace can not to connect to host
and container with host network can not be disconnected from host.
Signed-off-by: Lei Jitang <leijitang@huawei.com>
This patch adds the ability to run `docker stats` w/o arguments and get
statistics for all running containers by default. Also add a new
`--all` flag to list statistics for all containers (like `docker ps`).
New running containers are added to the list as they show up also.
Add integration tests for this new behavior.
Docs updated accordingly. Fix missing stuff in man/commandline
reference for `docker stats`.
Signed-off-by: Antonio Murdaca <runcom@redhat.com>
Fixes#17290
Fixes following issues:
- Cache checksums turning off while walking a broken symlink.
- Cache checksums were taken from symlinks while targets were actually copied.
- Copying a symlink pointing to a file to a directory used the basename of the target as a destination basename, instead of basename of the symlink.
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
Right now if blkid fails we are just logging a debug message and don;t return
the actual error to caller. Caller gets the error message that thin pool
base device UUID verification failed and it might give impression that thin
pool changed. But that's not the case. Thin pool is in such a state that we
could not even query the thin device UUID. Retrun error message appropriately
to make situation more clear.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
The LXC driver was deprecated in Docker 1.8.
Following the deprecation rules, we can remove a deprecated feature
after two major releases. LXC won't be supported anymore starting on Docker 1.10.
Signed-off-by: David Calavera <david.calavera@gmail.com>
`--cluster-store` is of form KV-PROVIDER://KV-URL, this commit makes
sure that KV-URL contains no "://"
Signed-off-by: Zhang Wei <zhangwei555@huawei.com>
- During concurrent operations in multihost environment,
it is possible that the implementer of `EndpointInfo`
is nil. It simply means the endpoint is no longer
available in the datastore.
Signed-off-by: Alessandro Boch <aboch@docker.com>
The purpose of this PR is for users to distinguish Docker errors from
contained command errors.
This PR modifies 'docker run' exit codes to follow the chroot standard
for exit codes.
Exit status:
125 if 'docker run' itself fails
126 if contained command cannot be invoked
127 if contained command cannot be found
the exit status otherwise
Signed-off-by: Sally O'Malley <somalley@redhat.com>
Side effects:
- Decouple daemon and container to start containers.
- Decouple daemon and container to copy files.
Signed-off-by: David Calavera <david.calavera@gmail.com>
Currently, we get the network stats each time per subscriber, causing a
high load of cpu when there are several subscribers per container.
This change makes the daemon to collect once and publish N times, where N is the
number of subscribers per container.
Signed-off-by: David Calavera <david.calavera@gmail.com>
- On `docker run --net <network id> ...`
the bug would cause the container to attempt
to connect to the network two times
- Also made sure endpoint creation rollback will
be executed on failures in `func (container *Container) connectToNetwork()`
Signed-off-by: Alessandro Boch <aboch@docker.com>
For graceful restart case it was done when the container was brought
down. But for ungraceful cases, the persistence is missing for nw
connect
Signed-off-by: Madhu Venugopal <madhu@docker.com>
And do not try to unmount empty paths.
Because nobody should be woken up in the middle of the night for them.
Signed-off-by: David Calavera <david.calavera@gmail.com>
This fixes a bug introduced in #15786:
* if a pre-v1.20 client requested docker stats, the daemon
would return both an API-compatible JSON blob *and* an API-incompatible JSON
blob: see https://gist.github.com/donhcd/338a5b3681cd6a071629
Signed-off-by: Donald Huang <don.hcd@gmail.com>