- create pass through in daemon for access to functions on daemon
member
- import image
push image
export image and corrections
lookup image & comments
load image
list images
image history & comments
Signed-off-by: Morgan Bauer <mbauer@us.ibm.com>
Right now we check for the existence of device but don't make sure it is
a thin pool device. We assume it is a thin pool device and call poolStatus()
on the device which returns an error EOF. And that error does not tell
anything.
So before we reach the stage of calling poolStatus() make sure we are working
with a thin pool device otherwise error out.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
* Moving Network Remote APIs out of experimental
* --net can now accept user created networks using network drivers/plugins
* Removed the experimental services concept and --default-network option
* Neccessary backend changes to accomodate multiple networks per container
* Integration Tests
Signed-off-by: David Calavera <david.calavera@gmail.com>
Signed-off-by: Madhu Venugopal <madhu@docker.com>
progressreader.Broadcaster becomes broadcaster.Buffered and
broadcastwriter.Writer becomes broadcaster.Unbuffered.
The package broadcastwriter is thus renamed to broadcaster.
Signed-off-by: Tibor Vass <tibor@docker.com>
This patch creates interfaces in builder/ for building Docker images.
It is a first step in a series of patches to remove the daemon
dependency on builder and later allow a client-side Dockerfile builder
as well as potential builder plugins.
It is needed because we cannot remove the /build API endpoint, so we
need to keep the server-side Dockerfile builder, but we also want to
reuse the same Dockerfile parser and evaluator for both server-side and
client-side.
builder/dockerfile/ and api/server/builder.go contain implementations
of those interfaces as a refactoring of the current code.
Signed-off-by: Tibor Vass <tibor@docker.com>
Start a goroutine which runs every 30 seconds and if there are deferred
deleted devices, it tries to clean those up.
Also it moves the call to cleanupDeletedDevices() into goroutine and
moves the locking completely inside the function. Now function does not
assume that device lock is held at the time of entry.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Finally here is the patch to implement deferred deletion functionality.
Deferred deleted devices are marked as "Deleted" in device meta file.
First we try to delete the device and only if deletion fails and user has
enabled deferred deletion, device is marked for deferred deletion.
When docker starts up again, we go through list of deleted devices and
try to delete these again.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Provide a command line option dm.use_deferred_deletion to enable deferred
device deletion feature. By default feature will be turned off.
Not sure if there is much value in deferred deletion being turned on
without deferred removal being turned on. So for now, this feature can
be enabled only if deferred removal is on.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Currently during startup we walk through all the device files and read
their device ID and mark in a bitmap that device id is used.
We are anyway going through all device files. So we can as well load all
that data into device hash map. This will save us little time when
container is actually launched later.
Also this will help with later patches where cleanup deferred device
wants to go through all the devices and see which have been marked for
deletion and delete these.
So re-organize the code a bit.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Simplify setupBaseImage() even further. Move some more code in a separate
function. Pure code reorganization. No functionality change.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Move thin pool related checks in a separate function. Pure code reorganization.
Makes reading code easier.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
This moves base device creation function in a separate function. Pure
code reorganization. Makes reading code little easier.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
This patch does three things. Following are the descriptions.
===
Create a separate function for delete transactions so that parent function
is little smaller.
Also close transaction if an error happens.
===
When docker is being shutdown, save deviceset metadata first before
trying to remove the devices. Generally caller gives only 10 seconds
for shutdown to complete and then kills it after that. So if some device
is busy, we will wait 20 seconds for it removal and never be able to save
metadata. So first save metadata and then deal with device removal.
===
Move issue discard operation in a separate function. This makes reading code
little easier.
Also don't issue discards if device is still open. That means devices is
still probably being used and issuing discards is not a good idea.
This is especially true in case of deferred deletion. We want to issue
discards when device is not open. At that time device can be deleted too.
Otherwise we will issue discards and deletion will actually fail. Later
we will try deletion again and issue discards again and deletion will
fail again as device is open and busy.
So this will ensure that discards are issued once when device is not open
and it can actually be deleted.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Fixes an issue where a `Dead` container has no names so the API returns
`null` instead of an empty array.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Exec start was sending HTTP 500 for every error.
Fixed an error where pausing a container and then calling exec start
caused the daemon to freeze.
Updated API docs which incorrectly showed that a successful exec start
was an HTTP 201, in reality it is HTTP 200.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Right now we seem to have 3 locks.
- devinfo.lock
This is a per device lock
- metaData.devicesLock
This is supposedely protecting map of devices.
- Global DeviceSet lock
This is protecting map of devices as well as serializing calls to libdevmapper.
Semantics of per devices lock and global deviceset lock seem to be very clear.
Even ordering between these two locks has been defined properly.
What is not clear is the need and ordering of metaData.devicesLock. Looks like
this lock is not necessary and global DeviceSet lock should be used to
protect map of devices as it is part of DeviceSet.
This patchset gets rid of metaData.devicesLock and instead uses DeviceSet
lock to protect map of devices.
Also at couple of places during initialization takes devices.Lock(). That
is not strictly necessary as there is supposed to be one thread of execution
during initializaiton. Still it makes the code clearer.
I think this makes code more clear and easier to understand and easier to
make further changes.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
maxDeviceID is upper limit on device Id thin pool can support. Right now
we have this check only during startup. It is a good idea to move this
check in loadMetadata so that any time a device file is loaded and if it
is corrupted and device Id is more than maxDevieceID, it will be detected
right then and there.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Use deactivateDevice() instead of removeDevice() directly. This will make
sure for device deletion, deferred removal is used if user has configured
it in. Also this makes reading code litle easier as there is single function
to remove a device and that is deactivateDevice().
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
If a device is still mounted at the time of DeleteDevice(), that means
higher layers have not called Put() properly on the device and are trying
to delete it. This is a bug in the code where Get() and Put() have not been
properly paired up. Fail device deletion if it is still mounted.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Exists() and HasDevice() just check if device file exists or not. It does
not say anything about if device is mounted or not. Fix comments.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
device has map (device.Devices), contains valid devices and we skip all
the files which are not device files. transaction metadata file is not
device file. Skip this file when devices files are being read and loaded
into map.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Implement basic interfaces to write custom routers that can be plugged
to the server. Remove server coupling with the daemon.
Signed-off-by: David Calavera <david.calavera@gmail.com>
This passes through the container hostname to HCS, which in Windows Server
2016 TP4 will set the container's hostname in the registry before starting
it. This will be silently ignored by TP3.
Signed-off-by: John Starks <jostarks@microsoft.com>
Although having a request ID available throughout the codebase is very
valuable, the impact of requiring a Context as an argument to every
function in the codepath of an API request, is too significant and was
not properly understood at the time of the review.
Furthermore, mixing API-layer code with non-API-layer code makes the
latter usable only by API-layer code (one that has a notion of Context).
This reverts commit de41640435, reversing
changes made to 7daeecd42d.
Signed-off-by: Tibor Vass <tibor@docker.com>
Conflicts:
api/server/container.go
builder/internals.go
daemon/container_unix.go
daemon/create.go
This reverts commit ff92f45be4, reversing
changes made to 80e31df3b6.
Reverting to make the next revert easier.
Signed-off-by: Tibor Vass <tibor@docker.com>
This fixes the case where directory is removed in
aufs and then the same layer is imported to a
different graphdriver.
Currently when you do `rm -rf /foo && mkdir /foo`
in a layer in aufs the files under `foo` would
only be be hidden on aufs.
The problems with this fix:
1) When a new diff is recreated from non-aufs driver
the `opq` files would not be there. This should not
mean layer differences for the user but still
different content in the tar (one would have one
`opq` file, the others would have `.wh.*` for every
file inside that folder). This difference also only
happens if the tar-split file isn’t stored for the
layer.
2) New files that have the filenames before `.wh..wh..opq`
when they are sorted do not get picked up by non-aufs
graphdrivers. Fixing this would require a bigger
refactoring that is planned in the future.
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
Fixes an issue where `VOLUME some_name:/foo` would be parsed as a named
volume, allowing access from the builder to any volume on the host.
This makes sure that named volumes must always be passed in as a bind.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Before this patch libcontainer badly errored out with `invalid
argument` or `numerical result out of range` while trying to write
to cpuset.cpus or cpuset.mems with an invalid value provided.
This patch adds validation to --cpuset-cpus and --cpuset-mems flag along with
validation based on system's available cpus/mems before starting a container.
Signed-off-by: Antonio Murdaca <runcom@linux.com>
Use `pkg/discovery` to provide nodes discovery between daemon instances.
The functionality is driven by two different command-line flags: the
experimental `--cluster-store` (previously `--kv-store`) and
`--cluster-advertise`. It can be used in two ways by interested
components:
1. Externally by calling the `/info` API and examining the cluster store
field. The `pkg/discovery` package can then be used to hit the same
endpoint and watch for appearing or disappearing nodes. That is the
method that will for example be used by Swarm.
2. Internally by using the `Daemon.discoveryWatcher` instance. That is
the method that will for example be used by libnetwork.
Signed-off-by: Arnaud Porterie <arnaud.porterie@docker.com>
* Thanks to the Default gateway service in libnetwork, we dont have to add
containers explicitly to secondary public network. This is handled
automatically regardless of the primary network driver.
* Fixed the URL convention for kv-store to be aligned with the upcoming
changes to discovery URL
* Also, in order to bring consistency between external and internal network
drivers, we moved the driver configs via controller Init.
Signed-off-by: Madhu Venugopal <madhu@docker.com>
Avoid creating a global context object that will be used while the daemon is running.
Not only this object won't ever be garbage collected, but it won't ever be used for anything else than creating other contexts in each request. I think it's a bad practive to have something like this sprawling aroud the code.
This change removes that global object and initializes a context in the cases we don't have already one, like shutting down the server.
This also removes a bunch of context arguments from functions that did nothing with it.
Signed-off-by: David Calavera <david.calavera@gmail.com>
The mount syscall does not handle string flags like "noatime",
we must use bitmasks like MS_NOATIME instead.
pkg/mount.Mount already handles this work.
Signed-off-by: Carl Henrik Lunde <chlunde@ping.uio.no>
This PR adds a "request ID" to each event generated, the 'docker events'
stream now looks like this:
```
2015-09-10T15:02:50.000000000-07:00 [reqid: c01e3534ddca] de7c5d4ca927253cf4e978ee9c4545161e406e9b5a14617efb52c658b249174a: (from ubuntu) create
```
Note the `[reqID: c01e3534ddca]` part, that's new.
Each HTTP request will generate its own unique ID. So, if you do a
`docker build` you'll see a series of events all with the same reqID.
This allow for log processing tools to determine which events are all related
to the same http request.
I didn't propigate the context to all possible funcs in the daemon,
I decided to just do the ones that needed it in order to get the reqID
into the events. I'd like to have people review this direction first, and
if we're ok with it then I'll make sure we're consistent about when
we pass around the context - IOW, make sure that all funcs at the same level
have a context passed in even if they don't call the log funcs - this will
ensure we're consistent w/o passing it around for all calls unnecessarily.
ping @icecrime @calavera @crosbymichael
Signed-off-by: Doug Davis <dug@us.ibm.com>
- Print the mount table as in /proc/self/mountinfo
- Do not exit prematurely when one of the ipc mounts doesn't exist.
- Do not exit prematurely when one of the ipc mounts cannot be unmounted.
- Add a unit test to see if the cleanup really works.
- Use syscall.MNT_DETACH to cleanup mounts after a crash.
- Unmount IPC mounts when the daemon unregisters an old running container.
Signed-off-by: David Calavera <david.calavera@gmail.com>
- Add unit tests to make sure the functionality is correct.
- Add FilterByDriver to allow filtering volumes by driver, for future
`volume ls` filtering and whatnot.
Signed-off-by: David Calavera <david.calavera@gmail.com>
If an invalid logger address is provided on daemon start it will
silently fail. As syslog driver is doing, this check should be done on
daemon start and prevent it from starting even in other drivers.
This patch also adds integration tests for this behavior.
Signed-off-by: Antonio Murdaca <runcom@linux.com>
`docker rename foo ''` would result in:
```
usage: docker rename OLD_NAME NEW_NAME
```
which is the old engine's way of return errors - yes that's in the
daemon code. So I fixed that error msg to just be normal.
While doing that I noticed that using an empty string for the
source container name failed but didn't print any error message at all.
This is because we would generate a URL like: ../containers//rename/..
which would cause a 301 redirect to ../containers/rename/..
however the CLI code doesn't actually deal with 301's - it just ignores
them and returns back to the CLI code/caller.
Rather than changing the CLI to deal with 3xx error codes, which would
probably be a good thing to do in a follow-on PR, for this immediate
issue I just added a cli-side check for empty strings for both old and
new names. This way we catch it even before we hit the daemon.
API callers will get a 404, assuming they follow the 301, for the
case of the src being empty, and the new error msg when the destination
is empty - so we should be good now.
Add tests for both cases too.
Signed-off-by: Doug Davis <dug@us.ibm.com>
This way provide both Time and TimeNano in the event. For the display of
the JSONMessage, use either, but prefer TimeNano Proving only TimeNano
would break Subscribers that are using the `Time` field, so both are set
for backwards compatibility.
The events logging uses nano formatting, but only provides a Unix()
time, therefor ordering may get lost in the output. Example:
```
2015-09-15T14:18:51.000000000-04:00 ee46febd64ac629f7de9cd8bf58582e6f263d97ff46896adc5b508db804682da: (from busybox) resize
2015-09-15T14:18:51.000000000-04:00 a78c9149b1c0474502a117efaa814541926c2ae6ec3c76607e1c931b84c3a44b: (from busybox) resize
```
By having a field just for Nano time, when set, the marshalling back to
`time.Unix(sec int64, nsec int64)` has zeros exactly where it needs to.
This does not break any existing use of jsonmessage.JSONMessage, but now
allows for use of `UnixNano()` and get event formatting that has
distinguishable order. Example:
```
2015-09-15T15:37:23.810295632-04:00 6adcf8ed9f5f5ec059a915466cd1cde86a18b4a085fc3af405e9cc9fecbbbbaf: (from busybox) resize
2015-09-15T15:37:23.810412202-04:00 6b7c5bfdc3f902096f5a91e628f21bd4b56e32590c5b4b97044aafc005ddcb0d: (from busybox) resize
```
Including tests for TimeNano and updated event API reference doc.
Signed-off-by: Vincent Batts <vbatts@redhat.com>
GET /containers/json route used to reply with and empty array `[]` when no
containers where available. Daemon containers list refactor introduced
this bug by declaring an empty slice istead of initializing it as well
and it was now replying with `null`.
Signed-off-by: Antonio Murdaca <runcom@linux.com>
- refactor to make it easier to split the api in the future
- addition to check the existing test case and make sure it contains
some expected output
Signed-off-by: Morgan Bauer <mbauer@us.ibm.com>
Using @mavenugo's patch for enabling the libcontainer pre-start hook to
be used for network namespace initialization (correcting the conflict
with user namespaces); updated the boolean check to the more generic
SupportsHooks() name, and fixed the hook state function signature.
Signed-off-by: Madhu Venugopal <madhu@docker.com>
Docker-DCO-1.1-Signed-off-by: Phil Estes <estesp@linux.vnet.ibm.com> (github: estesp)
With go1.5's concurrency, the use of a goroutine in Log'ing events was
causing the resulting events to not be in order.
Signed-off-by: Vincent Batts <vbatts@redhat.com>
For now docker stats will sum the rxbytes, txbytes, etc. of all
the interfaces.
It is OK for the output of CLI `docker stats` but not good for
the API response, especially when the container is in sereval
subnets.
It's better to leave these origianl data to user.
Signed-off-by: Hu Keping <hukeping@huawei.com>
Volumes are accounted when a container is created.
If the creation fails we should remove the reference from the counter.
Do not log ErrVolumeInUse as an error, having other volume references is
not an error.
Signed-off-by: David Calavera <david.calavera@gmail.com>
This is the first step in converting out static strings into well-defined
error types. This shows just a few examples of it to get a feel for how things
will look. Once we agree on the basic outline we can then work on converting
the rest of the code over.
Signed-off-by: Doug Davis <dug@us.ibm.com>
@noxiouz points out that we don't need to check for a nil result from
C.CString(), since an out-of-memory condition causes a runtime panic
instead.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com> (github: nalind)
Changes include :
* libnetwork support for userns
* driver api change to have 1 interface per endpoint
Signed-off-by: Madhu Venugopal <madhu@docker.com>
If a logdriver doesn't register a callback function to validate log
options, it won't be usable. Fix the journald driver by adding a dummy
validator.
Teach the client and the daemon's "logs" logic that the server can also
supply "logs" data via the "journald" driver. Update documentation and
tests that depend on error messages.
Add support for reading log data from the systemd journal to the
journald log driver. The internal logic uses a goroutine to scan the
journal for matching entries after any specified cutoff time, formats
the messages from those entries as JSONLog messages, and stuffs the
results down a pipe whose reading end we hand back to the caller.
If we are missing any of the 'linux', 'cgo', or 'journald' build tags,
however, we don't implement a reader, so the 'logs' endpoint will still
return an error.
Make the necessary changes to the build setup to ensure that support for
reading container logs from the systemd journal is built.
Rename the Jmap member of the journald logdriver's struct to "vars" to
make it non-public, and to make it easier to tell that it's just there
to hold additional variable values that we want journald to record along
with log data that we're sending to it.
In the client, don't assume that we know which logdrivers the server
implements, and remove the check that looks at the server. It's
redundant because the server already knows, and the check also makes
using older clients with newer servers (which may have new logdrivers in
them) unnecessarily hard.
When we try to "logs" and have to report that the container's logdriver
doesn't support reading, send the error message through the
might-be-a-multiplexer so that clients which are expecting multiplexed
data will be able to properly display the error, instead of tripping
over the data and printing a less helpful "Unrecognized input header"
error.
Signed-off-by: Nalin Dahyabhai <nalin@redhat.com> (github: nalind)
Noteworthy changes:
- Add Prestart/Poststop hook support
- Fix bug finding cgroup mount directory
- Add OomScoreAdj as a container configuration option
- Ensure the cleanup jobs in the deferrer are executed on error
- Don't make modifications to /dev when it is bind mounted
Other changes in runc:
https://github.com/opencontainers/runc/compare/v0.0.3...v0.0.4
Signed-off-by: David Calavera <david.calavera@gmail.com>
This PR makes a user visible behavior change with userland
proxy disabled by default and rely on hairpin NAT to be enabled
by default. This may not work in older (unsupported) kernels
where the user will be forced to enable userlandproxy if needed.
- Updated the Docs
- Changed the integration-cli to start with userlandproxy
desiabled by default.
Signed-off-by: Jana Radhakrishnan <mrjana@docker.com>
This changeset creates /dev/shm and /dev/mqueue mounts for each container under
/var/lib/containers/<id>/ and bind mounts them into the container. When --ipc:container<id/name>
is used, then the /dev/shm and /dev/mqueue of the ipc container are used instead of creating
new ones for the container.
Signed-off-by: Mrunal Patel <mrunalp@gmail.com>
Docker-DCO-1.1-Signed-off-by: Dan Walsh <dwalsh@redhat.com> (github: rhatdan)
(cherry picked from commit d88fe447df)
Allow to set the signal to stop a container in `docker run`:
- Use `--stop-signal` with docker-run to set the default signal the container will use to exit.
Signed-off-by: David Calavera <david.calavera@gmail.com>
Allows people to create out-of-process graphdrivers that can be used
with Docker.
Extensions must be started before Docker otherwise Docker will fail to
start.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>