mirror of https://github.com/docker/docs.git
Rewrite and break out Troubleshoot Batch Jobs into two pages
Restructure job logs content: Job Queue, View Job Logs (on interface), Troubleshoot Jobs via the API, and Enable Auto-deletion of Job Logs
This commit is contained in:
parent
e26559e270
commit
b8ea668fbf
|
|
@ -0,0 +1,48 @@
|
|||
---
|
||||
title: Enable Auto-Deletion of Job Logs
|
||||
description: Enable auto-deletion of old or unnecessary job logs for maintenance.
|
||||
keywords: registry, events, log, activity stream
|
||||
---
|
||||
|
||||
> BETA DISCLAIMER
|
||||
>
|
||||
> This is beta content. It is not yet complete and should be considered a work in progress. This content is subject to change without notice.
|
||||
|
||||
## Overview
|
||||
|
||||
Docker Trusted Registry has a global setting for auto-deletion of job logs which allows them to be removed as part of [garbage collection](../configure/garbage-collection.md). DTR admins can enable auto-deletion of repository events in DTR 2.6 based on specified conditions which are covered below.
|
||||
|
||||
## Steps
|
||||
|
||||
1. In your browser, navigate to `https://<dtr-url>` and log in with your UCP credentials.
|
||||
|
||||
2. Select **System** on the left navigation pane which will display the **Settings** page by default.
|
||||
|
||||
3. Scroll down to **Job Logs** and turn on ***Auto-Deletion***.
|
||||
|
||||
{: .with-border}
|
||||
|
||||
4. Specify the conditions with which a job log auto-deletion will be triggered.
|
||||
|
||||
{: .img-fluid .with-border}
|
||||
|
||||
|
||||
DTR allows you to set your auto-deletion conditions based on the following optional job log attributes:
|
||||
|
||||
| Name | Description | Example |
|
||||
|:----------------|:---------------------------------------------------| :----------------|
|
||||
| Age | Lets you remove job logs which are older than your specified number of hours, days, weeks or months| `2 months` |
|
||||
| Max number of events | Lets you specify the maximum number of job logs allowed within DTR. | `100` |
|
||||
|
||||
If you check and specify both, job logs will be removed from DTR during garbage collection if either condition is met. You should see a confirmation message right away.
|
||||
|
||||
5. Click **Start GC** if you're ready. Read more about [garbage collection](../configure/garbage-collection/#under-the-hood) if you're unsure about this operation.
|
||||
|
||||
6. Navigate to **System > Job Logs** to confirm that `onlinegc` has happened. For a detailed breakdown of individual job logs, see [View Job Logs](view-job-logs-on-interface.md).
|
||||
|
||||
{: .img-fluid .with-border}
|
||||
|
||||
## Where to go next
|
||||
|
||||
- [View Job Logs](view-job-logs-on-interface.md)
|
||||
|
||||
|
|
@ -0,0 +1,80 @@
|
|||
---
|
||||
title: Job Queue
|
||||
description: Learn how Docker Trusted Registry runs batch jobs for troubleshooting job-related issues.
|
||||
keywords: dtr, job queue, job management
|
||||
---
|
||||
|
||||
Docker Trusted Registry (DTR) uses a job queue to schedule batch jobs. Jobs are added to a cluster-wide job queue, and then consumed and executed by a job runner within DTR.
|
||||
|
||||

|
||||
|
||||
All DTR replicas have access to the job queue, and have a job runner component
|
||||
that can get and execute work.
|
||||
|
||||
## How it works
|
||||
|
||||
When a job is created, it is added to a cluster-wide job queue and enters the `waiting` state.
|
||||
When one of the DTR replicas is ready to claim the job, it waits a random time of up
|
||||
to `3` seconds to give every replica the opportunity to claim the task.
|
||||
|
||||
A replica claims a job by adding its replica ID to the job. That way, other
|
||||
replicas will know the job has been claimed. Once a replica claims a job, it adds
|
||||
that job to an internal queue, which in turn sorts the jobs by their `scheduledAt` time.
|
||||
Once that happens, the replica updates the job status to `running`, and
|
||||
starts executing it.
|
||||
|
||||
The job runner component of each DTR replica keeps a `heartbeatExpiration`
|
||||
entry on the database that is shared by all replicas. If a replica becomes
|
||||
unhealthy, other replicas notice the change and update the status of the failing worker to `dead`.
|
||||
Also, all the jobs that were claimed by the unhealthy replica enter the `worker_dead` state,
|
||||
so that other replicas can claim the job.
|
||||
|
||||
## Job Types
|
||||
|
||||
DTR runs periodic and long-running jobs. The following is a complete list of jobs you can filter for via [the user interface](view-job-logs.md) or [the API](../troubleshoot-batch-jobs.md).
|
||||
|
||||
| Job | Description |
|
||||
|:------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|
||||
| gc | Garbage collection job that deletes layers associated with deleted images |
|
||||
| onlinegc | Garbage collection job that deletes layers associated with deleted images without putting the registry in read only mode |
|
||||
| onlinegc_metadata | Garbage collection job that deletes metadata associated with deleted images|
|
||||
| onlinegc_joblogs | Garbage collection job that deletes job logs based on a set job history setting |
|
||||
| metadatastoremigration | metadatastoremigration is a necessary migration that enables the online gc feature |
|
||||
| sleep | Sleep is used to test the correctness of the jobrunner. It sleeps for 60 seconds |
|
||||
| false | False is used to test the correctness of the jobrunner. It runs the `false` command and immediately fails |
|
||||
| tagmigration | Tag migration is used to synchronize tag and manifest information between the DTR database and the storage backend. |
|
||||
| bloblinkmigration | bloblinkmigration is a 2.1 to 2.1 upgrade process that adds references for blobs to repositories in the database |
|
||||
| license_update | License update checks for license expiration extensions if online license updates are enabled |
|
||||
| scan_check | An image security scanning job. This job does not perform the actual scanning, rather it spawns `scan_check_single` jobs (one for each layer in the image). Once all of the `scan_check_single` jobs are complete, this job will terminate |
|
||||
| scan_check_single | A security scanning job for a particular layer given by the `parameter: SHA256SUM`. This job breaks up the layer into components and checks each component for vulnerabilities |
|
||||
| scan_check_all | A security scanning job that updates all of the currently scanned images to display the latest vulnerabilities |
|
||||
| update_vuln_db | A job that is created to update DTR's vulnerability database. It uses an Internet connection to check for database updates through `https://dss-cve-updates.docker.com/` and updates the `dtr-scanningstore` container if there is a new update available |
|
||||
| scannedlayermigration | scannedlayermigration is a 2.4 to 2.5 upgrade process that restructures scanned image data |
|
||||
| push_mirror_tag | A job that pushes a tag to another registry after a push mirror policy has been evaluated |
|
||||
| poll_mirror | A global cron that evaluates poll mirroring policies |
|
||||
| webhook | A job that is used to dispatch a webhook payload to a single endpoint |
|
||||
| nautilus_update_db | Is this different from `update_vuln_db`? |
|
||||
| ro_registry | What is this? |
|
||||
| tag_pruning | Tag pruning is a configurable job which cleans up unnecessary or unwanted repository tags. For configuration options, see [Tag Pruning](../user/tag-pruning). |
|
||||
|
||||
## Job Status
|
||||
|
||||
Jobs can have one of the following status values:
|
||||
|
||||
| Status | Description |
|
||||
|:----------------|:------------------------------------------------------------------------------------------------------------------------------------------|
|
||||
| waiting | Unclaimed job waiting to be picked up by a worker. |
|
||||
| running | The job is currently being run by the specified `workerID`. |
|
||||
| done | The job has successfully completed. |
|
||||
| error | The job has completed with errors. |
|
||||
| cancel_request | The status of a job is monitored by the worker in the database. If the job status changes to `cancel_request`, the job is canceled by the worker. |
|
||||
| cancel | The job has been canceled and was not fully executed. |
|
||||
| deleted | The job and its logs have been removed. |
|
||||
| worker_dead | The worker for this job has been declared `dead` and the job will not continue. |
|
||||
| worker_shutdown | The worker that was running this job has been gracefully stopped. |
|
||||
| worker_resurrection | The worker for this job has reconnected to the database and will cancel this job. |
|
||||
|
||||
## Where to go next
|
||||
|
||||
- [View Job Logs](view-job-logs-on-interface.md)
|
||||
- [Troubleshoot Jobs via the API](troubleshoot-jobs-via-api.md)
|
||||
|
|
@ -0,0 +1,174 @@
|
|||
---
|
||||
title: Troubleshoot Jobs via the API
|
||||
description: Learn how Docker Trusted Registry runs batch jobs for job-related troubleshooting.
|
||||
keywords: dtr, troubleshoot
|
||||
redirect_from: /ee/dtr/admin/monitor-and-troubleshoot/troubleshoot-batch-jobs/
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
This covers troubleshooting batch jobs via the API and was introduced in DTR 2.2. Starting in DTR 2.6, admins have the ability to [manage job logs](view-job-logs-on-interface.md) using the web interface. This requires familiarity with the [DTR Job Queue](job-queue.md).
|
||||
|
||||
### Job capacity
|
||||
|
||||
Each job runner has a limited capacity and won't claim jobs that require an
|
||||
higher capacity. You can see the capacity of a job runner using the
|
||||
`GET /api/v0/workers` endpoint:
|
||||
|
||||
```json
|
||||
{
|
||||
"workers": [
|
||||
{
|
||||
"id": "000000000000",
|
||||
"status": "running",
|
||||
"capacityMap": {
|
||||
"scan": 1,
|
||||
"scanCheck": 1
|
||||
},
|
||||
"heartbeatExpiration": "2017-02-18T00:51:02Z"
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
This means that the worker with replica ID `000000000000` has a capacity of 1
|
||||
`scan` and 1 `scanCheck`. If this worker notices that the following jobs
|
||||
are available:
|
||||
|
||||
```json
|
||||
{
|
||||
"jobs": [
|
||||
{
|
||||
"id": "0",
|
||||
"workerID": "",
|
||||
"status": "waiting",
|
||||
"capacityMap": {
|
||||
"scan": 1
|
||||
}
|
||||
},
|
||||
{
|
||||
"id": "1",
|
||||
"workerID": "",
|
||||
"status": "waiting",
|
||||
"capacityMap": {
|
||||
"scan": 1
|
||||
}
|
||||
},
|
||||
{
|
||||
"id": "2",
|
||||
"workerID": "",
|
||||
"status": "waiting",
|
||||
"capacityMap": {
|
||||
"scanCheck": 1
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
Our worker will be able to pick up job id `0` and `2` since it has the capacity
|
||||
for both, while id `1` will have to wait until the previous scan job is complete:
|
||||
|
||||
```json
|
||||
{
|
||||
"jobs": [
|
||||
{
|
||||
"id": "0",
|
||||
"workerID": "000000000000",
|
||||
"status": "running",
|
||||
"capacityMap": {
|
||||
"scan": 1
|
||||
}
|
||||
},
|
||||
{
|
||||
"id": "1",
|
||||
"workerID": "",
|
||||
"status": "waiting",
|
||||
"capacityMap": {
|
||||
"scan": 1
|
||||
}
|
||||
},
|
||||
{
|
||||
"id": "2",
|
||||
"workerID": "000000000000",
|
||||
"status": "running",
|
||||
"capacityMap": {
|
||||
"scanCheck": 1
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
You can get the list of jobs, using the `GET /api/v0/jobs/` endpoint. Each job
|
||||
looks like this:
|
||||
|
||||
```json
|
||||
{
|
||||
"id": "1fcf4c0f-ff3b-471a-8839-5dcb631b2f7b",
|
||||
"retryFromID": "1fcf4c0f-ff3b-471a-8839-5dcb631b2f7b",
|
||||
"workerID": "000000000000",
|
||||
"status": "done",
|
||||
"scheduledAt": "2017-02-17T01:09:47.771Z",
|
||||
"lastUpdated": "2017-02-17T01:10:14.117Z",
|
||||
"action": "scan_check_single",
|
||||
"retriesLeft": 0,
|
||||
"retriesTotal": 0,
|
||||
"capacityMap": {
|
||||
"scan": 1
|
||||
},
|
||||
"parameters": {
|
||||
"SHA256SUM": "1bacd3c8ccb1f15609a10bd4a403831d0ec0b354438ddbf644c95c5d54f8eb13"
|
||||
},
|
||||
"deadline": "",
|
||||
"stopTimeout": ""
|
||||
}
|
||||
```
|
||||
The fields of interest here are:
|
||||
|
||||
* `id`: The ID of the job
|
||||
* `workerID`: The ID of the worker in a DTR replica that is running this job
|
||||
* `status`: The current state of the job
|
||||
* `action`: The type of job the worker will actually perform
|
||||
* `capacityMap`: The available capacity a worker needs for this job to run
|
||||
|
||||
|
||||
### Cron jobs
|
||||
|
||||
Several of the jobs performed by DTR are run in a recurrent schedule. You can
|
||||
see those jobs using the `GET /api/v0/crons` endpoint:
|
||||
|
||||
|
||||
```json
|
||||
{
|
||||
"crons": [
|
||||
{
|
||||
"id": "48875b1b-5006-48f5-9f3c-af9fbdd82255",
|
||||
"action": "license_update",
|
||||
"schedule": "57 54 3 * * *",
|
||||
"retries": 2,
|
||||
"capacityMap": null,
|
||||
"parameters": null,
|
||||
"deadline": "",
|
||||
"stopTimeout": "",
|
||||
"nextRun": "2017-02-22T03:54:57Z"
|
||||
},
|
||||
{
|
||||
"id": "b1c1e61e-1e74-4677-8e4a-2a7dacefffdc",
|
||||
"action": "update_db",
|
||||
"schedule": "0 0 3 * * *",
|
||||
"retries": 0,
|
||||
"capacityMap": null,
|
||||
"parameters": null,
|
||||
"deadline": "",
|
||||
"stopTimeout": "",
|
||||
"nextRun": "2017-02-22T03:00:00Z"
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
The `schedule` uses a Unix crontab syntax.
|
||||
|
||||
## Where to go next
|
||||
|
||||
- [Enable auto-deletion of job logs](./auto-delete-job-logs.md)
|
||||
|
|
@ -0,0 +1,64 @@
|
|||
---
|
||||
title: View Job Logs
|
||||
description: View a list of jobs happening within DTR and review the detailed logs for each job.
|
||||
keywords: registry, jobs, log, system management, job queue
|
||||
---
|
||||
|
||||
> BETA DISCLAIMER
|
||||
>
|
||||
> This is beta content. It is not yet complete and should be considered a work in progress. This content is subject to change without notice.
|
||||
|
||||
As of DTR 2.2, admins were able to [view and troubleshoot jobs within DTR](./troubleshoot-jobs-via-api) using the API. DTR 2.6 enhances those capabilities by adding a **Job Logs** tab under **System** settings on the user interface. The tab displays a sortable and paginated list of jobs along with links to associated job logs.
|
||||
|
||||
## View Jobs List
|
||||
|
||||
To view the list of jobs within DTR, do the following:
|
||||
|
||||
1. Navigate to `https://<dtr-url>`and log in with your UCP credentials.
|
||||
|
||||
2. Select **System** from the left navigation pane, and then click **Job Logs**. You should see a paginated list of past, running, and queued jobs. By default, **Job Logs** shows the latest `10` jobs on the first page.
|
||||
|
||||
{: .img-fluid .with-border}
|
||||
|
||||
3. Specify a filtering option. **Job Logs** lets you filter by:
|
||||
|
||||
* Action: See [Troubleshoot Jobs via the API: Job Types](./troubleshoot-jobs-via-api/#job-types) for an explanation on the different actions or job types.
|
||||
|
||||
* Worker ID: The ID of the worker in a DTR replica that is responsible for running the job.
|
||||
|
||||
{: .img-fluid .with-border}
|
||||
|
||||
### Job Details
|
||||
|
||||
The following is an explanation of the job-related fields displayed in **Job Logs** and uses the filtered `online_gc` action from above.
|
||||
|
||||
| Job Detail | Description | Example |
|
||||
|:----------------|:-------------------------------------------------|:--------|
|
||||
| Action | The type of action or job being performed. See [Job Types](./troubleshoot-jobs-via-api/#job-types) for a full list of job types. | `onlinegc`
|
||||
| ID | The ID of the job. | `ccc05646-569a-4ac4-b8e1-113111f63fb9` |
|
||||
| Worker | The ID of the worker node responsible for running the job. | `8f553c8b697c`|
|
||||
| Status | Current status of the action or job. See [Troubleshoot Jobs via the API: Job Status](./troubleshoot-jobs-via-api/#job-status) for more details. | `done` |
|
||||
| Start Time | Time when the job started. | `9/23/2018 7:04 PM` |
|
||||
| Last Updated | Time when the job was last updated. | `9/23/2018 7:04 PM` |
|
||||
| View Logs | Links to the full logs for the job. | `[View Logs]` |
|
||||
|
||||
4. Optional: Click ***Edit Settings*** on the right of the filtering options to update your **Job Logs** settings. See [Enable auto-deletion of job logs](./auto-delete-job-logs.md) for more details.
|
||||
|
||||
## View Job-specific Logs
|
||||
|
||||
To view the log details for a specific job, do the following:
|
||||
|
||||
1. Click ***View Logs*** next to the job's **Last Updated** value. You will be redirected to the log detail page of your selected job.
|
||||
|
||||
{: .img-fluid .with-border}
|
||||
|
||||
Notice how the job `ID` is reflected in the URL while the `Action` and the abbreviated form of the job `ID` are reflected in the heading. Also, the JSON lines displayed are job-specific [DTR container logs](https://success.docker.com/article/how-to-check-the-docker-trusted-registry-dtr-logs). See [DTR Internal Components](../architecture/#dtr-internal-components) for more details.
|
||||
|
||||
2. Enter or select a different line count to truncate the number of lines displayed. Lines are cut off from the end of the logs.
|
||||
|
||||
{: .img-fluid .with-border}
|
||||
|
||||
|
||||
## Where to go next
|
||||
|
||||
- [Enable auto-deletion of job logs](./auto-delete-job-logs.md)
|
||||
Binary file not shown.
|
After Width: | Height: | Size: 127 KiB |
Binary file not shown.
|
After Width: | Height: | Size: 124 KiB |
Binary file not shown.
|
After Width: | Height: | Size: 208 KiB |
Binary file not shown.
|
After Width: | Height: | Size: 88 KiB |
Binary file not shown.
|
After Width: | Height: | Size: 229 KiB |
Binary file not shown.
|
After Width: | Height: | Size: 123 KiB |
Loading…
Reference in New Issue