From 1dc3efd6ec987fc07fe0231e367f7e18b382ed12 Mon Sep 17 00:00:00 2001 From: Dawn W Docker Date: Wed, 24 Jul 2019 13:32:21 -0700 Subject: [PATCH 01/11] adding ucp note per Jira ENGDOCS-58 --- ee/ucp/release-notes.md | 29 ++++++++++++++++++++++++++++- 1 file changed, 28 insertions(+), 1 deletion(-) diff --git a/ee/ucp/release-notes.md b/ee/ucp/release-notes.md index 1b7ee48ef8..025de700eb 100644 --- a/ee/ucp/release-notes.md +++ b/ee/ucp/release-notes.md @@ -186,7 +186,34 @@ In order to optimize user experience and security, support for Internet Explorer ### Known issues -- kubelets or Calico-node pods are Down +- Kubelet fails mounting local volumes in "Block" mode on SLES 12 and SLES 15 hosts + The error message from the kubelet looks like this, with mount returning error code 32. + ``` + Operation for "\"kubernetes.io/local-volume/local-pxjz5\"" failed. No retries permitted until 2019-07-18 20:28:28.745186772 +0000 UTC m=+5936.009498175 (durationBeforeRetry 2m2s). Error: "MountVolume.MountDevice failed for volume \"local-pxjz5\" (UniqueName: \"kubernetes.io/local-volume/local-pxjz5\") pod \"pod-subpath-test-local-preprovisionedpv-l7k9\" (UID: \"364a339d-a98d-11e9-8d2d-0242ac11000b\") : local: failed to mount device /dev/loop0 at /var/lib/kubelet/plugins/kubernetes.io/local-volume/mounts/local-pxjz5 (fstype: ), error exit status 32" + ``` + Issuing "dmesg" on the system will show something like: + ``` + [366633.029514] EXT4-fs (loop3): Couldn't mount RDWR because of SUSE-unsupported optional feature METADATA_CSUM. Load module with allow_unsupported=1. + ``` + Rootcause: + For block volumes, if a specific filesystem is not specified, then "ext4" is used as the default to format the volume. "mke2fs" is the util used for formatting and is part of the hyperkube image. The config file for mke2fs is at /etc/mke2fs.conf. The config file by default has the following line for ext4. Note that the features list includes something called "metadata_csum", which enables storing checksums to ensure filesystem integrity. + ``` + {{[fs_types]... + ext4 = {features = has_journal,extent,huge_file,flex_bg,metadata_csum,64bit,dir_nlink,extra_isizeinode_size = 256}}} + ``` + "metadata_csum" for ext4 is great. However, SLES12 and SLES15 call this an "experimental feature" and doesnt allow mounting of such blocks. The kubelet's mke2fs util looks up /etc/mke2fs.conf and formats the block volume with the checksum feature. Then the kubelet calls mount to mount the volume. But the kernel refuses to mount such a volume and errors with exit 32. + + Resolution: + On SLES12 and SLES15 hosts, use `sed` to remove the `metadata_csum` feature from the ucp-kubelet container:`sed -i 's/metadata_csum,//g' /etc/mke2fs.conf` + + This resolution can be automated across your cluster of SLES12 and SLES15 hosts, by creating a docker swarm service as follows. Note that, for this, the hosts should be in "swarm" mode: + + Create a global docker service that remove the "metadata_csum" feature from the mke2fs config file (/etc/mke2fs.conf) in ucp-kubelet container. For this, use the UCP client bundle to point to the UCP cluster and run the following swarm commands: + ``` + docker service create --mode=global --restart-condition none --mount type=bind,source=/var/run/docker.sock,target=/var/run/docker.sock mavenugo/swarm-exec:17.03.0-ce docker exec ucp-kubelet "/bin/bash" "-c" "sed -i 's/metadata_csum,//g' /etc/mke2fs.conf" + ``` + +- Kubelets or Calico-node pods are Down The symptom of this issue is that kubelets or Calico-node pods are down with one of the following error messages: - Kubelet is unhealthy From ba36083f4b2a286b279c7640886837d89747657c Mon Sep 17 00:00:00 2001 From: Dawn W Docker Date: Wed, 24 Jul 2019 13:36:45 -0700 Subject: [PATCH 02/11] formatting --- ee/ucp/release-notes.md | 15 ++++++++++++--- 1 file changed, 12 insertions(+), 3 deletions(-) diff --git a/ee/ucp/release-notes.md b/ee/ucp/release-notes.md index 025de700eb..796e14e1dc 100644 --- a/ee/ucp/release-notes.md +++ b/ee/ucp/release-notes.md @@ -189,11 +189,18 @@ In order to optimize user experience and security, support for Internet Explorer - Kubelet fails mounting local volumes in "Block" mode on SLES 12 and SLES 15 hosts The error message from the kubelet looks like this, with mount returning error code 32. ``` - Operation for "\"kubernetes.io/local-volume/local-pxjz5\"" failed. No retries permitted until 2019-07-18 20:28:28.745186772 +0000 UTC m=+5936.009498175 (durationBeforeRetry 2m2s). Error: "MountVolume.MountDevice failed for volume \"local-pxjz5\" (UniqueName: \"kubernetes.io/local-volume/local-pxjz5\") pod \"pod-subpath-test-local-preprovisionedpv-l7k9\" (UID: \"364a339d-a98d-11e9-8d2d-0242ac11000b\") : local: failed to mount device /dev/loop0 at /var/lib/kubelet/plugins/kubernetes.io/local-volume/mounts/local-pxjz5 (fstype: ), error exit status 32" + Operation for "\"kubernetes.io/local-volume/local-pxjz5\"" failed. No retries + permitted until 2019-07-18 20:28:28.745186772 +0000 UTC m=+5936.009498175 + (durationBeforeRetry 2m2s). Error: "MountVolume.MountDevice failed for volume \"local-pxjz5\" + (UniqueName: \"kubernetes.io/local-volume/local-pxjz5\") pod + \"pod-subpath-test-local-preprovisionedpv-l7k9\" (UID: \"364a339d-a98d-11e9-8d2d-0242ac11000b\") + : local: failed to mount device /dev/loop0 at /var/lib/kubelet/plugins/kubernetes.io/local-volume/mounts/local-pxjz5 (fstype: ), error exit status 32" ``` Issuing "dmesg" on the system will show something like: ``` - [366633.029514] EXT4-fs (loop3): Couldn't mount RDWR because of SUSE-unsupported optional feature METADATA_CSUM. Load module with allow_unsupported=1. + [366633.029514] EXT4-fs (loop3): Couldn't mount RDWR + because of SUSE-unsupported optional feature METADATA_CSUM. + Load module with allow_unsupported=1. ``` Rootcause: For block volumes, if a specific filesystem is not specified, then "ext4" is used as the default to format the volume. "mke2fs" is the util used for formatting and is part of the hyperkube image. The config file for mke2fs is at /etc/mke2fs.conf. The config file by default has the following line for ext4. Note that the features list includes something called "metadata_csum", which enables storing checksums to ensure filesystem integrity. @@ -210,7 +217,9 @@ In order to optimize user experience and security, support for Internet Explorer Create a global docker service that remove the "metadata_csum" feature from the mke2fs config file (/etc/mke2fs.conf) in ucp-kubelet container. For this, use the UCP client bundle to point to the UCP cluster and run the following swarm commands: ``` - docker service create --mode=global --restart-condition none --mount type=bind,source=/var/run/docker.sock,target=/var/run/docker.sock mavenugo/swarm-exec:17.03.0-ce docker exec ucp-kubelet "/bin/bash" "-c" "sed -i 's/metadata_csum,//g' /etc/mke2fs.conf" + docker service create --mode=global --restart-condition none --mount + type=bind,source=/var/run/docker.sock,target=/var/run/docker.sock mavenugo/swarm-exec:17.03.0-ce docker + exec ucp-kubelet "/bin/bash" "-c" "sed -i 's/metadata_csum,//g' /etc/mke2fs.conf" ``` - Kubelets or Calico-node pods are Down From f6bcd72d664015ff2209779c85197e3036c85306 Mon Sep 17 00:00:00 2001 From: Dawn W Docker Date: Wed, 24 Jul 2019 13:37:52 -0700 Subject: [PATCH 03/11] formatting --- ee/ucp/release-notes.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/ee/ucp/release-notes.md b/ee/ucp/release-notes.md index 796e14e1dc..8543e13e11 100644 --- a/ee/ucp/release-notes.md +++ b/ee/ucp/release-notes.md @@ -194,7 +194,9 @@ In order to optimize user experience and security, support for Internet Explorer (durationBeforeRetry 2m2s). Error: "MountVolume.MountDevice failed for volume \"local-pxjz5\" (UniqueName: \"kubernetes.io/local-volume/local-pxjz5\") pod \"pod-subpath-test-local-preprovisionedpv-l7k9\" (UID: \"364a339d-a98d-11e9-8d2d-0242ac11000b\") - : local: failed to mount device /dev/loop0 at /var/lib/kubelet/plugins/kubernetes.io/local-volume/mounts/local-pxjz5 (fstype: ), error exit status 32" + : local: failed to mount device /dev/loop0 at + /var/lib/kubelet/plugins/kubernetes.io/local-volume/mounts/local-pxjz5 (fstype: ), + error exit status 32" ``` Issuing "dmesg" on the system will show something like: ``` From df82a23d6495c18fd459c53176b5dbc9e20b1559 Mon Sep 17 00:00:00 2001 From: Dawn W Docker Date: Wed, 24 Jul 2019 13:51:35 -0700 Subject: [PATCH 04/11] formatting --- ee/ucp/release-notes.md | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/ee/ucp/release-notes.md b/ee/ucp/release-notes.md index 8543e13e11..0624d5023e 100644 --- a/ee/ucp/release-notes.md +++ b/ee/ucp/release-notes.md @@ -200,9 +200,7 @@ In order to optimize user experience and security, support for Internet Explorer ``` Issuing "dmesg" on the system will show something like: ``` - [366633.029514] EXT4-fs (loop3): Couldn't mount RDWR - because of SUSE-unsupported optional feature METADATA_CSUM. - Load module with allow_unsupported=1. + [366633.029514] EXT4-fs (loop3): Couldn't mount RDWR because of SUSE-unsupported optional feature METADATA_CSUM. Load module with allow_unsupported=1. ``` Rootcause: For block volumes, if a specific filesystem is not specified, then "ext4" is used as the default to format the volume. "mke2fs" is the util used for formatting and is part of the hyperkube image. The config file for mke2fs is at /etc/mke2fs.conf. The config file by default has the following line for ext4. Note that the features list includes something called "metadata_csum", which enables storing checksums to ensure filesystem integrity. @@ -210,7 +208,7 @@ In order to optimize user experience and security, support for Internet Explorer {{[fs_types]... ext4 = {features = has_journal,extent,huge_file,flex_bg,metadata_csum,64bit,dir_nlink,extra_isizeinode_size = 256}}} ``` - "metadata_csum" for ext4 is great. However, SLES12 and SLES15 call this an "experimental feature" and doesnt allow mounting of such blocks. The kubelet's mke2fs util looks up /etc/mke2fs.conf and formats the block volume with the checksum feature. Then the kubelet calls mount to mount the volume. But the kernel refuses to mount such a volume and errors with exit 32. + "metadata_csum" for ext4 is great. However, SLES12 and SLES15 call this an "experimental feature" and doesnt allow mounting of such blocks. The kubelet's mke2fs util looks up /etc/mke2fs.conf and formats the block volume with the checksum feature. Then the kubelet tries to mount the volume. But the kernel refuses to mount such a volume and errors with exit 32. Resolution: On SLES12 and SLES15 hosts, use `sed` to remove the `metadata_csum` feature from the ucp-kubelet container:`sed -i 's/metadata_csum,//g' /etc/mke2fs.conf` @@ -223,6 +221,7 @@ In order to optimize user experience and security, support for Internet Explorer type=bind,source=/var/run/docker.sock,target=/var/run/docker.sock mavenugo/swarm-exec:17.03.0-ce docker exec ucp-kubelet "/bin/bash" "-c" "sed -i 's/metadata_csum,//g' /etc/mke2fs.conf" ``` + You can now make nodes kubernetes workers. - Kubelets or Calico-node pods are Down From 583e7dd36b5f616faae6a0b6fb7aa4e0694d52bf Mon Sep 17 00:00:00 2001 From: Dawn W Docker Date: Wed, 24 Jul 2019 14:01:29 -0700 Subject: [PATCH 05/11] formatting --- ee/ucp/release-notes.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ee/ucp/release-notes.md b/ee/ucp/release-notes.md index 0624d5023e..a2a7d9faef 100644 --- a/ee/ucp/release-notes.md +++ b/ee/ucp/release-notes.md @@ -221,7 +221,7 @@ In order to optimize user experience and security, support for Internet Explorer type=bind,source=/var/run/docker.sock,target=/var/run/docker.sock mavenugo/swarm-exec:17.03.0-ce docker exec ucp-kubelet "/bin/bash" "-c" "sed -i 's/metadata_csum,//g' /etc/mke2fs.conf" ``` - You can now make nodes kubernetes workers. + You can now switch nodes to be kubernetes workers. - Kubelets or Calico-node pods are Down From 6311dfd7ecb1e9f804709f22d9ae799a2d6ce884 Mon Sep 17 00:00:00 2001 From: Adrian Plata <51415348+adrian-plata@users.noreply.github.com> Date: Wed, 24 Jul 2019 14:44:40 -0700 Subject: [PATCH 06/11] Update release-notes.md --- ee/ucp/release-notes.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/ee/ucp/release-notes.md b/ee/ucp/release-notes.md index a2a7d9faef..6e09f9c9bc 100644 --- a/ee/ucp/release-notes.md +++ b/ee/ucp/release-notes.md @@ -208,14 +208,14 @@ In order to optimize user experience and security, support for Internet Explorer {{[fs_types]... ext4 = {features = has_journal,extent,huge_file,flex_bg,metadata_csum,64bit,dir_nlink,extra_isizeinode_size = 256}}} ``` - "metadata_csum" for ext4 is great. However, SLES12 and SLES15 call this an "experimental feature" and doesnt allow mounting of such blocks. The kubelet's mke2fs util looks up /etc/mke2fs.conf and formats the block volume with the checksum feature. Then the kubelet tries to mount the volume. But the kernel refuses to mount such a volume and errors with exit 32. + "metadata_csum" for ext4 is great. However, SLES12 and SLES15 call this an "experimental feature" and doesn't allow mounting of such blocks. The kubelet's mke2fs util looks up /etc/mke2fs.conf and formats the block volume with the checksum feature. Then the kubelet tries to mount the volume. But the kernel refuses to mount such a volume and errors with exit 32. Resolution: On SLES12 and SLES15 hosts, use `sed` to remove the `metadata_csum` feature from the ucp-kubelet container:`sed -i 's/metadata_csum,//g' /etc/mke2fs.conf` This resolution can be automated across your cluster of SLES12 and SLES15 hosts, by creating a docker swarm service as follows. Note that, for this, the hosts should be in "swarm" mode: - Create a global docker service that remove the "metadata_csum" feature from the mke2fs config file (/etc/mke2fs.conf) in ucp-kubelet container. For this, use the UCP client bundle to point to the UCP cluster and run the following swarm commands: + Create a global docker service that remove the "metadata_csum" feature from the mke2fs config file (/etc/mke2fs.conf) in ucp-kubelet container. For this, use the UCP client bundle to point to the UCP cluster and run the following swarm commands: ``` docker service create --mode=global --restart-condition none --mount type=bind,source=/var/run/docker.sock,target=/var/run/docker.sock mavenugo/swarm-exec:17.03.0-ce docker From 884eb9d3629e7e8dd93a1ef0e4a77ef1ae1d1c40 Mon Sep 17 00:00:00 2001 From: Adrian Plata <51415348+adrian-plata@users.noreply.github.com> Date: Wed, 24 Jul 2019 16:47:41 -0700 Subject: [PATCH 07/11] Update release-notes.md --- ee/ucp/release-notes.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ee/ucp/release-notes.md b/ee/ucp/release-notes.md index 6e09f9c9bc..b4e69d7039 100644 --- a/ee/ucp/release-notes.md +++ b/ee/ucp/release-notes.md @@ -208,7 +208,7 @@ In order to optimize user experience and security, support for Internet Explorer {{[fs_types]... ext4 = {features = has_journal,extent,huge_file,flex_bg,metadata_csum,64bit,dir_nlink,extra_isizeinode_size = 256}}} ``` - "metadata_csum" for ext4 is great. However, SLES12 and SLES15 call this an "experimental feature" and doesn't allow mounting of such blocks. The kubelet's mke2fs util looks up /etc/mke2fs.conf and formats the block volume with the checksum feature. Then the kubelet tries to mount the volume. But the kernel refuses to mount such a volume and errors with exit 32. + "metadata_csum" for ext4 on SLES12 and SLES15 is an "experimental feature" and the kernel does not allow mounting of volumes that have been formatted with "metadata checksum" enabled. In the ucp-kubelet container, mke2fs is configured to enable metadata check-summing while formatting block volumes. The kubelet tries to mount such a block volume, but the kernel denies the mount with exit error 32. Resolution: On SLES12 and SLES15 hosts, use `sed` to remove the `metadata_csum` feature from the ucp-kubelet container:`sed -i 's/metadata_csum,//g' /etc/mke2fs.conf` From 7ab4350c891d8bbd3f3aeef0c1db9c88c1444530 Mon Sep 17 00:00:00 2001 From: Dawn W <51414965+DawnWood-Docker@users.noreply.github.com> Date: Wed, 24 Jul 2019 17:01:08 -0700 Subject: [PATCH 08/11] Update release-notes.md --- ee/ucp/release-notes.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/ee/ucp/release-notes.md b/ee/ucp/release-notes.md index b4e69d7039..6dea4a1d97 100644 --- a/ee/ucp/release-notes.md +++ b/ee/ucp/release-notes.md @@ -203,10 +203,10 @@ In order to optimize user experience and security, support for Internet Explorer [366633.029514] EXT4-fs (loop3): Couldn't mount RDWR because of SUSE-unsupported optional feature METADATA_CSUM. Load module with allow_unsupported=1. ``` Rootcause: - For block volumes, if a specific filesystem is not specified, then "ext4" is used as the default to format the volume. "mke2fs" is the util used for formatting and is part of the hyperkube image. The config file for mke2fs is at /etc/mke2fs.conf. The config file by default has the following line for ext4. Note that the features list includes something called "metadata_csum", which enables storing checksums to ensure filesystem integrity. + For block volumes, if a specific filesystem is not specified, then "ext4" is used as the default to format the volume. "mke2fs" is the util used for formatting and is part of the hyperkube image. The config file for mke2fs is at /etc/mke2fs.conf. The config file by default has the following line for ext4. Note that the features list includes "metadata_csum", which enables storing checksums to ensure filesystem integrity. ``` - {{[fs_types]... - ext4 = {features = has_journal,extent,huge_file,flex_bg,metadata_csum,64bit,dir_nlink,extra_isizeinode_size = 256}}} + [fs_types]... + ext4 = {features = has_journal,extent,huge_file,flex_bg,metadata_csum,64bit,dir_nlink,extra_isizeinode_size = 256} ``` "metadata_csum" for ext4 on SLES12 and SLES15 is an "experimental feature" and the kernel does not allow mounting of volumes that have been formatted with "metadata checksum" enabled. In the ucp-kubelet container, mke2fs is configured to enable metadata check-summing while formatting block volumes. The kubelet tries to mount such a block volume, but the kernel denies the mount with exit error 32. From 5d7ef9cb1711a9cdbe41e2cae065f7be0b474a09 Mon Sep 17 00:00:00 2001 From: Dawn W <51414965+DawnWood-Docker@users.noreply.github.com> Date: Wed, 24 Jul 2019 17:02:34 -0700 Subject: [PATCH 09/11] Update release-notes.md --- ee/ucp/release-notes.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ee/ucp/release-notes.md b/ee/ucp/release-notes.md index 6dea4a1d97..e0fa82eed1 100644 --- a/ee/ucp/release-notes.md +++ b/ee/ucp/release-notes.md @@ -187,7 +187,7 @@ In order to optimize user experience and security, support for Internet Explorer ### Known issues - Kubelet fails mounting local volumes in "Block" mode on SLES 12 and SLES 15 hosts - The error message from the kubelet looks like this, with mount returning error code 32. + The error message from the kubelet looks like this, with `mount` returning error code 32. ``` Operation for "\"kubernetes.io/local-volume/local-pxjz5\"" failed. No retries permitted until 2019-07-18 20:28:28.745186772 +0000 UTC m=+5936.009498175 From b59be9aa350c922742b4912d9da4051d923cea84 Mon Sep 17 00:00:00 2001 From: Dawn W <51414965+DawnWood-Docker@users.noreply.github.com> Date: Wed, 24 Jul 2019 17:03:51 -0700 Subject: [PATCH 10/11] Update release-notes.md --- ee/ucp/release-notes.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ee/ucp/release-notes.md b/ee/ucp/release-notes.md index e0fa82eed1..d00d2c77ee 100644 --- a/ee/ucp/release-notes.md +++ b/ee/ucp/release-notes.md @@ -215,7 +215,7 @@ In order to optimize user experience and security, support for Internet Explorer This resolution can be automated across your cluster of SLES12 and SLES15 hosts, by creating a docker swarm service as follows. Note that, for this, the hosts should be in "swarm" mode: - Create a global docker service that remove the "metadata_csum" feature from the mke2fs config file (/etc/mke2fs.conf) in ucp-kubelet container. For this, use the UCP client bundle to point to the UCP cluster and run the following swarm commands: + Create a global docker service that removes the "metadata_csum" feature from the mke2fs config file (/etc/mke2fs.conf) in ucp-kubelet container. For this, use the UCP client bundle to point to the UCP cluster and run the following swarm commands: ``` docker service create --mode=global --restart-condition none --mount type=bind,source=/var/run/docker.sock,target=/var/run/docker.sock mavenugo/swarm-exec:17.03.0-ce docker From 6883fe994843a8df00d6b6091479c7a1afe377bf Mon Sep 17 00:00:00 2001 From: Dawn W <51414965+DawnWood-Docker@users.noreply.github.com> Date: Wed, 24 Jul 2019 17:06:39 -0700 Subject: [PATCH 11/11] Update release-notes.md --- ee/ucp/release-notes.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ee/ucp/release-notes.md b/ee/ucp/release-notes.md index d00d2c77ee..16989d78ac 100644 --- a/ee/ucp/release-notes.md +++ b/ee/ucp/release-notes.md @@ -203,7 +203,7 @@ In order to optimize user experience and security, support for Internet Explorer [366633.029514] EXT4-fs (loop3): Couldn't mount RDWR because of SUSE-unsupported optional feature METADATA_CSUM. Load module with allow_unsupported=1. ``` Rootcause: - For block volumes, if a specific filesystem is not specified, then "ext4" is used as the default to format the volume. "mke2fs" is the util used for formatting and is part of the hyperkube image. The config file for mke2fs is at /etc/mke2fs.conf. The config file by default has the following line for ext4. Note that the features list includes "metadata_csum", which enables storing checksums to ensure filesystem integrity. + For block volumes, if a specific filesystem is not specified, "ext4" is used as the default to format the volume. "mke2fs" is the util used for formatting and is part of the hyperkube image. The config file for mke2fs is at /etc/mke2fs.conf. The config file by default has the following line for ext4. Note that the features list includes "metadata_csum", which enables storing checksums to ensure filesystem integrity. ``` [fs_types]... ext4 = {features = has_journal,extent,huge_file,flex_bg,metadata_csum,64bit,dir_nlink,extra_isizeinode_size = 256}