Our setup is a 3-node RHEL 7.3 bare-metal Kubernetes cluster running on Docker.
We have a multipath FC SAN block device discovered on all three nodes. This device is used as a Kubernetes Persistent Volume with ext4 filesystem. The definition of this object follows as:
apiVersion: v1
kind: PersistentVolume
metadata:
annotations:
pv.kubernetes.io/bound-by-controller: "yes"
creationTimestamp: 2019-01-04T13:49:42Z
finalizers:
- kubernetes.io/pv-protection
labels:
...
spec:
accessModes:
- ReadWriteMany
capacity:
storage: 15Gi
claimRef:
apiVersion: v1
kind: PersistentVolumeClaim
name: ...
namespace: ...
resourceVersion: ...
uid: ...
fc:
fsType: ext4
lun: 1
targetWWNs:
- ...04
- ...15
persistentVolumeReclaimPolicy: Retain
status:
phase: Bound
The pod using this volume crashed and upon restart started to complain about an inconsistency and requesting to run fsck.
Warning FailedMount 1m (x13 over 26m) kubelet, node2 MountVolume.WaitForAttach failed for volume "rtbm-prod-influxdb-pv" : fc: failed to mount fc volume /dev/dm-9 [ext4] to /var/lib/kubelet/plugins/kubernetes.io/fc/500
60e801232d404-lun-1, error 'fsck' found errors on device /dev/dm-9 but could not correct them: fsck from util-linux 2.23.2
k8s-san-0 contains a file system with errors, check forced.
k8s-san-0: Entry '675' in /data/_internal/monitor (262147) has an incorrect filetype (was 2, should be 1).
k8s-san-0: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY.
However we were unable to kick-off the fsck. We have undeployed the pod and fsck was still complaining with
# fsck.ext4 /dev/mapper/mpathb
e2fsck 1.42.9 (28-Dec-2013)
/dev/mapper/mpathb is in use.
e2fsck: Cannot continue, aborting.
I tried to see what exactly was using the device:
# mount -l | grep -i mpathb
# lsof /dev/mapper/mpathb
# grep mpathb /proc/mounts
# fuser -m /dev/mapper/mpathb
But to all these tools the usage was invisible. What else could I check in order to find out what's holding my block device?