Installing the NVMe Host Module Patch on Compute Nodes

At the appropriate time you are ready to set up your orchestration environment, you will need to apply a patch to all of your compute nodes. This patch addresses an issue where commands get stuck while Host NVMe-oF controller is in a reconnect state. This section provides a description of the patch with instructions for installing on Ubuntu and CentOS.

The NVMe controller enters into the reconnect state when it loses connectivity with the target. It tries to reconnect every 10 seconds (default) until it successfully reconnects or until it times-out. However, the host may not enforce the time out and will continue to try to reconnect for a longer period or indefinitely.

The patch provided resolves this issue and is based on the NVMe community patch. To fix this long delay, we:

  • Introduced a new session parameter fast_io_fail_tmo. The timeout is measured in seconds from the controller reconnect, and any command beyond that timeout is rejected. The new parameter value may be passed during CONNECT. The default value of -1 means there is no timeout, which is similar to the current behavior.
  • Added a new controller NVME_CTRL_FAILFAST_EXPIRED and respective delay work that updates the NVME_CTRL_FAILFAST_EXPIRED When the controller is entering the CONNECTING state, we schedule the delayed_work based on the failfast timeout value. If the transition is out of CONNECTING, we terminate the delayed work item and ensure failfast_expired is equal to false. If the delayed work item expires then NVME_CTRL_FAILFAST_EXPIRED is set to true.
  • Updated the nvmf_fail_nonready_command() and nvme_available_path() functions to check the NVME_CTRL_FAILFAST_EXPIRED controller flag.

For multipath (function nvme_available_path()), the path will not be considered available if the NVME_CTRL_FAILFAST_EXPIRED controller flag is set and the controller is in the NVME_CTRL_CONNECTING state. This prevents commands from getting stuck when available paths have tried to reconnect for too long.

Installing the NVMe Host Module Patch on Compute Nodes under Ubuntu

To install the NVMe host module patch for Ubuntu 20.04.2 (kernel 5.4.0-x-generic), complete the steps below on each compute node:

  1. Confirm you have disabled Secure Boot from your firmware (BIOS). Note that you will of course lose the protection provided by UEFI Secure Boot.
  2. Install the NVMe host patch with
sudo dpkg -i nvmf-host-fast-io-fail-patch-5.4.0-91-generic-1.0-0-all.deb
3.  Verify that patch has been installed by running:
modinfo nvme-core|grep description

It should return:

nvme host Kioxia patch (nvmf-host-fast-io-fail-patch-5.4.0-91-generic-1.0-0-all), then patch is installed,

To uninstall the NVMe patch, execute the command:

sudo apt-get remove nvmf-host-fast-io-fail-patch-5.4.0-91-generic

Installing the NVMe Host Module Patch on Compute Nodes under CentOS

The steps in this section apply to CentOS 7, Kernel 5.x and must be completed on each compute node. They have been tested on CentOS 7.9 and CentOS 8.3**.

Compile the NVMe host kernel modules

  1. Install common development tools and other required components:
yum -y groupinstall "Development Tools"

yum -y install ncurses-devel hmaccalc zlib-devel binutils-devel elfutils-libelf-devel
2.  If you are running CentOS 7, install these additional tools:
yum -y install centos-release-scl

yum -y install devtoolset-8-gcc

scl enable devtoolset-8 -- bash

wget https://www.openssl.org/source/openssl-1.1.0i.tar.gz

tar -zxf openssl-1.1.0i.tar.gz

cd openssl-1.1.0i

./config

make

make install

cp /usr/local/lib64/libcrypto.so.1.1 /lib64

cd ..

3. Install kernel source and development packages for the target kernel (e.g. kernel 5.10.61):

rpm -i kernel-devel-5.10.61-1.x86_64.rpm

rpm -i kernel-headers-5.10.61-1.x86_64.rpm

rpm -i kernel-5.10.61-1.src.rpm

tar xzf /root/rpmbuild/SOURCES/kernel-5.10.61.tar.gz
4.  Apply the patch received from Kioxia:
SRC_DIR=/root/rpmbuild/SOURCES/kernel-5.10.61/drivers/nvme/host

PATH_TO_PATCH=<path-to-the-patch-dir>/nvmf_host_fast_io_fail_patch_5.10.61-1.1-1.patch

cd ${SRC_DIR}

patch --dry-run -p1 -i ${PATH_TO_PATCH}

patch -p1 -i ${PATH_TO_PATCH}
5.  Define MODULE_DESCRIPTION to identify the applied patch:
BUILD_TIME=$(date +%y%m%d%H%M)

DESCRIPTION="\"nvme host Kioxia patch (${BUILD_TIME})\""

sed -i "/MODULE_DESCRIPTION.*/d" core.c

echo -e "\nMODULE_DESCRIPTION(${DESCRIPTION});" >> core.c
6.  Build the patched kernel modules and sign them:
    1. Build the modules
KDIR=/lib/modules/5.10.61/build/

make -C ${KDIR} M=${SRC_DIR}

Note: If the make fails, try to upgrade the gcc as shown below:

wget https://ftp.mirrorservice.org/sites/sourceware.org/pub/gcc/releases/gcc-8.5.0/gcc-8.5.0.tar.gz

tar zxf gcc-8.5.0.tar.gz

cd gcc-8.5.0

yum -y install bzip2

yum install libmpc-devel./configure --disable-multilib --enable-languages=c,c++

make -j 4

make install

 

b.  Sign the modules

cd ${SRC_DIR}

KOS=($(ls *.ko))

for module in ${KOS[*]}; do

echo "sign ${module}"; strip --strip-unneeded ${module}; ${KDIR}/scripts/sign-file sha256 ${KDIR}/certs/signing_key.pem ${KDIR}/certs/signing_key.x509 ${module}; done

NOTE: If the signing fails, then generate keys for using it with the kernel module you want to sign and copy them to, /usr/src/kernels/$(uname -r)/certs/:

openssl req -new -x509 -newkey rsa:2048 -keyout signing_key.pem -outform DER -out signing_key.x509 -nodes -subj "/CN=Owner/"

     cp signing_key.x509 signing_key.pem /usr/src/kernels/$(uname -r)/certs/

7.  Verify that the kernels produced are:

nvme-core.ko

nvme.ko

nvme-fabrics.ko

nvme-tcp.ko

nvme-rdma.ko

Install the patched modules

  1. Assuming the patched modules are placed in the current directory, verify that their version matches currently running kernel version:
modinfo *.ko | grep vermagic | awk '{print $2}'

uname -r

2.  Define bash variables:

KVER=$(uname -r)

SAVED_DIR="/root/tmp/orig_$KVER"

MODULES_DIR="/lib/modules/$KVER/kernel/drivers/nvme/host"
3.  Create a backup of the initramfs image:
cp /boot/initramfs-"$KVER".img /boot/saved-initramfs-"$KVER".img
4.  Backup the original modules and replace them with the patched version:
mkdir -p ${SAVED_DIR}

cd ${MODULES_DIR}

tar -cvf ${SAVED_DIR}/"$KVER"_kos.tar ./*.ko*

rm -f ${MODULES_DIR}/*.ko*

cp -f ${SRC_DIR}/*.ko ${MODULES_DIR}/
5.  Regenerate dependencies and initrd:
/sbin/depmod -ae -F /boot/System.map-$(uname -r)

dracut --force -H
6.  If nvme (pci) module cannot be unloaded, reboot the machine:
reboot

Otherwise, unload the original modules and load the patched version:

modprobe -r nvme-tcp

modprobe -r nvme-rdma

modprobe -r nvme-fc

modprobe -r nvme-fabrics

modprobe -r nvme

modprobe -r nvme-core

modprobe nvme-core

modprobe nvme

modprobe nvme-fabrics

modprobe nvme-tcp

modprobe nvme-rdma

Restore the original kernel modules

  1. Define bash variables:
KVER=$(uname -r)

SAVED_DIR="/root/tmp/orig_$KVER"

MODULES_DIR="/lib/modules/$KVER/kernel/drivers/nvme/host"
2.  Restore original kernel modules from the saved tar:
cd ${MODULES_DIR}

rm -f *.ko*

tar -xvf ${SAVED_DIR}/"$KVER"_kos.tar
3.  Regenerate dependencies and initrd:
/sbin/depmod -ae -F /boot/System.map-$(uname -r)

dracut --force -H

If the nvme (pci) module cannot be unloaded, reboot the machine:

reboot

Otherwise, unload the patched modules and load the original version:

modprobe -r nvme-tcp

modprobe -r nvme-rdma

modprobe -r nvme-fc

modprobe -r nvme-fabrics

modprobe -r nvme

modprobe -r nvme-core

modprobe nvme-core

modprobe nvme

modprobe nvme-fabrics

modprobe nvme-tcp

modprobe nvme-rdma

 

Next: KumoScale v.3.20 documentation list