I/O very slow on hardware RAID after upgrading to kernel 4.9.0, Debian 9

Peter Apian-Bennewitz, 9.9.2019
back to my list of hardware bits , home page

problem of slowing I/O rate

After upgrading to Debian 9 I/O performance came to a grinding halt: Write performance to a single file drops to 500kb/s from around 120MB/s and stays at that low level on an otherwise idle machine.
Characteristically, echo 2 > /proc/sys/vm/drop_caches brings i/o up to normal for a while. Note that this not a problem with user I/O first filling the cache, then going back to I/O hardware rate, since 500kb/s is way below the possible hardware disk I/O rate.

hardware

cache info and statistic dumps

Here's a tar file with the output of vmstat, iotop, slabtop, during normal operations (iofine directory) and "slow state" (iobad directory). Included in the tar is the Bourne shell script that generated the output: ioslow_dumpinfo .
Any pointers are very welcome.

solutions ?

No specific information was found on the web. However, most helpful was https://cromwell-intl.com/open-source/performance-tuning/disks.html.
Varying /proc/sys/vm/*dirt* parameters didn't show changes in the state of low I/O.
What did help was echo deadline > /sys/block/sdb/queue/scheduler, which selects a different I/O scheduler.
This was reproducible: Switching back to cfq scheduler replicated the slowing I/O. This was verified on a JFS and a EXT4 partition on the same controller, and over a longer time. With deadline scheduling the I/O rate stays at levels that are to be expected from the underlying hardware.
Note: Changing the scheduler requires restarting the writing program to take effect. Changing the scheduler while the program writes did not show any difference in the I/O rate of the program.
update: Using deadline does not solve the problem in the long term. It just takes a bit longer to occur.
Note: cfq had been the scheduler on this hardware, running Debian 8, 3.16.0-9-686-pae, without any I/O problems.

As a side effect, using deadline scheduling seems to bring read performance through MariaDB back to normal levels, as experienced on Debian 8 and MySQL.

The Debian kernel 4.9.0 is a bit dated, the current version of this long-term kernel is 4.9.190 . Recompiling a kernel had been a no-brainer, until Debian started to require a ramdisk to boot. Bad concept, IMHO.
However, https://www.debian.org/releases/stretch/i386/ch08s06.html.en has a how-to to generate a complete Debian package from a vanilla kernel. That worked so far.
But, in this case, booting this 4.9.190 hangs - and, as bonus, damages the boot partition, so a rescue disk is required to restore the boot partition to make grub work again. Great.

Next step: Trying a kernel backported from Debian_10, which uses the 4.19.x series. First thing to note: The I/O schedulers have changed in 4.12 (see https://www.phoronix.com/scan.php?page=article&item=linux-412-io&num=1 ).


Peter Apian-Bennewitz, info[AT]pab-opto.de, text and images are under the GNU_Free_Documentation_License, reference to this text appreciated.