Grub does not recognize ext4 partition with certain features enabled
I was recently installing Arch Linux in a VM
following my own notes which are a simplified and shortened version of
official instructions.
The same procedure clearly worked approximately 6 months ago but for some
unknown reason now the new system would not boot and would instead drop into
Grub prompt. It turned out that Grub was not able to read ext4 filesystem
produced with the recent mkfs.ext4
command.
After typing some semi-random commands into Grub prompt I became more or less certain that Grub wasn't able to identify the ext4 partition as such and so was not able to boot from it. But it still remained unclear why this would happen.
The first reasonable thing I tried to identify the problem was to use the version of the installation media as of 202302 (the one that worked when I created my notes). And everything worked fine so I regained some hope that I'm not insane.
So the old media created readable filesystem while the new one not. The next
thing to check was whether the problem was introduced at filesystem creation
time (mkfs.ext4
call) or later (for example when installing Grub). To answer
this question I booted first with the old installation media, partitioned and
formatted filesytem using it, but then switched to a more recent installation
ISO and continued with it. And this way I got a bootable system.
With this outcome it was more or less clear that there appears to be some
incompatibility between mkfs.ext4
and Grub. At first the Internet wasn't
helpful but then I found RedHat bug 1669772 - Grub does not support ext4
meta_bg flag. unbootable system after FS resize
. While the specific
feature reported in this bug was not the cause (this feature was enabled on
both 202302 and 202307 images) this bug showed me what to look for. Using
debugfs
I identified 3 features present in 202307 formatted filesystem which
did not exist in one formatted with 202302 mkfs.ext4. Further checking revealed
that metadata_csum_seed
feature was responsible for my issues. Booting with
202307 image and formatting filesystem with mkfs.ext4 -O '^metadata_csum_seed' /dev/vda1
produces bootable installation.
I have later found Launchpad bug #1844012: grub2 doesn't recognize ext4 with metadata_csum_seed enabled which lists few more features that may prevent Grub from detecting ext4 filesystem.
It is fairly sad to face such an issue. I think it highlights the problems of the fragmented approach used in Linux where multiple small components are being developed independently without enough synchronisation. Also Grub could have better diagnostic messages and directly mention that it cannot recognize ext4 filesystem due to a particular feature it does not support. This example also highlights that Arch Linux would benefit from better integration testing. The bug above has been open for almost 3 years by now which is also sad. But at least I learnt something new here.
p.s. As soon as I finished writing this I received an email that the aforementioned bug #1844012 changed status: "Confirmed → Fix Released". Better late than never I guess.