Re: GRUB bug with Btrfs multiple devices

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Nov 26, 2019 at 11:07 PM Goffredo Baroncelli <kreijack@xxxxxxxxx> wrote:
>
> On 27/11/2019 02.35, Chris Murphy wrote:
> > On Tue, Nov 26, 2019 at 4:53 PM Chris Murphy <lists@xxxxxxxxxxxxxxxxx> wrote:
> >>
> >> On Tue, Nov 26, 2019 at 2:11 PM Goffredo Baroncelli <kreijack@xxxxxxxxx> wrote:
> >>>
> >>> I think that the errors is due to the "rescan" logic (see grub commit [1]). Could you try a more recent grub (2.04 instead of 2.02) ?
> >>
> >> Yes Fedora Rawhide has 2.04 in it, so I'll give that a shot next time
> >> I rebuild this particular laptop, which should be relatively soon; or
> >> even maybe I can reproduce this problem in a VM with two virtio
> >> devices.
> >
> > I was able to just update to the Fedora 2.04-4.fc32 packages. It's not
> > upstream's but it's a quick and dirty way to give it a shot. Turns
> > out, the same errors happen, although the line number for efidisk.c
> > has changed:
> > https://photos.app.goo.gl/aKWRYhJkkJRDtC1W7
> >
> > For grins, I dropped to a grub prompt, and issued ls and get a different result:
> > https://photos.app.goo.gl/MvL9QZa6zGsiktAf9
>
> Looking at the second picture, it seems that grub had problem to access the disk 0..3 not only when is doing a btrfs activity.
> No problem accessing hd4 and hd5*
>
> Could you enable the debug, doing
>
>         set pager=1
>         set debug=all

I need to narrow the scope. Adding 'set debug=all', there's just way
too much to video, minutes of pages just holding down space bar full
time which is even too fast to video. There must be over 1000 pages, a
tiny minority contain efidisk.c references, the vast majority are
btrfs.c references. As many pages as there are, I was never able to
stop right on a boundary between efidisk.c and btrfs.c. So I gave up
on that approach.

Since the errors happen with efidisk.c I've enabled 'set
debug=efidisk' and captured 74 photos, available at the link below
(they are in pager order)

https://photos.app.goo.gl/nuDH5hFMRxUVKXpX6

It does seem that the errors only happen in efidisk.c and only when
trying to read from what might be phantom devices; I do not know how a
second device in a Btrfs volume triggers this though. There must be
some interaction between efidisk.c and btrfs.c? The grubx64.efi,
grubenv, grub.cfg, and grub modules are all on an HFS+ (no journal)
file system acting as the EFI System partition (as is the default
behavior in Fedora on Macs for many years now). Only vmlinuz and
initramfs are on Btrfs. So I'm not really even sure why btrfs.c gets
called before the GRUB menu is displayed.

I'll see about reproducing this with a VM using edk2 UEFI and two
virtio devices, at least get to a cleaner environment so we're not
confusing multiple system specific weird things. And I can also leave
this particular Mac laptop as it is for further study.


-- 
Chris Murphy



[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux