Re: Removing a failed device - stuck in a loop or normal?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 2019/6/14 上午7:17, Steven Fosdick wrote:
> I have a BTRFS volume with four devices one of which has not failed
> and is no longer present in the machine.  The volume is mounted in
> degraded mode.  I am trying to remove the failed device with:
> 
> btrfs device remove missing /data
> 
> There should be enough space to consolidate the data onto the three
> remaining disc before adding a fourth.  The first few attempts have
> failed with errors of the form:
> 
> Jun 12 14:54:36 meije kernel: BTRFS info (device sda): relocating
> block group 10436799889408 flags data|raid5
> Jun 12 14:54:41 meije kernel: BTRFS warning (device sda): csum failed
> root -9 ino 272 off 519241728 csum 0x9cb8912f expected csum 0x73ba6e2a
> mirror 2

That's common if the device is really failing.
Raid5 should re-build the corrupted blocks.

> Jun 12 14:54:41 meije kernel: BTRFS warning (device sda): csum failed
> root -9 ino 272 off 519245824 csum 0x98f94189 expected csum 0x4ab823e6
> mirror 2
> Jun 12 14:54:41 meije kernel: BTRFS warning (device sda): csum failed
> root -9 ino 272 off 519254016 csum 0xd3f53909 expected csum 0x94ab4db4
> mirror 2
> Jun 12 14:54:41 meije kernel: BTRFS warning (device sda): csum failed
> root -9 ino 272 off 519249920 csum 0xcb29eade expected csum 0x65d28b9e
> mirror 2
> Jun 12 14:54:41 meije kernel: BTRFS warning (device sda): csum failed
> root -9 ino 272 off 519258112 csum 0x714821f5 expected csum 0xeed771e2
> mirror 2
> Jun 12 14:54:41 meije kernel: BTRFS warning (device sda): csum failed
> root -9 ino 272 off 519262208 csum 0x574f1bdc expected csum 0x5a78e046
> mirror 2
> Jun 12 14:54:41 meije kernel: BTRFS warning (device sda): csum failed
> root -9 ino 272 off 519266304 csum 0x63ec8641 expected csum 0xcee67afe
> mirror 2
> Jun 12 14:54:41 meije kernel: BTRFS warning (device sda): csum failed
> root -9 ino 272 off 519270400 csum 0xb3d8a215 expected csum 0x39db0f0a
> mirror 2
> Jun 12 14:54:41 meije kernel: BTRFS warning (device sda): csum failed
> root -9 ino 272 off 519274496 csum 0x910dd641 expected csum 0x3599ad7d
> mirror 2
> Jun 12 14:54:41 meije kernel: BTRFS warning (device sda): csum failed
> root -9 ino 272 off 519278592 csum 0xe6ca8bc2 expected csum 0x413d5da7
> mirror 2
> 
> Deleting the files concerned allows it to progress further and the

The corrupted info is from data reloc tree, I'm not sure if deleting
files would really help.

> device remove has been logging messages of the form:
> 
> Jun 13 21:14:01 meije kernel: BTRFS info (device sda): relocating
> block group 7956456275968 flags data|raid5
> Jun 13 21:14:36 meije kernel: BTRFS info (device sda): found 785 extents
> Jun 13 21:14:46 meije kernel: BTRFS info (device sda): found 785 extents
> 
> The numbers obviously vary but the pattern of those three lines which
> the block group and two identical "found extents" lines has been
> repeating for several hours and the amount of data reported by:
> 
> btrfs fi usage /data
> 
> as being on the missing disc has been gradually reducing and the
> amount on the other three gradually increasing just as I would expect.
> Now, however, there is a new pattern:
> 
> Jun 13 21:14:54 meije kernel: BTRFS info (device sda): relocating
> block group 7955382534144 flags metadata|raid1
> Jun 13 21:18:51 meije kernel: BTRFS info (device sda): found 51353 extents
> Jun 13 21:19:18 meije kernel: BTRFS info (device sda): found 1 extents
> Jun 13 21:19:23 meije kernel: BTRFS info (device sda): found 1 extents
> Jun 13 21:19:27 meije kernel: BTRFS info (device sda): found 1 extents
> Jun 13 21:19:32 meije kernel: BTRFS info (device sda): found 1 extents
> Jun 13 21:19:36 meije kernel: BTRFS info (device sda): found 1 extents
> Jun 13 21:19:40 meije kernel: BTRFS info (device sda): found 1 extents
> Jun 13 21:19:44 meije kernel: BTRFS info (device sda): found 1 extents
> Jun 13 21:19:48 meije kernel: BTRFS info (device sda): found 1 extents
> Jun 13 21:19:52 meije kernel: BTRFS info (device sda): found 1 extents
> 
> With the last line repeating.  So far there have been 9,347 of the
> "found 1 extents" messages with no other BTRFS messages in between.
> The amount of data on the missing disc does not seem to be decreasing
> now.
> 
> Does this seem like normal behaviour, or has it not got stuck in an
> infinite loop, i.e. is it finding the same extent over and over again?

Looks like a dead loop.

Would you please provide the kernel version please?

And have you tried cancel current balance and start a new one again?

Thanks,
Qu

>  What should I do?
> 
> Regards,
> Steve.
> 

Attachment: signature.asc
Description: OpenPGP digital signature


[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux