On Mon, Jan 6, 2020 at 4:14 AM Leszek Dubiel <leszek@xxxxxxxxx> wrote: > > > > W dniu 02.01.2020 o 22:57, Chris Murphy pisze: > > > but I would say that in retrospect it would have been better to NOT > > delete the device with a few bad sectors, and instead use `btrfs > > replace` to do a 1:1 replacement of that particular drive. > > > Tested "replace" on ahother server: > > root@zefir:~# btrfs replace start /dev/sde1 /dev/sdb3 / > > and speed was quite normal: > > 1.49 TiB * ( 1024 * 1024 MiB/TiB ) / ( 4.5 hours * 3600 sec/hour ) > = 1.49 * ( 1024 * 1024 ) / ( 4.5 * 3600 ) = 96.44 MiB / sec > > > Questions: > > 1. it is a little bit confusing that kerner reports sdc1 and sde1 on the > same line: "BTRFS warning (device sdc1): i/o error ... on dev > /dev/sde1".... Can you provide the entire line? It's probably already confusing but the ellipses makes it more confusing. > > # reduce slack > root@zefir:~# btrfs fi resize 4:max / > Resize '/' of '4:max' > root@zefir:~# btrfs dev usage / > ... > /dev/sdb3, ID: 4 > Device size: 1.80TiB > Device slack: 3.50KiB <<<<<<<<<<<<<<<<<<<< Maybe the partition isn't aligned on a 4KiB boundary? *shrug* But yeah one gotcha with 'btrfs replace' is that the replacement must be equal to or bigger than the drive being replaced; and once complete, the file system is not resized to fully utilize the replacement drive. That's intentional because by default you may want allocations to have the same balance as with the original device. If you resize to max, Btrfs will favor allocations to the drive with the most free space. > Jan 5 13:52:09 zefir kernel: [1291932.446093] BTRFS warning (device > sdc1): i/o error at logical 11658111352832 on dev /dev/sde1, physical > 867246145536: metadata leaf (level 0) in tree 9109477097472 Ahh yeah I see what you mean. I think that's confusing also. The error is on sde1. But I guess why sdc1 is reported first is probably to do with the device the kernel considers mounted, there's not really a good facility (or maybe it's in the newer VFS mount code, not sure) for showing two devices mounted on a single mount point. > [/dev/sda1].write_io_errs 10418 > [/dev/sda1].read_io_errs 227 > [/dev/sda1].flush_io_errs 117 > [/dev/sda1].corruption_errs 77 > [/dev/sda1].generation_errs 47 This isn't good either. I'd keep an eye on that. read errors can be fixed up if there's a good copy, Btrfs will use the good copy and overwrite the bad sector, *if* SCT ERC is lower duration than SCSI command timer. But write and flush errors are bad. You need to find out what that's about. -- Chris Murphy
