Re: very slow "btrfs dev delete" 3x6Tb, 7Tb of data

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Dec 25, 2019 at 3:42 PM Leszek Dubiel <leszek@xxxxxxxxx> wrote:
>
>
> Hello!
>
> I have a server: 3 disks, 6TB each, total 17TB space, occupied with data
> 6TB.
>
>
> One of the disks got smart errors:
>
>      197 Current_Pending_Sector  0x0022   100   100   000 Old_age
> Always       -       16
>      198 Offline_Uncorrectable   0x0008   100   100   000 Old_age
> Offline      -       2
>
> And didn't pass tests:
>
>      Num  Test_Description    Status                  Remaining
> LifeTime(hours)  LBA_of_first_error
>      # 1  Extended offline    Completed: read failure 90%
> 3575         -
>      # 2  Short offline       Completed without error 00%
> 3574         -
>      # 3  Extended offline    Completed: read failure 90%
> 3574         -
>      # 4  Extended offline    Completed: read failure 90%
> 3560         -
>      # 5  Extended offline    Completed: read failure 50%
> 3559         -
>
> I decided to remove that drive from BTRFS system:


What is the SCT ERC for each drive? This applies to mdadm, lvm, and
btrfs RAID. While you are not using raid for data, you are using it
for metadata. And also mismatching SCT ERC with kernel's command timer
is not good for any configuration, the SCT ERC must be shorter than
the kernel command timer or inevitably bad sector errors are masked by
the kernel resetting the link to the device.

https://raid.wiki.kernel.org/index.php/Timeout_Mismatch

And when was the last time a scrub was done on the volume? Were there
any errors reported by either user space tools or kernel? And what
were they?

I do agree, however, that this configuration should have higher
performing reads from the device being deleted, unless part of the
reason why it's so slow is that one or more drives is trying to do
deep recoveries.

My suggestion for single profile multiple device is to leave the per
drive SCT ERC disabled (or a high value, e.g. 1200 deciseconds) and
also change the per block device command timer (this is a kernel timer
set per device) to be at least 120 seconds. This will allow the drive
to do deep recovery, which will make it dog slow, but if necessary
it'll improve the chances of getting data off the drives. If you don't
care about getting data off the drives, i.e. you have a backup, then
you can set the SCT ERC value to 70 deciseconds. Any bad sector errors
will quickly result in a read error, Btrfs will report the affected
file. IF it's metadata that's affected, it'll get a good copy from the
mirror, and fix up the bad copy, automatically.



-- 
Chris Murphy



[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux