On 01/01/2018 08:44 PM, Stirling Westrup wrote:
> On Mon, Jan 1, 2018 at 7:15 AM, Kai Krakow <hurikhan77@xxxxxxxxx> wrote:
>> Am Mon, 01 Jan 2018 18:13:10 +0800 schrieb Qu Wenruo:
>>
>>> On 2018年01月01日 08:48, Stirling Westrup wrote:
>>>>
>>>> 1) I had a 2T drive die with exactly 3 hard-sector errors and those 3
>>>> errors exactly coincided with the 3 super-blocks on the drive.
>>>
>>> WTF, why all these corruption all happens at btrfs super blocks?!
>>>
>>> What a coincident.
>>
>> Maybe it's a hybrid drive with flash? Or something that went wrong in the
>> drive-internal cache memory the very time when superblocks where updated?
>>
>> I bet that the sectors aren't really broken, just the on-disk checksum
>> didn't match the sector. I remember such things happening to me more than
>> once back in the days when drives where still connected by molex power
>> connectors. Those connectors started to get loose over time, due to
>> thermals or repeated disconnect and connect. That is, drives sometimes
>> started to no longer have a reliable power source which let to all sorts
>> of very strange problems, mostly resulting in pseudo-defective sectors.
>>
>> That said, the OP would like to check the power supply after this
>> coincidence... Maybe it's aging and no longer able to support all four
>> drives, CPU, GPU and stuff with stable power.
>
> You may be right about the cause of the error being a power-supply issue.
> For those that are curious, the drive that failed was a Seagate Barracuda
> LP 2000G drive (ST2000DL003).
>
Forgive me if it's not relevant, but I own quite a few disks from that
series, like:
root@iomega-ordo:~# hdparm -i /dev/sda
/dev/sda:
Model=ST2000DM001-1CH164, FwRev=CC27, SerialNo=Z1E6EV85
Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs RotSpdTol>.5% }
root@iomega-acm:~# smartctl -d sat -a /dev/sda
=== START OF INFORMATION SECTION ===
Device Model: ST3000DM001-9YN166
Serial Number: S1F0PGQJ
LU WWN Device Id: 5 000c50 0516fce00
Firmware Version: CC4B
root@iomega-europol:~# smartctl -d sat -a /dev/sda
smartctl 5.41 2011-06-09 r3365 [armv5tel-linux-2.6.31.8] (local build)
=== START OF INFORMATION SECTION ===
Device Model: ST3000DM001-9YN166
Serial Number: Z1F1H5KA
LU WWN Device Id: 5 000c50 04ec18fda
Different locations, different environments, different boards one more
stable (the power) than others.
I replaced at least three four in the past 3 years. All of them died
because heavy random wirte workload. (rsnapshot, massive cp -al of
millions of files every day). In my case every time bad sectors occurred
too, but I didn't analyze where exactly, it was just a backup
destination drive. I pretty convinced it could be ext2 supers too though.
--
PGP Public Key (RSA/4096b):
ID: 0xF2C6EA10
SHA-1: 51DA 40EE 832A 0572 5AD8 B3C0 7AFF 69E1 F2C6 EA10
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html