Re: Possible MD RAID 5 or sata_sil driver issues.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I was pretty sure both md and sil drivers are solid... just didn't
know if their might be some weird interactions between the two.  I
might try blktrace to narrow it down.

When I get some time, I will try downgrading to some other SIL bioses.
 I do see some exceptions for certain hard drives in the SIL kernel
code, so it could be interactions with my particular drives.

If it is RAM on the hard drive, I have no real idea how to check this
one.  Maybe disabling the read/write cache to minimize the affect.

System RAM is good, and has been tested 24+ hours with memtest+.  This
is an older system, and I am contemplating just building another
system with a MB with many built in sata ports.

Is there any easy way to figure out how MD maps it to a particular
drive vs. me having to do manual math, etc?



On Thu, Sep 29, 2011 at 4:50 PM, Jim Paris <jim@xxxxxxxx> wrote:
> Jim Mills wrote:
>> Possible MD RAID 5 or sata_sil driver issues.
>>
>> Summary:
>> I created an XFS filesystem on top of a MD RAID5 across 4 SATA drives
>> connected to a single SIL3114 PCI card.
>>
>> Problem:  I am seeing errors and corrupted files, as checked by CRC
>> and PAR2.  This is a brand new filesystem, on new drives.  The drives
>> do not have smart errors, and have even been zeroed out, as well as
>> reading all blocks with offline smart checks, badblocks, and even
>> ddrescue.  This also shows up in the mismatch_cnt after sending check
>> to sync_action.  Sending repair to sync_action, and then later sending
>> check doesn't fix it.
>>
>> I have seen this issue regardless if it is XFS or even EXT4, so I am
>> not assuming it is not related to the filesystem.  Although I did note
>> that MD didn't start recovering after being created until a filesystem
>> was created.
>>
>> I do not see these issues when using the drives without RAID.
>>
>> This leaves me with the only common pieces is the card and the md
>> software, which is why I am writing both of you.  It might be
>> something weird with the interaction of the two.
>>
>> I have tried looking at the sata_sil code, and don't see an easy way
>> to enable debugging via insmod.  I have not tried turning on any debug
>> in md, and can't unload it as my root, etc. is on a md mirror.
>>
>> Linux Kernel: SUSE 3.0.4-2-desktop.
>> SiI 3114 IDE BIOS     4/22/2008       5.5.0.0
>>
>> Please let me know what additional details would be helpful, and if I
>> should point this at a particular email distribution.
>
> Just some random input from a bystander:
>
> The md raid5 code and sata_sil drivers can usually be considered
> really solid, they're very commonly used and well-tested.
>
> I had similar issues once, with file corruption sometimes showing up
> on a MD raid5.  The disks always tested out fine individually (writing
> pseudorandom data and reading it back), and they were still fine with
> a raid1 across all disks, so I thought it might be raid5 related.  It
> turns out that it was actually bad RAM on one of the HDDs, and the
> glitch was triggered only by certain access patterns that showed up
> while writing to the raid5 array.
>
> It's probably worth looking into hardware issues.  Maybe your power
> supply isn't good enough and these particular access patterns trigger
> a problem.  Or system RAM could be bad, or maybe your motherboard has
> problems with heavy traffic on the PCI bus, etc.
>
> It could help to figure out exactly what the corruption is, by writing
> known data to the entire raid5 array and seeing where it differs when
> you read it back.
>
> -jim
>
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Filesystems]     [Linux SCSI]     [Linux RAID]     [Git]     [Kernel Newbies]     [Linux Newbie]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Samba]     [Device Mapper]

  Powered by Linux