Re: Questions about bitrot and RAID 5/6

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 01/24/2014 12:59 PM, Chris Murphy wrote:
> 
> On Jan 24, 2014, at 10:03 AM, Phil Turmel <philip@xxxxxxxxxx> wrote:
>>> w many bits of loss occur with one URE?
>> 
>> Complete physical sector.
> 
> 
> A complete physical sector represents 512 bytes / 4096 bits, or in
> the case of AF disks 4096 bytes / 32768 bits, of loss for one URE.
> Correct?
> 
> So a URE is either 4096 bits nonrecoverable, or 32768 bits
> nonrecoverable, for HDDs. Correct?

Yes.  Note that the specification is for an *event*, not for a specific
number of bits lost.  The error rate is not "bits lost per bits read",
it is "bits lost event per bits read".

>>>> Your comments suggest you've completely discounted the fact
>>>> that published URE rates are now close to, or within, drive
>>>> capacities.
>>>> 
>>>> Spend some time with the math and you will be very concerned.
>>> 
>>> Yeah I tried that a year ago and when it came to really super
>>> basic questions, no one was willing to answer them and the thread
>>> died as if we don't actually know what we're talking about. So I
>>> think some rather basic definitions are in order and an agreement
>>> that we don't get to redefine mathematics by saying a max error
>>> rate is a mean.
>>> 
>>> http://www.spinics.net/lists/raid/msg41669.html
>> 
>> I participated in that thread.  Some of your comments there imply
>> that the math is simple.  It's not (unless you are whiz with
>> statistics). Look at the Poisson distribution I referenced and the
>> computation examples I gave.
> 
> At the moment a Poisson distribution is out of scope because my
> questions have nothing to do with how often, when, or how many, such
> URE's will occur. At the moment I only want complete utter clarity on
> what a URE/nonrecoverable error (not even the rate) is in terms of
> quantity. That's my main problem.

Ok, but the earlier arguments in this thread over the relative merits of
raid5 versus raid6 very much depend on the error rate.

>> Note that a statement about the rate of a randomly occurring error
>> is implicitly stating an average.
> 
> Except that it has only one limiter, with the next notch a whole
> order magnitude less error. So I don't see how you get an average
> unless you're willing to just make assumptions about the bottom end.
> It doesn't make sense that a manufacturer would state a maximum error
> rate of X and then target that as an average. The average is
> certainly well below the max.

You are confused.  The specification is a maximum of an average.  An
average that changes with time, and cannot be measured from single events.

Phil
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux