|[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]|
On Thu, 2007-04-05 at 23:37, Greg Smith wrote: > On Thu, 5 Apr 2007, Scott Marlowe wrote: > > > On Thu, 2007-04-05 at 14:30, James Mansion wrote: > >> Can you cite any statistical evidence for this? > > Logic? > > OK, everyone who hasn't already needs to read the Google and CMU papers. > I'll even provide links for you: > > http://www.cs.cmu.edu/~bianca/fast07.pdf > http://labs.google.com/papers/disk_failures.pdf > > There are several things their data suggests that are completely at odds > with the lore suggested by traditional logic-based thinking in this area. > Section 3.4 of Google's paper basically disproves that "mechanical devices > have decreasing MTBF when run in hotter environments" applies to hard > drives in the normal range they're operated in. On the google: The google study ONLY looked at consumer grade drives. It did not compare them to server class drives. This is only true when the temperature is fairly low. Note that the drive temperatures in the google study are <=55C. If the drive temp is below 55C, then the environment, by extension, must be lower than that by some fair bit, likely 10-15C, since the drive is a heat source, and the environment the heat sink. So, the environment here is likely in the 35C range. Most server drives are rated for 55-60C environmental temperature operation, which means the drive would be even hotter. As for the CMU study: It didn't expressly compare server to consumer grade hard drives. Remember, there are server class SATA drives, and there were (once upon a time) consumer class SCSI drives. If they had separated out the drives by server / consumer grade I think the study would have been more interesting. But we just don't know from that study. Personal Experience: In my last job we had three very large storage arrays (big black refrigerator looking boxes, you know the kind.) Each one had somewhere in the range of 150 or so drives in it. The first two we purchased were based on 9Gig server class SCSI drives. The third, and newer one, was based on commodity IDE drives. I'm not sure of the size, but I believe they were somewhere around 20Gigs or so. So, this was 5 or so years ago, not recently. We had a cooling failure in our hosting center, and the internal temperature of the data center rose to about 110F to 120F (43C to 48C). We ran at that temperature for about 12 hours, before we got a refrigerator on a flatbed brought in (btw, I highly recommend Aggreko if you need large scale portable air conditioners or generators) to cool things down. In the months that followed the drives in the IDE based storage array failed by the dozens. We eventually replaced ALL the drives in that storage array because of the failure rate. The SCSI based arrays had a few extra drives fail than usual, but nothing too shocking. Now, maybe now Seagate et. al. are making their consumer grade drives from yesterday's server grade technology, but 5 or 6 years ago that was not the case from what I saw. > Your comments about > server hard drives being rated to higher temperatures is helpful, but > conclusions drawn from just thinking about something I don't trust when > they conflict with statistics to the contrary. Actually, as I looked up some more data on this, I found it interesting that 5 to 10 years ago, consumer grade drives were rated for 35C environments, while today consumer grade drives seem to be rated to 55C or 60C. Same as server drives were 5 to 10 years ago. I do think that server grade drive tech has been migrating into the consumer realm over time. I can imagine that today's high performance game / home systems with their heat generating video cards and tendency towards RAID1 / RAID0 drive setups are pushing the drive manufacturers to improve reliability of consumer disk drives. > The main thing I wish they'd published is breaking some of the statistics > down by drive manufacturer. For example, they suggest a significant > number of drive failures were not predicted by SMART. I've seen plenty of > drives where the SMART reporting was spotty at best (yes, I'm talking > about you, Maxtor) and wouldn't be surprised that they were quiet right up > to their bitter (and frequent) end. I'm not sure how that factor may have > skewed this particular bit of data. I too have pretty much given up on Maxtor drives and things like SMART or sleep mode, or just plain working properly. In recent months, we had an AC unit fail here at work, and we have two drive manufacturers for our servers. Manufacturer F and S. The drives from F failed at a much higher rate, and developed lots and lots of bad sectors, the drives from manufacturer S, OTOH, have not had an increased failure rate. While both manufacturers claim that their drives can survive in an environment of 55/60C, I'm pretty sure one of them was lying. We are silently replacing the failed drives with drives from manufacturer S. Based on experience I think that on average server drives are more reliable than consumer grade drives, and can take more punishment. But, the variables of manufacturer, model, and the batch often make even more difference than grade.
[Postgresql General] [Postgresql PHP] [PHP Users] [PHP Home] [PHP on Windows] [Kernel Newbies] [PHP Classes] [PHP Books] [PHP Databases] [Home] [Yosemite]