Re: Software RAID and TRIM

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]


On Wed, 29 Jun 2011 11:32:55 +0100 (BST) Tom De Mulder <tdm27@xxxxxxxxx>
wrote:

> On Tue, 28 Jun 2011, Mathias Burén wrote:
> 
> > IIRC md can already pass TRIM down, but I think the filesystem needs
> > to know about the underlying architecture, or something, for TRIM to
> > work in RAID.
> 
> Yes, it's (usually/ideally) the filesystem's job to invoke the TRIM 
> command, and that's what ext4 can do. I have it working just fine on 
> single drives, but for reasons of service reliability would need to get 
> RAID to work.
> 
> I tried (on an admittedly vanilla Ubuntu 2.6.38 kernel) the same on a two 
> drive RAID1 md and it definitely didn't work (the blocks didn't get marked 
> as unused and zeroed).
> 
> > There's numerous discussions on this in the archives of
> > this mailing list.
> 
> Given how fast things move in the world of SSDs at the moment, I wanted to 
> check if any progress was made since. :-) I don't seem to be able to find 
> any reference to this in recent kernel source commits (but I'm a complete 
> amateur when it comes to git).


Trim support for md is a long way down my list of interesting projects (and
no-one else has volunteered).

It is not at all straight forward to implement.

For stripe/parity RAID, (RAID4/5/6) it is only safe to discard full stripes at
a time, and the md layer would need to keep a record of which stripes had been
discarded so that it didn't risk trusting data (and parity) read from those
stripes.  So you would need some sort of bitmap of invalid stripes, and you
would need the fs to discard in very large chunks for it to be useful at all.

For copying RAID (RAID1, RAID10) you really need the same bitmap.  There
isn't the same risk of reading and trusting discarded parity, but a resync
which didn't know about discarded ranges would undo the discard for you.

So is basically requires another bitmap to be stored with the metadata, and a
fairly fine-grained bitmap it would need to be.  Then every read and resync
checks the bitmap and ignores or returns 0 for discarded ranges, and every
write needs to check and if the range was discard, clear the bit and write to
the whole range.

So: do-able, but definitely non-trivial.

NeilBrown
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[ATA RAID]     [Linux SCSI Target Infrastructure]     [Managing RAID on Linux]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device-Mapper]     [Kernel]     [Linux Books]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Photos]     [Yosemite Photos]     [Yosemite News]     [AMD 64]     [Linux Networking]

Add to Google Powered by Linux