On Wed, 2009-04-08 at 10:21 -0500, Patrick Goetz wrote: > Hi - > > I've been trying to get up to speed on new linux filesystem efforts and > stumbled upon the following post from a btrfs developer to lwn.net: > > ----------------------------------------------------------- > Posted Mar 16, 2009 16:50 UTC (Mon) by masoncl (subscriber, #47138) > The btrfs data=ordered implementation is different from ext34 and > reiserfs. It decouples data writes from the metadata transaction, and > simply updates the metadata for file extents after the data blocks are > on disk. > > This means the transaction commit doesn't have to wait for the data > blocks because the metadata for the file extents always reflects extents > that are actually on disk. > > When you rename one file over another, the destination file is > atomically replaced with the new file. The new file is fully consistent > with the data that has already been written, which in the worst case > means it has a size of zero after a crash. > ... > ----------------------------------------------------------- > > Frankly this comment doesn't make any sense to me at all. First of all, > "this means the transaction commit doesn't have to wait for the data > blocks...". Is the data ordered or not? If you commit the transaction > -- i.e. update the metadata before the data blocks are committed -- then > the operations are occurring out of order and ext4 > open-write-close-rename mayhem ensues. > > Second, atomicity in this context means that when executing a rename, > you always get either the old data (exactly) or the new data (exactly) > even after a crash. The "worst case scenario" described above -- a size > of zero after crash -- precisely violates atomicity. > > Any comments on this? There isn't a quick and short description for this. Before 2.6.30, btrfs would allow renames to result in zero length files after a crash. Filesystem developers have always considered the rename-is-atomic requirement to refer only to the directory entries themselves. With 2.6.30, extra ordering is added to btrfs, making sure that metadata and data are both atomically replaced during a rename. In other words, for renames it will work like ext3 data=ordered mode. -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html
