On Sat, Aug 6, 2011 at 4:37 AM, Liu Bo <liubo2009@xxxxxxxxxxxxxx> wrote: > I've fixed a bug and rebased this to the latest for-linus branch, > and with applying my previous posted patch: > > [PATCH] Btrfs: fix an oops of log replay > > , I also test this sub transaction patchset with > a) sysbench 0.4.12 tool and > b) Chris's synctest tool in both _crash_ and _uncrash_ cases, and it works well. > > Please test this and feel free to notice me if there are any problems. > Hope that it can get through with no bugs and be ready for merge this time :) > > === > I've been working to try to improve the write-ahead log's performance, > and I found that the bottleneck addresses in the checksum items, > especially when we want to make a random write on a large file, e.g a 4G file. > > Then a idea for this suggested by Chris is to use sub transaction ids and just > to log the part of inode that had changed since either the last log commit or > the last transaction commit. And as we also push the sub transid into the btree > blocks, we'll get much faster tree walks. As a result, we abandon the original > brute force approach, which is "to delete all items of the inode in log", > to making sure we get the most uptodate copies of everything, and instead > we manage to "find and merge", i.e. finding extents in the log tree and merging > in the new extents from the file. > > This patchset puts the above idea into code, and although the code is now more > complex, it brings us a great deal of performance improvement: > > in my sysbench "write + fsync" test: > > 451.01Kb/sec -> 4.3621Mb/sec > > In v2, thanks to Chris, we worked together to solve 2 bugs, and after that it > works as expected. > In v3, thanks to Josef, we simplify several code. > In v4, rebase to the latest for-linus branch, Chris hit two problems, and we > solve them. > > Since there are some vital changes in recent rc, like "kill trans_mutex" and > "use cur_trans", as David asked, I rebase the patchset to the latest for-linus > branch. > > More tests are welcome! > > > Liu Bo (12): > Revert "Btrfs: do not flush csum items of unchanged file data during > treelog" > Btrfs: introduce sub transaction stuff > Btrfs: update block generation if should_cow_block fails > Btrfs: modify btrfs_drop_extents API > Btrfs: introduce first sub trans > Btrfs: still update inode trans stuff when size remains unchanged > Btrfs: improve log with sub transaction > Btrfs: add checksum check for log > Btrfs: fix a bug of log check > Btrfs: kick off useless code > Btrfs: do not iput inode when inode is still in log > Btrfs: use the right generation number to read log_root_tree > > fs/btrfs/btrfs_inode.h | 12 ++- > fs/btrfs/ctree.c | 87 +++++++++++++------ > fs/btrfs/ctree.h | 5 +- > fs/btrfs/disk-io.c | 23 ++++-- > fs/btrfs/extent-tree.c | 10 ++- > fs/btrfs/file.c | 22 ++--- > fs/btrfs/inode.c | 39 ++++++--- > fs/btrfs/ioctl.c | 6 +- > fs/btrfs/relocation.c | 6 +- > fs/btrfs/transaction.c | 13 ++- > fs/btrfs/transaction.h | 19 ++++- > fs/btrfs/tree-defrag.c | 2 +- > fs/btrfs/tree-log.c | 225 ++++++++++++++++++++++++++++++++---------------- > 13 files changed, 312 insertions(+), 157 deletions(-) > I've had the v5 stack of patches in my kernel for about 3 weeks now. I've just been testing for general stability in a 3.0 series kernel, and I haven't run across any issues or obvious performance effects. I've been testing on both x86 and x86_64 installations in Desktop service without RAID. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html
