Re: [PATCH] Btrfs: deal with error on secondary log properly

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 08/28/2015 09:23 AM, Filipe David Manana wrote:
On Wed, Aug 26, 2015 at 8:06 PM, Josef Bacik <jbacik@xxxxxx> wrote:
On 08/25/2015 10:06 PM, Liu Bo wrote:

On Tue, Aug 25, 2015 at 01:09:43PM -0400, Josef Bacik wrote:

If we have an fsync at the same time in two seperate subvolumes we could
end up
with the tree log pointing at invalid blocks.  We need to notice if our
writeout


Mind to share more details of the problem?


It's the problem Filipe was trying to solve.  Say we fsync() on two
different subvols, one of them will race in and be the one who commits the
log root tree.  So process A waits on process B to add its log to the log
root tree and write out its log.  If we get an IO error while writing out
the log the log root tree will be pointing to invalid crap, and we also
won't return an error back to userspace.  We need to notice if there was an
error, turn on the transaction commit stuff since we've already updated the
the log root tree with our subvol log so we don't get garbage on the disk,
and we need to return an error to process B. Thanks,

Josef,

So the problem was that without forcing A to trigger a transaction
commit if B gets an error when writing one or more log tree
nodes/leafs, A could write a superblock pointing to a log root tree
for which not all nodes/leafs were persisted. Then B would fall back
to a transaction commit and everything would be ok - i.e. we would
only have a "small" time window where a superblock points to an
invalid log root tree.

So the fix here shouldn't be only to force A do a transaction commit
(call btrfs_set_log_full_commit())? Why do we need to make B return an
error to userspace and not fallback to a transaction commit too (as it
was before this change)? After all this is the kind of failure for
which we can fallback to a transaction commit without losing any inode
metadata (links, owner, group, xattrs, etc) nor file data.

The change looks good, just puzzled why we need to return the error to
userspace.


Ah well there's a reason in upcoming patches! Basically I'm moving the wait on ordered extents back into the sync log stuff so I can have my go fast stripes back, and when I do that we'll want to return an error since it could be from wait_ordered_extents. But you are right, if we only error out writing the metadata we can just commit the transaction and not return an error. I can change this bit to not return an error if we want, or wait for my other patches which will fix it up. Thanks,

Josef

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux