On Tue, Jan 14, 2020 at 01:07:22PM -0800, Josef Bacik wrote: > >> - num_devices = btrfs_num_devices(fs_info); > >> + /* > >> + * rw_devices can be messed with by rm_device and device replace, so > >> + * take the chunk_mutex to make sure we have a relatively consistent > >> + * view of the fs at this point. > > > > Well, what does 'relatively consistent' mean here? There are enough > > locks and exclusion that device remove or replace should not change the > > value until btrfs_balance ends, no? > > > > Again I don't have the code in front of me, but there's nothing at this point to > stop us from running in at the tail end of device replace or device rm. This should be prevented by the EXCL_OP mechanism, so even the end of device remove or replace will not be running at this time because it cannot even start. > The > mutex keeps us from getting weirdly inflated values when we increment and > decrement at the end of device replace, but there's nothing (that I can > remember) that will stop rw devices from changing right after we check it, thus > relatively. rw_devices is changed in a handful of places on a mounted filesystem, not counting device open/close. Device remove and replace are excluded from running at that time, rw_devices can't change at this point of balance. btrfs_dev_replace_finishing - when removing srcdev, rw_devices-- - when adding the target device as new, rw_devices++ btrfs_rm_device - rw_devices-- btrfs_init_new_device (called by device add) - rw_devices++ So the chunk mutex is either redundant or there's something I'm missing.
