On Wed, Nov 13, 2019 at 11:27:23AM +0100, Johannes Thumshirn wrote:
> In btrfs_close_one_device() we're allocating a new device and if this
> fails we BUG().
>
> Move the allocation to the top of the function and return an error in case
> it failed.
>
> The BUG_ON() is temporarily moved to close_fs_devices(), the caller of
> btrfs_close_one_device() as further work is pending to untangle this.
>
> Signed-off-by: Johannes Thumshirn <jthumshirn@xxxxxxx>
> ---
> fs/btrfs/volumes.c | 27 +++++++++++++++++++++------
> 1 file changed, 21 insertions(+), 6 deletions(-)
>
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index 5ee26e7fca32..0a2a73907563 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -1061,12 +1061,17 @@ static void btrfs_close_bdev(struct btrfs_device *device)
> blkdev_put(device->bdev, device->mode);
> }
>
> -static void btrfs_close_one_device(struct btrfs_device *device)
> +static int btrfs_close_one_device(struct btrfs_device *device)
> {
> struct btrfs_fs_devices *fs_devices = device->fs_devices;
> struct btrfs_device *new_device;
> struct rcu_string *name;
>
> + new_device = btrfs_alloc_device(NULL, &device->devid,
> + device->uuid);
> + if (IS_ERR(new_device))
> + goto err_close_device;
> +
> if (test_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state) &&
> device->devid != BTRFS_DEV_REPLACE_DEVID) {
> list_del_init(&device->dev_alloc_list);
> @@ -1080,10 +1085,6 @@ static void btrfs_close_one_device(struct btrfs_device *device)
> if (device->bdev)
> fs_devices->open_devices--;
>
> - new_device = btrfs_alloc_device(NULL, &device->devid,
> - device->uuid);
> - BUG_ON(IS_ERR(new_device)); /* -ENOMEM */
> -
> /* Safe because we are under uuid_mutex */
> if (device->name) {
> name = rcu_string_strdup(device->name->str, GFP_NOFS);
> @@ -1096,18 +1097,32 @@ static void btrfs_close_one_device(struct btrfs_device *device)
>
> synchronize_rcu();
> btrfs_free_device(device);
> +
> + return 0;
> +
> +err_close_device:
> + btrfs_close_bdev(device);
> + if (device->bdev) {
> + fs_devices->open_devices--;
> + btrfs_sysfs_rm_device_link(fs_devices, device);
> + device->bdev = NULL;
> + }
I don't understand this part: the 'device' pointer is from the argument,
so the device we want to delete from the list and for that all the state
bit tests, bdev close, list replace rcu and synchronize_rcu should
happen -- in case we have a newly allocated new_device.
What I don't understand how the short version after label
err_close_device: is correct. The device is still left in the list but
with NULL bdev but rw_devices, missing_devices is untouched.
That a device closing needs to allocate memory for a new device instead
of reinitializing it again is stupid but with the simplified device
closing I'm not sure the state is well defined.