Re: [PATCH 2/6] Btrfs: fix race between device replace and block group removal

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, May 20, 2016 at 12:44 AM,  <fdmanana@xxxxxxxxxx> wrote:
> From: Filipe Manana <fdmanana@xxxxxxxx>
>
> When it's finishing, the device replace code iterates all extent maps
> representing block group and for each one that has a stripe that refers
> to the source device, it replaces its device with the target device.
> However when it replaces the source device with the target device it,
> the target device still has an ID of 0ULL (BTRFS_DEV_REPLACE_DEVID),
> only after its ID is changed to match the one from the source device.
> This leads to races with the chunk removal code that can temporarly see
> a device with an ID of 0ULL and then attempt to use that ID to remove
> items from the device tree and fail, causing a transaction abort:
>
> [ 9238.594364] BTRFS info (device sdf): dev_replace from /dev/sdf (devid 3) to /dev/sde finished
> [ 9238.594377] ------------[ cut here ]------------
> [ 9238.594402] WARNING: CPU: 14 PID: 21566 at fs/btrfs/volumes.c:2771 btrfs_remove_chunk+0x2e5/0x793 [btrfs]
> [ 9238.594403] BTRFS: Transaction aborted (error 1)
> [ 9238.594416] Modules linked in: btrfs crc32c_generic acpi_cpufreq xor tpm_tis tpm raid6_pq ppdev parport_pc processor psmouse parport i2c_piix4 evdev sg i2c_core se
> rio_raw pcspkr button loop autofs4 ext4 crc16 jbd2 mbcache sr_mod cdrom sd_mod ata_generic virtio_scsi ata_piix virtio_pci libata virtio_ring virtio e1000 scsi_mod fl
> oppy [last unloaded: btrfs]
> [ 9238.594418] CPU: 14 PID: 21566 Comm: btrfs-cleaner Not tainted 4.6.0-rc7-btrfs-next-29+ #1
> [ 9238.594419] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS by qemu-project.org 04/01/2014
> [ 9238.594421]  0000000000000000 ffff88017f1dbc60 ffffffff8126b42c ffff88017f1dbcb0
> [ 9238.594422]  0000000000000000 ffff88017f1dbca0 ffffffff81052b14 00000ad37f1dbd18
> [ 9238.594423]  0000000000000001 ffff88018068a558 ffff88005c4b9c00 ffff880233f60db0
> [ 9238.594424] Call Trace:
> [ 9238.594428]  [<ffffffff8126b42c>] dump_stack+0x67/0x90
> [ 9238.594430]  [<ffffffff81052b14>] __warn+0xc2/0xdd
> [ 9238.594432]  [<ffffffff81052b7a>] warn_slowpath_fmt+0x4b/0x53
> [ 9238.594434]  [<ffffffff8116c311>] ? kmem_cache_free+0x128/0x188
> [ 9238.594450]  [<ffffffffa04d43f5>] btrfs_remove_chunk+0x2e5/0x793 [btrfs]
> [ 9238.594452]  [<ffffffff8108e456>] ? arch_local_irq_save+0x9/0xc
> [ 9238.594464]  [<ffffffffa04a26fa>] btrfs_delete_unused_bgs+0x317/0x382 [btrfs]
> [ 9238.594476]  [<ffffffffa04a961d>] cleaner_kthread+0x1ad/0x1c7 [btrfs]
> [ 9238.594489]  [<ffffffffa04a9470>] ? btree_invalidatepage+0x8e/0x8e [btrfs]
> [ 9238.594490]  [<ffffffff8106f403>] kthread+0xd4/0xdc
> [ 9238.594494]  [<ffffffff8149e242>] ret_from_fork+0x22/0x40
> [ 9238.594495]  [<ffffffff8106f32f>] ? kthread_stop+0x286/0x286
> [ 9238.594496] ---[ end trace 183efbe50275f059 ]---
>
> The sequence of steps leading to this is like the following:
>
>               CPU 1                                           CPU 2
>
>  btrfs_dev_replace_finishing()
>
>    at this point
>    dev_replace->tgtdev->devid ==
>    BTRFS_DEV_REPLACE_DEVID (0ULL)
>
>    ...
>
>    btrfs_start_transaction()
>    btrfs_commit_transaction()
>
>                                                      btrfs_delete_unused_bgs()
>                                                        btrfs_remove_chunk()
>
>                                                          looks up for the extent map
>                                                          corresponding to the chunk
>
>                                                          lock_chunks() (chunk_mutex)
>                                                          check_system_chunk()
>                                                          unlock_chunks() (chunk_mutex)
>
>    locks fs_info->chunk_mutex
>
>    btrfs_dev_replace_update_device_in_mapping_tree()
>      --> iterates fs_info->mapping_tree and
>          replaces the device in every extent
>          map's map->stripes[] with
>          dev_replace->tgtdev, which still has
>          an id of 0ULL (BTRFS_DEV_REPLACE_DEVID)
>
>                                                          iterates over all stripes from
>                                                          the extent map
>
>                                                            --> calls btrfs_free_dev_extent()
>                                                                passing it the target device
>                                                                that still has an ID of 0ULL
>
>                                                            --> btrfs_free_dev_extent() fails
>                                                              --> aborts current transaction
>
>    finishes setting up the target device,
>    namely it sets tgtdev->devid to the value
>    of srcdev->devid (which is necessarily > 0)
>
>    frees the srcdev
>
>    unlocks fs_info->chunk_mutex
>
> So fix this by taking the device list mutex while processing the stripes
> for the chunk's extent map. This is similar to the race between device
> replace and block group creation that was fixed by commit 50460e37186a
> ("Btrfs: fix race when finishing dev replace leading to transaction abort").
>
> Signed-off-by: Filipe Manana <fdmanana@xxxxxxxx>


Reviewed-by: Josef Bacik <jbacik@xxxxxx>

Thanks,

Josef
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux