Re: kernel BUG at btrfs/scrub.c:638 (kernel 3.6.5)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Sure Stefan. Thank you! Looking forward to the fix.


On Wed, Nov 14, 2012 at 4:46 PM, Stefan Behrens
<sbehrens@xxxxxxxxxxxxxxxx> wrote:
> On Wed, 14 Nov 2012 15:27:28 +0100, Joeri Vanthienen wrote:
>> Hi,
>>
>> I was testing a new HBA (lsi SAS2008 based) in combination with BTRFS
>> and kernel 3.6.5
>>
>> #mkfs.btrfs -m raid1 -d raid1 /dev/sdf /dev/sdg
>> # btrfs filesystem show
>> Label: none  uuid: fe542409-7346-4ea1-af04-fd1765b6a1a2
>>  Total devices 2 FS bytes used 123.02MB
>>  devid    2 size 298.09GB used 19.01GB path /dev/sdg
>>  devid    1 size 298.09GB used 19.03GB path /dev/sdf
>>
>> Btrfs v0.19+
>>
>> I was simulating a faulty disk by physical removing the disk and
>> connecting again.
>> After reconnecting the disk, the disk appeared again but I get some
>> kernel BUG report in dmesg after running a scrub.
>>
>> [  936.138067] Btrfs loaded
>> [  936.138252] device fsid fe542409-7346-4ea1-af04-fd1765b6a1a2 devid
>> 1 transid 3 /dev/sdf
>> [  936.190574] device fsid fe542409-7346-4ea1-af04-fd1765b6a1a2 devid
>> 2 transid 3 /dev/sdg
>> [  950.208483] device fsid fe542409-7346-4ea1-af04-fd1765b6a1a2 devid
>> 1 transid 4 /dev/sdf
>> [  950.216385] btrfs: disk space caching is enabled
>> [ 1079.577103] mpt2sas0: log_info(0x3003010a): originator(IOP),
>> code(0x03), sub_code(0x010a)
>> [ 1079.577151] mpt2sas0: log_info(0x3003010a): originator(IOP),
>> code(0x03), sub_code(0x010a)
>> [ 1079.577416] mpt2sas0: log_info(0x30030101): originator(IOP),
>> code(0x03), sub_code(0x0101)
>>
>>
>> => after disconnection of one disk
>> [ 1253.444417] sd 8:0:1:0: [sdg] Synchronizing SCSI cache
>> [ 1253.444442] sd 8:0:1:0: [sdg]
>> [ 1253.444444] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK
>> [ 1253.444588] mpt2sas0: removing handle(0x000a), sas_addr(0x4433221106000000)
>>
>> testsan:/btrfs # btrfs filesystem show
>> Label: none  uuid: fe542409-7346-4ea1-af04-fd1765b6a1a2
>>  Total devices 2 FS bytes used 123.02MB
>>  devid    1 size 298.09GB used 19.03GB path /dev/sdf
>>  *** Some devices missing
>>
>> Btrfs v0.19+
>>
>> => after connecting the same disk again
>> => it seems that the disk is now sdh instead of sdg, could be that
>> I've connected the disk on another port of the HBA
>>
>> # btrfs filesystem show
>> Label: none  uuid: fe542409-7346-4ea1-af04-fd1765b6a1a2
>>  Total devices 2 FS bytes used 123.02MB
>>  devid    2 size 298.09GB used 19.01GB path /dev/sdh
>>  devid    1 size 298.09GB used 19.03GB path /dev/sdf
>>
>> Btrfs v0.19+
>>
>> After running a scrub command I get now the following errors in dmesg:
>>
>> [ 1253.444417] sd 8:0:1:0: [sdg] Synchronizing SCSI cache
>> [ 1253.444442] sd 8:0:1:0: [sdg]
>> [ 1253.444444] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK
>> [ 1253.444588] mpt2sas0: removing handle(0x000a), sas_addr(0x4433221106000000)
>> [ 1385.440298] scsi 8:0:2:0: Direct-Access     ATA      WDC
>> WD3200AAJS-0 3E01 PQ: 0 ANSI: 6
>> [ 1385.440307] scsi 8:0:2:0: SATA: handle(0x000a),
>> sas_addr(0x4433221106000000), phy(6), device_name(0x0000000000000000)
>> [ 1385.440310] scsi 8:0:2:0: SATA:
>> enclosure_logical_id(0x500605b0054dc1f0), slot(5)
>> [ 1385.440415] scsi 8:0:2:0: atapi(n), ncq(y), asyn_notify(n),
>> smart(y), fua(y), sw_preserve(y)
>> [ 1385.440421] scsi 8:0:2:0: qdepth(32), tagged(1), simple(0),
>> ordered(0), scsi_level(7), cmd_que(1)
>> [ 1385.440627] sd 8:0:2:0: Attached scsi generic sg0 type 0
>> [ 1385.441276] sd 8:0:2:0: [sdh] 625142448 512-byte logical blocks:
>> (320 GB/298 GiB)
>> [ 1385.444743] sd 8:0:2:0: [sdh] Write Protect is off
>> [ 1385.444747] sd 8:0:2:0: [sdh] Mode Sense: 7f 00 10 08
>> [ 1385.445860] sd 8:0:2:0: [sdh] Write cache: enabled, read cache:
>> enabled, supports DPO and FUA
>> [ 1385.464525]  sdh: unknown partition table
>> [ 1385.472633] sd 8:0:2:0: [sdh] Attached SCSI disk
>> [ 1593.048743] ------------[ cut here ]------------
>> [ 1593.050188] kernel BUG at
>> /usr/src/packages/BUILD/kernel-default-3.6.5/linux-3.6/fs/btrfs/scrub.c:638!
>> [ 1593.051654] invalid opcode: 0000 [#1] SMP
>> [ 1593.052712] Modules linked in: btrfs zlib_deflate libcrc32c
>> af_packet cpufreq_conservative cpufreq_userspace cpufreq_powersave
>> dm_mod snd_hda_codec_hdmi iTCO_wdt snd_hda_codec_realtek gpio_ich
>> iTCO_vendor_support sg i2c_i801 acpi_cpufreq mperf coretemp serio_raw
>> pcspkr sr_mod cdrom snd_hda_intel mei lpc_ich mfd_core e1000e
>> kvm_intel kvm microcode snd_hda_codec snd_hwdep snd_pcm snd_timer snd
>> usb_storage tpm_tis tpm hid_generic wmi usbhid soundcore
>> snd_page_alloc tpm_bios edd autofs4 uhci_hcd ehci_hcd usbcore
>> usb_common i915 drm_kms_helper drm i2c_algo_bit video button processor
>> thermal_sys scsi_dh_hp_sw scsi_dh_rdac scsi_dh_alua scsi_dh_emc
>> scsi_dh mpt2sas scsi_transport_sas raid_class ata_generic
>> [ 1593.052712] CPU 2
>> [ 1593.052712] Pid: 2823, comm: btrfs-scrub-1 Not tainted
>> 3.6.5-0-default #1 Acer Veriton M67WS/EQ45M
>> [ 1593.052712] RIP: 0010:[<ffffffffa0526032>]  [<ffffffffa0526032>]
>> scrub_handle_errored_block+0x972/0x980 [btrfs]
>> [ 1593.052712] RSP: 0018:ffff88022d6c1ca0  EFLAGS: 00010246
>> [ 1593.052712] RAX: 0000000000000007 RBX: ffff88022d012800 RCX: 0000000000010000
>> [ 1593.052712] RDX: 0000000000000000 RSI: ffff88022d4a79a0 RDI: ffff88022d012800
>> [ 1593.052712] RBP: ffff88022d4a71f0 R08: ffff88022d6c0000 R09: dead000000100100
>> [ 1593.052712] R10: dead000000200200 R11: 0000000000000001 R12: 0000000000000001
>> [ 1593.052712] R13: 0000000000000000 R14: ffff88022d4a7278 R15: 0000000000000000
>> [ 1593.052712] FS:  0000000000000000(0000) GS:ffff88023bd00000(0000)
>> knlGS:0000000000000000
>> [ 1593.052712] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
>> [ 1593.052712] CR2: 00007f014f8dc000 CR3: 0000000001a0c000 CR4: 00000000000407e0
>> [ 1593.052712] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> [ 1593.052712] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>> [ 1593.052712] Process btrfs-scrub-1 (pid: 2823, threadinfo
>> ffff88022d6c0000, task ffff88022517a180)
>> [ 1593.052712] Stack:
>> [ 1593.052712]  0000000000000300 ffff88023bd0e000 0000000300000000
>> 0000000000001000
>> [ 1593.052712]  ffff88023bd13140 ffff88022e09f000 000000010004ee83
>> ffff88022d012800
>> [ 1593.052712]  ffff88022517a620 ffff8802253e8000 0000000000010000
>> 0000000000000000
>> [ 1593.052712] Call Trace:
>> [ 1593.052712]  [<ffffffffa05265bc>] scrub_bio_end_io_worker+0x57c/0x720 [btrfs]
>> [ 1593.052712]  [<ffffffffa0502f83>] worker_loop+0x153/0x540 [btrfs]
>> [ 1593.052712]  [<ffffffff81065645>] kthread+0x85/0x90
>> [ 1593.052712]  [<ffffffff81568034>] kernel_thread_helper+0x4/0x10
>> [ 1593.052712] Code: c7 e4 35 54 a0 e8 9f d8 ff ff e9 7b ff ff ff 48
>> 8b 74 24 38 48 c7 c7 cb 35 54 a0 e8 89 d8 ff ff e9 0d fd ff ff 0f 0b
>> 0f 0b 0f 0b <0f> 0b 66 66 66 2e 0f 1f 84 00 00 00 00 00 41 57 41 56 49
>> 89 fe
>> [ 1593.052712] RIP  [<ffffffffa0526032>]
>> scrub_handle_errored_block+0x972/0x980 [btrfs]
>> [ 1593.052712]  RSP <ffff88022d6c1ca0>
>> [ 1593.109840] ---[ end trace 6f23598a7da7ea0c ]---
>> [ 1983.558609] mpt2sas0: log_info(0x3003010a): originator(IOP),
>> code(0x03), sub_code(0x010a)
>> [ 1983.560570] mpt2sas0: log_info(0x3003010a): originator(IOP),
>> code(0x03), sub_code(0x010a)
>> [ 1983.562707] mpt2sas0: log_info(0x30030101): originator(IOP),
>> code(0x03), sub_code(0x0101)
>>
>> Scrub keeps running...
>> # btrfs scrub status /btrfs
>> scrub status for fe542409-7346-4ea1-af04-fd1765b6a1a2
>>  scrub started at Wed Nov 14 15:07:33 2012, running for 840 seconds
>>  total bytes scrubbed: 0.00 with 0 errors
>>
>> What is going on ?
>
>
> This issue is reproducible here with v3.6.5 and with the latest 3.7-rc as well. I'll prepare a fix for btrfs-next.
>
> Thank you for finding and reporting this issue! I'll add a Reported-by tag with your name and address if you don't mind?
>
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux