Re: Interrupted and resumed scrubs seem to have caused filesystem to go readonly (EFBIG error)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 2020/1/2 上午7:35, Graham Cobb wrote:
> I have a problem on one BTRFS filesystem. It is not a critical
> filesystem (it is used for backups) and I have not yet tried even
> unmounting and remounting, let alone a "btrfs check".
> 
> The problem seems to be that after several iterations of running 'btrfs
> scrub' for 30 minutes, then pausing for a while, then resuming the
> scrub, I got a transaction aborted with an EFBIG error and a warning in
> the kernel log. The fs went readonly, and transid verify errors are now
> reported. The original log extract is available at
> http://www.cobb.uk.net/kern.log.bug-010120 but I have pasted the key
> part below.

EFBIG in btrfs is very rare, and can only be caused by too many system
chunks.

The most common reason is the chunk pre-alllocation for scrub, which
also matches your situation.

There is already a fix for it, and will land in v5.5 kernel.
It looks like we should backport it.

Thanks,
Qu


> 
> The kernel is a Debian Testing kernel:
> Linux black 5.3.0-2-amd64 #1 SMP Debian 5.3.9-3 (2019-11-19) x86_64
> GNU/Linux
> 
> I run this same script monthly, and I have not seen this problem before,
> so I cannot be certain it is caused by the scrub. I have not yet tried
> to reproduce it, or to investigate the filesystem (check, etc).
> 
> Does anyone recognise this as a known/fixed problem? If not, is there
> any particular further information I could gather before or during my
> attempt to either recover the filesystem or just rebuild it?
> 
> Here is the log (starting with the 7th resumed scrub):
> 
> 
> Jan  1 06:41:45 black kernel: [1930660.938782] BTRFS info (device sdc3):
> scrub: started on devid 1
> Jan  1 06:41:45 black kernel: [1930660.939195] BTRFS info (device sdc3):
> scrub: started on devid 4
> Jan  1 06:41:45 black kernel: [1930661.475557] ------------[ cut here
> ]------------
> Jan  1 06:41:45 black kernel: [1930661.475562] BTRFS: Transaction
> aborted (error -27)
> Jan  1 06:41:45 black kernel: [1930661.475667] WARNING: CPU: 0 PID:
> 771075 at fs/btrfs/extent-tree.c:8247 btrfs_create_pending_block_
> groups+0x1db/0x230 [btrfs]
> Jan  1 06:41:45 black kernel: [1930661.475669] Modules linked in: fuse
> nfsv3 rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache bnep nf_t
> ables snd_hrtimer snd_seq_midi snd_seq_midi_event snd_rawmidi snd_seq
> snd_seq_device cpufreq_userspace cpufreq_powersave cpufreq_cons
> ervative nfnetlink_queue nfnetlink_log nfnetlink bluetooth drbg
> ansi_cprng ecdh_generic ecc binfmt_misc hid_generic usbhid hid it87 h
> wmon_vid radeon edac_mce_amd kvm_amd eeepc_wmi ccp asus_wmi rng_core
> evdev sparse_keymap kvm snd_hda_codec_realtek rfkill irqbypass s
> nd_hda_codec_generic ttm video wmi_bmof ledtrig_audio pcspkr
> snd_hda_codec_hdmi drm_kms_helper fam15h_power k10temp snd_hda_intel snd
> _hda_codec snd_hda_core snd_hwdep sp5100_tco drm snd_pcm_oss
> snd_mixer_oss watchdog snd_pcm snd_timer sg snd soundcore i2c_algo_bit b
> utton acpi_cpufreq eeprom i2c_nforce2 firewire_sbp2 firewire_core
> crc_itu_t psmouse nfsd parport_pc ppdev auth_rpcgss lp nfs_acl parp
> ort lockd grace sunrpc ip_tables x_tables autofs4 btrfs xor
> zstd_decompress zstd_compress raid6_pq libcrc32c
> Jan  1 06:41:45 black kernel: [1930661.475710]  ext4 crc16 mbcache jbd2
> crc32c_generic sr_mod cdrom uas usb_storage sd_mod dm_crypt d
> m_mod crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel
> ohci_pci aesni_intel ahci libahci xhci_pci aes_x86_64 xhci_hcd c
> rypto_simd libata ehci_pci ohci_hcd ehci_hcd cryptd glue_helper scsi_mod
> usbcore r8169 i2c_piix4 realtek libphy usb_common wmi
> Jan  1 06:41:45 black kernel: [1930661.475737] CPU: 0 PID: 771075 Comm:
> btrfs Not tainted 5.3.0-2-amd64 #1 Debian 5.3.9-3
> Jan  1 06:41:45 black kernel: [1930661.475739] Hardware name: To be
> filled by O.E.M. To be filled by O.E.M./M5A97, BIOS 0705 08/22/20
> 11
> Jan  1 06:41:45 black kernel: [1930661.475767] RIP:
> 0010:btrfs_create_pending_block_groups+0x1db/0x230 [btrfs]
> Jan  1 06:41:45 black kernel: [1930661.475770] Code: e9 26 ff ff ff 48
> 8b 45 50 f0 48 0f ba a8 38 17 00 00 02 72 17 41 83 fc fb 74 2d
>  44 89 e6 48 c7 c7 50 2e 7a c0 e8 23 9d 19 e3 <0f> 0b 44 89 e1 ba 37 20
> 00 00 48 c7 c6 20 80 79 c0 48 89 ef e8 73
> Jan  1 06:41:45 black kernel: [1930661.475772] RSP:
> 0018:ffff9c69804cfb00 EFLAGS: 00010286
> Jan  1 06:41:45 black kernel: [1930661.475775] RAX: 0000000000000000
> RBX: ffff909444e7a520 RCX: 0000000000000006
> Jan  1 06:41:45 black kernel: [1930661.475777] RDX: 0000000000000007
> RSI: 0000000000000096 RDI: ffff90957aa17680
> Jan  1 06:41:45 black kernel: [1930661.475779] RBP: ffff90946c745d68
> R08: 0000000000010ec1 R09: 0000000000000007
> Jan  1 06:41:45 black kernel: [1930661.475781] R10: 0000000000000000
> R11: 0000000000000001 R12: 00000000ffffffe5
> Jan  1 06:41:45 black kernel: [1930661.475783] R13: ffff90946c745dc0
> R14: ffff909575d2e000 R15: ffff909574444000
> Jan  1 06:41:45 black kernel: [1930661.475786] FS:
> 00007ff2eb4c7700(0000) GS:ffff90957aa00000(0000) knlGS:0000000000000000
> Jan  1 06:41:45 black kernel: [1930661.475788] CS:  0010 DS: 0000 ES:
> 0000 CR0: 0000000080050033
> Jan  1 06:41:45 black kernel: [1930661.475790] CR2: 00005634edab7008
> CR3: 00000000bd0f2000 CR4: 00000000000406f0
> Jan  1 06:41:45 black kernel: [1930661.475792] Call Trace:
> Jan  1 06:41:45 black kernel: [1930661.475826]
> __btrfs_end_transaction+0x3f/0x1b0 [btrfs]
> Jan  1 06:41:45 black kernel: [1930661.475855]
> btrfs_inc_block_group_ro+0x10e/0x150 [btrfs]
> Jan  1 06:41:45 black kernel: [1930661.475891]
> scrub_enumerate_chunks+0x162/0x560 [btrfs]
> Jan  1 06:41:45 black kernel: [1930661.475900]  ?
> remove_wait_queue+0x20/0x60
> Jan  1 06:41:45 black kernel: [1930661.475936]
> btrfs_scrub_dev+0x26b/0x590 [btrfs]
> Jan  1 06:41:45 black kernel: [1930661.475942]  ? _cond_resched+0x15/0x30
> Jan  1 06:41:45 black kernel: [1930661.475946]  ?
> __kmalloc_track_caller+0x16e/0x260
> Jan  1 06:41:45 black kernel: [1930661.475980]  ?
> btrfs_ioctl+0x82f/0x2e10 [btrfs]
> Jan  1 06:41:45 black kernel: [1930661.475984]  ?
> __check_object_size+0x136/0x147
> Jan  1 06:41:45 black kernel: [1930661.476019]  btrfs_ioctl+0x87a/0x2e10
> [btrfs]
> Jan  1 06:41:45 black kernel: [1930661.476024]  ?
> tomoyo_path_number_perm+0x66/0x1d0
> Jan  1 06:41:45 black kernel: [1930661.476030]  ? do_vfs_ioctl+0x40e/0x670
> Jan  1 06:41:45 black kernel: [1930661.476033]  do_vfs_ioctl+0x40e/0x670
> Jan  1 06:41:45 black kernel: [1930661.476036]  ?
> create_task_io_context+0x95/0x100
> Jan  1 06:41:45 black kernel: [1930661.476040]  ksys_ioctl+0x5e/0x90
> Jan  1 06:41:45 black kernel: [1930661.476044]  __x64_sys_ioctl+0x16/0x20
> Jan  1 06:41:45 black kernel: [1930661.476048]  do_syscall_64+0x53/0x140
> Jan  1 06:41:45 black kernel: [1930661.476052]
> entry_SYSCALL_64_after_hwframe+0x44/0xa9
> Jan  1 06:41:45 black kernel: [1930661.476055] RIP: 0033:0x7ff2eb5b95b7
> Jan  1 06:41:45 black kernel: [1930661.476058] Code: 00 00 90 48 8b 05
> d9 78 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84
> 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b
> 0d a9 78 0c 00 f7 d8 64 89 01 48
> Jan  1 06:41:45 black kernel: [1930661.476061] RSP:
> 002b:00007ff2eb4c6d38 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> Jan  1 06:41:45 black kernel: [1930661.476064] RAX: ffffffffffffffda
> RBX: 000055eeaf2e94b0 RCX: 00007ff2eb5b95b7
> Jan  1 06:41:45 black kernel: [1930661.476066] RDX: 000055eeaf2e94b0
> RSI: 00000000c400941b RDI: 0000000000000003
> Jan  1 06:41:45 black kernel: [1930661.476067] RBP: 0000000000000000
> R08: 00007ff2eb4c7700 R09: 0000000000000000
> Jan  1 06:41:45 black kernel: [1930661.476069] R10: 00007ff2eb4c7700
> R11: 0000000000000246 R12: 00007ffc4cfa511e
> Jan  1 06:41:45 black kernel: [1930661.476071] R13: 00007ffc4cfa511f
> R14: 00007ff2eb4c7700 R15: 00007ff2eb4c6e40
> Jan  1 06:41:45 black kernel: [1930661.476075] ---[ end trace
> 6429c1bf293fecb8 ]---
> Jan  1 06:41:45 black kernel: [1930661.476079] BTRFS: error (device
> sdc3) in btrfs_create_pending_block_groups:8247: errno=-27 unknown
> Jan  1 06:41:45 black kernel: [1930661.476082] BTRFS info (device sdc3):
> forced readonly
> Jan  1 06:41:45 black kernel: [1930661.489816] BTRFS warning (device
> sdc3): failed setting block group ro: -30
> Jan  1 06:41:45 black kernel: [1930661.489821] BTRFS info (device sdc3):
> scrub: not finished on devid 1 with status: -30
> Jan  1 06:41:52 black kernel: [1930668.052295] BTRFS warning (device
> sdc3): failed setting block group ro: -30
> Jan  1 06:41:52 black kernel: [1930668.052301] BTRFS info (device sdc3):
> scrub: not finished on devid 4 with status: -30
> Jan  1 06:51:56 black kernel: [1931271.801468] BTRFS error (device
> sdc3): parent transid verify failed on 16216583520256 wanted 301800
> found 301756
> Jan  1 06:51:56 black kernel: [1931271.822215] BTRFS error (device
> sdc3): parent transid verify failed on 16216583520256 wanted 301800
> found 301756
> Jan  1 06:51:57 black kernel: [1931273.492798] BTRFS error (device
> sdc3): parent transid verify failed on 16216583520256 wanted 301800
> found 301756
> Jan  1 06:51:57 black kernel: [1931273.493041] BTRFS error (device
> sdc3): parent transid verify failed on 16216583520256 wanted 301800
> found 301756
> 

Attachment: signature.asc
Description: OpenPGP digital signature


[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux