Re: btrfs crashing the kernel with Seagate 8TB SMR drives.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Dec 03, 2015 at 06:07:52PM +0000, Codebird wrote:
> I've got a nice bug for you - because I can offer you what everyone likes to
> see, a precise error message.
> 
> I've got a btrfs filesystem spread over six devices, RAID1 mode. Four of
> these are Seagate 8TB archive drives - those SMR ones that a few others have
> reported failing when used with btrfs. I've had that issue too, and I just
> can't explain why, other than to say that it only occurs when using them on
> my mainboard SATA ports, not via USB dock. But that's not what I'm reporting
> - that's just the source of the problem that causes the crash I am
> reporting.
> 
> The crash occurs when scrubbing, after some time and some terabytes - or
> possibly just when reading a certain place, I'm not sure - and it gives this
> helpful error left on the screen along with a system so unresponsive numlock
> won't flash:
> 
> BTRFS: Error (device sdg1) in  __btrfs_free_extent:6360: errno=-5 IO failure
> BTRFS: Error (device sdg1) in  __btrfs_free_extent:6360: errno=-5 IO failure
> BTRFS: Error (device sdg1) in  btrfs_run_delayed_refs:2851: errno=-5 IO
> failure
> BTRFS: Error (device sdg1) in  btrfs_run_delayed_refs:2851: errno=-5 IO
> failure
> BTRFS: Error (device sdg1) in  btrfs_run_delayed_refs:2851: errno=-5 IO
> failure
> <long indent, as if a CR was lost> BTRFS: assertion failed:
> f(fs_info->sb->s_flags & MS  <Cut by edge of screen>
> -----------[ cut here ]------------
> kernel BUG at ../fs/btrfs/ctree.h:4057!
> 
> Not sure if some of those 5 might be 6, as I was in a hurry to get it back
> up both times and just got a blurry photo. But it looks to me like there
> might be a chunk of code that doesn't handle a hardware fault - rather than
> cleanly return an error it's causing the kernel to hang entirely. I've
> managed to get this to happen twice now, so it's certainly something worth
> looking into. This is on SUSE tumbleweed, with kernel 4.3.0-2-default.

We do set btrfs to readonly state when handing this EIO error, but
what's happening here is that btrfs failed to stop scrub workers
calling repair_io_failure() and hit that ASSERT.

Will send a patch to you.

Thanks,

-liubo

> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux