Hello,
this:
>Some prefer bug report in mail list directly like me, some prefer
kernel
>bugzilla.
and this:
>Not sure if other is looking into this.
>Btrfs bug tracking is somewhat tricky.
may be related...
>Not likely. You can do a scrub to check for metadata and data
>corruption.
Did that. All good.
>And you can do an offline (unmounted) 'btrfs check
>--readonly' to check the validity of the metadata.
Will do that.
> The Btrfs call
>traces during the blocked task are INFO, not warnings or errors, so
>the file system and data is likely fine. There's no read, write,
>corruption, or generation errors in the dmesg; but you can also check
>'btfs dev stats <mountpoint>' which is a persistent counter for this
>particular device.
[/dev/sdh1].write_io_errs 0
[/dev/sdh1].read_io_errs 0
[/dev/sdh1].flush_io_errs 0
[/dev/sdh1].corruption_errs 0
[/dev/sdh1].generation_errs 0
>I should have read this before replying earlier.
>
>You can also do a one time clean mount with '-o
>clear_cache,space_cache=v2' which will remove the v1 (default) space
>cache, and create a v2 cache. Subsequent mount will see the flag for
>this feature and always use the v2 cache. It's a totally differently
>implementation and shouldn't have this problem.
So, you have a suspicion already about what caused the problem? Why is
v2 then not default? Is it worth chasing the Bug in v1?
For me, the question now is, whether we should chase this Bug or not. I
encountered it three times while filling a 8TB drive with 7TB. Now, I
have 1TB left and I am not sure I can reproduce, but I can try.
>Qu would know better but usually developers ask for sysrq+w when
>there's blocked tasks.
I am wondering, whether there is a -long term- a better way than this.
Ideally, btrfs would automatically create a
btrfs-bug-DD-MM-YY-hh-mm-ss.tar.gz with all the info you need and inform
the User about it and where to issue the bug. I am aware that this is
tricky. But in order to further mature btrfs, I assume you need more
real life data with good quality (that is, the right logs) without too
much work (asking for logs). What's your view on this?
>You know what? Try changing the scheduler from mq-deadline to none.
>Change nothing else. Now try to reproduce. Let's see if it still
>happens.
Wouldn't it make sense first to try to reproduce without changing
anything?
>Also, what are the mount options?
rw,noatime,nospace_cache,subvolid=5,subvol=/
But noatime and nospace_cache I added just today.
Greetings,
Hendrik