Re: first mount(s) after unclean shutdown always fail

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 2020/7/2 上午9:11, Marc Lehmann wrote:
> On Thu, Jul 02, 2020 at 08:02:52AM +0800, Qu Wenruo <quwenruo.btrfs@xxxxxxx> wrote:
>> Well, if you want to go this way, let me show the code here.
>>
>> From fs/btrfs/volumes.c:btrfs_read_chunk_tree():
>>
>>         if (btrfs_super_total_bytes(fs_info->super_copy) <
>>             fs_info->fs_devices->total_rw_bytes) {
>>                 btrfs_err(fs_info,
>>         "super_total_bytes %llu mismatch with fs_devices total_rw_bytes
>> %llu",
>>                           btrfs_super_total_bytes(fs_info->super_copy),
>>                           fs_info->fs_devices->total_rw_bytes);
>>                 ret = -EINVAL;
>>                 goto error;
>>         }
>>
>> Doesn't this explain why we abort the mount?
> 
> I wouldn't see how, especially if the code doesn't do anything _unless_ it
> also prints the message.
> 
> When it doesn't produce the message, all it does is compare two numbers
> (unless btrfs_super_total_bytes does something very funny) - how does this
> explain that the mount fails, then succeeds, in the cases where the message
> is _not_ logged, as reported?

When the error is logged, this snippet get triggered and abort mount.

And you have reported this at least happened once.

Then for that case, you should go btrfs rescue fix-device-size.

> 
>>> Also, shouldn't btrfs be fixed instead? I was under the impression that
>>> one of the goals of btrfs is to be safe w.r.t. crashes.
>>
>> That's why we provide the btrfs rescue fix-device-size.
> 
> Not sure how that follows - there is a bug in the kernel filesystem and
> you provide a userspace tool that should be run on every crash, to what
> end?

Nope, it get executed once and that specific problem will be gone.

As said, that's caused by some older kernel, newer kernel has extra safe
net to ensure the accounting numbers are safe.

> 
> Spurious mount failures are a bug in the btrfs kernel driver.

Then report them as separate bugs.

The bugs of that message is well known and we have solution for a while.

> 
>>> The bug I reported has very little or nothing to with strict checking.
>>
>> I have provide the code to prove why it's related.
> 
> The code proves only that you are wrong - the code _always_ prints the
> message. Unless btrfs_super_total_bytes does more than just read some
> data, it cannot explain the bug I reported, simply because the message is
> not always produced, and the mount is not always aborted.

Solve one problem and go on to solve the next one.

If you don't even bother the solution to that specific problem, you
won't bother any debug procedure provided by any developer.

> 
>> Whether you believe is your problem then.
> 
> No, it's not, simply because I don't have a problem...
> 
> btrfs has problems, and I reported one, that's all that has happened.

You reported several problem without proper reproducer.

You can reproduce it on your system is not a proper reproducer.
I provided one solution to one of your problems, you ignored and that's
your problem.

I don't see any point to debug any bugs reported by the one who doesn't
even want to try a known solution but insists on whatever he believe is
correct.

> 
> I slowly get the distinct feeling that reporting bugs in btrfs us a futile
> exercise, though.
> 

Attachment: signature.asc
Description: OpenPGP digital signature


[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux