Re: BUG: BTRFS and O_DIRECT could lead to wrong checksum and wrong data

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



15.09.2017 08:50, Goffredo Baroncelli пишет:
> On 09/15/2017 05:55 AM, Andrei Borzenkov wrote:
>> 15.09.2017 01:00, Goffredo Baroncelli пишет:
>>>
>>> 2) The second bug, is a more severe bug. If during a writing of a buffer with O_DIRECT, the buffer is updated at the same time by a second process, the checksum may be incorrect.
>>>
>>
>> Is it btrfs specific ? If buffer is updated before it was actually
>> consumed by kernel, this likely means data corruption on any filesystem.
> 
> I don't see any corruption in other FS. The fact that application push to filesystem garbage, doesn't allow the filesystem to be corrupted. 

I did not say "filesystem corruption", I said "data corruption".

> In this case the filesystem became corrupted, because another application which try to read the data (without O_DIRECT) may got -EIO.
> 

No. *Data* on this filesystem was corrupted and luckily btrfs makes you
aware of it. On different filesystem you still may have the same data
corruption, but silent.

> I repeat, the problem is a data race when the data is in the FS camp, and the kernel does wrong checksum.
> 

Of course it is race. But again - I expect that when pwrite() returns it
means data buffer can be reused. Otherwise I cannot see how O_DIRECT can
be sensibly used at all. In this case you need to demonstrate that data
corruption happens after pwrite() returns - this makes it btrfs issue
indeed. If data corruption happens while thread is waiting for pwrite()
to return, I say this is expected behavior and application fault - it
need to protect against concurrent write and modification.

> 
> IMHO, BTRFS should disallow O_DIRECT (which is the same thing that does ZFS on linux); I think that it could be allowed only for  nodatasum files.
> 
>> I.e. there should be clear indication from kernel that buffer can be
>> reused by application, in your example - when pwrite returns. So when
>> data corruption happens - during pwrite or after? 
>> If data is corrupted
>> during pwrite, it is arguably application fault - it should disallow
>> concurrent access.
> 
> 
> 
> 
> 
>>
> 
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux