Re: qcow2 images make scrub believe the filesystem is corrupted.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 08/18/2017 07:43 PM, Josef Bacik wrote:
> On Fri, Aug 18, 2017 at 06:23:18PM +0200, Goffredo Baroncelli wrote:
>> On 08/18/2017 01:39 AM, Josef Bacik wrote:
>> [...]
>>> This is happening because the app (the guest OS in this case, we saw this a lot
>>> with windows guests) is changing the pages while they are in flight.  We
>>> calculate the checksum of the page before it's written, so if it changes while
>>> in flight we'll end up with a csum mismatch.
>>>
>>> To fix this change kvm to not use O_DIRECT or set NODATASUM on your qcow2 image.
>>> You'll have to re-create the image because NODATASUM won't apply to the already
>>> invalid checksums.  Thanks,
>>
>> Hi Josef,
>>
>> could you elaborate: do you are saying that using O_DIRECT is incompatible with DATASUM ?
>>
> 
> No, I'm saying using O_DIRECT with applications that don't protect in-flight
> memory are incompatible with DATASUM.  

This is what I call an 'incompatibility'. Even is a "corner" case, it is still an incompatibility. And to be honest, it is still difficult to say that a "VM" is a "corner" case.

> We have no way of making sure nobody
> touches the page while we're writing it out, so after we calculate the checksum
> any changes to the page are going to cause a checksum mismatch.  O_DIRECT are
> user space pages, there's nothing we can do to stop user space from doing stupid
> things.

I understand the technical difficulties; however I can't agree about "user space [...] doing *stupid* things". If it is not explicitly forbidden, it is legal; not "stupid"

How the application know that the page aren't in-flight anymore ? It is sufficient to wait the end of the write() syscall ? Or it has to wait the end of a fsync() ?
 
> The options I looked into before were things like detecting the page had changed
> since we calculated the checksum, and re-submitting the write.  This punishes
> applications that do the right thing (databases for example) by forcing us to
> calculate checksums twice.

There are other "cases" where it is possible to have the same problem ? It is the same for mmap() ?

> 
> This is a shit situation because users aren't going to understand this
> limitation, and it bites them in the ass with all these weird errors.  I think
> maybe we need to go back to the double-checksum thing by default, and have a
> flag or something for users to set if they know their application behaves
> properly.  

Or... disable checksum for the "O_DIRECT" writings... If you can't trust the checksums at 100%, these don't make sense.

> 
> Josef
> 


-- 
gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it>
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux