Re: qcow2 images make scrub believe the filesystem is corrupted.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



HI all..

So this sucks, having to check every single disk image so it wont get
corrupted stuff, while it works on other filesystems isn't exactly
stellar.

i changed the qcow2 settings per this page:
https://pve.proxmox.com/wiki/Performance_Tweaks

i went with cache=writeback since it neither uses O_DSYNC nor O_DIRECT
semantics.

I understand the reasons checksumming might fail and that isn't
exactly btrfs fault but i scoured the entire btrfs wiki and couldnt
find anything warning about btrfs + qcow2 (or other image types) in
the wiki pages. Maybe adding some warning would help unlucky ppl like
myself?

Also, is is safe to enable compress=lzo , or is it also a no no, im
also starting to suspect that discard wasnt the culprit since i have
it enabled for more then a year and the only problem i got with
corruption was precisely this images?
| Paulo Dias
| paulo.miguel.dias@xxxxxxxxx

Tempora mutantur, nos et mutamur in illis.


On Fri, Aug 18, 2017 at 2:59 PM, Liu Bo <bo.li.liu@xxxxxxxxxx> wrote:
> On Fri, Aug 18, 2017 at 06:23:18PM +0200, Goffredo Baroncelli wrote:
>> On 08/18/2017 01:39 AM, Josef Bacik wrote:
>> [...]
>> > This is happening because the app (the guest OS in this case, we saw this a lot
>> > with windows guests) is changing the pages while they are in flight.  We
>> > calculate the checksum of the page before it's written, so if it changes while
>> > in flight we'll end up with a csum mismatch.
>> >
>> > To fix this change kvm to not use O_DIRECT or set NODATASUM on your qcow2 image.
>> > You'll have to re-create the image because NODATASUM won't apply to the already
>> > invalid checksums.  Thanks,
>>
>> Hi Josef,
>>
>> could you elaborate: do you are saying that using O_DIRECT is incompatible with DATASUM ?
>>
>
> They're compatible, but applications need to be careful.
>
> O_DIRECT takes userspace page, and it works like
> DIO write
>   p = get_user_page();
>   add p to bio
>   #btrfs submits this bio
>   calc_checksum(bio);
>   submit_bio();
>
> There's a chance that page p got changed between calc_checksum() and
> submit_bio(), which then causes the mismatch.
>
> For buffered IO, dirty page cache pages is synchronized with page
> fault by page lock and page writeback bit.
>
> thanks,
> -liubo
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux