On 2020-05-16 19:18, Filipe Manana wrote:
On Sat, May 16, 2020 at 5:51 PM A L <mail@xxxxxxxxxxxxxx> wrote:
Dear all,
I did some testing on copying files with the +c (compression) xattrs set.
As far as I can tell, 'cp - a' only sets any xattrs after copying the data. This means that a compressed file should end up without compression, but still with the +c xattr set. However this is not entirely true. Some small amount of data is still getting compressed.
I would like to understand why.
As discussed on the mailing list:
cp copies the xattr only after copying the file data. Since the data
is written to the destination using buffered IO, it is possible that
while copying the data the system flushes dirty pages for whatever
reason (due to memory pressure, someone called sync(2), etc) - this
data will not be compressed since the file doesn't have yet the
compression xattr. If the remaining data is flushed after cp finishes,
then that data can end up compressed, since the file has the
compression xattr at that point. Typically for small files, all the
data ends up getting flushed after cp finishes, so we don't see any
surprising behaviour.
I'll look into changing 'cp''s behaviour to copy xattrs before file
data next week, unless you or someone else is interested in doing it.
Thanks.
Based on what you say, the file operations are happening asynchronous in
the background, rather than synchronous. I guess 'cp' and other tools
like it should issue a 'fsync' call between setting the xattrs and
writing data? Is this specific to Btrfs, or is it a Linux design choice?
Also, thanks for looking into changing cp to do the xattrs before
writing data. I had also asked about this on the coreutils mailing list:
https://lists.gnu.org/archive/html/coreutils/2020-05/msg00011.html
Thanks