Re: zstd compression

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



2017-11-16 19:32 GMT+03:00 Austin S. Hemmelgarn <ahferroin7@xxxxxxxxx>:
> On 2017-11-16 08:43, Duncan wrote:
>>
>> Austin S. Hemmelgarn posted on Thu, 16 Nov 2017 07:30:47 -0500 as
>> excerpted:
>>
>>> On 2017-11-15 16:31, Duncan wrote:
>>>>
>>>> Austin S. Hemmelgarn posted on Wed, 15 Nov 2017 07:57:06 -0500 as
>>>> excerpted:
>>>>
>>>>> The 'compress' and 'compress-force' mount options only impact newly
>>>>> written data.  The compression used is stored with the metadata for
>>>>> the extents themselves, so any existing data on the volume will be
>>>>> read just fine with whatever compression method it was written with,
>>>>> while new data will be written with the specified compression method.
>>>>>
>>>>> If you want to convert existing files, you can use the '-c' option to
>>>>> the defrag command to do so.
>>>>
>>>>
>>>> ... Being aware of course that using defrag to recompress files like
>>>> that will break 100% of the existing reflinks, effectively (near)
>>>> doubling data usage if the files are snapshotted, since the snapshot
>>>> will now share 0% of its extents with the newly compressed files.
>>>
>>> Good point, I forgot to mention that.
>>>>
>>>>
>>>> (The actual effect shouldn't be quite that bad, as some files are
>>>> likely to be uncompressed due to not compressing well, and I'm not sure
>>>> if defrag -c rewrites them or not.  Further, if there's multiple
>>>> snapshots data usage should only double with respect to the latest one,
>>>> the data delta between it and previous snapshots won't be doubled as
>>>> well.)
>>>
>>> I'm pretty sure defrag is equivalent to 'compress-force', not
>>> 'compress', but I may be wrong.
>>
>>
>> But... compress-force doesn't actually force compression _all_ the time.
>> Rather, it forces btrfs to continue checking whether compression is worth
>> it for each "block"[1] of the file, instead of giving up if the first
>> quick try at the beginning says that block won't compress.
>>
>> So what I'm saying is that if the snapshotted data is already compressed,
>> think (pre-)compressed tarballs or image files such as jpeg that are
>> unlikely to /easily/ compress further and might well actually be _bigger_
>> once the compression algorithm is run over them, defrag -c will likely
>> fail to compress them further even if it's the equivalent of compress-
>> force, and thus /should/ leave them as-is, not breaking the reflinks of
>> the snapshots and thus not doubling the data usage for that file, or more
>> exactly, that extent of that file.
>>
>> Tho come to think of it, is defrag -c that smart, to actually leave the
>> data as-is if it doesn't compress further, or does it still rewrite it
>> even if it doesn't compress, thus breaking the reflink and doubling the
>> usage regardless?
>
> I'm not certain how compression factors in, but if you aren't compressing
> the file, it will only get rewritten if it's fragmented (which is shy
> defragmenting the system root directory is usually insanely fast on most
> systems, stuff there is almost never fragmented).
>>
>>
>> ---
>> [1] Block:  I'm not positive it's the usual 4K block in this case.  I
>> think I read that it's 16K, but I might be confused on that.  But
>> regardless the size, the point is, with compress-force btrfs won't give
>> up like simple compress will if the first "block" doesn't compress, it'll
>> keep trying.
>>
>> Of course the new compression heuristic changes this a bit too, but the
>> same general idea holds, compress-force continues to try for the entire
>> file, compress will give up much faster.
>
> I'm not actually sure, I would think it checks 128k blocks of data (the
> effective block size for compression), but if it doesn't it should be
> checking at the filesystem block size (which means 16k on most recently
> created filesystems).
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

Defragment of data on btrfs, is simply rewrite data if, data doesn't
meet some criteria.
And only that -c does, it's say which compression method apply for new
written data, no more, no less.
On write side, FS see long/short data ranges for writing (see
compress_file_range()), if compression needed, split data to 128KiB
and pass it to compression logic.
compression logic give up it self in 2 cases:
1. Compression of 2 (or 3?) first page sized blocks of 128KiB make
data bigger -> give up -> write data as is
2. After compression done, if compression not free at least one sector
size -> write data as is

i.e.
If you write 16 KiB at time, btrfs will compress each separate write as 16 KiB.
If you write 1 MiB at time, btrfs will split it by 128 KiB.
If you write 1025KiB, btrfs will split it by 128 KiB and last 1 KiB
will be written as is.

JFYI:
Only that heuristic logic doing (i.e. compress, not compress-force) is:
On every write, kernel check if compression are needed by inode_need_compress().
i.e. check flags like compress, nocompress, compress-force,
defrag-compress (work like compress-force AFAIK)

Internal logic:
 - Up to 4.14 kernel:
   If compression of first 128 KiB of file are fail by any criteria ->
mark file as non compressible -> skip compression for new data
 - On 4.15+, if heuristic will work as expected (it does by logic):
    while check file  (see inode_need_compress()), if it's marked for
compression and it's not compression-force, heuristic check input
write data range for some patterns and anti-patterns of compressible
data, and can make decision for every written data, does it worth
compression or not. Instead of blind decision based on prefix
estimation.

Thanks
-- 
Have a nice day,
Timofey.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux