Re: compress-force not really forcing compression?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 12/23/2018 1:16 AM, Adam Borowski wrote:
On Sun, Dec 23, 2018 at 12:24:02AM +0000, Paul Jones wrote:
IMHO the more pertinent question is :

If a file has portions which are not easily compressible does that imply all
future writes are also incompressible. IMO no, so I think what will be prudent
is remove FORCE_COMPRESS altogether and make the code act as if it's
always on.

Any opinions?


That is a good idea.  If I turn on compression I would expect everything
to be compressed, except in cases where there is no size benefit.

I expect that the vast majority of files consist of blocks of similar
compressibility.  Thus, finding a block that fails to compress strongly
suggests other blocks are either incompressible as well or compress only
minimally.  Refusing to waste time, electricity and fragmentation in such
case is a good default, I think.

But, if you believe this should be changed, there's an easy experiment you
can try: for all files on your filesystem, chop every file into 128KB pieces
and compress each of them with your chosen algorithm.  Noting the compressed
size of every block in a file that had at least one block fail to compress
would give us some data.

I would suggest looking at Windows DLL files installed as part of a Wine setup as a potential candidate for this. They tend to have very long runs of null bytes scattered seemingly randomly throughout the file (because hot patching, except you can't hot-patch DLL's reliably on Windows) and use UTF-16 strings. As a result, the actual machine code generally doesn't compress well, but most of the rest of the file does. Fixed-size preallocated VM disk images would be another good candidate, just wipe the free space with zeroes from the VM before testing them.

Realistically though, I see a couple of issues with the default behavior:

* There's no way for a regular user to figure out if a file actually is transparently compressed or not. * Without editing the filesystem directly, there's no way to preemptively set the bit in metadata that tells BTRFS to not try to compress a file, and there's no way to reset it either. * The default behavior happens to be what `chattr +c` honors, which leads to potentially unexpected behaviors some times (I, and most people I know, would expect 'chattr +c' to behave like `compress-force`, not `compress`).



[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux