Re: compression disk space saving - what are your results?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2015-12-02 04:46, Tomasz Chmielewski wrote:
What are your disk space savings when using btrfs with compression?

I have a 200 GB btrfs filesystem which uses compress=zlib, only stores
text files (logs), mostly multi-gigabyte files.


It's a "single" filesystem, so "df" output matches "btrfs fi df":

# df -h
Filesystem      Size  Used Avail Use% Mounted on
(...)
/dev/xvdb       200G  124G   76G  62% /var/log/remote


# du -sh /var/log/remote/
153G    /var/log/remote/


 From these numbers (124 GB used where data size is 153 GB), it appears
that we save around 20% with zlib compression enabled.
Is 20% reasonable saving for zlib? Typically text compresses much better
with that algorithm, although I understand that we have several
limitations when applying that on a filesystem level.

This is actually an excellent question. A couple of things to note before I share what I've seen: 1. Text compresses better with any compression algorithm. It is by nature highly patterned and moderately redundant data, which is what benefits the most from compression. 2. When BTRFS does in-line compression, it uses 128k blocks. Because of this, there are diminishing returns for smaller files when using compression. 3. The best compression ratio I've ever seen from zlib on real data is about 65-70%, and that was using SquashFS, which is designed to take up as little room as possible. 4. LZO gets a worse compression ratio than zlib (around 40-50% if you're lucky), but is a _lot_ faster. 5. By playing around with the -c option for defrag, you can compress or uncompress different parts of the filesystem, and get a rough idea of what compresses best.

Now, to my results. These are all from my desktop system, with no deduplication, and the data for zlib is somewhat outdated (I've not used it since LZO support stabilized).

For the filesystems I have on traditional hard disks:
1. For /home (mostly text files, some SQLite databases, and a couple of git repositories), I get about 15-20% space savings with zlib, and about a 2-4$ performance hit. I get about 5-10% space savings with lzo, but performance is about 5-8% better than uncompressed. 2. For /usr/src (50/50 mix of text and executable code), I get about 25% space savings with zlib with a 5-7% hit to performance, and about 10% with lzo with a 7% boost in performance relative to uncompressed. 3. For /usr/portage and /var/lib/layman (lots of small text files, a number of VCS repos, and about 2000 compressed source archives), I get about 25% space savings with zlib, with a 15% performance hit (yes, seriously 15%), and with lzo I get about 25% space savings with no measurable performance difference relative to uncompressed.

For the filesystems I have on SSD's:
1. For /var/tmp (huge assortment of different things, but usually similar to /usr/src because this is where packages get built), I get almost no space savings with either type of compression, and see a performance reduction of about 5% for both. 2. For /var/log (Lots of text files (notably, I don't compress rotated logs, and I don't have systemd's insane binary log files), I get about 30% space savings with zlib, but it makes the _whole_ system run about 5% slower, and I get about 20% space savings with lzo, with no measurable performance difference relative to uncompressed. 3. For /var/spool (Lots of really short text files, mostly stuff from postfix and CUPS), I actually see higher disk usage with both types of compression, but almost zero performance impact from either of them. 4. For /boot (a couple of big binary files that already have built-in compression), I see no net space savings, and don't have any numbers regarding performance impact. 5. For / (everything that isn't on one of the other filesystems I listed above), I see about 10-20% space savings from zlib, with a roughly 5% performance hit, and about 5-15% space savings with lzo, with no measurable performance difference.

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature


[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux