On 2015-12-02 22:03, Austin S Hemmelgarn wrote:
From these numbers (124 GB used where data size is 153 GB), it
appears
that we save around 20% with zlib compression enabled.
Is 20% reasonable saving for zlib? Typically text compresses much
better
with that algorithm, although I understand that we have several
limitations when applying that on a filesystem level.
This is actually an excellent question. A couple of things to note
before I share what I've seen:
1. Text compresses better with any compression algorithm. It is by
nature highly patterned and moderately redundant data, which is what
benefits the most from compression.
It looks that compress=zlib does not compress very well. Following
Duncan's suggestion, I've changed it to compress-force=zlib, and
re-copied the data to make sure the file are compressed.
Compression ratio is much much better now (on a slightly changed data
set):
# df -h
/dev/xvdb 200G 24G 176G 12% /var/log/remote
# du -sh /var/log/remote/
138G /var/log/remote/
So, 138 GB files use just 24 GB on disk - nice!
However, I would still expect that compress=zlib has almost the same
effect as compress-force=zlib, for 100% text files/logs.
Tomasz Chmielewski
http://wpkg.org
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html