Hi, for last several days i try work on entropy calculation that can be usable in btrfs compression code (for detect bad compressible data), I've implemented: - avg meaning (Problems with accuracy) - shannon entropy - shannon entropy with only integer logic (Accuracy compared to float shannon +-0.5%) All writen on C with C++ inserts and can be easy ported to kernel code if needed. Repo there: https://github.com/Nefelim4ag/Entropy_Calculation It will be great if someone has an interest in profiling and performance tests of that Because my stupid tests with ~$ time <binary> and 8MB of test data Shows that lzo with level 1-6 are fastest way to detect if data are compressible And that integer shannon entropy are much faster (in 5 times) way in compare to any gzip level. Thanks! P.S. I get this idea from: https://btrfs.wiki.kernel.org/index.php/Project_ideas - Compression enhancements - heuristics -- try to learn in a simple way how well the file data compress, or not -- Have a nice day, Timofey. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html
