Based on kdave for-next
Patches short:
1. Move heuristic to use compression workspaces
Bit tricky, but works.
2. Add heuristic counters and buffer to workspaces
3. Implement simple input data sampling
It's get 16 byte samples with 256 bytes shifts
over input data. Collect info about how many
different bytes (symbols) has been found in sample data
4. Implement check sample to zeroes
Just check all bytes in sample to 0
5. Add code for calculate
how many unique bytes has been found in sample data
That can fast detect easy compressible data
6. Add code for calculate byte core set size
i.e. how many unique bytes use 90% of sample data
That code require that numbers in bucket must be sorted
That can detect easy compressible data with many repeated bytes
That can detect not compressible data with evenly distributed bytes
Changes v1 -> v2:
- Change input data iterator shift 512 -> 256
- Replace magic macro numbers with direct values
- Drop useless symbol population in bucket
as no one care about where and what symbol stored
in bucket at now
Changes v2 -> v3 (only update #3 patch):
- Fix u64 division problem by use u32 for input_size
- Fix input size calculation start - end -> end - start
- Add missing sort.h header
Changes v3 -> v4 (only update #1 patch):
- Change counter type in bucket item u16 -> u32
- Drop other fields from bucket item for now,
no one use it
Change v4 -> v5
- Move heuristic code to external file
- Make heuristic use compression workspaces
- Add check sample to zeroes
Timofey Titovets (6):
Btrfs: heuristic make use compression workspaces
Btrfs: heuristic workspace add bucket and sample items
Btrfs: Implement heuristic sampling logic
Btrfs: heuristic add detection of zeroed sample
Btrfs: heuristic add byte set calculation
Btrfs: heuristic add byte core set calculation
fs/btrfs/Makefile | 2 +-
fs/btrfs/compression.c | 18 ++--
fs/btrfs/compression.h | 7 +-
fs/btrfs/heuristic.c | 220 +++++++++++++++++++++++++++++++++++++++++++++++++
4 files changed, 234 insertions(+), 13 deletions(-)
create mode 100644 fs/btrfs/heuristic.c
--
2.14.1
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html