On Fri, Jan 3, 2020 at 7:39 AM Leszek Dubiel <leszek@xxxxxxxxx> wrote:
>
> ** number of files by given size
>
> root@wawel:/mnt/root/orion# cat disk_usage | perl -MData::Dumper -e
> '$Data::Dumper::Sortkeys = 1; while (<>) { chomp; my ($byt, $nam) =
> split /\t/, $_, -1; if (index("$las/", $nam) == 0) { $dir++; } else {
> $filtot++; for $p (1 .. 99) { if ($byt < 10 ** $p) { $fil{"num of files
> size <10^$p"}++; last; } } }; $las = $nam; }; print "\ndirectories:
> $dir\ntotal num of files: $filtot\n", "\nnumber of files grouped by
> size: \n", Dumper(\%fil) '
>
> directories: 1314246
> total num of files: 10123960
>
> number of files grouped by size:
> $VAR1 = {
> 'num of files size <10^1' => 3325886,
> 'num of files size <10^2' => 3709276,
> 'num of files size <10^3' => 789852,
> 'num of files size <10^4' => 1085927,
> 'num of files size <10^5' => 650571,
> 'num of files size <10^6' => 438717,
> 'num of files size <10^7' => 116757,
> 'num of files size <10^8' => 6638,
> 'num of files size <10^9' => 323
> 'num of files size <10^10' => 13,
Is that really ~7.8 million files at or less than 1KiB?? (totalling
the first three)
Compression may not do much with such small files, and also I'm not
sure which algorithm would do the best job. They all probably want a
lot more than 1KiB to become efficient.
But nodesize 64KiB might be a big deal...worth testing.
--
Chris Murphy