Re: How can btrfs take 23sec to stat 23K files from an SSD?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Marc,

Am Mittwoch, 1. August 2012 schrieb Marc MERLIN:
> On Wed, Aug 01, 2012 at 01:08:46PM +0700, Fajar A. Nugraha wrote:
> > > It it were a random crappy SSD from a random vendor, I'd blame the
> > > SSD, but I have a hard time believing that samsung is selling SSDs
> > > that are slower than hard drives at random IO and 'seeks'.
> > 
> > You'd be surprised on how badly some vendors can screw up :)
> 
> At some point, it may come down to that indeed :-/
> I'm still hopefully that Samsung didn't, but we'll see.

Its getting quite strange.

I lost track of whether you did that already or not, but if you didn´t
please post some

vmstat 1

iostat -xd 1

on the device while it is being slow.

I am interested in wait I/O and latencies and disk utilization.


Comparison data of Intel SSD 320 in ThinkPad T520 during

merkaba:~> echo 3 > /proc/sys/drop_caches ; du -sch /usr

on BTRFS with Kernel 3.5:


martin@merkaba:~> vmstat 1
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa
 2  1  21556 4442668   2056 502352    0    0   194    85  247  120 11  2 87  0
 1  2  21556 4408888   2448 514884    0    0 11684   328 4975 24585  5 16 65 14
 1  0  21556 4389880   2448 528060    0    0 13400     0 4574 23452  2 16 68 14
 3  1  21556 4370068   2448 545052    0    0 18132     0 5499 27220  1 18 64 16
 1  0  21556 4350228   2448 555580    0    0 10856     0 4122 25339  3 16 67 14
 1  1  21556 4315604   2448 569756    0    0 12648     0 4647 31153  5 14 66 15
 0  1  21556 4295652   2456 581480    0    0 11548    56 4093 24618  2 13 69 16
 0  1  21556 4286720   2456 591580    0    0 10824     0 3750 21445  1 12 71 16
 0  1  21556 4266308   2456 603620    0    0 12932     0 4841 26447  4 12 68 17
 1  0  21556 4248228   2456 613808    0    0 10264     4 3703 22108  1 13 71 15
 5  1  21556 4231976   2456 624356    0    0 10540     0 3581 20436  1 10 72 17
 0  1  21556 4197168   2456 639108    0    0 12952     0 4738 28223  4 15 66 15
 4  1  21556 4178456   2456 650552    0    0 11656     0 4234 23480  2 14 68 16
 0  1  21556 4163616   2456 662992    0    0 13652     0 4619 26580  1 16 70 13
 4  1  21556 4138288   2456 675696    0    0 13352     0 4422 22254  1 16 70 13
 1  0  21556 4113204   2456 689060    0    0 13232     0 4312 21936  1 15 70 14
 0  1  21556 4085532   2456 704160    0    0 14972     0 4820 24238  1 16 69 14
 2  0  21556 4055740   2456 719644    0    0 15736     0 5099 25513  3 17 66 14
 0  1  21556 4028612   2456 734380    0    0 14504     0 4795 25052  3 15 68 14
 2  0  21556 3999108   2456 749040    0    0 14656    16 4672 21878  1 17 69 13
 1  1  21556 3972732   2456 762108    0    0 12972     0 4717 22411  1 17 70 13
 5  0  21556 3949684   2584 773484    0    0 11528    52 4837 24107  3 15 67 15
 1  0  21556 3912504   2584 787420    0    0 12156     0 4883 25201  4 15 67 14


martin@merkaba:~> iostat -xd 1 /dev/sda
Linux 3.5.0-tp520 (merkaba)     01.08.2012      _x86_64_        (4 CPU)

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               1,29     1,44   11,58   12,78   684,74   299,75    80,81     0,24    9,86    0,95   17,93   0,29   0,71

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0,00     0,00 2808,00    0,00 11232,00     0,00     8,00     0,57    0,21    0,21    0,00   0,19  54,50

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0,00     0,00 2967,00    0,00 11868,00     0,00     8,00     0,63    0,21    0,21    0,00   0,21  60,90

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0,00    11,00 2992,00    4,00 11968,00    56,00     8,03     0,64    0,22    0,22    0,25   0,21  62,00

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0,00     0,00 2680,00    0,00 10720,00     0,00     8,00     0,70    0,26    0,26    0,00   0,25  66,70

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0,00     0,00 3153,00    0,00 12612,00     0,00     8,00     0,72    0,23    0,23    0,00   0,22  69,30

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0,00     0,00 2769,00    0,00 11076,00     0,00     8,00     0,63    0,23    0,23    0,00   0,21  58,00

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0,00     0,00 2523,00    1,00 10092,00     4,00     8,00     0,74    0,29    0,29    0,00   0,28  71,30

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0,00     0,00 3026,00    0,00 12104,00     0,00     8,00     0,73    0,24    0,24    0,00   0,21  64,00

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0,00     0,00 3069,00    0,00 12276,00     0,00     8,00     0,67    0,22    0,22    0,00   0,20  62,00

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0,00     0,00 3346,00    0,00 13384,00     0,00     8,00     0,64    0,19    0,19    0,00   0,18  59,90

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0,00     0,00 3188,00    0,00 12752,00     0,00     8,00     0,80    0,25    0,25    0,00   0,17  54,00

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0,00     0,00 3433,00    0,00 13732,00     0,00     8,00     1,03    0,30    0,30    0,00   0,17  57,00

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0,00     0,00 3565,00    0,00 14260,00     0,00     8,00     0,92    0,26    0,26    0,00   0,16  57,30

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0,00     0,00 3972,00    0,00 15888,00     0,00     8,00     1,13    0,29    0,29    0,00   0,16  62,90

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0,00     0,00 3743,00    0,00 14972,00     0,00     8,00     1,03    0,28    0,28    0,00   0,16  59,40

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0,00     0,00 3408,00    0,00 13632,00     0,00     8,00     1,08    0,32    0,32    0,00   0,17  56,70

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0,00     0,00 3730,00    3,00 14920,00    16,00     8,00     1,14    0,31    0,31    0,00   0,15  56,30




I also suggest to use fio with with the ssd-test example on the SSD. I
have some comparison data available for my setup. Heck it should be
publicly available in my ADMIN magazine article about fio. I used a bit
different fio jobs with block sizes of 2k to 16k, but its similar enough
and I might even have some 4k examples at hand or can easily create one.
I also raised size and duration a bit.

An example based on whats in my article:

[global]
ioengine=libaio
direct=1
iodepth=64
runtime=60

filename=testfile
size=2G
bsrange=2k-16k

refill_buffers=1

[randomwrite]
stonewall
rw=randwrite

[sequentialwrite]
stonewall
rw=write

[randomread]
stonewall
rw=randread

[sequentialread]
stonewall
rw=read

Remove bsrange if you want 4k blocks only. I put the writes above the
reads cause the writes with refill_buffers=1 initialize the testfile
with random data. It would contain only zeros which will be compressible
nicely by modern SandForce controllers otherwise. Above job is untested,
but it should do.

Please remove the test file and fstrim your disk unless you have discard
mount option on after having run any write tests. (Not necessarily after
each one ;).


Also I am interested in

merkaba:~> hdparm -I /dev/sda | grep -i queue
        Queue depth: 32
           *    Native Command Queueing (NCQ)

output for your SSD.

Try to load the ssd with different iodepths up to twice the amount
displayed by hdparm.

But note: a du -sch will not reach that high iodepth. I expect it to
get as low as an iodepth of one in that case. You see in my comparison
output that a single du -sch is not able to load the Intel SSD 320 fully,
but load is just about 50-70%. And iowait is just about 10-20%.

Thanks,
-- 
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux