Reiterate: btrfs stuck with lot's of files

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi, guys again. Looking at this issue, I suspect this is bug in btrfs.
We'll have to clean up this installation soon, so if there is any
request to do some debugging, please, ask. I'll try to reiterate what
was said in this thread.

Short story: btrfs filesystem made of 22 1Tb disks with lot's of files
(~30240000). Write load is 25 Mbyte/second. After some time file system
became unable to cope with this load. Also at this time `sync` takes
ages to finish, shutdown -r hangs (I guess related to sync).

Also I see there is one some kernel kworker that is main suspect for
this behavior: all the time it takes 100% of CPU core, jumping from core
to core. At the same time according to iostat write/read speed is close
to zero and everything is stuck.

Siting some details from previous messages:

> > top - 13:10:58 up 1 day,  9:26,  5 users,  load average: 157.76, 156.61, 149.29
> > Tasks: 235 total,   2 running, 233 sleeping,   0 stopped,   0 zombie
> > %Cpu(s): 19.8 us, 15.0 sy,  0.0 ni, 60.7 id,  3.9 wa,  0.0 hi,  0.6 si, 0.0 st
> > KiB Mem:  65922104 total, 65414856 used,   507248 free,     1844 buffers
> > KiB Swap:        0 total,        0 used,        0 free. 62570804 cached Mem
> >
> >    PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+
> > COMMAND
> >   8644 root      20   0       0      0      0 R  96.5  0.0 127:21.95 kworker/u16:16
> >   5047 dvr       20   0 6884292 122668   4132 S   6.4  0.2 258:59.49 dvrserver
> > 30223 root      20   0   20140   2600   2132 R   6.4  0.0   0:00.01 top
> >      1 root      20   0    4276   1628   1524 S   0.0  0.0   0:40.19 init
> >
> > There are about 300 treads on server, some of which are writing on disk.
> > A bit information about this btrfs filesystem: this is 22 disk file
> > system with raid1 for metadata and raid0 for data:
> >
> >   # btrfs filesystem df /store/
> > Data, single: total=11.92TiB, used=10.86TiB
> > System, RAID1: total=8.00MiB, used=1.27MiB
> > System, single: total=4.00MiB, used=0.00B
> > Metadata, RAID1: total=46.00GiB, used=33.49GiB
> > Metadata, single: total=8.00MiB, used=0.00B
> > GlobalReserve, single: total=512.00MiB, used=128.00KiB
> >   # btrfs property get /store/
> > ro=false
> > label=store
> >   # btrfs device stats /store/
> > (shows all zeros)
> >   # btrfs balance status /store/
> > No balance found on '/store/'

 # btrfs filesystem show
Label: 'store'  uuid: 296404d1-bd3f-417d-8501-02f8d7906bcf
	Total devices 22 FS bytes used 6.50TiB
	devid    1 size 931.51GiB used 558.02GiB path /dev/sdb
	devid    2 size 931.51GiB used 559.00GiB path /dev/sdc
	devid    3 size 931.51GiB used 559.00GiB path /dev/sdd
	devid    4 size 931.51GiB used 559.00GiB path /dev/sde
	devid    5 size 931.51GiB used 559.00GiB path /dev/sdf
	devid    6 size 931.51GiB used 559.00GiB path /dev/sdg
	devid    7 size 931.51GiB used 559.00GiB path /dev/sdh
	devid    8 size 931.51GiB used 559.00GiB path /dev/sdi
	devid    9 size 931.51GiB used 559.00GiB path /dev/sdj
	devid   10 size 931.51GiB used 559.00GiB path /dev/sdk
	devid   11 size 931.51GiB used 559.00GiB path /dev/sdl
	devid   12 size 931.51GiB used 559.00GiB path /dev/sdm
	devid   13 size 931.51GiB used 559.00GiB path /dev/sdn
	devid   14 size 931.51GiB used 559.00GiB path /dev/sdo
	devid   15 size 931.51GiB used 559.00GiB path /dev/sdp
	devid   16 size 931.51GiB used 559.00GiB path /dev/sdq
	devid   17 size 931.51GiB used 559.00GiB path /dev/sdr
	devid   18 size 931.51GiB used 559.00GiB path /dev/sds
	devid   19 size 931.51GiB used 559.00GiB path /dev/sdt
	devid   20 size 931.51GiB used 559.00GiB path /dev/sdu
	devid   21 size 931.51GiB used 559.01GiB path /dev/sdv
	devid   22 size 931.51GiB used 560.01GiB path /dev/sdw

Btrfs v3.17.1

> > iostat 1 exposes following problem:
> >
> > avg-cpu:  %user   %nice %system %iowait  %steal   %idle
> >            16.96    0.00   17.09   65.95    0.00    0.00
> >
> > Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
> > sda               0.00         0.00         0.00          0          0
> > sdc               0.00         0.00         0.00          0          0
> > sdb               0.00         0.00         0.00          0          0
> > sde               0.00         0.00         0.00          0          0
> > sdd               0.00         0.00         0.00          0          0
> > sdf               0.00         0.00         0.00          0          0
> > sdg               0.00         0.00         0.00          0          0
> > sdj               0.00         0.00         0.00          0          0
> > sdh               0.00         0.00         0.00          0          0
> > sdk               0.00         0.00         0.00          0          0
> > sdi               1.00         0.00       200.00          0        200
> > sdl               0.00         0.00         0.00          0          0
> > sdn              48.00         0.00     17260.00          0      17260
> > sdm               0.00         0.00         0.00          0          0
> > sdp               0.00         0.00         0.00          0          0
> > sdo               0.00         0.00         0.00          0          0
> > sdq               0.00         0.00         0.00          0          0
> > sdr               0.00         0.00         0.00          0          0
> > sds               0.00         0.00         0.00          0          0
> > sdt               0.00         0.00         0.00          0          0
> > sdv               0.00         0.00         0.00          0          0
> > sdw               0.00         0.00         0.00          0          0
> > sdu               0.00         0.00         0.00          0          0

At that time I saw such load profile. Write load changed from disk to
disk with time, so I do not suspect broken disk. Currently write profile
is different:
https://drive.google.com/file/d/0BygFL6N3ZVUAVmxaZ1Q5VTZpSGc/view?usp=sharing
Sometimes like above, sometimes all zero, most time load is very low.

> > write goes to one disk. I've tried to debug what's going in kworker and
> > did
> >
> > $ echo workqueue:workqueue_queue_work
> >> /sys/kernel/debug/tracing/set_event
> > $ cat /sys/kernel/debug/tracing/trace_pipe > trace_pipe.out2

I've put result here:
https://drive.google.com/file/d/0BygFL6N3ZVUAMWxCQ0tDREE1Uzg/view?usp=sharing

> > Server has 64Gb of RAM. 
kernel is 3.16.7-gentoo

--
Peter.


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux