Re: [RFC PATCH 0/6] Understanding delays due to throttling under very heavy write load
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
On 02/03/2012 05:03 PM, Yehuda Sadeh Weinraub wrote:
On Fri, Feb 3, 2012 at 3:33 PM, Jim Schutt<jaschut@xxxxxxxxxx> wrote:
You can try running 'iostat -t -kx -d 1' on the osds, and see whether %util
reaches 100%, and when it happens whether it's due to number of io
operations that are thrashing, or whether it's due to high amount of data.
FWIW, you may try setting 'filestore flusher = false', and set
/proc/sys/vm/dirty_background_ratio' to a small number (e.g., 1M).
Here's some iostat data from early in a run, when things are
running well:
02/02/2012 09:14:13 AM
avg-cpu: %user %nice %system %iowait %steal %idle
23.24 0.00 61.99 7.38 0.00 7.38
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util
sdc 0.00 0.00 0.00 206.00 0.00 101.57 1009.79 54.80 251.27 4.86 100.10
sdd 0.00 0.00 0.00 202.00 0.00 98.10 994.61 27.85 132.42 4.96 100.10
sde 0.00 4.00 0.00 212.00 0.00 105.09 1015.25 96.06 588.43 4.72 100.10
sdh 0.00 0.00 0.00 200.00 0.00 97.11 994.40 69.77 535.01 5.00 100.10
sdg 0.00 2.00 0.00 221.00 0.00 109.59 1015.60 82.05 298.71 4.53 100.10
sda 0.00 1.00 0.00 212.00 0.00 83.93 810.75 18.26 84.82 4.68 99.30
sdf 0.00 0.00 0.00 208.00 0.00 102.55 1009.73 77.23 383.19 4.50 93.70
sdb 0.00 0.00 0.00 205.00 0.00 98.66 985.68 19.97 133.98 4.84 99.20
sdj 0.00 0.00 0.00 202.00 0.00 99.59 1009.66 69.97 257.47 4.95 100.00
sdk 0.00 0.00 0.00 204.00 0.00 98.10 984.86 20.83 100.34 4.87 99.30
sdm 0.00 0.00 0.00 216.00 0.00 106.55 1010.22 77.73 268.67 4.63 100.00
sdn 0.00 0.00 0.00 205.00 0.00 98.60 985.05 19.33 95.88 4.81 98.60
sdo 0.00 0.00 0.00 232.00 0.00 106.25 937.93 23.26 82.19 4.29 99.50
sdl 0.00 0.00 0.00 181.00 0.00 85.12 963.09 24.73 131.71 4.80 86.80
sdp 0.00 4.00 0.00 207.00 0.00 87.41 864.77 37.01 111.13 4.49 93.00
sdi 0.00 0.00 0.00 208.00 0.00 103.04 1014.54 72.30 263.72 4.70 97.70
sdr 0.00 0.00 0.00 191.00 0.00 76.75 822.95 11.51 83.69 4.59 87.60
sds 0.00 0.00 0.00 209.00 0.00 101.91 998.58 49.95 278.08 4.70 98.20
sdt 0.00 0.00 0.00 209.00 0.00 99.57 975.69 27.31 157.44 4.79 100.10
sdu 0.00 0.00 0.00 216.00 0.00 107.09 1015.41 79.82 345.88 4.63 100.10
sdw 0.00 0.00 0.00 208.00 0.00 103.09 1015.00 74.55 308.15 4.81 100.10
sdv 0.00 0.00 0.00 201.00 0.00 98.05 999.08 76.87 265.88 4.98 100.10
sdx 0.00 0.00 0.00 202.00 0.00 100.50 1018.93 110.40 327.68 4.96 100.10
sdq 0.00 0.00 0.00 228.00 0.00 112.59 1011.30 54.84 281.04 4.39 100.10
02/02/2012 09:14:14 AM
avg-cpu: %user %nice %system %iowait %steal %idle
22.11 0.00 54.03 15.38 0.00 8.48
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util
sdc 0.00 0.00 0.00 233.00 0.00 99.68 876.15 95.98 384.42 4.29 100.00
sdd 0.00 0.00 0.00 205.00 0.00 96.64 965.46 20.37 108.51 4.84 99.30
sde 0.00 0.00 0.00 225.00 0.00 99.54 906.03 92.38 420.67 4.44 100.00
sdh 0.00 0.00 0.00 198.00 0.00 97.05 1003.84 79.39 410.56 5.05 100.00
sdg 0.00 0.00 0.00 245.00 0.00 108.38 905.99 84.40 385.47 4.08 100.00
sda 0.00 4.00 0.00 220.00 0.00 96.23 895.78 63.24 294.59 4.44 97.60
sdf 0.00 0.00 0.00 216.00 0.00 107.09 1015.41 87.67 399.14 4.57 98.80
sdb 0.00 0.00 0.00 156.00 0.00 72.05 945.95 11.61 58.94 4.84 75.50
sdj 0.00 0.00 0.00 199.00 0.00 95.41 981.95 56.28 366.11 4.84 96.40
sdk 0.00 0.00 0.00 206.00 0.00 100.14 995.57 54.69 241.41 4.86 100.10
sdm 0.00 0.00 0.00 200.00 0.00 99.09 1014.72 79.51 506.47 4.74 94.70
sdn 0.00 0.00 0.00 191.00 0.00 91.29 978.81 26.82 128.39 5.18 98.90
sdo 0.00 0.00 0.00 234.00 0.00 106.75 934.32 49.82 231.07 4.27 100.00
sdl 0.00 0.00 0.00 214.00 0.00 103.62 991.70 33.03 168.13 4.62 98.80
sdp 0.00 0.00 0.00 219.00 0.00 106.08 992.00 64.69 328.92 4.57 100.00
sdi 0.00 0.00 0.00 210.00 0.00 104.09 1015.09 100.98 421.01 4.76 100.00
sdr 0.00 0.00 0.00 180.00 0.00 81.66 929.07 10.31 63.59 5.12 92.20
sds 0.00 0.00 0.00 201.00 0.00 95.15 969.47 32.60 144.16 4.98 100.00
sdt 0.00 0.00 0.00 198.00 0.00 95.72 990.10 33.26 155.98 4.84 95.90
sdu 0.00 0.00 0.00 219.00 0.00 108.59 1015.53 66.10 347.91 4.57 100.00
sdw 0.00 0.00 0.00 204.00 0.00 100.75 1011.41 81.20 456.47 4.80 98.00
sdv 0.00 0.00 0.00 197.00 0.00 96.09 998.90 44.19 284.65 5.08 100.00
sdx 0.00 0.00 0.00 211.00 0.00 104.19 1011.26 84.87 542.85 4.69 99.00
sdq 0.00 0.00 0.00 216.00 0.00 105.10 996.52 36.63 134.40 4.63 100.00
This is later in the same run, when things are not going as well:
02/02/2012 09:21:52 AM
avg-cpu: %user %nice %system %iowait %steal %idle
5.13 0.00 13.31 8.52 0.00 73.04
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util
sdc 0.00 0.00 0.00 36.00 0.00 16.02 911.11 1.43 39.72 5.64 20.30
sdd 0.00 0.00 0.00 18.00 0.00 8.01 911.11 0.85 47.28 6.39 11.50
sde 0.00 0.00 0.00 4.00 0.00 0.01 6.00 0.08 20.00 13.00 5.20
sdh 0.00 0.00 0.00 20.00 0.00 8.01 820.40 0.65 32.40 5.30 10.60
sdg 0.00 0.00 0.00 19.00 0.00 8.01 863.58 0.60 31.63 4.63 8.80
sda 0.00 0.00 0.00 82.00 0.00 36.04 900.10 3.13 37.05 5.15 42.20
sdf 0.00 0.00 0.00 18.00 0.00 8.01 911.11 0.80 44.22 6.39 11.50
sdb 0.00 8.00 0.00 42.00 0.00 1.75 85.52 0.14 3.43 1.40 5.90
sdj 0.00 16.00 0.00 103.00 0.00 25.64 509.83 2.21 21.36 3.65 37.60
sdk 0.00 14.00 0.00 152.00 0.00 47.93 645.79 3.96 27.31 4.12 62.60
sdm 0.00 0.00 0.00 21.00 0.00 9.39 915.81 0.94 44.57 5.71 12.00
sdn 0.00 34.00 0.00 197.00 0.00 64.61 671.72 28.66 85.62 4.02 79.10
sdo 0.00 0.00 0.00 92.00 0.00 42.54 946.87 6.22 55.58 4.85 44.60
sdl 0.00 0.00 0.00 6.00 0.00 2.01 685.33 0.09 59.67 6.33 3.80
sdp 0.00 10.00 0.00 58.00 0.00 9.56 337.52 1.20 20.60 3.05 17.70
sdi 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sdr 0.00 0.00 0.00 37.00 0.00 16.02 886.92 1.19 32.27 5.11 18.90
sds 0.00 18.00 0.00 115.00 0.00 26.54 472.70 4.03 25.94 3.20 36.80
sdt 0.00 0.00 0.00 131.00 0.00 60.05 938.87 6.13 46.33 5.11 67.00
sdu 0.00 12.00 0.00 119.00 0.00 31.40 540.44 2.93 24.65 3.05 36.30
sdw 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sdv 0.00 4.00 0.00 63.00 0.00 9.46 307.68 0.83 14.32 2.38 15.00
sdx 0.00 0.00 0.00 35.00 0.00 15.51 907.66 0.79 28.20 4.89 17.10
sdq 0.00 0.00 0.00 37.00 0.00 16.02 886.70 1.52 41.00 5.86 21.70
02/02/2012 09:21:53 AM
avg-cpu: %user %nice %system %iowait %steal %idle
3.74 0.00 8.75 6.60 0.00 80.90
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util
sdc 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sdd 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sde 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sdh 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sdg 0.00 0.00 0.00 18.00 0.00 8.01 911.11 0.88 48.94 6.83 12.30
sda 0.00 0.00 0.00 45.00 0.00 7.38 335.64 0.54 18.87 1.78 8.00
sdf 0.00 0.00 0.00 18.00 0.00 8.01 911.11 0.93 51.44 6.78 12.20
sdb 0.00 0.00 0.00 5.00 0.00 0.74 302.40 0.05 10.20 8.20 4.10
sdj 0.00 0.00 0.00 72.00 0.00 32.03 911.11 2.51 34.99 5.01 36.10
sdk 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sdm 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sdn 0.00 0.00 0.00 123.00 0.00 52.60 875.84 13.83 209.72 4.84 59.50
sdo 0.00 0.00 0.00 13.00 0.00 5.52 868.92 0.30 108.31 4.69 6.10
sdl 0.00 0.00 0.00 27.00 0.00 12.47 945.78 1.33 47.15 6.59 17.80
sdp 0.00 0.00 0.00 11.00 0.00 4.50 838.55 0.51 14.09 5.09 5.60
sdi 0.00 0.00 0.00 19.00 0.00 8.01 863.58 0.72 38.05 5.74 10.90
sdr 0.00 0.00 0.00 18.00 0.00 8.01 911.11 0.69 38.33 5.89 10.60
sds 0.00 0.00 0.00 56.00 0.00 19.66 718.86 1.31 39.16 5.11 28.60
sdt 0.00 0.00 0.00 161.00 0.00 72.57 923.18 6.97 37.39 5.07 81.70
sdu 0.00 0.00 0.00 66.00 0.00 30.02 931.64 2.77 39.85 5.09 33.60
sdw 0.00 0.00 0.00 20.00 0.00 8.51 871.60 1.47 27.80 4.85 9.70
sdv 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sdx 0.00 0.00 0.00 36.00 0.00 16.02 911.11 1.37 38.08 5.72 20.60
sdq 0.00 0.00 0.00 44.00 0.00 19.46 906.00 1.15 26.02 4.50 19.80
And finally, this is still later, near the end of the run, when things have recovered
somewhat:
02/02/2012 09:22:34 AM
avg-cpu: %user %nice %system %iowait %steal %idle
15.25 0.00 52.27 20.88 0.00 11.60
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util
sdc 0.00 1.00 0.00 217.00 0.00 95.20 898.51 84.43 413.56 4.60 99.90
sdd 0.00 0.00 0.00 40.00 0.00 16.86 863.00 1.59 28.45 5.55 22.20
sde 0.00 0.00 0.00 206.00 0.00 99.27 986.95 89.64 452.92 4.85 99.90
sdh 0.00 0.00 0.00 51.00 0.00 22.53 904.63 2.02 35.45 5.47 27.90
sdg 0.00 0.00 0.00 230.00 0.00 112.49 1001.63 92.87 283.01 4.33 99.60
sda 0.00 0.00 0.00 215.00 0.00 106.10 1010.68 94.45 253.40 4.65 99.90
sdf 0.00 0.00 0.00 73.00 0.00 32.04 898.74 2.20 30.08 5.11 37.30
sdb 0.00 0.00 0.00 92.00 0.00 40.05 891.48 2.55 27.70 4.85 44.60
sdj 0.00 44.00 0.00 280.00 0.00 91.61 670.03 109.32 314.59 3.57 99.90
sdk 0.00 1.00 0.00 210.00 0.00 100.63 981.41 97.79 419.98 4.76 99.90
sdm 0.00 42.00 0.00 282.00 0.00 100.27 728.23 92.86 285.38 3.54 99.90
sdn 0.00 0.00 0.00 213.00 0.00 100.81 969.31 41.62 301.33 4.67 99.40
sdo 0.00 39.00 0.00 306.00 0.00 102.84 688.29 82.44 279.69 3.26 99.70
sdl 0.00 0.00 0.00 219.00 0.00 104.16 974.06 83.05 421.80 4.56 99.90
sdp 0.00 46.00 0.00 277.00 0.00 97.01 717.23 106.44 324.31 3.61 99.90
sdi 0.00 0.00 0.00 56.00 0.00 24.03 878.86 1.73 30.91 5.05 28.30
sdr 0.00 34.00 0.00 266.00 0.00 97.66 751.91 63.86 304.39 3.76 100.00
sds 0.00 18.00 0.00 67.00 0.00 17.41 532.18 1.68 25.03 3.79 25.40
sdt 0.00 0.00 0.00 130.00 0.00 64.01 1008.37 56.33 166.52 4.99 64.90
sdu 0.00 0.00 0.00 197.00 0.00 95.02 987.82 44.70 282.45 4.95 97.60
sdw 0.00 0.00 0.00 207.00 0.00 93.39 923.98 90.21 448.08 4.83 99.90
sdv 0.00 0.00 0.00 204.00 0.00 100.52 1009.14 84.16 425.70 4.85 98.90
sdx 0.00 0.00 0.00 203.00 0.00 88.75 895.33 87.10 475.92 4.92 99.90
sdq 0.00 0.00 0.00 18.00 0.00 8.01 911.11 0.52 28.83 4.83 8.70
02/02/2012 09:22:35 AM
avg-cpu: %user %nice %system %iowait %steal %idle
14.63 0.00 50.99 22.22 0.00 12.16
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util
sdc 0.00 0.00 0.00 209.00 0.00 99.54 975.35 84.02 409.76 4.78 99.90
sdd 0.00 0.00 0.00 13.00 0.00 5.50 867.08 0.34 57.31 6.23 8.10
sde 0.00 0.00 0.00 204.00 0.00 98.12 985.06 87.28 418.62 4.88 99.50
sdh 0.00 0.00 0.00 78.00 0.00 34.12 895.79 2.15 30.26 5.37 41.90
sdg 0.00 0.00 0.00 226.00 0.00 108.48 983.04 93.54 336.46 4.42 99.80
sda 0.00 0.00 0.00 219.00 0.00 108.07 1010.63 80.90 510.96 4.53 99.20
sdf 0.00 6.00 0.00 81.00 0.00 21.20 535.90 1.99 24.47 3.59 29.10
sdb 0.00 0.00 0.00 71.00 0.00 32.03 923.94 2.46 34.63 4.65 33.00
sdj 0.00 0.00 0.00 192.00 0.00 83.87 894.62 83.33 459.53 5.21 100.10
sdk 0.00 41.00 0.00 285.00 0.00 94.12 676.32 104.34 310.17 3.51 100.10
sdm 0.00 0.00 0.00 202.00 0.00 90.44 916.91 86.45 506.52 4.96 100.10
sdn 0.00 0.00 0.00 208.00 0.00 101.48 999.23 87.79 323.35 4.79 99.70
sdo 0.00 1.00 0.00 228.00 0.00 108.63 975.75 89.79 327.24 4.38 99.80
sdl 0.00 28.00 0.00 270.00 0.00 97.64 740.65 52.06 281.67 3.54 95.60
sdp 0.00 0.00 0.00 195.00 0.00 85.65 899.57 92.28 453.54 5.14 100.20
sdi 0.00 14.00 0.00 31.00 0.00 9.02 595.61 0.96 30.94 4.77 14.80
sdr 0.00 0.00 0.00 192.00 0.00 83.11 886.46 14.22 142.39 5.06 97.10
sds 0.00 0.00 0.00 18.00 0.00 8.01 911.11 0.73 40.39 5.89 10.60
sdt 0.00 0.00 0.00 201.00 0.00 98.66 1005.29 65.87 425.37 4.89 98.30
sdu 0.00 0.00 0.00 209.00 0.00 103.01 1009.38 87.49 285.51 4.74 99.10
sdw 0.00 0.00 0.00 204.00 0.00 96.74 971.22 82.66 410.50 4.89 99.70
sdv 0.00 0.00 0.00 198.00 0.00 96.61 999.23 83.39 420.17 5.03 99.50
sdx 0.00 0.00 0.00 204.00 0.00 98.79 991.80 86.54 428.67 4.90 100.00
sdq 0.00 0.00 0.00 36.00 0.00 16.02 911.11 0.88 24.33 4.44 16.00
The above suggests to me that the slowdown is a result
of requests not getting submitted at the same rate as
when things are running well.
-- Jim
Yehuda
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
[CEPH Users]
[Information on CEPH]
[Linux USB Devel]
[Video for Linux]
[Linux Audio Users]
[Photo]
[Yosemite News]
[Yosemite Photos]
[Free Online Dating]
[Linux Kernel]
[Linux SCSI]
[XFree86]