Am Samstag, 25. Februar 2012 schrieb Arne Jansen:
> On 02/24/12 16:51, Martin Steigerwald wrote:
> > Am Samstag, 21. Januar 2012 schrieb Martin Steigerwald:
> >> Am Samstag, 21. Januar 2012 schrieb Martin Steigerwald:
> >>> I still have this with 3.2.0-1-pae - which is a debian kernel based
> >>> on 3.2.1.
> >>>
> >>> When I do btrfs scrub start / the machine locks immediately up
> >>> hard.
> >>>
> >>> Then usually on next boot it stops on space_cache enabled message,
> >>> but not the one for /, but the one for /home which is mounted
> >>> later.
> >>>
> >>> When I then boot with 3.1 it works. BTRFS redos the space_cache
> >>> then while the machine takes ages to boot - I mean ages - 10
> >>> minutes till KDM prompt is no problem there.
> >>
> >> I now tested scrubbing /home which is a different BTRFS filesystem
> >> on the same machine.
> >>
> >> Then the scrub is started, scrub status tells me so, but nothing
> >> happens, no block in/out activity in vmstat, no CPU related activity
> >> in top.
> >>
> >> btrfs scrub cancel then hangs, but not the complete machine, only
> >> the process.
> >>
> >> I had this once on my T520 with the internal Intel SSD 320 as well.
> >> The other time it worked.
> >>
> >> Well maybe that is due to BTRFS doing something else on my T23 now:
> >>
> >> deepdance:~> ps aux | grep ino-cache | grep -v grep
> >> root 1992 5.5 0.0 0 0 ? D 12:15 0:09
> >> [btrfs- ino-cache]
> >>
> >> Hmmm, so I just let it sit for a while, maybe eventually it will
> >> scrub /home.
> >>
> >> At least it doesn´t lock up hard, so there might really be something
> >> strange with /.
> >
> > FWIW a btrfs filesystem balance / does work. After this a btrfs scrub
> > start / still locks the kernel.
>
> Hi Martin,
>
> I just sent 2 patches to the list. Could you please test if these
> fix your problem with scrub?
I didn´t yet test it but I tried the first balance then scrub stuff again:
deepdance:~> btrfs filesystem balance /
This time scrub didn´t lock up hard:
deepdance:~> btrfs scrub start /
scrub started on /, fsid 2bf5b1dc-1d89-4f0d-a561-1a5551a27275 (pid=3347)
deepdance:~> btrfs scrub status /
scrub status for 2bf5b1dc-1d89-4f0d-a561-1a5551a27275
scrub started at Thu Mar 15 18:10:52 2012, running for 10 seconds
total bytes scrubbed: 272.59MB with 0 errors
deepdance:~> cat /proc/version
Linux version 3.2.0-1-686-pae (Debian 3.2.6-1) (ben@xxxxxxxxxxxxxxx)
(gcc version 4.6.2 (Debian 4.6.2-14) ) #1 SMP Fri Feb 17 06:27:21 UTC 2012
deepdance:~> btrfs scrub status /
scrub status for 2bf5b1dc-1d89-4f0d-a561-1a5551a27275
scrub started at Thu Mar 15 18:10:52 2012, running for 55 seconds
total bytes scrubbed: 1.29GB with 0 errors
deepdance:~> btrfs scrub status /
scrub status for 2bf5b1dc-1d89-4f0d-a561-1a5551a27275
scrub started at Thu Mar 15 18:10:52 2012, running for 175 seconds
total bytes scrubbed: 2.58GB with 0 errors
But it seems to be stuck now:
deepdance:~> btrfs scrub status /
scrub status for 2bf5b1dc-1d89-4f0d-a561-1a5551a27275
scrub started at Thu Mar 15 18:10:52 2012, running for 295 seconds
total bytes scrubbed: 2.58GB with 0 errors
deepdance:~> btrfs scrub status /
scrub status for 2bf5b1dc-1d89-4f0d-a561-1a5551a27275
scrub started at Thu Mar 15 18:10:52 2012, running for 395 seconds
total bytes scrubbed: 2.58GB with 0 errors
deepdance:~> btrfs scrub status /
scrub status for 2bf5b1dc-1d89-4f0d-a561-1a5551a27275
scrub started at Thu Mar 15 18:10:52 2012, running for 636 seconds
total bytes scrubbed: 2.58GB with 0 errors
There is almost no activity in vmstat 5:
2 0 108812 63812 36 412716 0 22 0 437 515 1723 47 13 37 3
2 0 108812 63320 36 413168 0 0 81 0 453 1600 44 12 43 1
2 0 108812 68056 36 413512 0 0 64 0 480 1589 51 11 38 0
2 0 108812 68112 36 413516 0 0 0 0 460 1454 46 11 43 0
1 0 108812 68112 36 413536 0 0 0 1 454 1456 45 11 44 0
1 1 108812 67924 36 413692 0 0 0 330 491 1578 49 13 37 1
2 0 108812 67808 36 413792 0 0 0 6 473 1556 48 11 41 0
2 0 108812 67436 36 414124 0 0 0 424 523 1922 47 13 39 2
1 0 108812 67460 36 414136 0 0 0 0 453 1578 46 9 45 0
2 0 108812 67476 36 414136 0 0 0 0 465 1493 46 12 41 0
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa
3 0 108812 67492 36 414140 0 0 0 0 467 1609 48 10 42 0
2 0 108812 67500 36 414148 0 0 0 0 470 1686 46 12 41 0
1 0 108812 66872 36 414152 0 0 0 1 463 1585 46 11 43 0
1 0 108812 67492 36 414156 0 0 0 1 483 1612 48 13 39 0
2 0 108812 67428 36 414156 0 0 0 325 497 1654 48 9 41 1
0 0 108812 67452 36 414160 0 0 0 3 478 1588 46 15 39 0
2 0 108812 65840 36 415228 0 0 214 0 465 1408 54 10 36 0
0 0 108812 66124 36 415352 0 0 24 0 510 1476 62 14 22 2
Given the general slowness of the machine, I waited about 15 seconds on a
tail -f /var/log/syslog
while just a Konsole and Amarok was running and wait about that time or
even longer when logging via SSH at times, I think next I try newest
Debian kernel which should be up to 3.2.10.
If that doesn´t work, I consider trying your patches.
At least its a step forward: The machine doesn´t lock up hard anymore. Its
still playing music.
FWIW a cancel hangs:
deepdance:~> btrfs scrub cancel /
martin@deepdance:~> ps aux | grep btrfs | grep cancel
root 3429 0.0 0.0 2128 304 pts/5 D+ 18:24 0:00 btrfs scrub cancel /
Ah, now there we go, from dmesg:
16453.391004] btrfs: relocating block group 41393586176 flags 1
[16568.504524] btrfs: found 9891 extents
[16660.810000] btrfs: found 9891 extents
[16664.818271] btrfs: relocating block group 40319844352 flags 1
[16729.029078] btrfs: found 3593 extents
[16755.771257] btrfs: found 3593 extents
[16757.816472] btrfs: relocating block group 39246102528 flags 1
[16843.042862] btrfs: found 13147 extents
[16898.078594] btrfs: found 13147 extents
[16905.444650] btrfs: relocating block group 39237713920 flags 34
[16910.579963] btrfs: found 1 extents
[16915.946553] btrfs: relocating block group 39103496192 flags 36
[16997.308374] btrfs: found 14552 extents
[17002.239868] btrfs: relocating block group 38969278464 flags 36
[17165.253567] btrfs: found 22792 extents
[17166.522513] btrfs: relocating block group 38835060736 flags 36
[17376.519168] btrfs: found 22857 extents
[17377.526508] btrfs: relocating block group 38700843008 flags 36
[17559.413243] btrfs: found 24810 extents
[17563.767590] btrfs: relocating block group 37627101184 flags 1
[17646.272226] btrfs: found 10900 extents
[17694.323897] btrfs: found 10900 extents
[17697.171187] btrfs: relocating block group 36553359360 flags 1
[17779.364231] btrfs: found 12149 extents
[17828.395574] btrfs: found 12149 extents
[17831.593055] btrfs: relocating block group 35479617536 flags 1
[17923.352624] btrfs: found 22174 extents
[18014.147595] btrfs: found 22174 extents
[18015.821756] btrfs: relocating block group 35345399808 flags 36
[18055.201043] btrfs: found 9166 extents
[18061.692337] btrfs: relocating block group 34271657984 flags 1
[18173.986306] btrfs: found 29124 extents
[18328.733328] btrfs: found 29124 extents
[18333.141061] btrfs: relocating block group 33197916160 flags 1
[18427.523898] btrfs: found 26881 extents
[18587.015489] btrfs: found 26880 extents
[18590.183797] btrfs: relocating block group 33063698432 flags 36
[18590.183797] btrfs: relocating block group 33063698432 flags 36
[18624.883576] btrfs: found 6522 extents
[18628.618415] btrfs: relocating block group 32929480704 flags 36
[18670.147756] btrfs: found 9876 extents
[18671.426925] btrfs: relocating block group 32795262976 flags 36
[18863.313381] btrfs: found 27324 extents
[18864.392712] btrfs: relocating block group 31721521152 flags 1
[18945.941315] btrfs: found 21748 extents
[19076.164135] btrfs: found 21721 extents
[20880.564211] INFO: task btrfs:3348 blocked for more than 120 seconds.
[20880.564225] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[20880.564234] btrfs D 000012d2 0 3348 1 0x00000000
[20880.564251] c0b72080 00000086 2cae1a17 000012d2 ec3ae5a0 00000000 000012d2 c14769c0
[20880.564275] c0b7222c c14769c0 00008050 e0b96ce0 ee000120 d1d5dae0 d1d5db20 00000028
[20880.564298] c105d451 00000092 c12b949b 00000092 f0b7037c 00000246 eee6073c c105d451
[20880.564321] Call Trace:
[20880.564348] [<c105d451>] ? arch_local_irq_save+0xf/0x14
[20880.564368] [<c12b949b>] ? _raw_spin_lock_irqsave+0x8/0x21
[20880.564482] [<f0b7037c>] ? btrfs_queue_worker+0x1cd/0x1fc [btrfs]
[20880.564497] [<c105d451>] ? arch_local_irq_save+0xf/0x14
[20880.564511] [<c12b949b>] ? _raw_spin_lock_irqsave+0x8/0x21
[20880.564526] [<c104d402>] ? prepare_to_wait+0x12/0x50
[20880.564601] [<f0b8a8c8>] ? btrfs_reada_wait+0x5b/0x84 [btrfs]
[20880.564615] [<c104d4d0>] ? add_wait_queue+0x30/0x30
[20880.564687] [<f0b87e33>] ? scrub_stripe+0x34c/0xbda [btrfs]
[20880.564729] [<f0a4da1f>] ? scsi_alloc_sgtable+0x1a/0x38 [scsi_mod]
[20880.564786] [<f0b37a7f>] ? comp_keys+0x2f/0x34 [btrfs]
[20880.564842] [<f0b37b4d>] ? generic_bin_search.constprop.31+0xc9/0x104 [btrfs]
[20880.564856] [<c104d4d0>] ? add_wait_queue+0x30/0x30
[20880.564912] [<f0b37bbf>] ? bin_search+0x37/0x3d [btrfs]
[20880.564987] [<f0b5e3d4>] ? __lookup_extent_mapping+0xe5/0x101 [btrfs]
[20880.565061] [<f0b88781>] ? scrub_chunk.isra.13+0xc0/0xef [btrfs]
[20880.565134] [<f0b889a3>] ? scrub_enumerate_chunks+0x1f3/0x23a [btrfs]
[20880.565209] [<f0b89486>] ? btrfs_scrub_dev+0x215/0x34e [btrfs]
[20880.565230] [<c10c061a>] ? __kmalloc_track_caller+0x9b/0xa7
[20880.565248] [<c102a280>] ? should_resched+0x5/0x1e
[20880.565320] [<f0b74b14>] ? btrfs_ioctl+0xc55/0xda1 [btrfs]
[20880.565335] [<c1029188>] ? kmap_atomic_prot+0x2f/0xe0
[20880.565353] [<c10ad3d5>] ? handle_mm_fault+0x1ee/0x1fd
[20880.565426] [<f0b73ebf>] ? btrfs_ioctl_trans_end+0x49/0x49 [btrfs]
[20880.565442] [<c10d7023>] ? do_vfs_ioctl+0x459/0x48f
[20880.565457] [<c12bc3d7>] ? do_page_fault+0x2e0/0x2fc
[20880.565469] [<c12bc3c4>] ? do_page_fault+0x2cd/0x2fc
[20880.565484] [<c10d709d>] ? sys_ioctl+0x44/0x67
[20880.565499] [<c12bdd1f>] ? sysenter_do_call+0x12/0x28
[21000.564277] INFO: task btrfs:3348 blocked for more than 120 seconds.
[21000.564292] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[21000.564302] btrfs D 000012d2 0 3348 1 0x00000000
[21000.564319] c0b72080 00000086 2cae1a17 000012d2 ec3ae5a0 00000000 000012d2 c14769c0
[21000.564344] c0b7222c c14769c0 00008050 e0b96ce0 ee000120 d1d5dae0 d1d5db20 00000028
[21000.564366] c105d451 00000092 c12b949b 00000092 f0b7037c 00000246 eee6073c c105d451
[21000.564390] Call Trace:
[21000.564422] [<c105d451>] ? arch_local_irq_save+0xf/0x14
[21000.564442] [<c12b949b>] ? _raw_spin_lock_irqsave+0x8/0x21
[21000.564582] [<f0b7037c>] ? btrfs_queue_worker+0x1cd/0x1fc [btrfs]
[21000.564600] [<c105d451>] ? arch_local_irq_save+0xf/0x14
[21000.564614] [<c12b949b>] ? _raw_spin_lock_irqsave+0x8/0x21
[21000.564630] [<c104d402>] ? prepare_to_wait+0x12/0x50
[21000.564707] [<f0b8a8c8>] ? btrfs_reada_wait+0x5b/0x84 [btrfs]
[21000.564721] [<c104d4d0>] ? add_wait_queue+0x30/0x30
[21000.564793] [<f0b87e33>] ? scrub_stripe+0x34c/0xbda [btrfs]
[21000.564840] [<f0a4da1f>] ? scsi_alloc_sgtable+0x1a/0x38 [scsi_mod]
[21000.564899] [<f0b37a7f>] ? comp_keys+0x2f/0x34 [btrfs]
[21000.564954] [<f0b37b4d>] ? generic_bin_search.constprop.31+0xc9/0x104 [btrfs]
[21000.564968] [<c104d4d0>] ? add_wait_queue+0x30/0x30
[21000.565024] [<f0b37bbf>] ? bin_search+0x37/0x3d [btrfs]
[21000.565099] [<f0b5e3d4>] ? __lookup_extent_mapping+0xe5/0x101 [btrfs]
[21000.565173] [<f0b88781>] ? scrub_chunk.isra.13+0xc0/0xef [btrfs]
[21000.565246] [<f0b889a3>] ? scrub_enumerate_chunks+0x1f3/0x23a [btrfs]
[21000.565322] [<f0b89486>] ? btrfs_scrub_dev+0x215/0x34e [btrfs]
[21000.565345] [<c10c061a>] ? __kmalloc_track_caller+0x9b/0xa7
[21000.565363] [<c102a280>] ? should_resched+0x5/0x1e
[21000.565436] [<f0b74b14>] ? btrfs_ioctl+0xc55/0xda1 [btrfs]
[21000.565451] [<c1029188>] ? kmap_atomic_prot+0x2f/0xe0
[21000.565471] [<c10ad3d5>] ? handle_mm_fault+0x1ee/0x1fd
[21000.565544] [<f0b73ebf>] ? btrfs_ioctl_trans_end+0x49/0x49 [btrfs]
[21000.565561] [<c10d7023>] ? do_vfs_ioctl+0x459/0x48f
[21000.565577] [<c12bc3d7>] ? do_page_fault+0x2e0/0x2fc
[21000.565589] [<c12bc3c4>] ? do_page_fault+0x2cd/0x2fc
[21000.565603] [<c10d709d>] ? sys_ioctl+0x44/0x67
[21000.565619] [<c12bdd1f>] ? sysenter_do_call+0x12/0x28
[… it seems to go on this way …]
Maybe something of that is helpful?
I leave the machine running for a while, but I think it won´t hibernate
with a process stuck in D state so eventually at some time I will shut it
down and restart. OTOH I could leave it running till tomorrow or what.
I will put log files aside for further diagnostics in any case.
Ciao,
--
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA B82F 991B EAAC A599 84C7
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html