I have a system running the Debian package of 3.11.5 with an Amd Opteron 1212 processor (2*64bit cores), 8G of RAM, and an Intel 120G SSD for the root and home subvols. It has a RAID-1 array of 2*3TB disks for bulk storage (movies etc) but that probably isn't relevant to this problem. On the root filesystem I have cron jobs making daily snapshots of / and /home and additional snapshots of /home every 15 minutes. At midnight a cron job removes older snapshots. For the last 8 days the system has been reliably hanging at about 5 minutes after midnight and the subvol removal cron job is the only thing that has happened then. So it seems clear to me that on my system 3.11.5 has a crash a few minutes after removing ~98 subvols at the same time. Last night I watched it happen and deleted a few dozen extra subvols to test whether it would repeat. That wasn't such a good idea and I rebooted the system many times before giving up and booting 3.10.11 which is now working correctly. When running 3.11.5 I was seeing kernel log messages such as the following shortly after boot. Then after that it got into a state where a ssh session didn't work and the X login prompt didn't even flash it's cursor. In that state it could still forward packets (the system in question is an ethernet bridge which I use to connect my workstation to the Internet) but couldn't do much else. The NFS server processes locked and sshd wouldn't complete the login process for new connection attempts. [ 68.056003] BUG: soft lockup - CPU#0 stuck for 22s! [btrfs-cleaner:270] [ 68.144004] BUG: soft lockup - CPU#1 stuck for 22s! [btrfs-transacti:271] Prior to the lockup those two kernel processes had used most CPU time. I'm not sure whether prior to the lockup they were in some sort of CPU loop or whether they were just reading a lot of data from a fast SSD and acting correctly. As an aside I ordered a replacement server last week when I wasn't sure if this was a hardware or a software problem. This will allow me to test some things in more detail on the old server after the new one is running, however I don't own a spare SSD so if it's a SSD specific issue then I have limited ability to test. -- My Main Blog http://etbe.coker.com.au/ My Documents Blog http://doc.coker.com.au/ -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html
