On Sun, Oct 05, 2014 at 01:29:37PM -0700, Marc MERLIN wrote: > Deadlocks have been less frequent (good), but here is one. > > An rsync from 5 days ago got stuck on btrfs it seems, and things just > started piling up on top until the system deadlocked This gave me a chance to fix my cronjob that should have detected this earlier (there is no fix but rebooting, but I can reboot earlier and before the watchdog kills everything without syncing my software raid five arrays first). I just polished and released the crontab below (posted on http://marc.merlins.org/perso/btrfs/post_2014-10-05_Btrfs-Tips_-Catch-Btrfs-Deadlocks.html ) You can paste this template in your crontab SHELL=/bin/bash # If load average is more than MAXLA, show load average and all blocked processes # As any time show anything blocked on wait_current_trans.isra.15 (used to be a btrfs hang bug) # Also show swap if it drops below MINSWAP # We pipe into bc because shell comparison doesn't do floating point. */5 * * * * nobody MAXLA=25; MINSWAP=10; if [[ $(echo "$(awk '{print $1}' < /proc/loadavg) > $MAXLA" | bc) == 1 ]]; then cat /proc/loadavg; ps -eo state,pid,etime,wchan:30,args | grep W |grep -v "^[RS]" ; fi; ps -eo pid,etime,wchan:30,args | grep -q [w]ait_current_trans.isra.15; if [[ $(echo "$(free | grep 'Swap' | awk '{t = $2; f = $4; print (f/t*100)}') < $MINSWAP" | bc) == 1 ]]; then free; fi Cheers, Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ | PGP 1024R/763BE901 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html
