On Tue, Aug 25, 2015 at 10:29:15PM +0000, Duncan wrote: > Swâmi Petaramesh posted on Tue, 25 Aug 2015 16:03:55 +0200 as excerpted: > > > Le mardi 25 août 2015 15:25:12 Swâmi Petaramesh a écrit : > >> Uh, I've started [btrfs check --repair] hours ago, and it has been > >> eating 100% CPU on one of my cores since, without apparently yet > >> finding a single error. > > > > btrfs check finally came to an end, and fixed dirs iszes. > > > > The 1st good news is that my FS didn't die in the process, the 2nd good > > news is that I was then actually able to delete the "empty" directory > > that couldn't be deleted before. > > > > So, it did the job ;-) > > =:^) > > FWIW, there was something in the corner of my mind about those root nnnnn > numbers that was bothering me a bit, but I was focused on the errors 200 > dir isize wrong end and couldn't quite place it, so didn't mention it. > > Now I know what it was. Roots above 256 (255?) are subvolume/snapshot > roots. FS tree IDs for subvolumes start at 256. The rest of the metadata trees (chunk, dev, extent, csum, UUID, top-level FS) are all small integers under 20. > That's not bad in itself, indeed, it's normal (which was why I > couldn't place what was bothering me about it), but with the output > including a root over 24k, it does indicate that you're very likely a > heavy snapshotter. It happens to us all eventually... ;) > And that /can/ be a bit of a problem when doing btrfs maintenance, as > btrfs is known to have scaling issues with large numbers of snapshots, > dramatically increasing the time required for maintenance tasks such as > check and balance. From something josef said on IRC about 6 months ago, I think balancing is O(n^2) where n is the maximum number of shared snapshots. I don't know what check is like for that, but I'd hazard a guess at O(n) in the total size of the metadata. > That's almost certainly why the check --repair took so long -- all those > snapshots. > > My recommendation has been to try to keep the number of snapshots under > 10K, worst-case, and preferably under 2K. Even if you're doing automated > snapshots at say half-hour intervals, a reasonable thinning program, say > hourly after 12 hours, two-hourly after a day, six-hourly after two days, > daily after a week, weekly after a quarter, ... and switch to longer term > backups after six months or a year so older snapshots can be deleted, can > easily keep per-subvolume snapshots to 250 or so. 250 snapshots per > subvolume lets you do four subvolumes at a 1000 snapshot per filesystem > cap, eight subvolumes at the 2000 snapshot cap. Beyond that, and > certainly beyond 10K, scaling really gets to be an issue, with btrfs > maintenance time quickly increasing beyond the practical range. > > Hours for a btrfs check --repair? You're actually lucky. We've had > reports of days, weeks... when people have tens of thousands of snapshots. > > If I'd have properly placed that nagging feeling about those root numbers > that was in the back of my mind with the go-ahead post, I could at least > have warned you about the time it could take. Well, I guess I know for > next time, anyway. Talking about which... thanks for confirming the > fix. Very helpful for next time. =:^) > > Meanwhile, talking scaling, one more thing... btrfs quotas... > > Btrfs quotas just don't work well at this point, being both not always > reliable, and dramatically increasing the scaling issues. Even after the recent rewrite? I'd expect that to drop back to "unproven". > My > recommendation is that if you really need them, use a more mature > filesystem where they're known to work reliably. If you can get by > without them on btrfs, definitely do so, unless you're specifically > working with the devs to develop and test that feature, because keeping > quotas off will simplify your btrfs life considerably! The devs are > actively working on the problem and should eventually get it right, but > it's just an incredibly difficult issue that has taken multiple tries. > -- Hugo Mills | Great oxymorons of the world, no. 5: hugo@... carfax.org.uk | Manifesto Promise http://carfax.org.uk/ | PGP: E2AB1DE4 |
Attachment:
signature.asc
Description: Digital signature
