On Mon, Oct 28, 2019 at 12:44 PM Atemu <atemu.main@xxxxxxxxx> wrote: > > > That's quite a lot of extents shared many times. > > That indeed slows backreference walking and therefore send which uses it. > > While the slowdown is known, the memory consumption I wasn't aware of, > > but from your logs, it's not clear > > Is there anything else I could monitor to find out? You can run 'slabtop' while doing the send operation. That might be enough. It's very likely the backreference walking code, due to huge ulists (kmalloc-N slab), lots of btrfs_prelim_ref structures (btrfs_prelim_ref slab), etc. > > > where it comes exactly from, something to be looked at. There's also a > > significant number of data checksum errors. > > As I said, those seem to be false; the file is in-tact (it happens to > be a 7z archive) and scrubs before triggering the bug don't report > anything either. > > Could be related to running OOM or its own bug. Yes, it's likely a different bug. I don't think it's related either. > > > I think in the meanwhile send can just skip backreference walking and > > attempt to clone whenever the number of > > backreferences for an inode exceeds some limit, in which case it would > > fallback to writes instead of cloning. > > Wouldn't it be better to make it dynamic in case it's run under low > memory conditions? Ideally yes. But that's a lot harder to do for several reasons and in the end might not be worth it. Thanks. > > > I'll look into it, thanks for the report (and Qu for telling how to > > get the backreference counts). > > Thanks to you both! > -Atemu -- Filipe David Manana, “Whether you think you can, or you think you can't — you're right.”
