On Fri, Jan 3, 2020 at 10:38 PM Zygo Blaxell <ce3g8jdj@xxxxxxxxxxxxxxxxxxxxx> wrote: > > On Thu, Jan 02, 2020 at 04:22:37PM -0700, Chris Murphy wrote: > > I've seen with 16KiB leaf size, often small files that could be > > inlined, are instead put into a data block group, taking up a minimum > > 4KiB block size (on x64_64 anyway). I'm not sure why, but I suspect > > there just isn't enough room in that leaf to always use inline > > extents, and yet there is enough room to just reference it as a data > > block group extent. When using a larger node size, a larger percentage > > of small files ended up using inline extents. I'd expect this to be > > quite a bit more efficient, because it eliminates a time expensive (on > > HDD anyway) seek. > > Putting a lot of inline file data into metadata pages makes them less > dense, which is either good or bad depending on which bottleneck you're > currently hitting. If you have snapshots there is an up-to-300x metadata > write amplification penalty to update extent item references every time > a shared metadata page is unshared. Inline extents reduce the write > amplification. On the other hand, if you are doing a lot of 'find'-style > tree sweeps, then inline extents will reduce their efficiency because more > pages will have to be read to scan the same number of dirents and inodes. Egads! Soo... total tangent. I'll change the subject. I have had multiple flash drive failures while using Btrfs: all Samsung, several SD Cards, and so far two USB sticks. They all fail in the essentially the same way, the media itself becomes read only. USB: writes succeed but they do not persist. Write data to the media, and there is no error. Read that same sector back, old data is there. SD Card: writes fail with a call trace and diagnostic info unique to the sd card kernel code, and everything just goes belly up. This happens inside of 6 months of rather casual use as rootfs. And BTW Samsung always replaces the media under warranty without complaint. It's not a scientific sample. Could be the host device, which is the same in each case. Could be a bug in the firmware. I have nothing to go on really. But I wonder if this is due to write amplification that's just not anticipated by the manufacturers? Is there any way to test for this or estimate the amount of amplification? This class of media doesn't report LBA's written, so I'm at quite a lack of useful information to know what the cause is. The relevance here though is, I really like the idea of Btrfs used as a rootfs for things like IoT because of compression, ostensibly there are ssd optimizations, and always on checksumming to catch what often can be questionable media: like USB sticks, SD Cards, eMMC, etc. But not if the write amplication has a good chance of killing people's hardware (I have no proof of this but now I wonder, as I read your email). I'm aware of write amplification, I just didn't realize it could be this massive. It's is 300x just by having snapshots at all? Or does it get worse with each additional snapshot? And is it multiplicative or exponentially worse? In the most prolific snapshotting case, I had two subvolumes, each with 20 snapshots (at most). I used default ssd mount option for the sdcards, most recently ssd_spread with the usb sticks. And now nossd with the most recent USB stick I just started to use. -- Chris Murphy
