On Thu, Mar 14, 2019 at 09:10:42AM +0200, Nikolay Borisov wrote: > >> The correct way to free the buffer is call free_extent_buffer_stale. > >> This function will correctly call atomic_dec explicitly for the buffer > >> and subsequently call release_extent_buffer which will decrement the > >> final reference thus correctly remove the invalid buffer from buffer > >> cache. This change affects only newly allocated buffers since they have > >> eb->refs == 2. > > > > Following the same pattern, should every use of > > btrfs_find_create_tree_block be followed by free_extent_buffer_stale in > > case of errors? > > > > There's almost the same sequence in readahead_tree_block as you update > > in reada_tree_block_flagged: > > > > btrfs_find_create_tree_block > > read_extent_buffer_pages > > free_extent_buffer > > > > so should be the _stale variant used there too? > > You are right it should also be using the _stale variant. I will fix it. > > Despite my cleanup of the extent buffer freeing code I'm still not > entirely happy with the special cases we have with the stale or unmapped > logic. That's why we need all these special cases ;\ > > > > > free_extent_buffer_stale is normally used in other contexts where the > > 'stale' part means that the eb is known to be disconnected. > > > > I think that btrfs_find_create_tree_block needs a dedicated freeing > > function that will make sure all invariants (refs == 2, no other > > structures connected) are met and then do effectively the same what > > _stale does now. > Sounds good though I'm a bit reluctant to add yet another > free_extent_buffer variant. IMO the logic of the extent buffer radix > tree need to be revised which could lead to simplifications in the whole > ref counting code. Revised maybe, but this is the core of the MM <-> FS interaction and must be correct, even after the simplification. There's RCU in the mix, non-standard refcounting, page tracking. This won't be easy. Adding a admitedly why-yet-another special freeing helper would at least stop abusing the API and annotate the context. That's an intermediate step.
