On 17.10.19 г. 22:39 ч., David Sterba wrote: > Signed-off-by: David Sterba <dsterba@xxxxxxxx> > --- > fs/btrfs/locking.c | 110 +++++++++++++++++++++++++++++++++++++++------ > 1 file changed, 96 insertions(+), 14 deletions(-) > > diff --git a/fs/btrfs/locking.c b/fs/btrfs/locking.c > index e0e0430577aa..2a0e828b4470 100644 > --- a/fs/btrfs/locking.c > +++ b/fs/btrfs/locking.c > @@ -13,6 +13,48 @@ > #include "extent_io.h" > #include "locking.h" > > +/* > + * Extent buffer locking > + * ~~~~~~~~~~~~~~~~~~~~~ > + * > + * The locks use a custom scheme that allows to do more operations than are > + * available fromt current locking primitives. The building blocks are still > + * rwlock and wait queues. > + * > + * Required semantics: Thinking out loud here.. > + * > + * - reader/writer exclusion RWSEM has that > + * - writer/writer exclusion RWSEM has that > + * - reader/reader sharing RWSEM has that > + * - spinning lock semantics RWSEM has that in the form of optimistic spinning which would be shorter than what the custom implementation provides. > + * - blocking lock semantics RWSEM has that > + * - try-lock semantics for readers and writers down_write_trylock, down_read_trylock. > + * - one level nesting, allowing read lock to be taken by the same thread that > + * already has write lock this might be somewhat problematic but there is downgrade_write which could be used, I'll have to check. > + * > + * The extent buffer locks (also called tree locks) manage access to eb data. > + * We want concurrency of many readers and safe updates. The underlying locking > + * is done by read-write spinlock and the blocking part is implemented using > + * counters and wait queues. > + * > + * spinning semantics - the low-level rwlock is held so all other threads that > + * want to take it are spinning on it. > + * > + * blocking semantics - the low-level rwlock is not held but the counter > + * denotes how many times the blocking lock was held; > + * sleeping is possible > + * > + * Write lock always allows only one thread to access the data. <snip>
