On 1.02.20 г. 0:36 ч., Josef Bacik wrote: > Nikolay reported a problem where generic/371 would fail sometimes with a > slow drive. The gist of the test is that we fallocate a file in > parallel with a pwrite of a different file. These two files combined > are smaller than the file system, but sometimes the pwrite would ENOSPC. > > A fair bit of investigation uncovered the fact that the fallocate > workload was racing in and grabbing the free space that the pwrite > workload was trying to free up so it could make its own reservation. > After a few loops of this eventually the pwrite workload would error out > with an ENOSPC. > > We've had the same problem with metadata as well, and we serialized all > metadata allocations to satisfy this problem. This wasn't usually a > problem with data because data reservations are more straightforward, > but obviously could still happen. > > Fix this by not allowing reservations to occur if there are any pending > tickets waiting to be satisfied on the space info. > > Signed-off-by: Josef Bacik <josef@xxxxxxxxxxxxxx> Reviewed-by: Nikolay Borisov <nborisov@xxxxxxxx>
