On Mon, Jan 06, 2020 at 05:30:01PM +0100, David Sterba wrote: > Is it expected to leave the counters in a state where are discardable > extents but not process after a long period of time? I found > > discard_bitmap_bytes:316833792 > discard_bytes_saved:59390722048 > discard_extent_bytes:26122764288 > discardable_bytes:44863488 > discardable_extents:883 > iops_limit:10 > kbps_limit:0 > max_discard_size:67108864 > > there was activity when the number of extents wen from about 2000 to > that value (833), so this could bea nother instance of the -1 accounting > bug. There is no guarantee each invocation of the work item will find something to discard. This was designed to prevent any edge case from consuming the cpu. If free space is added back while a block_group has it's cursor being moved (unless it's fully free), it will not go back and trim those extents. So we may leave stuff untrimmed until the next time around. This is also to prevent a pathological case of just resetting in the same block_group. Therefore, we may be in a situation where we have discardable extents, but we aren't actively discarding it. The premise is some filesystem usage will eventually occur and kick it back onto the list. This also works because btrfs tries to reuse block groups before allocating another one. The -1 case is special because it really has to be we're blowing up the block_group with something left on the table. Because of the size, I'm guessing it's bitmap related and I added removal of discardable_* inside free_bitmap().
