On 02/10/2014 01:24 PM, cwillu wrote:
I concur. The regular df data used number should be the amount of space required to hold a backup of that content (assuming that the backup maintains reflinks and compression and so forth). There's no good answer for available space; the statfs syscall isn't rich enough to cover all the bases even in the face of dup metadata and single data (i.e., the common case), and a truly conservative estimate (report based on the highest-usage raid level in use) would report space/2 on that same common case. "Highest-usage data raid level in use" is probably the best compromise, with a big warning that that many large numbers of small files will not actually fit, posted in some mythical place that users look. I would like to see the information from btrfs fi df and btrfs fi show summarized somewhere (ideally as a new btrfs fi df output), as both sets of numbers are really necessary, or at least have btrfs fi df include the amount of space not allocated to a block group. Re regular df: are we adding space allocated to a block group (raid1, say) but not in actual use in a file as the N/2 space available in the block group, or the N space it takes up on disk? This probably matters a bit less than it used to, but if it's N/2, that leaves us open to "empty filesystem, 100GB free, write a 80GB file and then delete it, wtf, only 60GB free now?" reporting issues.
The only case we add the actual allocated chunk space is for metadata, for data we only add the actual used number. So say say you write 80gb file and then delete it but during the writing we allocated a 1 gig chunk for metadata you'll see only 99gb free, make sense? We could (should?) roll this into the b_avail magic and make "used" really only reflect data usage, opinions on this? Thanks,
Josef -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html
