On 08/10/2019 11:26, Qu Wenruo wrote:
>
>
> On 2019/10/8 下午5:14, Johannes Thumshirn wrote:
>>> [[Benchmark]]
>>> Since I have upgraded my rig to all NVME storage, there is no HDD
>>> test result.
>>>
>>> Physical device: NVMe SSD
>>> VM device: VirtIO block device, backup by sparse file
>>> Nodesize: 4K (to bump up tree height)
>>> Extent data size: 4M
>>> Fs size used: 1T
>>>
>>> All file extents on disk is in 4M size, preallocated to reduce space usage
>>> (as the VM uses loopback block device backed by sparse file)
>>
>> Do you have a some additional details about the test setup? I tried to
>> do the same (testing) for a bug Felix (added to Cc) reported to my at
>> the ALPSS Conference and I couldn't reproduce the issue.
>>
>> My testing was a 100TB sparse file passed into a VM and running this
>> script to touch all blockgroups:
>
> Here is my test scripts:
> ---
> #!/bin/bash
>
> dev="/dev/vdb"
> mnt="/mnt/btrfs"
>
> nr_subv=16
> nr_extents=16384
> extent_size=$((4 * 1024 * 1024)) # 4M
>
> _fail()
> {
> echo "!!! FAILED: $@ !!!"
> exit 1
> }
>
> fill_one_subv()
> {
> path=$1
> if [ -z $path ]; then
> _fail "wrong parameter for fill_one_subv"
> fi
> btrfs subv create $path || _fail "create subv"
>
> for i in $(seq 0 $((nr_extents - 1))); do
> fallocate -o $((i * $extent_size)) -l $extent_size
> $path/file || _fail "fallocate"
> done
> }
>
> declare -a pids
> umount $mnt &> /dev/null
> umount $dev &> /dev/null
>
> #~/btrfs-progs/mkfs.btrfs -f -n 4k $dev -O bg-tree
> mkfs.btrfs -f -n 4k $dev
> mount $dev $mnt -o nospace_cache
>
> for i in $(seq 1 $nr_subv); do
> fill_one_subv $mnt/subv_${i} &
> pids[$i]=$!
> done
>
> for i in $(seq 1 $nr_subv); do
> wait ${pids[$i]}
> done
> sync
> umount $dev
>
> ---
>
>>
>> #!/bin/sh
>>
>> FILE=/mnt/test
>>
>> add_dirty_bg() {
>> off="$1"
>> len="$2"
>> touch $FILE
>> xfs_io -c "falloc $off $len" $FILE
>> rm $FILE
>> }
>>
>> mkfs.btrfs /dev/vda
>> mount /dev/vda /mnt
>>
>> for ((i = 1; i < 100000; i++)); do
>> add_dirty_bg $i"G" "1G"
>> done
>
> This wont really build a good enough extent tree layout.
>
> 1G fallocate will only cause 8 128M file extents, thus 8 EXTENT_ITEMs.
>
> Thus a leaf (16K by default) can still contain a lot of BLOCK_GROUPS all
> together.
>
> To build a case to really show the problem, you'll need a lot of
> EXTENT_ITEM/METADATA_ITEMS to fill the gaps between BLOCK_GROUPS.
>
> My test scripts did that, but may still not represent the real world, as
> real world can cause even smaller extents due to snapshots.
>
Ah thanks for the explanation. I'll give your testscript a try.
--
Johannes Thumshirn SUSE Labs Filesystems
jthumshirn@xxxxxxx +49 911 74053 689
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5
90409 Nürnberg
Germany
(HRB 247165, AG München)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
Attachment:
signature.asc
Description: OpenPGP digital signature
