On 5/30/20 6:59 AM, Qu Wenruo wrote:
[...]
This new mode is enabled passing the option ssd_metadata at mount time.
This policy of allocation is the "preferred" one. If this doesn't permit
a chunk allocation, the "classic" one is used.
One thing to improve here, in fact we can use existing members to
restore the device related info:
- btrfs_dev_item::seek_speed
- btrfs_dev_item::bandwidth (I tend to rename it to IOPS)
Hi Qu,
this path was an older version,the current one (sent 2 days ago) store the setting
of which disks has to be considered as "preferred_metadata".
In fact, what you're trying to do is to provide a policy to allocate
chunks based on each device performance characteristics.
I believe it would be super awesome, but to get it upstream, I guess we
would prefer a more flex framework, thus it would be pretty slow to merge.
I agree. And considering that in the near future the SSD will become more
widespread, I don't know if the effort (and the time required) are worth.
But still, thanks for your awesome idea.
Thanks,
Qu
Some examples: (/dev/sd[abc] are ssd, and /dev/sd[ef] are rotational)
Non striped profile: metadata->raid1, data->raid1
The data is stored on /dev/sd[ef], metadata is stored on /dev/sd[abc].
When /dev/sd[ef] are full, then the data chunk is allocated also on
/dev/sd[abc].
Striped profile: metadata->raid6, data->raid6
raid6 requires 3 disks at minimum, so /dev/sd[ef] are not enough for a
data profile raid6. To allow a data chunk allocation, the data profile raid6
will be stored on all the disks /dev/sd[abcdef].
Instead the metadata profile raid6 will be allocated on /dev/sd[abc],
because these are enough to host this chunk.
Changelog:
v1: - first issue
v2: - rebased to v5.6.2
- correct the comparison about the rotational disks (>= instead of >)
- add the flag rotational to the struct btrfs_device_info to
simplify the comparison function (btrfs_cmp_device_info*() )
v3: - correct the collision between BTRFS_MOUNT_DISCARD_ASYNC and
BTRFS_MOUNT_SSD_METADATA.
Below I collected some data to highlight the performance increment.
Test setup:
I performed as test a "dist-upgrade" of a Debian from stretch to buster.
The test consisted in an image of a Debian stretch[1] with the packages
needed under /var/cache/apt/archives/ (so no networking was involved).
For each test I formatted the filesystem from scratch, un-tar-red the
image and the ran "apt-get dist-upgrade" [2]. For each disk(s)/filesystem
combination I measured the time of apt dist-upgrade with and
without the flag "force-unsafe-io" which reduce the using of sync(2) and
flush(2). The ssd was 20GB big, the hdd was 230GB big,
I considered the following scenarios:
- btrfs over ssd
- btrfs over ssd + hdd with my patch enabled
- btrfs over bcache over hdd+ssd
- btrfs over hdd (very, very slow....)
- ext4 over ssd
- ext4 over hdd
The test machine was an "AMD A6-6400K" with 4GB of ram, where 3GB was used
as cache/buff.
Data analysis:
Of course btrfs is slower than ext4 when a lot of sync/flush are involved. Using
apt on a rotational was a dramatic experience. And IMHO this should be replaced
by using the btrfs snapshot capabilities. But this is another (not easy) story.
Unsurprising bcache performs better than my patch. But this is an expected
result because it can cache also the data chunk (the read can goes directly to
the ssd). bcache perform about +60% slower when there are a lot of sync/flush
and only +20% in the other case.
Regarding the test with force-unsafe-io (fewer sync/flush), my patch reduce the
time from +256% to +113% than the hdd-only . Which I consider a good
results considering how small is the patch.
Raw data:
The data below is the "real" time (as return by the time command) consumed by
apt
Test description real (mmm:ss) Delta %
-------------------- ------------- -------
btrfs hdd w/sync 142:38 +533%
btrfs ssd+hdd w/sync 81:04 +260%
ext4 hdd w/sync 52:39 +134%
btrfs bcache w/sync 35:59 +60%
btrfs ssd w/sync 22:31 reference
ext4 ssd w/sync 12:19 -45%
Test description real (mmm:ss) Delta %
-------------------- ------------- -------
btrfs hdd 56:2 +256%
ext4 hdd 51:32 +228%
btrfs ssd+hdd 33:30 +113%
btrfs bcache 18:57 +20%
btrfs ssd 15:44 reference
ext4 ssd 11:49 -25%
[1] I created the image, using "debootrap stretch", then I installed a set
of packages using the commands:
# debootstrap stretch test/
# chroot test/
# mount -t proc proc proc
# mount -t sysfs sys sys
# apt --option=Dpkg::Options::=--force-confold \
--option=Dpkg::options::=--force-unsafe-io \
install mate-desktop-environment* xserver-xorg vim \
task-kde-desktop task-gnome-desktop
Then updated the release from stretch to buster changing the file /etc/apt/source.list
Then I download the packages for the dist upgrade:
# apt-get update
# apt-get --download-only dist-upgrade
Then I create a tar of this image.
Before the dist upgrading the space used was about 7GB of space with 2281
packages. After the dist-upgrade, the space used was 9GB with 2870 packages.
The upgrade installed/updated about 2251 packages.
[2] The command was a bit more complex, to avoid an interactive session
# mkfs.btrfs -m single -d single /dev/sdX
# mount /dev/sdX test/
# cd test
# time tar xzf ../image.tgz
# chroot .
# mount -t proc proc proc
# mount -t sysfs sys sys
# export DEBIAN_FRONTEND=noninteractive
# time apt-get -y --option=Dpkg::Options::=--force-confold \
--option=Dpkg::options::=--force-unsafe-io dist-upgrade
BR
G.Baroncelli
--
gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it>
Key fingerprint BBF5 1610 0B64 DAC6 5F7D 17B2 0EDA 9B37 8B82 E0B5