Le 24 décembre 2019 13:40:56 "Austin S. Hemmelgarn" <ahferroin7@xxxxxxxxx>
a écrit :
On 2019-12-23 21:04, Wang Shilong wrote:
On Tue, Dec 24, 2019 at 7:38 AM Hans van Kranenburg <hans@xxxxxxxxxxx> wrote:
Hi Stéphane,
On 12/23/19 2:44 PM, Stéphane Lesimple wrote:
Has this ever been considered to implement a feature so that metadata
chunks would always be allocated on a given set of disks part of the btrfs
filesystem?
Yes, many times.
I implement it locally before for my local testing before.
As metadata use can be intensive and some operations are known to be slow
(such as backref walking), I'm under the (maybe wrong) impression that
having a set of small ssd's just for the metadata would give quite a boost
to a filesystem. Maybe even make qgroups more usable with volumes having 10
snapshots?
No, it's not wrong. For bigger filesystems this would certainly help.
This could just be a preference set on the allocator,
Yes. Now, the big question is, how do we 'just' set this preference?
Be sure to take into account that the filesystem has no way to find out
itself which disks are those ssds. There's no easy way to discover this
in a running system.
No, there is API for filesystem to detect whether lower device is SSD or not.
Something like:
if (!blk_queue_nonrot(q))
fs_devices->rotating = 1;
Currently, btrfs will treat filesystem as rotational disks if any of
one disk is rotational,
We might record how many non-rotational disks, and make chunk allocation
try SSD
firstly if it possible.
This doesn't tell you that the device is an SSD though, just that it
reports to the kernel as non+rotational. For example, NBD devices
present as non-rotational by default, and in most cases you do _not_
want hot data on a network disc.
The important thing here is disk performance, not whether it's an SSD or
not. An SD card is non-rotational and solid-state, but on most systems
the performance is going to be sufficiently bad for BTRFS-type workloads
that it's almost useless for this type of thing.
That's a good point, which is why I think this kind of preference should be
set manually by the user on fs creation, on device add/replace or anytime
later with "btrfs device set allocator.hint.metadata always /tank".
Now, we might still want to add some autodetection routine candy to the
btrfs user space tool, or for mkfs.btrfs, albeit not enabled by default as
your counter-examples indicate. But that's entirely optional. A manual mode
would already be awesome.
so that a 6 disks
raid1 FS with 4 spinning disks and 2 ssds prefer to allocate metadata on
the ssd than on the slow drives (and falling back to spinning disks if ssds
are full, with the possibility to rebalance later).
Would such a feature make sense?
Absolutely.
Hans