On 2016-05-19 19:23, Henk Slager wrote:
On Thu, May 19, 2016 at 8:51 PM, Austin S. Hemmelgarn
<ahferroin7@xxxxxxxxx> wrote:
On 2016-05-19 14:09, Kai Krakow wrote:
Am Wed, 18 May 2016 22:44:55 +0000 (UTC)
schrieb Ferry Toth <ftoth@xxxxxxxxxxxxxx>:
Op Tue, 17 May 2016 20:33:35 +0200, schreef Kai Krakow:
Bcache is actually low maintenance, no knobs to turn. Converting to
bcache protective superblocks is a one-time procedure which can be done
online. The bcache devices act as normal HDD if not attached to a
caching SSD. It's really less pain than you may think. And it's a
solution available now. Converting back later is easy: Just detach the
HDDs from the SSDs and use them for some other purpose if you feel so
later. Having the bcache protective superblock still in place doesn't
hurt then. Bcache is a no-op without caching device attached.
No, bcache is _almost_ a no-op without a caching device. From a userspace
perspective, it does nothing, but it is still another layer of indirection
in the kernel, which does have a small impact on performance. The same is
true of using LVM with a single volume taking up the entire partition, it
looks almost no different from just using the partition, but it will perform
worse than using the partition directly. I've actually done profiling of
both to figure out base values for the overhead, and while bcache with no
cache device is not as bad as the LVM example, it can still be a roughly
0.5-2% slowdown (it gets more noticeable the faster your backing storage
is).
You also lose the ability to mount that filesystem directly on a kernel
without bcache support (this may or may not be an issue for you).
The bcache (protective) superblock is in an 8KiB block in front of the
file system device. In case the current, non-bcached HDD's use modern
partitioning, you can do a 5-minute remove or add of bcache, without
moving/copying filesystem data. So in case you have a bcache-formatted
HDD that had just 1 primary partition (512 byte logical sectors), the
partition start is at sector 2048 and the filesystem start is at 2064.
Hard removing bcache (so making sure the module is not
needed/loaded/used the next boot) can be done done by changing the
start-sector of the partition from 2048 to 2064. In gdisk one has to
change the alignment to 16 first, otherwise this it refuses. And of
course, also first flush+stop+de-register bcache for the HDD.
The other way around is also possible, i.e. changing the start-sector
from 2048 to 2032. So that makes adding bcache to an existing
filesystem a 5 minute action and not a GBs- or TBs copy action. It is
not online of course, but just one reboot is needed (or just umount,
gdisk, partprobe, add bcache etc).
For RAID setups, one could just do 1 HDD first.
My argument about the overhead was not about the superblock, it was
about the bcache layer itself. It isn't practical to just access the
data directly if you plan on adding a cache device, because then you
couldn't do so online unless you're going through bcache. This extra
layer of indirection in the kernel does add overhead, regardless of the
on-disk format.
Secondarily, having a HDD with just one partition is not a typical use
case, and that argument about the slack space resulting from the 1M
alignment only holds true if you're using an MBR instead of a GPT layout
(or for that matter, almost any other partition table format), and
you're not booting from that disk (because GRUB embeds itself there).
It's also fully possible to have an MBR formatted disk which doesn't
have any spare space there too (which is how most flash drives get
formatted).
This also doesn't change the fact that without careful initial
formatting (it is possible on some filesystems to embed the bcache SB at
the beginning of the FS itself, many of them have some reserved space at
the beginning of the partition for bootloaders, and this space doesn't
have to exist when mounting the FS) or manual alteration of the
partition, it's not possible to mount the FS on a system without bcache
support.
There is also a tool doing the conversion in-place (I haven't used it
myself, my python(s) had trouble; I could do the partition table edit
much faster/easier):
https://github.com/g2p/blocks#bcache-conversion
I actually hadn't known about this tool, thanks for mentioning it.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html