On 2017-04-10 14:18, Kai Krakow wrote:
Am Mon, 10 Apr 2017 13:13:39 -0400
schrieb "Austin S. Hemmelgarn" <ahferroin7@xxxxxxxxx>:
On 2017-04-10 12:54, Kai Krakow wrote:
Am Mon, 10 Apr 2017 18:44:44 +0200
schrieb Kai Krakow <hurikhan77@xxxxxxxxx>:
Am Mon, 10 Apr 2017 08:51:38 -0400
schrieb "Austin S. Hemmelgarn" <ahferroin7@xxxxxxxxx>:
[...]
[...]
[...]
[...]
[...]
Did you put it in /etc/fstab only for the rootfs? If yes, it
probably has no effect. You would need to give it as rootflags on
the kernel cmdline.
I did a "fgrep lazytime /usr/src/linux -ir" and it reveals only ext4
and f2fs know the flag. Kernel 4.10.
So probably you're seeing a placebo effect. If you put lazytime for
rootfs just only into fstab, it won't have an effect because on
initial mount this file cannot be opened (for obvious reasons), and
on remount, btrfs seems to happily accept lazytime but it has no
effect. It won't show up in /proc/mounts. Try using it in rootflags
kernel cmdline and you should see that the kernel won't accept the
flag lazytime.
The command-line also rejects a number of perfectly legitimate
arguments that BTRFS does understand too though, so that's not much
of a test.
Which are those? I didn't encounter any...
I'm not sure there are any anymore, but I know that a handful (mostly
really uncommon ones) used to (and BTRFS is not alone in this respect,
some of the more esoteric ext4 options aren't accepted on the kernel
command-line either). I know at a minimum at some point in the past
alloc-start, check_int, and inode_cache did not work from the kernel
command-line.
I've just finished some quick testing though, and it looks
like you're right, BTRFS does not support this, which means I now
need to figure out what the hell was causing the IOPS counters in
collectd to change in rough correlation with remounting (especially
since it appears to happen mostly independent of the options being
changed).
I think that noatime (which I remember you also used?), lazytime, and
relatime are mutually exclusive: they all handle the inode updates.
Maybe that is the effect you see?
They're not exactly exclusive. The lazytime option will prevent changes
to the mtime or atime fields in a file from forcing inode write-out for
up to 24 hours (if the inode would be written out for some other reason
(such as a file-size change or the inode being evicted from the cache),
then the timestamps will be too), but it does not change the value of
the timestamps. So if you have lazytime enabled and use touch to update
the mtime on anotherwise idle file, the mtime will still be correct as
far as userspace is concerned, as long as you don't crash before the
update hits the disk (but userspace will only see the discrepancy
_after_ the crash).
By comparison, relatime causes the atime not to updated at all if it's
changed in the last 24 hours, and noatime completely prevents atime
updates. In both cases, the atime isn't correct at all in userspace as
far as POSIX is concerned.
So, you have the following combinations:
* strictatime, nolazytime: Both atime and mtime updates happen, and are
flushed to disk (almost) immediately.
* relatime, nolazytime (the upstream default): atime updates happen only
if the atime hasn't changed in 24 hours, mtime updates happen as normal,
and both types of update are flushed to disk (almost) immediately.
* noatime, nolazytime (the default on some specific kernels (this is
easy to patch, so a lot of people who already carry custom patches and
don't use mutt patch it)): atime updates never happen, mtime updates
happen as normal and are flushed to disk (almost) immediately.
* strictatime, lazytime: Both atime and mtime updates happen, but they
actual update may not hit the disk for up to 24 hours (this will let
mutt work correctly as long as your system shuts down cleanly, but still
improve performance noticeably on at least ext4).
* relatime, lazytime: atime updates happen only if the atime hasn't
changed in 24 hours, mtime updates happen as normal, and both may not
hit the disk for up to 24 hours.
* noatime, lazytime (what I'm trying to run): atime updates never
happen, mtime updates happen as normal, but may not hit the disk for up
to 24 hours.
In essence, lazytime only impacts inode writeback (deferring it under
special circumstances), while {no,rel,strict}atime impacts the actual
value of the time-stamps.
This is somewhat disappointing though, as supporting this would
probably help with the write-amplification issues inherent in COW
filesystems. --
Well, relatime is mostly the same thus not perfectly resembling the
POSIX standard. I think the only software that relies on atime is
mutt...
This very much depends on what you're doing. If you have a WORM
workload, then yeah, it's pretty much the same. If however you have
something like a database workload where a specific set of files get
internally rewritten regularly, then it actually has a measurable impact.
As a very specific example, I run collectd on my systems using RRD files
as data storage. An RRD file is essentially a really fancy circular
buffer, so it remains fixed size but gets a _lot_ of internal rewrites
(by the way, if anyone wants to test fragmentation behavior on BTRFS,
RRD files are a great way to do it). Because of how I have things set
up, each file gets a batch of data points every 1-2 minutes. This in
turn means that the mtime is updating every 1-2 minutes for each of the
1000+ RRD files. In this case, writing out the timestamps results in an
overhead of roughly 256 bytes per file, which is about 0.1% based on the
average file size of roughly 169k. If I use noatime on this filesystem,
then it has near zero impact because the average number of times per
hour that these files are read is near zero. Turning on lazytime
however, results in mtime updates getting deferred until the hourly
forced fssync for this filesystem hits (this is something I'm doing, not
the OS), that reduces the overhead by a factor of roughly 45 (the
average number of writes per-file per-hour) to about 0.00003%, which is
a pretty serious difference.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html