- Subject: Re: [Lsf-pc] [LSF/MM TOPIC] a few storage topics
- From: Wu Fengguang <fengguang.wu@xxxxxxxxx>
- Date: Fri, 3 Feb 2012 20:55:43 +0800
- Cc: Andreas Dilger <adilger@xxxxxxxxx>, Andrea Arcangeli <aarcange@xxxxxxxxxx>, Dan Magenheimer <dan.magenheimer@xxxxxxxxxx>, Jan Kara <jack@xxxxxxx>, Mike Snitzer <snitzer@xxxxxxxxxx>, linux-scsi@xxxxxxxxxxxxxxx, Christoph Hellwig <hch@xxxxxxxxxxxxx>, dm-devel@xxxxxxxxxx, "Loke, Chetan" <Chetan.Loke@xxxxxxxxxxxx>, Jeff Moyer <jmoyer@xxxxxxxxxx>, Boaz Harrosh <bharrosh@xxxxxxxxxxx>, linux-fsdevel@xxxxxxxxxxxxxxx, lsf-pc@xxxxxxxxxxxxxxxxxxxxxxxxxx, Chris Mason <chris.mason@xxxxxxxxxx>
- In-reply-to: <1327509623.2720.52.camel@menhir>
- References: <20120124190732.GH4387@shiny> <x49vco0kj5l.fsf@segfault.boston.devel.redhat.com> <20120124200932.GB20650@quack.suse.cz> <x49pqe8kgej.fsf@segfault.boston.devel.redhat.com> <20120124203936.GC20650@quack.suse.cz> <20120125032932.GA7150@localhost> <F6F2DEB8-F096-4A3B-89E3-3A132033BC76@dilger.ca> <1327502034.2720.23.camel@menhir> <D3F292ADF945FB49B35E96C94C2061B915A638A6@nsmail.netscout.com> <1327509623.2720.52.camel@menhir>
- Reply-to: device-mapper development <dm-devel@xxxxxxxxxx>
- User-agent: Mutt/1.5.20 (2009-06-14)
On Wed, Jan 25, 2012 at 04:40:23PM +0000, Steven Whitehouse wrote:
> Hi,
>
> On Wed, 2012-01-25 at 11:22 -0500, Loke, Chetan wrote:
> > > If the reason for not setting a larger readahead value is just that it
> > > might increase memory pressure and thus decrease performance, is it
> > > possible to use a suitable metric from the VM in order to set the value
> > > automatically according to circumstances?
> > >
> >
> > How about tracking heuristics for 'read-hits from previous read-aheads'? If the hits are in acceptable range(user-configurable knob?) then keep seeking else back-off a little on the read-ahead?
> >
> > > Steve.
> >
> > Chetan Loke
>
> I'd been wondering about something similar to that. The basic scheme
> would be:
>
> - Set a page flag when readahead is performed
> - Clear the flag when the page is read (or on page fault for mmap)
> (i.e. when it is first used after readahead)
>
> Then when the VM scans for pages to eject from cache, check the flag and
> keep an exponential average (probably on a per-cpu basis) of the rate at
> which such flagged pages are ejected. That number can then be used to
> reduce the max readahead value.
>
> The questions are whether this would provide a fast enough reduction in
> readahead size to avoid problems? and whether the extra complication is
> worth it compared with using an overall metric for memory pressure?
>
> There may well be better solutions though,
The caveat is, on a consistently thrashed machine, the readahead size
should better be determined for each read stream.
Repeated readahead thrashing typically happen in a file server with
large number of concurrent clients. For example, if there are 1000
read streams each doing 1MB readahead, since there are 2 readahead
window for each stream, there could be up to 2GB readahead pages that
will sure be thrashed in a server with only 1GB memory.
Typically the 1000 clients will have different read speeds. A few of
them will be doing 1MB/s, most others may be doing 100KB/s. In this
case, we shall only decrease readahead size for the 100KB/s clients.
The 1MB/s clients actually won't see readahead thrashing at all and
we'll want them to do large 1MB I/O to achieve good disk utilization.
So we need something better than the "global feedback" scheme, and we
do have such a solution ;) As said in my other email, the number of
history pages remained in the page cache is a good estimation of that
particular read stream's thrashing safe readahead size.
Thanks,
Fengguang
--
dm-devel mailing list
dm-devel@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/dm-devel
[DM Crypt]
[Fedora Desktop]
[ATA RAID]
[Fedora Marketing]
[Fedora Packaging]
[Fedora SELinux]
[Yosemite Discussion]
[Yosemite Photos]
[KDE Users]
[Fedora Tools]
[Fedora Docs]