- To: Chris Mason <chris.mason@xxxxxxxxxx>, James Bottomley <James.Bottomley@xxxxxxxxxxxxxxxxxxxxx>, "Loke, Chetan" <Chetan.Loke@xxxxxxxxxxxx>, Steven Whitehouse <swhiteho@xxxxxxxxxx>, Andreas Dilger <adilger@xxxxxxxxx>, Andrea Arcangeli <aarcange@xxxxxxxxxx>, Jan Kara <jack@xxxxxxx>, Mike Snitzer <snitzer@xxxxxxxxxx>, linux-scsi@xxxxxxxxxxxxxxx, neilb@xxxxxxx, dm-devel@xxxxxxxxxx, Christoph Hellwig <hch@xxxxxxxxxxxxx>, linux-mm@xxxxxxxxx, Jeff Moyer <jmoyer@xxxxxxxxxx>, Wu Fengguang <fengguang.wu@xxxxxxxxx>, Boaz Harrosh <bharrosh@xxxxxxxxxxx>, linux-fsdevel@xxxxxxxxxxxxxxx, lsf-pc@xxxxxxxxxxxxxxxxxxxxxxxxxx, "Darrick J.Wong" <djwong@xxxxxxxxxx>
- Subject: Re: [Lsf-pc] [dm-devel] [LSF/MM TOPIC] a few storage topics
- From: Dave Chinner <david@xxxxxxxxxxxxx>
- Date: Fri, 27 Jan 2012 09:38:01 +1100
- In-reply-to: <20120125200613.GH15866@shiny>
- User-agent: Mutt/1.5.21 (2010-09-15)
On Wed, Jan 25, 2012 at 03:06:13PM -0500, Chris Mason wrote:
> On Wed, Jan 25, 2012 at 12:37:48PM -0600, James Bottomley wrote:
> > On Wed, 2012-01-25 at 13:28 -0500, Loke, Chetan wrote:
> > > > So there are two separate problems mentioned here. The first is to
> > > > ensure that readahead (RA) pages are treated as more disposable than
> > > > accessed pages under memory pressure and then to derive a statistic for
> > > > futile RA (those pages that were read in but never accessed).
> > > >
> > > > The first sounds really like its an LRU thing rather than adding yet
> > > > another page flag. We need a position in the LRU list for never
> > > > accessed ... that way they're first to be evicted as memory pressure
> > > > rises.
> > > >
> > > > The second is you can derive this futile readahead statistic from the
> > > > LRU position of unaccessed pages ... you could keep this globally.
> > > >
> > > > Now the problem: if you trash all unaccessed RA pages first, you end up
> > > > with the situation of say playing a movie under moderate memory
> > > > pressure that we do RA, then trash the RA page then have to re-read to display
> > > > to the user resulting in an undesirable uptick in read I/O.
> > > >
> > > > Based on the above, it sounds like a better heuristic would be to evict
> > > > accessed clean pages at the top of the LRU list before unaccessed clean
> > > > pages because the expectation is that the unaccessed clean pages will
> > > > be accessed (that's after all, why we did the readahead). As RA pages age
> > >
> > > Well, the movie example is one case where evicting unaccessed page may not be the right thing to do. But what about a workload that perform a random one-shot search?
> > > The search was done and the RA'd blocks are of no use anymore. So it seems one solution would hurt another.
> >
> > Well not really: RA is always wrong for random reads. The whole purpose
> > of RA is assumption of sequential access patterns.
>
> Just to jump back, Jeff's benchmark that started this (on xfs and ext4):
>
> - buffered 1MB reads get down to the scheduler in 128KB chunks
>
> The really hard part about readahead is that you don't know what
> userland wants. In Jeff's test, he's telling the kernel he wants 1MB
> ios and our RA engine is doing 128KB ios.
>
> We can talk about scaling up how big the RA windows get on their own,
> but if userland asks for 1MB, we don't have to worry about futile RA, we
> just have to make sure we don't oom the box trying to honor 1MB reads
> from 5000 different procs.
Right - if we know the read request is larger than the RA window,
then we should ignore the RA window and just service the request in
a single bio. Well, at least, in chunks as large as the underlying
device will allow us to build....
Cheers,
Dave.
--
Dave Chinner
david@xxxxxxxxxxxxx
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
[SCSI Target Devel]
[Linux SCSI Target Infrastructure]
[Kernel Newbies]
[Share Photos]
[IDE]
[Security]
[Git]
[Netfilter]
[Bugtraq]
[Photos]
[Yosemite]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Linux ATA RAID]
[Linux IIO]
[Samba]
[Video 4 Linux]
[Device Mapper]
[Linux Resources]