- To: "Loke, Chetan" <Chetan.Loke@xxxxxxxxxxxx>
- Subject: Re: [Lsf-pc] [dm-devel] [LSF/MM TOPIC] a few storage topics
- From: Wu Fengguang <fengguang.wu@xxxxxxxxx>
- Date: Fri, 3 Feb 2012 20:37:53 +0800
- Cc: Andrea Arcangeli <aarcange@xxxxxxxxxx>, Chris Mason <chris.mason@xxxxxxxxxx>, James Bottomley <James.Bottomley@xxxxxxxxxxxxxxxxxxxxx>, Steven Whitehouse <swhiteho@xxxxxxxxxx>, Andreas Dilger <adilger@xxxxxxxxx>, Jan Kara <jack@xxxxxxx>, Mike Snitzer <snitzer@xxxxxxxxxx>, linux-scsi@xxxxxxxxxxxxxxx, neilb@xxxxxxx, dm-devel@xxxxxxxxxx, Christoph Hellwig <hch@xxxxxxxxxxxxx>, linux-mm@xxxxxxxxx, Jeff Moyer <jmoyer@xxxxxxxxxx>, Boaz Harrosh <bharrosh@xxxxxxxxxxx>, linux-fsdevel@xxxxxxxxxxxxxxx, lsf-pc@xxxxxxxxxxxxxxxxxxxxxxxxxx, "Darrick J.Wong" <djwong@xxxxxxxxxx>, Dan Magenheimer <dan.magenheimer@xxxxxxxxxx>
- In-reply-to: <D3F292ADF945FB49B35E96C94C2061B915A64111@nsmail.netscout.com>
- User-agent: Mutt/1.5.20 (2009-06-14)
On Thu, Jan 26, 2012 at 11:40:47AM -0500, Loke, Chetan wrote:
> > From: Andrea Arcangeli [mailto:aarcange@xxxxxxxxxx]
> > Sent: January 25, 2012 5:46 PM
>
> ....
>
> > Way more important is to have feedback on the readahead hits and be
> > sure when readahead is raised to the maximum the hit rate is near 100%
> > and fallback to lower readaheads if we don't get that hit rate. But
> > that's not a VM problem and it's a readahead issue only.
> >
>
> A quick google showed up - http://kerneltrap.org/node/6642
>
> Interesting thread to follow. I haven't looked further as to what was
> merged and what wasn't.
>
> A quote from the patch - " It works by peeking into the file cache and
> check if there are any history pages present or accessed."
> Now I don't understand anything about this but I would think digging the
> file-cache isn't needed(?). So, yes, a simple RA hit-rate feedback could
> be fine.
>
> And 'maybe' for adaptive RA just increase the RA-blocks by '1'(or some
> N) over period of time. No more smartness. A simple 10 line function is
> easy to debug/maintain. That is, a scaled-down version of
> ramp-up/ramp-down. Don't go crazy by ramping-up/down after every RA(like
> SCSI LLDD madness). Wait for some event to happen.
>
> I can see where Andrew Morton's concerns could be(just my
> interpretation). We may not want to end up like a protocol state machine
> code: tcp slow-start, then increase , then congestion, then let's
> back-off. hmmm, slow-start is a problem for my business logic, so let's
> speed-up slow-start ;).
Loke,
Thrashing safe readahead can work as simple as:
readahead_size = min(nr_history_pages, MAX_READAHEAD_PAGES)
No need for more slow-start or back-off magics.
This is because nr_history_pages is a lower estimation of the threshing
threshold:
chunk A chunk B chunk C head
l01 l11 l12 l21 l22
| |-->|-->| |------>|-->| |------>|
| +-------+ +-----------+ +-------------+ |
| | # | | # | | # | |
| +-------+ +-----------+ +-------------+ |
| |<==============|<===========================|<============================|
L0 L1 L2
Let f(l) = L be a map from
l: the number of pages read by the stream
to
L: the number of pages pushed into inactive_list in the mean time
then
f(l01) <= L0
f(l11 + l12) = L1
f(l21 + l22) = L2
...
f(l01 + l11 + ...) <= Sum(L0 + L1 + ...)
<= Length(inactive_list) = f(thrashing-threshold)
So the count of continuous history pages left in inactive_list is always a
lower estimation of the true thrashing-threshold. Given a stable workload,
the readahead size will keep ramping up and then stabilize in range
(thrashing_threshold/2, thrashing_threshold)
Thanks,
Fengguang
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
[SCSI Target Devel]
[Linux SCSI Target Infrastructure]
[Kernel Newbies]
[Share Photos]
[IDE]
[Security]
[Git]
[Netfilter]
[Bugtraq]
[Photos]
[Yosemite]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Linux ATA RAID]
[Linux IIO]
[Samba]
[Video 4 Linux]
[Device Mapper]
[Linux Resources]