Re: [PATCH V3 0/2] memcg softlimit reclaim rework

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]


On Thu, Apr 19, 2012 at 10:04 AM, Michal Hocko <mhocko@xxxxxxx> wrote:
> On Wed 18-04-12 11:00:40, Ying Han wrote:
>> On Wed, Apr 18, 2012 at 5:24 AM, Johannes Weiner <hannes@xxxxxxxxxxx> wrote:
>> > On Tue, Apr 17, 2012 at 09:37:46AM -0700, Ying Han wrote:
>> >> The "soft_limit" was introduced in memcg to support over-committing the
>> >> memory resource on the host. Each cgroup configures its "hard_limit" where
>> >> it will be throttled or OOM killed by going over the limit. However, the
>> >> cgroup can go above the "soft_limit" as long as there is no system-wide
>> >> memory contention. So, the "soft_limit" is the kernel mechanism for
>> >> re-distributing system spare memory among cgroups.
>> >>
>> >> This patch reworks the softlimit reclaim by hooking it into the new global
>> >> reclaim scheme. So the global reclaim path including direct reclaim and
>> >> background reclaim will respect the memcg softlimit.
>> >>
>> >> v3..v2:
>> >> 1. rebase the patch on 3.4-rc3
>> >> 2. squash the commits of replacing the old implementation with new
>> >> implementation into one commit. This is to make sure to leave the tree
>> >> in stable state between each commit.
>> >> 3. removed the commit which changes the nr_to_reclaim for global reclaim
>> >> case. The need of that patch is not obvious now.
>> >>
>> >> Note:
>> >> 1. the new implementation of softlimit reclaim is rather simple and first
>> >> step for further optimizations. there is no memory pressure balancing between
>> >> memcgs for each zone, and that is something we would like to add as follow-ups.
>> >>
>> >> 2. this patch is slightly different from the last one posted from Johannes
>> >> http://comments.gmane.org/gmane.linux.kernel.mm/72382
>> >> where his patch is closer to the reverted implementation by doing hierarchical
>> >> reclaim for each selected memcg. However, that is not expected behavior from
>> >> user perspective. Considering the following example:
>> >>
>> >> root (32G capacity)
>> >> --> A (hard limit 20G, soft limit 15G, usage 16G)
>> >>    --> A1 (soft limit 5G, usage 4G)
>> >>    --> A2 (soft limit 10G, usage 12G)
>> >> --> B (hard limit 20G, soft limit 10G, usage 16G)
>> >>
>> >> Under global reclaim, we shouldn't add pressure on A1 although its parent(A)
>> >> exceeds softlimit. This is what admin expects by setting softlimit to the
>> >> actual working set size and only reclaim pages under softlimit if system has
>> >> trouble to reclaim.
>> >
>> > Actually, this is exactly what the admin expects when creating a
>> > hierarchy, because she defines that A1 is a child of A and is
>> > responsible for the memory situation in its parent.
>
> Hmm, I guess that both approaches have cons and pros.
> * Hierarchical soft limit reclaim - reclaim the whole subtree of the over
>  soft limit memcg
>  + it is consistent with the hard limit reclaim
Not sure why we want them to be consistent. Soft_limit is serving
different purpose and the one of the main purpose is to preserve the
working set of the cgroup.

>  + easier for top to bottom configuration - especially when you allow
>    subgroups to create deeper hierarchies. Does anybody do that?

As far as I heard, most (if not all) are using flat configuration
where everything is running under root.

>  - harder to set up if soft limit should act as a guarantee - might lead
>    to an unexpected reclaim.
>
> * Targeted soft limit reclaim - only reclaim LRUs of over limit memcgs
>  + easier to set up for the working set guarantee because admin can focus
>    on the working set of a single group and not the whole hierarchy
This is true.

>  - easier to construct soft unreclaimable hierarchies - whole subtree
>    contributes but nobody wants to take the responsibility when we reach
>    the limit.
>
> Both approaches don't play very well with the default 0 limit because we
> either reclaim unless we set up the whole hierarchy properly or we just
> burn cycles by trying to reclaim groups wit no or only few pages.

Setting the default to 0 is a good optimization which makes everybody
to be eligible for reclaim if admin doesn't do anything.

In reality, if admin want to preserve working set of cgroups and
he/she has to set the softlimit. By doing that, it is easier to only
focus on the cgroup itself without looking up its ancestors.

> The second approach leads to more expected results though because we do
> not touch "leaf" groups unless they are over limit.
> I have to think about that some more but it seems that the second approach
> is much easier to implement and matches the "guarantee" expectations
> more.

Agree.

> I guess we could converge both approaches if we could reclaim from the
> leaf groups upwards to the root but I didn't think about this very much.

That is what the current patch does, which only consider softlimit
under global pressure :)

--Ying
>
> [...]
> --
> Michal Hocko
> SUSE Labs
> SUSE LINUX s.r.o.
> Lihovarska 1060/12
> 190 00 Praha 9
> Czech Republic

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href


[Site Home]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Tools]     [DDR & Rambus]     [Asterisk Internet PBX]     [Linux API]     [Monitors]

Add to Google Google PageRank Checking tool