Re: [PATCH -V6 07/14] memcg: Add HugeTLB extension

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 16 Apr 2012, Aneesh Kumar K.V wrote:

> This patch implements a memcg extension that allows us to control HugeTLB
> allocations via memory controller. The extension allows to limit the
> HugeTLB usage per control group and enforces the controller limit during
> page fault. Since HugeTLB doesn't support page reclaim, enforcing the limit
> at page fault time implies that, the application will get SIGBUS signal if it
> tries to access HugeTLB pages beyond its limit. This requires the application
> to know beforehand how much HugeTLB pages it would require for its use.
> 
> The charge/uncharge calls will be added to HugeTLB code in later patch.
> Support for memcg removal will be added in later patches.
> 

Again, I disagree with this approach because it's adding the functionality 
to memcg when it's unnecessary; it would be a complete legitimate usecase 
to want to limit the number of globally available hugepages to a set of 
tasks without incurring the per-page tracking from memcg.

This can be implemented as a seperate cgroup and as we move to a single 
hierarchy, you lose no functionality if you mount both cgroups from what 
is done here.

It would be much cleaner in terms of

 - build: not requiring ifdefs and dependencies on CONFIG_HUGETLB_PAGE, 
   which is a prerequisite for this functionality and is not for 
   CONFIG_CGROUP_MEM_RES_CTLR,

 - code: seperating hugetlb bits out from memcg bits to avoid growing 
   mm/memcontrol.c beyond its current 5650 lines, and

 - performance: not incurring any overhead of enabling memcg for per-
   page tracking that is unnecessary if users only want to limit hugetlb 
   pages.

Kmem accounting and swap accounting is really a seperate topic and makes 
sense to be incorporated directly into memcg because their usage is a 
single number, the same is not true for hugetlb pages where charging one 
1GB page is not the same as charging 512 2M pages.  And we have no 
usecases for wanting to track kmem or swap only without user page 
tracking, what would be the point?

There's a reason we don't enable CONFIG_CGROUP_MEM_RES_CTLR in the 
defconfig, we don't want the extra 1% metadata overhead of enabling it and 
the potential performance regression from doing per-page tracking if we 
only want to limit a global resource (hugetlb pages) to a set of tasks.

So please consider seperating this functionality out into its own cgroup, 
there's no reason not to do it and it would benefit hugetlb users who 
don't want to incur the disadvantages of enabling memcg entirely.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[Index of Archives]

  Powered by Linux

[Older Kernel Discussion]     [Yosemite National Park Forum]     [Large Format Photos]     [Gimp]     [Yosemite Photos]     [Stuff]     [Index of Other Archives]