Re: [PATCH tip/core/rcu 01/15] rcu: Control RCU_FANOUT_LEAF from boot-time parameter

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]


On Fri, Jun 15, 2012 at 02:05:56PM -0700, Paul E. McKenney wrote:
> From: "Paul E. McKenney" <paulmck@xxxxxxxxxxxxxxxxxx>
> 
> Although making RCU_FANOUT_LEAF a kernel configuration parameter rather
> than a fixed constant makes it easier for people to decrease cache-miss
> overhead for large systems, it is of little help for people who must
> run a single pre-built kernel binary.
> 
> This commit therefore allows the value of RCU_FANOUT_LEAF to be
> increased (but not decreased!) via a boot-time parameter named
> rcutree.rcu_fanout_leaf.
> 
> Reported-by: Mike Galbraith <efault@xxxxxx>
> Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx>
> ---
>  Documentation/kernel-parameters.txt |    4 ++
>  kernel/rcutree.c                    |   97 ++++++++++++++++++++++++++++++-----
>  kernel/rcutree.h                    |   23 +++++----
>  kernel/rcutree_plugin.h             |    4 +-
>  kernel/rcutree_trace.c              |    2 +-
>  5 files changed, 104 insertions(+), 26 deletions(-)
> 
> diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
> index c45513d..88bd3ef 100644
> --- a/Documentation/kernel-parameters.txt
> +++ b/Documentation/kernel-parameters.txt
> @@ -2367,6 +2367,10 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
>  			Set maximum number of finished RCU callbacks to process
>  			in one batch.
>  
> +	rcutree.fanout_leaf=	[KNL,BOOT]
> +			Set maximum number of finished RCU callbacks to process
> +			in one batch.

Copy-paste problem.

>  	rcutree.qhimark=	[KNL,BOOT]
>  			Set threshold of queued
>  			RCU callbacks over which batch limiting is disabled.
> diff --git a/kernel/rcutree.c b/kernel/rcutree.c
> index 0da7b88..a151184 100644
> --- a/kernel/rcutree.c
> +++ b/kernel/rcutree.c
> @@ -60,17 +60,10 @@
>  
>  /* Data structures. */
>  
> -static struct lock_class_key rcu_node_class[NUM_RCU_LVLS];
> +static struct lock_class_key rcu_node_class[RCU_NUM_LVLS];

I assume that the requirement to only increase the fanout and never
decrease it comes from the desire to not increase the sizes of all of
these arrays to MAX_RCU_LVLS?

> +/*
> + * Compute the rcu_node tree geometry from kernel parameters.  This cannot
> + * replace the definitions in rcutree.h because those are needed to size
> + * the ->node array in the rcu_state structure.
> + */
> +static void __init rcu_init_geometry(void)
> +{
> +	int i;
> +	int j;
> +	int n = NR_CPUS;
> +	int rcu_capacity[MAX_RCU_LVLS + 1];
> +
> +	/* If the compile-time values are accurate, just leave. */
> +	if (rcu_fanout_leaf == CONFIG_RCU_FANOUT_LEAF)
> +		return;
> +
> +	/*
> +	 * Compute number of nodes that can be handled an rcu_node tree
> +	 * with the given number of levels.  Setting rcu_capacity[0] makes
> +	 * some of the arithmetic easier.
> +	 */
> +	rcu_capacity[0] = 1;
> +	rcu_capacity[1] = rcu_fanout_leaf;
> +	for (i = 2; i <= MAX_RCU_LVLS; i++)
> +		rcu_capacity[i] = rcu_capacity[i - 1] * CONFIG_RCU_FANOUT;
> +
> +	/*
> +	 * The boot-time rcu_fanout_leaf parameter is only permitted
> +	 * to increase the leaf-level fanout, not decrease it.  Of course,
> +	 * the leaf-level fanout cannot exceed the number of bits in
> +	 * the rcu_node masks.  Finally, the tree must be able to accommodate
> +	 * the configured number of CPUs.  Complain and fall back to the
> +	 * compile-timer values if these limits are exceeded.

Typo: s/timer/time/

> +	 */
> +	if (rcu_fanout_leaf < CONFIG_RCU_FANOUT_LEAF ||
> +	    rcu_fanout_leaf > sizeof(unsigned long) * 8 ||
> +	    n > rcu_capacity[4]) {

4 seems like a magic number here; did you mean MAX_RCU_LVLS or similar?

Also, why have n as a variable when it never changes?

> --- a/kernel/rcutree.h
> +++ b/kernel/rcutree.h
> @@ -42,28 +42,28 @@
>  #define RCU_FANOUT_4	      (RCU_FANOUT_3 * CONFIG_RCU_FANOUT)
>  
>  #if NR_CPUS <= RCU_FANOUT_1
> -#  define NUM_RCU_LVLS	      1
> +#  define RCU_NUM_LVLS	      1

I assume you made this change to make it easier to track down all the
uses of the macro to change them; however, having now done so, the
change itself seems rather gratuitous, and inconsistent with the other
macros.  Would you consider changing it back?

> +extern int rcu_num_lvls;
> +extern int rcu_num_nodes;

Given the above, you might also want to change these for consistency.

Also, have you checked the various loops using these variables to figure
out if GCC emits less optimal code now that it can't rely on a
compile-time constant?  I don't expect it to make much of a difference,
but it seems worth checking.

You might also consider marking these as __read_mostly, at a minimum.

> diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
> index 2411000..e9b44c3 100644
> --- a/kernel/rcutree_plugin.h
> +++ b/kernel/rcutree_plugin.h
> @@ -68,8 +68,10 @@ static void __init rcu_bootup_announce_oddness(void)
>  	printk(KERN_INFO "\tAdditional per-CPU info printed with stalls.\n");
>  #endif
>  #if NUM_RCU_LVL_4 != 0
> -	printk(KERN_INFO "\tExperimental four-level hierarchy is enabled.\n");
> +	printk(KERN_INFO "\tFour-level hierarchy is enabled.\n");

This change seems entirely unrelated to this patch.  Seems simple enough
to split it into a separate one-line patch ("Mark four-level hierarchy
as no longer experimental").

>  #endif
> +	if (rcu_fanout_leaf != CONFIG_RCU_FANOUT_LEAF)
> +		printk(KERN_INFO "\tExperimental boot-time adjustment of leaf fanout.\n");

You might consider printing rcu_fanout_leaf in this message.

- Josh Triplett
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[Other Archives]     [Linux Kernel Newbies]     [Linux Driver Development]     [Fedora Kernel]     [Linux Kernel Testers]     [Linux SH]     [Linux Omap]     [Linux Kbuild]     [Linux Tape]     [Linux Input]     [Linux Kernel Janitors]     [Linux Kernel Packagers]     [Linux Doc]     [Linux Man Pages]     [Linux API]     [Linux Memory Management]     [Linux Modules]     [Linux Standards]     [Kernel Announce]     [Netdev]     [Git]     [Linux PCI]     Linux CAN Development     [Linux I2C]     [Linux RDMA]     [Linux NUMA]     [Netfilter]     [Netfilter Devel]     [SELinux]     [Bugtraq]     [FIO]     [Linux Perf Users]     [Linux Serial]     [Linux PPP]     [Linux ISDN]     [Linux Next]     [Kernel Stable Commits]     [Linux Tip Commits]     [Kernel MM Commits]     [Linux Security Module]     [AutoFS]     [Filesystem Development]     [Ext3 Filesystem]     [Linux bcache]     [Ext4 Filesystem]     [Linux BTRFS]     [Linux CEPH Filesystem]     [Linux XFS]     [XFS]     [Linux NFS]     [Linux CIFS]     [Ecryptfs]     [Linux NILFS]     [Linux Cachefs]     [Reiser FS]     [Initramfs]     [Linux FB Devel]     [Linux OpenGL]     [DRI Devel]     [Fastboot]     [Linux RT Users]     [Linux RT Stable]     [eCos]     [Corosync]     [Linux Clusters]     [LVS Devel]     [Hot Plug]     [Linux Virtualization]     [KVM]     [KVM PPC]     [KVM ia64]     [Linux Containers]     [Linux Hexagon]     [Linux Cgroups]     [Util Linux]     [Wireless]     [Linux Bluetooth]     [Bluez Devel]     [Ethernet Bridging]     [Embedded Linux]     [Barebox]     [Linux MMC]     [Linux IIO]     [Sparse]     [Smatch]     [Linux Arch]     [x86 Platform Driver]     [Linux ACPI]     [Linux IBM ACPI]     [LM Sensors]     [CPU Freq]     [Linux Power Management]     [Linmodems]     [Linux DCCP]     [Linux SCTP]     [ALSA Devel]     [Linux USB]     [Linux PA RISC]     [Linux Samsung SOC]     [MIPS Linux]     [IBM S/390 Linux]     [ARM Linux]     [ARM Kernel]     [ARM MSM]     [Tegra Devel]     [Sparc Linux]     [Linux Security]     [Linux Sound]     [Linux Media]     [Video 4 Linux]     [Linux IRDA Users]     [Linux for the blind]     [Linux RAID]     [Linux ATA RAID]     [Device Mapper]     [Linux SCSI]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Linux IDE]     [Linux SMP]     [Linux AXP]     [Linux Alpha]     [Linux M68K]     [Linux ia64]     [Linux 8086]     [Linux x86_64]     [Linux Config]     [Linux Apps]     [Linux MSDOS]     [Linux X.25]     [Linux Crypto]     [DM Crypt]     [Linux Trace Users]     [Linux Btrace]     [Linux Watchdog]     [Utrace Devel]     [Linux C Programming]     [Linux Assembly]     [Dash]     [DWARVES]     [Hail Devel]     [Linux Kernel Debugger]     [Linux gcc]     [Gcc Help]     [X.Org]     [Wine]

Add to Google Powered by Linux

[Older Kernel Discussion]     [Yosemite National Park Forum]     [Large Format Photos]     [Gimp]     [Yosemite Photos]     [Stuff]