Google
  Web www.spinics.net

Re: building kernel with gcc-4.3.x

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]


On Thu, 3 Jul 2008, Russell King - ARM Linux wrote:

> On Wed, Jul 02, 2008 at 09:21:11PM -0400, Nicolas Pitre wrote:
> > Either a new constraint identifier needs to be created (and therefore 
> > making backward compatibility a problem), or we should simply use 
> > __builtin_prefetch() instead of our own open coded implementation.
> 
> Does that always generate a pld instruction when used?

Only when the architecture level is v5e  or above.  The PLD insn was not 
defined before that although it is supposed to be a no op.

I just wrote a small test case like this:

|static inline void prefetch(const void *ptr)
|{
|	__asm__ __volatile__(
|			"pld\t%0"
|			:
|			: "o" (*(char *)ptr)
|			: "cc");
|}
|
|struct foo {
|	struct foo *next;
|	int a;
|	int b;
|};
|
|int bar(struct foo *p)
|{
|	int x = 0;
|	do {
|		prefetch(p->next);
|		x += p->a + p->b;
|		p = p->next;
|	} while (p);
|	return x;
|}

and we clearly can see the "optimization" here, as gcc-4.3.x compiles 
the above to:

        .type   bar, %function
bar:
        @ args = 0, pretend = 0, frame = 0
        @ frame_needed = 0, uses_anonymous_args = 0
        @ link register save eliminated.
.L2:
        ldr     r0, [r0, #0]
        pld     [r0, #0]
        b       .L2
        .size   bar, .-bar

Replacing prefetch() with __builtin_prefetch(), we now get:

        .type   bar, %function
bar:
        @ args = 0, pretend = 0, frame = 0
        @ frame_needed = 0, uses_anonymous_args = 0
        @ link register save eliminated.
        mov     r1, #0
.L3:
        ldr     r3, [r0, #0]
        ldmib   r0, {r2, ip}
        add     r2, ip, r2
        cmp     r3, #0
        mov     r0, r3
        add     r1, r1, r2
        pld     [r3, #0]
        bne     .L3
        mov     r0, r1
        bx      lr
        .size   bar, .-bar

The fix, if we wish to preserve our implementation, would be this:

diff --git a/include/asm-arm/processor.h b/include/asm-arm/processor.h
index bd8029e..13fda19 100644
--- a/include/asm-arm/processor.h
+++ b/include/asm-arm/processor.h
@@ -112,9 +112,9 @@ extern int kernel_thread(int (*fn)(void *), void *arg, unsigned long flags);
 static inline void prefetch(const void *ptr)
 {
 	__asm__ __volatile__(
-		"pld\t%0"
+		"pld\t%a0"
 		:
-		: "o" (*(char *)ptr)
+		: "p" (ptr)
 		: "cc");
 }

Or simply redefine prefetch(ptr) as __builtin_prefetch(ptr). 


Nicolas

-------------------------------------------------------------------
List admin: http://lists.arm.linux.org.uk/mailman/listinfo/linux-arm
FAQ:        http://www.arm.linux.org.uk/mailinglists/faq.php
Etiquette:  http://www.arm.linux.org.uk/mailinglists/etiquette.php

[Site Home]     [Linux ARM Kernel]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Tools]     [DDR & Rambus]     [Monitors]

Add to Google

Google PageRank Checking tool