RE: Unable to remove pipeline stalls

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]



> -----Original Message-----
> From: Ian Lance Taylor [mailto:iant@xxxxxxxxxx]
> Sent: 18 July 2012 19:48
> To: Deepti Sharma
> Cc: gcc-help@xxxxxxxxxxx
> Subject: Re: Unable to remove pipeline stalls
> 
> On Wed, Jul 18, 2012 at 2:19 AM, Deepti Sharma
<deepti.gccretarget@xxxxxxxxx>
> wrote:
> >
> > I am facing one issue with pipeline stalls. My compiler generates
> > below
> > assembly:
> >  MOVE AR2, _rec2
> >  MOVE AR0, _rec1
> >  MOVE R3, (AR2)+  //Writes to R3
> >  MOVE (AR0)+, R3  //Reads from R3; Stall Created  MOV  AR1, AR0  MOV
> > AR0, AR2  MOVE R2, (AR0)+  //  MOVE (AR1)+, R2  // stall  MOVE R1,
> > (AR0)+  //  MOVE (AR1)+, R1  // stall  MOVE R0, (AR0,0)  MOVE (AR1,0),
> > R0  //stall
> 
> ...
> 
> > ;;   ======================================================
> > ;;   -- basic block 2 from 7 to 20 -- before reload
> > ;;   ======================================================
> >
> > ;;        0-->     7 r62=`r1'
> > ;;        1-->     8 r63=`r2'                          :UNIT3
> > ;;        2-->    11 [r62++]=[r63++]                   :UNIT1|UNIT2
> > ;;        3-->    12 r65=r62                           :UNIT4
> > ;;        4-->    13 r64=r63                           :UNIT4
> > ;;        5-->    14 [r65++]=[r64++]                   :UNIT1|UNIT2
> > ;;        6-->    17 [r65++]=[r64++]                   :UNIT1|UNIT2
> > ;;        7-->    20 [r65]=[r64]                       :UNIT1|UNIT2
> >
> > Any pointers for this issue will be helpful.
> 
> I don't understand how insns like [r65++]=[r64++] correspond to the
> assembler instruction sequence you show above.  Do you have a
> define_split in there that only happens after reload is complete?
> Perhaps you need to be looking at the scheduler dump after reload,
> which is to say the one from -fschedule-insns2.

I have also checked the sched2 dump, which comes as:

;;   ======================================================
;;   -- basic block 2 from 8 to 27 -- after reload
;;   ======================================================

;;   0-->     8 call <...>                        
;;   1-->    12 AR0=`r1'                        
;;   2-->    41 AR2=`r2'                        
;;   3-->    13 AR1=[AR0]                         
;;   4-->    15 AR0=[AR2]                         
;;   5-->    43 R4=[AR0++]                        
;;   6-->    18 [AR1++]=R4     //Stall                    
;;   7-->    44 R5=[AR0++]                        
;;   8-->    21 [AR1++]=R5     //Stall                   
;;   9-->    45 R6=[AR0++]                        
;;  10-->    24 [AR1++]=R6                        
;;  11-->    46 R7=[AR0]                          
;;  12-->    27 [AR1]=R7                          

As we see above, the scheduler is not able to reschedule the instructions to
remove stalls.
I understand that the removal of stalls is performed by scheduler-pass1
(sched1). 
However, since the scheduler1 is receiving the instruction as a single
instruction "[r62++]=[r63++]", 
it does not know that it is later broken down into two MOVE operations.

I don't have a define_split pattern for MOVE. I have a define_expand, like
below:

 (define_expand "movsi"
   [(set (match_operand:SI 0 "nonimmediate_operand" "")
 	(match_operand:SI 1 "general_operand" ""))]
  ""
  "
  { 
    if ((GET_CODE (operands[1]) == CONST_INT) ) ||
        ((GET_CODE (operands[1]) == LABEL_REF) || (GET_CODE (operands[1]) ==
SYMBOL_REF)))
    {
	  if (register_operand(operands[0],SImode))
	  {
	    emit_insn (gen_movsi_high (operands[0], operands[1]));
	    DONE;
	  }
	  else if (!(reload_in_progress || reload_completed))
	  {
	    rtx REG1 = gen_reg_rtx(SImode);
	    emit_insn (gen_movsi_high (REG1, operands[1]));
	    emit_insn (gen_movsi_lo_sum (REG1, REG1, operands[1]));        
	    emit_insn(gen_rtx_SET(SImode, operands[0], REG1));
	    DONE;
	  }
	  else FAIL;
    }
  }")

(define_insn "*movsi"
  [(set (match_operand:SI 0 "nonimmediate_operand" "=z,z,x,x, xz,co1 ")
        (match_operand:SI 1 "general_operand"      " z,x,z,x, co1,xz "))]
  ""
  "@
  MOV %0, %1
  MOVE %0, %1
  MOVE %0, %1
  MOVE %0, %1
  MOVE %0, %1
  MOVE %0, %1"
  [(set_attr "ATTR1" "mem1, mem1, mem1, mem1, mem1, mem1")
   (set_attr "ATTR2" "none, op1, op1, op1, op2, op1")])


-- Deepti






[Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

Add to Google