Strength reduction

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Elliotbay (talk | contribs) at 02:14, 22 June 2007 (formatting). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Jump to navigation Jump to search

Strength reduction is a compiler optimization where a function of some systematically changing variable is calculated more efficiently by using previous values of the function. In a procedural programming language this would apply to an expression involving a loop variable and in a declarative language it would apply to the argument of a recursive function. E.g.

f x = ... (2**x) ... (f (x+1)) ...

becomes

f x = f' x (2**x)

where

f ' x z = ... z ... (f' (x+1) 2*z) ...


Here the expensive operation (2**x) has been replaced by the cheaper 2*z in the recursive function f'. This maintains the invariant that z = 2**x for any call to f'.

Strength reduction in loop address computation

One of the most important uses of the strength reduction is computing memory addresses inside a loop. For example a naive translation of the following loop would multiply y by 320 and add it to x in each iteration of the inner loop:

/* Initialize 320x200 array to zeros */
void foo(int *img)
{
    for(int y=0; y<320; y++)
        for(int x=0; x<200; x++)
            img[y*320 + x] = 0;
}

This is indeed the case with GCC with optimizations turned off:

.LFB3:
       pushl   %ebp
.LCFI0:
       movl    %esp, %ebp
.LCFI1:
       subl    $8, %esp
.LCFI2:
       movl    $0, -4(%ebp)
.L2:
       cmpl    $319, -4(%ebp)
       jle     .L5
       jmp     .L1
.L5:
       movl    $0, -8(%ebp)
.L6:
       cmpl    $199, -8(%ebp)
       jle     .L9
       jmp     .L4
# .L9 is the inner loop iteration, the code takes y (-4(%ebp)) and x (-8(%ebp))
# and computes the index based on them
.L9:
       movl    -4(%ebp), %edx
       movl    %edx, %eax
       sall    $2, %eax
       addl    %edx, %eax
       sall    $6, %eax
       addl    -8(%ebp), %eax
       leal    0(,%eax,4), %edx
       movl    8(%ebp), %eax
       movl    $0, (%eax,%edx)
       leal    -8(%ebp), %eax
       incl    (%eax)
       jmp     .L6
.L4:
       leal    -4(%ebp), %eax
       incl    (%eax)
       jmp     .L2
.L1:
       leave
       ret

With optimizations turned on the code gets compiled to much more efficient form:

.LFB3:
       pushl   %ebp
.LCFI0:
       xorl    %ecx, %ecx
       movl    %esp, %ebp
.LCFI1:
       pushl   %esi
.LCFI2:
       movl    8(%ebp), %esi
       pushl   %ebx
.LCFI3:
       xorl    %ebx, %ebx
.L11:
       leal    (%esi,%ecx), %eax
       movl    $199, %edx
       .p2align 4,,15
# Inner loop is just the next four instructions
# It computes address by adding 4 to the previous one (1 times size of integer)
.L10:
       movl    $0, (%eax)
       addl    $4, %eax
       decl    %edx
       jns     .L10
       incl    %ebx
# And to compute the address in the outer loop it doesn't use a multiplication,
# but adds 1280 (320 * size of integer) to the previous address
       addl    $1280, %ecx
       cmpl    $319, %ebx
       jle     .L11
       popl    %ebx
       popl    %esi
       popl    %ebp
       ret

This article is based on material taken from the Free On-line Dictionary of Computing prior to 1 November 2008 and incorporated under the "relicensing" terms of the GFDL, version 1.3 or later.