## Saturday, February 5, 2011

### Goto Hell

There was a recent stackoverflow challenge question where users were asked to present code that prints the numbers 1 through 1000 without using loops or conditionals. A major theme for answers to the thread was recursion - both compile time and run time. There were a mix of other entries including consecutive function calls to iterate through 1000 increment/print operations, shelling-out to a program such as seq, and several other approaches to the problem. My first opinion was that the only correct solution (according to the spec) was the one that used constructors. I say correct because, technically, recursion is looping.

I posted my own solution for fun which aimed to solve the problem in a way not currently presented. This is both an abuse of compiler extensions and language constructs and is nothing more than me having some fun. Really, don't write code like this. Ever.

#include <stdio.h>

int main () {
void * label[1001] = { [0 ... 999] = &&again };
label[1000] = &&done;
int i = 0;
again:
printf ("%d\n", i + 1);
goto *label[++i];
done:
return 0;
}


I knew that if I considered recursion a loop that my code was also a loop construct. The point of my post, however, was to have a solution that was entirely distinct from what had already been posted and not necessarily to solve the puzzle exactly. This got me thinking though; how close is state machine looping, and recursion, and compile-time recursion to an actual coded loop?

I coded up and compiled three implementations with the intent of determining what the code looked like after the compiler was finished with it. The solutions I chose were: compile time recursion, run time recursion, and my goto looping. Each of these was compared to a basic for loop. Because of template stack depth issues I've only printed up to 10 in these examples. All files were compiled by g++ -O0 (except my solution, see below). In all assembly output the comments are my own.

Compile Time Template Recursion

template< int N > void fun () {
fun< N-1 >();
std::cout << N << '\n';
}


As asm

...

_Z3funILi1EEvv:                 ; fun< 1 >
pushq   %rbp
movq    %rsp, %rbp
movl    $1, %esi movl$_ZSt4cout, %edi
call    _ZNSolsEi
movq    %rax, %rdi
movl    $10, %esi call _ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_c leave ret ... _Z3funILi2EEvv: ; fun< 2 > pushq %rbp movq %rsp, %rbp call _Z3funILi1EEvv ; fun< 1 >() ... _Z3funILi3EEvv: ; fun< 3 > pushq %rbp movq %rsp, %rbp call _Z3funILi2EEvv ; fun< 2 >() ... main: pushq %rbp movq %rsp, %rbp call _Z3funILi10EEvv ; fun< 10 >() movl$0, %eax
leave
ret


This was a big surprise to me. I assumed that since the compiler was dealing with constant values and there was no class involved that this would actually be generated as a recursive expression. Instead, as you can see, the compiler actually generates all 10 function calls and arranges them to call each other in order where main calls _Z3funILi10EEvv which calls _Z3funILi9EEvv which, in turn, calls _Z3funILi8EEvv and so on - each printing a value after invoking the next function. The process bottoms out at _Z3funILi1EEvv where no function is called and only a value is printed. So this is essentially the same as creating a function for each value to be printed and calling them in order with the exception that the calls are actually nested instead of sequential. This is not recursion.

Run Time Recursion

void fun(int n) {
n && (fun (n-1), std::cout << n << '\n');
}


As asm

_Z3funi:
pushq   %rbp
movq    %rsp, %rbp
subq    $16, %rsp movl %edi, -4(%rbp) cmpl$0, -4(%rbp)    ; hidden conditional ;)
je  .L9
movl    -4(%rbp), %eax
leal    -1(%rax), %edi
call    _Z3funi         ; the recursion
movl    -4(%rbp), %esi
movl    $_ZSt4cout, %edi ; print setup stuff here .L9: leave ret ... main: pushq %rbp movq %rsp, %rbp movl$10, %edi
call    _Z3funi         ; start recursive call
movl    $0, %eax leave ret  It is interesting to note that there is a hidden conditional in there. It applies to the short-circuit operation used to terminate the recursion. Other than that, there is nothing unexpected here. The only thing that might have changed this up a bit is if we could have done tail recursion; the compiler might have turned it into a loop. Requiring the print call to be the final executed statement of the function spoils any chance of that. Goto State Table As a side note, I was not able to compile my submission as C++ code. In order to make the comparisons I've just used gcc instead - I don't imagine the result would be any different had g++ been willing to accept the code. main: pushq %rbp movq %rsp, %rbp subq$8048, %rsp
leaq    -8016(%rbp), %rax
movq    %rax, -8024(%rbp)
movq    $0, -8032(%rbp) movl$8008, %eax
cmpl    $8, %eax ; part of initialization, not my code jb .L2 movq$1001, -8040(%rbp)
movq    -8024(%rbp), %rdi
movq    -8040(%rbp), %rcx
movq    -8032(%rbp), %rax
rep stosq
.L2:
movq    $.L3, -8016(%rbp) ; again label ; lots of lines initializing jump table movq$.L3, -24(%rbp)   ; again label
movq    $.L4, -16(%rbp) ; end lable movl$0, -4(%rbp)
.L3:
movl    -4(%rbp), %eax
leal    1(%rax), %esi
movl    $.LC0, %edi movl$0, %eax
call    printf
addl    $1, -4(%rbp) movl -4(%rbp), %eax cltq movq -8016(%rbp,%rax,8), %rax jmp *%rax ; non-conditional jump .L4: movl$0, %eax
leave
ret


And, for completeness, the for loop

main:
pushq   %rbp
movq    %rsp, %rbp
subq    $16, %rsp movl$1, -4(%rbp)
jmp .L7
.L8:
movl    -4(%rbp), %esi
movl    $_ZSt4cout, %edi call _ZNSolsEi movq %rax, %rdi movl$10, %esi
call    _ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_c
addl    $1, -4(%rbp) .L7: cmpl$1000, -4(%rbp)     ; conditional
jle .L8
movl    \$0, %eax
leave
ret


The really surprising thing to me was the result of C++ template recursion and that the generated code is not recursive. So, in addition to the constructor submission it seems that compile time recursion satisfies both requirements of the problem: no conditionals and no looping.

As expected, there were conditionals in the resulting assembly for the run time recursion - they are necessary to ensure an exit condition. This is where I believe that I have a leg up on the recursive solutions: the goto approach does not rely on a condition to terminate. In both the run time recursion and straight for loop implementation the resulting [assembly] code boiled down to basically a check for continuation (the conditional) and a jump to a previous location (the loop). Neither are present in the template recursion or the constructor solution. My code sits somewhere in between. Though, I likely lose major points for style.