Monday, May 26, 2014

Of Binary Bombs (part 4)

Part 3 of this series explored phase_3 of this riddle which marked the first phase that accepted more than a single correct input sequence. Here I'll continue with the next phase: sym.phase_4.


At first, some arguments are loaded onto the stack to prep for a call to sscanf but rather than an immediate string as the format string argument, an address is provided. Recall in in phase_2 that we could get a hint of the input by printing str._d_d_d_d_d_d (ps @ str._d_d_d_d_d_d) - here we need to understand the arguments to sscanf a little to know that the address 0x8049808 should contain a format string. Indeed, it does (ps @ 0x8049808 yields %d) - our input string needs to be a number. That number needs to be larger than 0.

    cmp dword [ebp-0x4], 0x0
    jg 0x8048d0e

After the input is checked, its value is pushed to the stack and control is passed to sym.func4.


The second requirement of our input is now revealed:

    cmp ebx, 0x1
    jle 0x8048cd0

For values larger than 1 sym.func4 is called recursively.

    lea eax, [ebx-0x1]
    push eax
    call sym.func4

So the input value to sym.func4 (our input, initially) is decremented by 1 and passed to the recursive call. The result of that call is saved to esi and then sym.func4 is called recursively yet again - this time after decrementing the input by 2.

    lea eax, [ebx-0x2]
    push eax
    call sym.func4

The return value of the second recursion is added to the result of the first and that sum becomes the return value of this depth of the recursive call. In C, this looks something like:

int func4 (int n) {
    if (n < 2) { return 1; }
    return func4 (n - 1) + func4 (n - 2);
}

Back in sym.phase_4 there is the following check:

    cmp eax, 0x37

The input, when fed to sym.func4, should return 55. The recursion of n-1 + n-2 is one way to calculate the Fibonacci number at index n. That, coupled with the fact that 55 is a Fibonacci number, allows us to derive the necessary input value of 9.


Next up, part 5.

Thursday, May 22, 2014

Of Binary Bombs (part 3)

In part 2 I explained both sym.read_line and the solution to sym.phase_2. Here I work through the third phase of Dr. Evil's nasty binary bomb.


I'll assume to start directly at sym.phase_3 beyond the input handling routines previously discussed.



Phase 3 starts with a call to sscanf with a format string of "%d %c %d". So we should provide a number, character, and a number. The first number should be less than or equal to 7

    cmp dword [ebp-0xc], 0x7

and is used to calculate a jump address

    mov eax, [ebp-0xc]
    jmp dword [eax*4+0x80497e8]

If we start with eax containing 0 this jumps us to a location stored at 0x80497e8. That address is 0x08048be0 - or just over the next instruction.

    mov bl, 0x71
    cmp dword [ebp-0x4], 0x309

This sets bl to 0x71 (ASCII 'q') and compares the third input to 777. If the third input is 777 control jumps to

    cmp bl, [ebp-0x5]

So, to avoid the bomb we can provide the following input: 0 q 777 and we have a valid solution.


But what about setting eax to something other than 0 to start? Let's look at the other possible jump addresses for values less than 8 but greater than 0 for the first input. I've abbreviated the code and commented what is different from the above description.

; eax == 1 - 0x08048c00
    mov bl, 0x62                ; ASCII 'b'
    cmp dword [ebp-0x4], 0xd6   ; 214

; eax == 2 - 0x08048c16
    mov bl, 0x62                ; ASCII 'b'
    cmp dword [ebp-0x4], 0x2f3  ; 755
        
; eax == 3 - 0x08048c28
    mov bl, 0x6b                ; ASCII 'k'
    cmp dword [ebp-0x4], 0xfb   ; 251
        
; eax == 4 - 0x08048c40
    mov bl, 0x6f                ; ASCII 'o'
    cmp dword [ebp-0x4], 0xa0   ; 160
  
; eax == 5 - 0x08048c52
    mov bl, 0x74                ; ASCII 't'
    cmp dword [ebp-0x4], 0x1ca  ; 458

; eax == 6 - 0x08048c64
    mov bl, 0x76                ; ASCII 'v'
    cmp dword [ebp-0x4], 0x30c  ; 780

; eax == 7 - 0x08048c76
    mov bl, 0x62                ; ASCII 'b'
    cmp dword [ebp-0x4], 0x20c  ; 524

So it looks as if there are several answers to this part of the riddle. Let's verify at least one of the others work.

Next, phase 4.

Wednesday, May 21, 2014

Of Binary Bombs (part 2)

In part 1 I worked out the setup of the binary bomb and walked through the solution to the first phase of the riddle. This post continues from there to examine phase 2.

After defusing phase 1 the program reads the next line from the user (sym.read_line) and calls sym.phase_2. In my first post I glazed over the sym.read_line with some hand-waving so I'd like to revisit that here with a little more attention. As a reminder, in radare2 you can search for a symbol while in visual mode with s (s sym.read_line).


The first thing that happens in sym.read_line (after managing the pointers and local variables) is a call to sym.skip. sym.skip has a little more to it than just this but basically determines which input line is to be read by looking at the global variable sym.num_input_strings and using that as an index into sym.input_strings (a global holding each input string in an 80-byte char array). It returns the string read in eax.

Once a line is read from the input stream the following instructions check the return value of sym.skip.

    test eax, eax
    jne 0x804925f

Only if the return value of sym.skip is zero (when null is returned) does execution not jump to 0x0804925f. I will cover later what happens in the null return case. For now, non-null returns follow this explanation.

    mov eax, [sym.num_input_strings]

At this point, we have entered 2 strings but the global counter sym.num_input_strings has only been updated once so contains the value 1.

    lea eax, [eax+eax*4]
    shl eax, 0x4
    lea edi, [eax+sym.input_strings]

The above sets eax to eax * 5, multiplies the result by 16, and loads the string at sym.input_strings+eax into edi. In this case (sym.num_input_strings == 1) this yields the second index in the sym.input_strings array (recall that each entry there is 80 bytes).

The next set of instructions calculate the length of the input string by decrementing a counter for each non-null byte in the string. The calculation uses some shortcuts for efficiency so is not directly mapped to traditional C-like counting (google 'asm string length' for details).

    mov al, 0x0
    cld
    mov ecx, 0xffffffff 
    repne scasb
    mov eax, ecx
    not eax
    lea edi, [eax-0x1]

Our input strings must fit within the 80-byte array so checking that the string (less newline) is less than 0x4f (79) bytes is done next

    cmp edi, 0x4f

Finally, the newline is replaced with a zero byte

    mov byte [edi+eax+0x804b67f], 0x0

the input string is placed in eax and the global sym.num_input_strings is incremented by 1.

At this point eax contains the input string and I can resume with the second phase of the bomb.


After the prologue and setting aside 32 bytes of local space the input string is copied to edx and the local buffer address is loaded into eax. These two arguments are pushed onto the stack and sym.read_six_numbers is invoked. To avoid making this post too long I'll skip over the detail of sym.read_six_numbers and just mention that the numbers in the input string are read into the local array via sscanf with a format string of "%d %d %d %d %d %d" (Hint #1). 

The following check verifies that the first entry in the local array has the value 1 and is the second hint to solving this phase.

    cmp dword [ebp-0x18], 0x1                

The next section of code is the meat of phase 2 and how we determine what numbers to include in our string.

    mov ebx, 0x1
    lea esi, [ebp-0x18] 
    lea eax, [ebx+0x1]  ; target of jump below (0x8048b76)
    imul eax, [esi+ebx*4-0x4] 
    cmp [esi+ebx*4], eax   
    je 0x8048b88 
    call sym.explode_bomb
    inc ebx             ; target of jump above (0x8048b88)
    cmp ebx, 0x5  
    jle 0x8048b76 

This is a loop from 1 to 5 using ebx as the counter. The initialization sets ebx to 1 and esi to the beginning of the array of converted numbers (the local array). On each iteration of the loop eax is set to ebx + 1 then multiplied by the nth entry of the local array (n is ebx - 1). The result of that is compared to the next entry in the array. In C parlance this might look like:

    int xs[] = {...}; 
    for (i = 0; i < 5; ++i)
        if ((i+1) * xs[i] != xs[i+1]) 
            FAIL();

If we know that xs[0] must be 1 we can now use the above to calculate the remainder of the array.

    x[1] = 2 * x[0] => 2
    x[2] = 3 * x[1] => 6 
    x[3] = 4 * x[2] => 24
    ...

Ultimately, our input requirements become: 1 2 6 24 120 720. This works as expected and phase 2 is now solved.


On to phase 3.

Tuesday, May 6, 2014

Of Binary Bombs (part 1)

I was recently turned on to radare2 (http://www.radare.org/y/) and, in an attempt to use-it-to-learn-it, I am going to try and solve the sample binary bomb provided as an example lab for the CMU computer systems course. I'm aiming to make my next few posts an account of trying to solve that puzzle (and learn this new tool).

Caveat lector: I've basically no experience with binary tools and assembly language so this will be deliberate and likely full of amateur mistakes.


I expect to be able to analyze this as the program would be run so I'm staring at the main entry point and assuming that there is nothing tricky happening prior to that. Load up radare2 and enter visual mode (press 'V' at the command prompt) then press 'p' until the view is one that resembles assembly instructions. Once there search for the main entry point (at the command prompt: s main). That will bring you to a screen that looks similar to:


First thing to happen in this program is to check the arguments to main to determine whether to read from stdin or from a file. This progresses from the function prologue: saving the base pointer (ebp); copy the stack pointer (esp); and reserve 20 bytes for local use (sub esp, 0x14). The program then continues to look at the value of argc and acts accordingly eventually assigning the input stream (stdin or otherwise) to sym.infile.

Without error, control flow reaches address 0x08048a30 (in my version) which is a call to sym.initialize_bomb. Radare2 allows you to set marks much like you would in vim so if you are familiar with that, set a mark here with ma and hit enter at this address to follow to sym.initialize_bomb. Alternatively, at the command prompt, search for the function symbol with s sym.initialize_bomb. In either case you should see the following:


Here, the program simply sets up to handle SIGINT with the function sig_handler and returns. If you are using marks you can return to the previous address with 'a.

The program continues with a couple of generic prompts, proceeds to read a line of input (which does some internal bookkeeping), and jumps to phase 1 (sys.phase_1). Place a local mark (ma) and jump to the first phase code: we are now ready to start the actual process of solving Dr. Evil's nasty riddle.


Phase 1 is intended to be easy. The first line of input provided to the program is copied to eax, and then two arguments are set up: a global string (at address 0x080497c0) and eax. These arguments are sent to strings_not_equal which returns 0 if the two arguments are the same length and contain the same strings. The check test eax, eax (true when eax is 0) avoids exploding the bomb so we want eax (the first input string) to be the same as whatever is in 0x080497c0.

The radare2 command to print memory as a string is ps. Entering command mode (:) and typing ps @ 0x080497c0 yields: 'Public speaking is very easy.' (the period is significant). Entering that provides:


Next, I'll tackle phase 2.