Monday, December 1, 2014

Readability

I have written before about some of the differences between Ruby and Python and my quirks generally tend to the Ruby approach. I think readability is another dimension between the two languages that highlights this for me - especially as it applies to understanding new code.

I prefer to read code from left-to-right (LTR), top-to-bottom. This is natural for me as it models how I read other text. Code that processes right-to-left (RTL) and, in the severe case, bottom-to-top challenges my ability to easily understand intent. Method chaining highlights this quite nicely. For example, to transform the numbers of a list stored as a string in Python one might write:
','.join(map(lambda x : str(int(x) ** 2), "1,2,3,4".split(',')))
If I am reading that for the first time I need to mentally maintain a stack of operations (join, map, lambda) until I've parsed most of the statement and arrived at the object being operated on: "1,2,3,4". This is due to the RTL application of the code. I've then got to backtrack over each operation on my mental stack to understand what the type and/or result of the overall statement will be. This is complicated by the fact that Python allows for some LTR ("1,2,3,4".split(',')) mixed with the RTL.

For first-time readers of the language this process is even more difficult if the behavior of join or map are not yet well understood.

Ruby makes this significantly easier.
"1,2,3,4".split(',').map { |x| x.to_i ** 2 }.join(',')
When I read code similar to that I can store the type and result as I am parsing the statement. The initial object is immediately available and I can read the expression LTR as: split a string, apply a function to each element of the resulting array, and join that final array with the comma character. The fact that Ruby supports method chaining (on the built-in types) makes for much more readable code.

I've singled out Python above but that was only for the sake of the example. As far as RTL languages go I think Python is middle of the road. Haskell, for example, has a much nicer syntax to deal with function composition (a similar, but not identical situation). On the other end of the spectrum is Lisp which is basically a bottom-to-top, RTL language.

I can (and have) used these languages and many more; RTL vs. LTR in no way prevents one from being proficient over time. Certainly, most RTL code can be written in a way that it flows mostly LTR, top-to-bottom. Even when it isn't, well-written code can read by anyone with enough practice. For newcomers looking to read a new language however, there is less difficulty when the process more closely models how they read in general.

Tuesday, November 25, 2014

Order of Events

inet_ntoa uses static storage for the result it returns. The GNU implementation of inet_ntoa uses the following internally:

static __thread char buffer[18];

This makes the function thread-safe but this safety does not remove the need to worry about use within a single thread. Consider the following snippet of code:

#include <arpa/inet.h>
#include <stdio.h>
int main () {
    struct in_addr a = { .s_addr = 1234567 }, b = { .s_addr = 7654321 };
    return printf ("%s : %s\n", inet_ntoa (a), inet_ntoa (b));
}

Since inet_ntoa is used twice within the argument list the result is dependent on the order of evaluation of the arguments to printf. Regardless of which call gets evaluated first the output will always print the same IP address twice. On my system, the result is:

135.214.18.0 : 135.214.18.0

This is a result of two things: arguments are evaluated before their results are used by printf; and inet_ntoa overwrites static storage on each invocation. Looking at the instructions for this C code makes this clear:

.LC0:                                                                                                                                                                       
        .string "%s : %s\n"
        .text
        .globl  main
        .type   main, @function
main:
        pushl   %ebp
        movl    %esp, %ebp
        pushl   %ebx
        andl    $-16, %esp
        subl    $32, %esp
        movl    $1234567, 24(%esp)
        movl    $7654321, 28(%esp)
        movl    28(%esp), %eax
        movl    %eax, (%esp)
        call    inet_ntoa
        movl    %eax, %ebx        ; pointer stored to static memory
        movl    24(%esp), %eax
        movl    %eax, (%esp)
        call    inet_ntoa
        movl    $.LC0, %edx
        movl    %ebx, 8(%esp)     ; arg2 to printf; pointer from above
        movl    %eax, 4(%esp)     ; arg1 to printf; new pointer, same static memory
        movl    %edx, (%esp)      ; arg0 (format string)
        call    printf
         movl    -4(%ebp), %ebx
        leave
        ret

The correct way to call inet_ntoa consecutively is to save each result to a local variable.

#include <arpa/inet.h>
#include <string.h>
#include <stdio.h>
int main () {
    struct in_addr a = { .s_addr = 1234567 }, b = { .s_addr = 7654321 };
    char ipa[18] = { 0 }, ipb[18] = { 0 };
    strcpy (ipa, inet_ntoa (a));
    strcpy (ipb, inet_ntoa (b));
    return printf ("%s : %s\n", ipa, ipb);
}

Sunday, October 5, 2014

Automating keystrokes via evdev

In a previous post I talked about how to capture keys out from under the X11 windowing system by reading from /dev/input/eventX. These character devices can also be useful to generate input simulating keyboard activity.

I circled back to this topic after having to automate user keyboard activity. I've accomplished similar tasks in the past with a tool named xdotool - unfortunately, in this case I did not have the luxury of being able to install software. The remainder of this post highlights the differences between consuming and producing events. (By the way, If you have the need to automate X actions I highly suggest looking at what xdotool can do for you.)

Consuming events is the easier of the two tasks: you simply read open the device and read events into the following structure:

/* See: /usr/include/linux/input.h */
struct input_event {
    struct timeval time;
    __u16 type;
    __u16 code;
    __s32 value;
};      

Filter input with type == 1 and read the code to get the key and value to get the event (eg. press, release).

To produce a compliant event the process is a little more complicated since the input needs to be synchronized. For each event there are three distinct sets of data that are required: setup (EV_MSC); the event (EV_KEY); and event synchronize (EV_SYN). In addition to that, certain events are captured over time so this is a stateful process. An example of this is pressing Ctrl-L; the control key is held down while another key is pressed and then released.

The easiest way I found to initially grok the protocol is to capture all events while there is keyboard activity and see what the output looks like. Obviously, to produce fully compliant input you should consult API documentation or source code.

An example of automatically entering a URL in the Chrome browser (Ctrl-L [URL]) would require the following inputs (the type, code, and value members of struct input_event). The input goes to the focused window (the standard behavior for X) so you need to place focus on the Chrome window for the following example.

4,  4, 29  # Setup  
1, 29,  1  # Press Ctrl key
0,  0,  0  # Sync
4,  4, 29  # Setup  
1, 29,  2  # Ctrl (value == 2 -> autorepeat)
0,  0,  0  # Sync
4,  4, 38
1, 38,  1  # Press 'L' key
0,  0,  0
4,  4, 38
1, 38,  0  # Release 'L' key
0,  0,  0
4,  4, 29
1, 29,  0  # Release Ctrl key
0,  0,  0
  
# and so on for the URL string 

4,  4, 28
1, 28,  1  # Press Enter key
0,  0,  0
4,  4, 28
1, 28,  0  # Release Enter key
0,  0,  0 

Monday, August 4, 2014

Blocking ptrace

I've had occasion to change the functionality of binary programs for a variety of purposes - mostly to instrument for debugging or logging purposes. The techniques used to do this vary but can be used for both passive monitoring or actively changing the functionality of a program. I'd like to consider one of those techniques (ptrace) in a little more detail here - specifically the ability to stop and arbitrarily modify a running process (think gdb).

I'm going to walk through a few examples of how to prevent a ptrace-based approach to modifying a program. For illustrative purposes I'll use the following sample program that maintains a global variable to influence control flow at run time.


long global_flag = 1;

int main () {
    while (global_flag) {
        fprintf (stderr, "Running ...\n");
        sleep (5);
    }   
    fprintf (stderr, "Someone captured my flag!\n");
    return 0;
}       

The goal in these examples is to prevent the global variable (global_flag) from being modified from an external process. I'm going to step through a few methods that could be used to modify this variable and how to prevent these techniques in turn.

First, I'll look to just overwrite the value directly. Since we can look at the symbols it is trivial to construct a program that will place data into the memory of our choosing within the running process using ptrace. Obviously, this case is easier than would be for most programs due to the simplicity of the example. The approach holds, however, regardless of the scale of the actual process.

Suppose our process is PID 11896; we can find the memory location to modify using nm


...
08048410 t frame_dummy
         U fwrite@@GLIBC_2.0   
0804a018 D global_flag
08048434 T main
         U sleep@@GLIBC_2.0    
... 


If you don't have the program available you can still get at the symbols by looking in /proc (e.g. nm /proc/11896/exe).

The program I'm using to change memory in a particular process:


#include <sys/ptrace.h>
#include <stdio.h>
#include <stdlib.h>
#include <libgen.h>
#include <string.h>
#include <errno.h>
  
void usage (char * prog) {
    fprintf (stderr, "USAGE: %s <pid> <addr> <value>\n", basename (prog));
    fprintf (stderr, "-------------------------\n");
    fprintf (stderr, " pid      Process to modify\n");
    fprintf (stderr, " addr     Address to change\n");             
    fprintf (stderr, " value    Value to write\n");
    exit (42);
}
      
int main (int argc, char **argv) {
  
    pid_t pid = 0;
    unsigned long addr = 0;
    long value = 0, old_value = 0;                  
    
    if (4 != argc) { usage (argv[0]); }

    pid = strtol (argv[1], NULL, 10);
    addr = strtol (argv[2], NULL, 16);
    value = strtol (argv[3], NULL, 10);

    if (ptrace (PTRACE_ATTACH, pid, 0, 0)) {
        fprintf (stderr, "Unable to attach to PID: %d (%s)\n", 
                pid, strerror (errno));
        return 1;
    }

    old_value = ptrace (PTRACE_PEEKDATA, pid, addr, 0);
    fprintf (stderr, "Original value: %ld\n", old_value);
    if (ptrace (PTRACE_POKEDATA, pid, addr, value)) {
        fprintf (stderr, "Unable to overwrite data @ 0x%lx (%s)\n",
                addr, strerror (errno));
        ptrace (PTRACE_DETACH, pid, 0, 0);
        return 1;
    }

    ptrace (PTRACE_DETACH, pid, 0, 0);
    return 0;
}

Considering the output of nm and the PID, I'll call that as follows:

   ./modify 11896 0804a018 0

Then, in the terminal running the original process, you see the output "Someone captured my flag!" and the process ends.
             
To prevent the above result, we need to prevent ptrace from attaching to our running process. We can use ptrace against itself within our program to achieve this goal. Since a process can only be traced by a single process at a time we can immediately set to trace ourselves when the program starts. The new program looks like this:


long global_flag = 1;

int main () {
    ptrace (PTRACE_TRACEME, 0, 0, 0);
    while (global_flag) {
        fprintf (stderr, "Running ...\n");
        sleep (5);
    }
    fprintf (stderr, "Someone captured my flag!\n");
    return 0;
}       

Now, when we try to connect to the process at run time we get an error from ptrace. This is true for any process that attempts to use ptrace to this end (e.g. strace will report: "Unable to attach to PID: 11940 (Operation not permitted)"). Notice that this is also the case when trying to attach to the process as root.

Note for Ubuntu users: it is now the default behavior to prevent attaching to a process unless it is a direct child of the tracing process. The root user can still attach to arbitrary processes but other users are restricted (see /etc/sysctl.d/10-ptrace.conf or man prctl).

Unfortunately, that does not entirely solve the problem. If, instead of having the running process, a user can spawn the process within a debugger the above mechanism can still be defeated. Consider the following example.


[ezpz@mercury (ptrace)]$ gdb prevent_2 
GNU gdb (Ubuntu/Linaro 7.4-2012.04-0ubuntu2.1) 7.4-2012.04
Copyright (C) 2012 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "i686-linux-gnu".
For bug reporting instructions, please see:
<http://bugs.launchpad.net/gdb-linaro/>...
(gdb) b main
Breakpoint 1 at 0x8048467      
(gdb) r
Starting program: prevent_2    
      
Breakpoint 1, 0x08048467 in main ()
(gdb) set {int}0x0804a01c = 0  
(gdb) c
Continuing.
Someone captured my flag!      
[Inferior 1 (process 12130) exited normally]
(gdb) 

Since gdb can set a breakpoint at main control can be gained (by the debugger) prior to being able to self-trace. This situation can be identified from within the traced program, however, by looking at the return value of the call to ptrace.


--- prevent_2.c    2014-08-02 23:33:03.091366946 -0400 
+++ prevent_3.c    2014-08-02 23:33:06.939366991 -0400 
@@ -5,7 +5,10 @@
 long global_flag = 1;
 
 int main () {
-    ptrace (PTRACE_TRACEME, 0, 0, 0);
+    if (0 != ptrace (PTRACE_TRACEME, 0, 0, 0)) {
+        fprintf (stderr, "Tsk tsk tsk...");
+        return 1;
+    }
     while (global_flag) {     
         fprintf (stderr, "Running ...\n");
         sleep (5);

Now gdb can set the breakpoint and modify the memory but when execution continues the program will exit when the call to ptrace (from within gdb) fails.

The observant reader will realize that, from within the debugger, the return value check can also be modified. In fact, nothing prevents someone from directly modifying the binary prior to running the program. There are a variety of mechanisms - both static and dynamic - that can get around the above methods. Some can be prevented; others not. What these mechanisms do provide is a relatively cheap investment that raises the bar when trying to dynamically change program behavior.

Sunday, June 29, 2014

Of Binary Bombs (the secret)

So far, I've described six stages of this bomb along with their solution. These stages have built up in difficulty while describing often used programming constructs such as: string comparison, arrays, a switch statement, recursion, lookup tables, linked lists, and here in the final stage a binary search tree.

While solving the 6th phase will successfully defuse the bomb there is a curious section of code executed at the end. The most important thing to notice is that we can not trigger the bomb from this point on; the entire function will only jump to a graceful exit unless we unlock the secrets. Recall the code for sym.phase_defused:


Initially, there is a check for the total number of lines entered so far; until this point that check has failed. Here the jump is bypassed and execution proceeds to call to sscanf. Two important arguments to sscanf here are: the format string: str._d_s (%d %s) and 0x804b770. From that first argument we can infer the types that will be read and the second indicates from where we will read that data. Unlike in prior phases, there is no input line read to start this phase so 0x804b770 must already have data located in it.

If we look at what is stored there we find nothing special - certainly not something that looks like a number followed by a string.


This analysis is using a static binary, however, so this memory may get filled in at runtime. We have looked at each function in turn and the only changes in memory are driven by the inputs we provide. So where, is this address in memory? If we look for known addresses around this we see that 0x804b770 is located at sym.input_strings+240. Remember in phase 2 we determined that sym.input_strings was a global array of 80-byte character arrays to hold the inputs we provide. So 240 bytes beyond that is the 4th solution we provided (the number 9). There was no string after that but that is part of the secret...

The sym.read_line grabs the entire input line and in phase 4 sscanf only looked for %d which leaves the remainder of the buffer untouched. Nothing prevents us from providing some trailing values after the number so long as there is a space between them.

Supposing we did provide a trailing string the next step is to check that string against str.austinpowers. So that is the secret to accessing the secret phase: update the 4th input to be '9 austinpowers'. 


The secret phase reads in an additional line from the input stream and converts it to a long value using strtol. That value is decremented and compared against 0x3e8 (1000) - the bomb is triggered if our decremented value is greater than that. If the input passes that check we enter the final function: sym.fun7. Prior to going into detail, however, it is important to note that the return value from this function needs to be 0x7 to avoid triggering the bomb. The initial value to sym.fun7 is sym.n1 (0x0804b320).


This is a recursive function very similar to the one explained in stage 4. To understand what is happening with the control flow it is important to first understand what is contained in sym.n1. However, unlike stage 4 this variable name gives us little indication of what the memory may contain.

Looking at the first 16 bytes of that memory location we see the values are (after adjusting for endianess and assuming 32-byte values): 0x24, 0x0804b314, 0x0804b308, 0x0. The second two look very much like memory addresses in a range very close to sym.n1.


Following these two addresses we arrive at a very similar layout. This begins to resemble a recursive data structure most people will recognize: a binary tree. In C it is represented as:

struct bst {
    int value;
    struct bst *left, *right;
};      

Mapping out the entire tree yields the following:


Now, that will make it easier to follow the control flow in sym.fun7 but there are still some pieces that are needed before a solution can be derived directly. Back in sym.fun7, there is an initial check for a nil next pointer and then the remainder of the function follows a pre-order traversal of the binary search tree.

The main concern at this point is understanding how the return value is calculated. Ultimately, we need to understand when the return value will be 7 so that we can provide input that will force a return at that particular point. The control flow on the left subtree either continues down the next left subtree when the argument node value is less than the current node or the right subtree if the value is greater than the current node. If the value is equal to the current node, eax is set to zero and the function returns.

The return path from a left tree traversal simply doubles the value of eax and returns to the caller. The return from the right subtree is a little more interesting - in addition to eax being doubled it is also incremented by one prior to returning to the caller. Since eax is used to hold intermediate memory addresses, the calculation probably only makes sense when the search value is found in the tree (thus setting eax to 0).

Since a found value returns 0 initially any return from a left subtree will only propagate the zero value; in order to get to seven we need to rely on the increment on the return path of the right subtree path. The only path that leads to the target return value is the one from the rightmost leaf in the tree.


To force a return value of 7 we must provide a value of 1001.



Tuesday, June 10, 2014

Of Binary Bombs (part 6)

In the last installment (phase 5) Dr. Evil used masking and a lookup table to try and defeat any secret agent. I will continue on here with the final phase of this binary bomb: phase 6. (This isn't really the final stage - check out the secret stage)

Our input string is loaded into the edx register as usual but then there is a strange reference to a sym.node1 that gets loaded into local stack space. That makes our first order of business to find what is stored in sym.node1.


The name node1 gives a fairly blatant hint at how we should look at this memory (without the symbols, this task would be a whole lot less straightforward). The first several bytes are pretty sparse: interpreting as 32-bit values we get 0xfd (253), 0x01 (1), and then the value 0x0804b260 (this is stored in little endian). That looks like another memory address; lets see.


Same structure. 0x02d5 (725), 0x02 (2), 0x0804b254. And the pattern continues. I'll take a leap and say that we have something that looks like the following C structure:

struct list_ {
    int value_;
    int index_;
    struct list_ *next_;
};      

I'm going to walk the list for a while to collect the values (and verify the counter continues in order). That results in the following (value_,index_) pairs starting from sym.node1.

(253, 1)
(725, 2)
(301, 3)
(997, 4)
(212, 5)
(432, 6)

The list is terminated at that point with a null next_ pointer. At this point, the values of the list are known so it is appropriate to resume walking the body of sym.phase_6.

Currently, the input string is loaded into edx and the linked list is stored in a local value; next a local buffer is loaded to eax and sym.read_six_numbers is called. I described this function in phase 2 and we can expect that the local buffer will contain our six input numbers after the call. I have a guess at this point what they should be but I want to verify first to avoid any of Dr. Evil's tricks.

The remainder of this phase can be broken down into four distinct loops. They are:
  1. Verify the input values
  2. Collect the nodes of the above list according to the input values
  3. Reorder the original list with that collection
  4. Verify the resulting list
While the input verification has a nested loop it is the most straightforward of the steps: it checks that all values are unique and less than 7.


Initially, collecting nodes according to the input values is a little harder to grasp as it too is a nested loop construct but is now dealing with offsetting into structures and moving memory locations (C pointers) around.


Specifically, the commented line below walks the linked list. This is something that would have not been evident had I not understood the memory in sys.node1.

    mov eax, [edx+ecx]
    lea esi, [esi]
    mov esi, [esi+0x8]  ; this uses the 'next' pointer  
    inc ebx
    cmp ebx, eax

The third step, reordering the original list, is short and looks simple enough but took me some time to fully grok. I needed to understand that the previous step was storing local copies of the nodes in the original list. From that the original list is overwirtten here in the order specified by the input.


Finally, the overwritten list is checked to ensure that the value_ elements are arranged in decreasing order.

With that final piece of information the necessary input sequence becomes clear - the solution is to provide index_ values that order the value_ members from largest to smallest.


Below is a mapping of this functionality to some C code that it may have come from.


struct list_ {
    int value_, index_;
    struct list_ *next;
};

void phase_6 (const char * input) {                 
    
    int i = 0;
    struct list_ *list = ..., *node = list;
    int values[6] = {0};     
    struct list_ *nodes[6] = {0};
    read_six_numbers (input, values);
        
    // 0x08048db8 - 0x08048e00
    for (; i < 6; ++i) {
        int j = i + 1;
        if (values[i] > 6) explode_bomb ();
        for (; j < 6; ++j) 
            if (values[i] == values[j]) 
                explode_bomb ();
    }

    // 0x08048e02 - 0x08048e42
    for (i = 0 ; i < 6; ++i) {
        node = list;
        while (node) {
            if (node->index_ == values[i]) {
                nodes[i] = node;
                break;
            }        
            node = node->next;
        }        
    }

    // 0x08048e44 - 0x08048e60
    i = 1;
    list = nodes[0];
    node = list;
    while (i <= 5) {
        node->next = nodes[i];
        node = node->next;
        ++i;
    }
    node->next = 0;

    // 0x08048e67 - 0x08048e85
    node = list;
    for (i = 0; i < 5; ++i)
        if (node->value_ < node->next->value_)
            explode_bomb ();

}

Tuesday, June 3, 2014

Of Binary Bombs (part 5)

Part 4 detailed a recursive function that calculated the nth entry into the Fibonacci sequence. Here we continue with the next stage to defeating Dr. Evil.


There is a familiar face here: sym.string_length. Recall in phase 1 I glazed over sym.string_not_equal which had buried inside a call to sym.string_length - if you've been following along at home this is not a surprise. The result of this call (which expects our input string as an argument) should be 6.

    cmp eax, 0x6

This is our first clue to solving the riddle.

Peeking ahead a little there are two memory locations referenced directly: sym.array.123 and str.giants. Before we get too far into the details of sym.phase_5 lets look at what each of these contain. Using the memory printing capabilities of radare2 we can do this with: px @ sym.array.123 and ps @ str.giants to get the hex and ASCII representations, respectively.

Not surprisingly str.giants contains the string 'giants' and the content of sym.array.123 is listed below:


Alright, now that we've got some context lets continue with the code.

    lea ecx, [ebp-0x8]      ; load an empty local array
    mov esi, sym.array.123  ; set a pointer to the first element of the memory above
    mov al, [edx+ebx]       ; target of the jump below 
    and al, 0xf
    movsx eax, al
    mov al, [eax+esi]
    mov [edx+ecx], al
    inc edx
    cmp edx, 0x5
    jle 0x8048d57

After loading the address of a local array the code enters a loop from 0 to 5 (for the six characters of our input). The body of that loop does the following:

Selects the nth byte from the user input string, masks off the bottom 4 bits, and then uses that as an index into sym.array.123. The byte at that index is then copied to the local array.

    mov al, [edx+ebx]
    and al, 0xf
    movsx eax, al
    mov al, [eax+esi]
    mov [edx+ecx], al

In C, that might look similar to

char array123[] = "isrveawhobpnutfg", local[6] = {0}, *input = ...;
int i = 0;

for (; i < 6; ++i)
    local[i] = array123[input[i] & 0xf];


After the loop the local array is null terminated and compared against str.giants; matching strings avoids triggering the bomb. Now all we need is to determine what indices from sym.array.123 yield the string 'giants.'

Recall the memory stored in sym.array.123 - isrveawhobpnutfg. The necessary index sequence then becomes: 0xf, 0x0, 0x5, 0xb, 0xd, 0x1. Since our ASCII input is masked we need to find ASCII strings with lower-order bits matching these values. I list the valid combinations (for printable ASCII) below:

0xf : / ? O _ o
0x0 : 0 @ P ` p
0x5 : % 5 E U e u
0xb : + ; K [ k {
0xd : - = M ] m }
0x1 : ! 1 A Q a q

Any combination of those values should be a valid input to solve this stage. Let's try one: 'opekma'


Sweet, almost there. Next up is phase 6 the [supposed] last stage...

Monday, May 26, 2014

Of Binary Bombs (part 4)

Part 3 of this series explored phase_3 of this riddle which marked the first phase that accepted more than a single correct input sequence. Here I'll continue with the next phase: sym.phase_4.


At first, some arguments are loaded onto the stack to prep for a call to sscanf but rather than an immediate string as the format string argument, an address is provided. Recall in in phase_2 that we could get a hint of the input by printing str._d_d_d_d_d_d (ps @ str._d_d_d_d_d_d) - here we need to understand the arguments to sscanf a little to know that the address 0x8049808 should contain a format string. Indeed, it does (ps @ 0x8049808 yields %d) - our input string needs to be a number. That number needs to be larger than 0.

    cmp dword [ebp-0x4], 0x0
    jg 0x8048d0e

After the input is checked, its value is pushed to the stack and control is passed to sym.func4.


The second requirement of our input is now revealed:

    cmp ebx, 0x1
    jle 0x8048cd0

For values larger than 1 sym.func4 is called recursively.

    lea eax, [ebx-0x1]
    push eax
    call sym.func4

So the input value to sym.func4 (our input, initially) is decremented by 1 and passed to the recursive call. The result of that call is saved to esi and then sym.func4 is called recursively yet again - this time after decrementing the input by 2.

    lea eax, [ebx-0x2]
    push eax
    call sym.func4

The return value of the second recursion is added to the result of the first and that sum becomes the return value of this depth of the recursive call. In C, this looks something like:

int func4 (int n) {
    if (n < 2) { return 1; }
    return func4 (n - 1) + func4 (n - 2);
}

Back in sym.phase_4 there is the following check:

    cmp eax, 0x37

The input, when fed to sym.func4, should return 55. The recursion of n-1 + n-2 is one way to calculate the Fibonacci number at index n. That, coupled with the fact that 55 is a Fibonacci number, allows us to derive the necessary input value of 9.


Next up, part 5.

Thursday, May 22, 2014

Of Binary Bombs (part 3)

In part 2 I explained both sym.read_line and the solution to sym.phase_2. Here I work through the third phase of Dr. Evil's nasty binary bomb.


I'll assume to start directly at sym.phase_3 beyond the input handling routines previously discussed.



Phase 3 starts with a call to sscanf with a format string of "%d %c %d". So we should provide a number, character, and a number. The first number should be less than or equal to 7

    cmp dword [ebp-0xc], 0x7

and is used to calculate a jump address

    mov eax, [ebp-0xc]
    jmp dword [eax*4+0x80497e8]

If we start with eax containing 0 this jumps us to a location stored at 0x80497e8. That address is 0x08048be0 - or just over the next instruction.

    mov bl, 0x71
    cmp dword [ebp-0x4], 0x309

This sets bl to 0x71 (ASCII 'q') and compares the third input to 777. If the third input is 777 control jumps to

    cmp bl, [ebp-0x5]

So, to avoid the bomb we can provide the following input: 0 q 777 and we have a valid solution.


But what about setting eax to something other than 0 to start? Let's look at the other possible jump addresses for values less than 8 but greater than 0 for the first input. I've abbreviated the code and commented what is different from the above description.

; eax == 1 - 0x08048c00
    mov bl, 0x62                ; ASCII 'b'
    cmp dword [ebp-0x4], 0xd6   ; 214

; eax == 2 - 0x08048c16
    mov bl, 0x62                ; ASCII 'b'
    cmp dword [ebp-0x4], 0x2f3  ; 755
        
; eax == 3 - 0x08048c28
    mov bl, 0x6b                ; ASCII 'k'
    cmp dword [ebp-0x4], 0xfb   ; 251
        
; eax == 4 - 0x08048c40
    mov bl, 0x6f                ; ASCII 'o'
    cmp dword [ebp-0x4], 0xa0   ; 160
  
; eax == 5 - 0x08048c52
    mov bl, 0x74                ; ASCII 't'
    cmp dword [ebp-0x4], 0x1ca  ; 458

; eax == 6 - 0x08048c64
    mov bl, 0x76                ; ASCII 'v'
    cmp dword [ebp-0x4], 0x30c  ; 780

; eax == 7 - 0x08048c76
    mov bl, 0x62                ; ASCII 'b'
    cmp dword [ebp-0x4], 0x20c  ; 524

So it looks as if there are several answers to this part of the riddle. Let's verify at least one of the others work.

Next, phase 4.

Wednesday, May 21, 2014

Of Binary Bombs (part 2)

In part 1 I worked out the setup of the binary bomb and walked through the solution to the first phase of the riddle. This post continues from there to examine phase 2.

After defusing phase 1 the program reads the next line from the user (sym.read_line) and calls sym.phase_2. In my first post I glazed over the sym.read_line with some hand-waving so I'd like to revisit that here with a little more attention. As a reminder, in radare2 you can search for a symbol while in visual mode with s (s sym.read_line).


The first thing that happens in sym.read_line (after managing the pointers and local variables) is a call to sym.skip. sym.skip has a little more to it than just this but basically determines which input line is to be read by looking at the global variable sym.num_input_strings and using that as an index into sym.input_strings (a global holding each input string in an 80-byte char array). It returns the string read in eax.

Once a line is read from the input stream the following instructions check the return value of sym.skip.

    test eax, eax
    jne 0x804925f

Only if the return value of sym.skip is zero (when null is returned) does execution not jump to 0x0804925f. I will cover later what happens in the null return case. For now, non-null returns follow this explanation.

    mov eax, [sym.num_input_strings]

At this point, we have entered 2 strings but the global counter sym.num_input_strings has only been updated once so contains the value 1.

    lea eax, [eax+eax*4]
    shl eax, 0x4
    lea edi, [eax+sym.input_strings]

The above sets eax to eax * 5, multiplies the result by 16, and loads the string at sym.input_strings+eax into edi. In this case (sym.num_input_strings == 1) this yields the second index in the sym.input_strings array (recall that each entry there is 80 bytes).

The next set of instructions calculate the length of the input string by decrementing a counter for each non-null byte in the string. The calculation uses some shortcuts for efficiency so is not directly mapped to traditional C-like counting (google 'asm string length' for details).

    mov al, 0x0
    cld
    mov ecx, 0xffffffff 
    repne scasb
    mov eax, ecx
    not eax
    lea edi, [eax-0x1]

Our input strings must fit within the 80-byte array so checking that the string (less newline) is less than 0x4f (79) bytes is done next

    cmp edi, 0x4f

Finally, the newline is replaced with a zero byte

    mov byte [edi+eax+0x804b67f], 0x0

the input string is placed in eax and the global sym.num_input_strings is incremented by 1.

At this point eax contains the input string and I can resume with the second phase of the bomb.


After the prologue and setting aside 32 bytes of local space the input string is copied to edx and the local buffer address is loaded into eax. These two arguments are pushed onto the stack and sym.read_six_numbers is invoked. To avoid making this post too long I'll skip over the detail of sym.read_six_numbers and just mention that the numbers in the input string are read into the local array via sscanf with a format string of "%d %d %d %d %d %d" (Hint #1). 

The following check verifies that the first entry in the local array has the value 1 and is the second hint to solving this phase.

    cmp dword [ebp-0x18], 0x1                

The next section of code is the meat of phase 2 and how we determine what numbers to include in our string.

    mov ebx, 0x1
    lea esi, [ebp-0x18] 
    lea eax, [ebx+0x1]  ; target of jump below (0x8048b76)
    imul eax, [esi+ebx*4-0x4] 
    cmp [esi+ebx*4], eax   
    je 0x8048b88 
    call sym.explode_bomb
    inc ebx             ; target of jump above (0x8048b88)
    cmp ebx, 0x5  
    jle 0x8048b76 

This is a loop from 1 to 5 using ebx as the counter. The initialization sets ebx to 1 and esi to the beginning of the array of converted numbers (the local array). On each iteration of the loop eax is set to ebx + 1 then multiplied by the nth entry of the local array (n is ebx - 1). The result of that is compared to the next entry in the array. In C parlance this might look like:

    int xs[] = {...}; 
    for (i = 0; i < 5; ++i)
        if ((i+1) * xs[i] != xs[i+1]) 
            FAIL();

If we know that xs[0] must be 1 we can now use the above to calculate the remainder of the array.

    x[1] = 2 * x[0] => 2
    x[2] = 3 * x[1] => 6 
    x[3] = 4 * x[2] => 24
    ...

Ultimately, our input requirements become: 1 2 6 24 120 720. This works as expected and phase 2 is now solved.


On to phase 3.