Sunday, June 29, 2014

Of Binary Bombs (the secret)

So far, I've described six stages of this bomb along with their solution. These stages have built up in difficulty while describing often used programming constructs such as: string comparison, arrays, a switch statement, recursion, lookup tables, linked lists, and here in the final stage a binary search tree.

While solving the 6th phase will successfully defuse the bomb there is a curious section of code executed at the end. The most important thing to notice is that we can not trigger the bomb from this point on; the entire function will only jump to a graceful exit unless we unlock the secrets. Recall the code for sym.phase_defused:


Initially, there is a check for the total number of lines entered so far; until this point that check has failed. Here the jump is bypassed and execution proceeds to call to sscanf. Two important arguments to sscanf here are: the format string: str._d_s (%d %s) and 0x804b770. From that first argument we can infer the types that will be read and the second indicates from where we will read that data. Unlike in prior phases, there is no input line read to start this phase so 0x804b770 must already have data located in it.

If we look at what is stored there we find nothing special - certainly not something that looks like a number followed by a string.


This analysis is using a static binary, however, so this memory may get filled in at runtime. We have looked at each function in turn and the only changes in memory are driven by the inputs we provide. So where, is this address in memory? If we look for known addresses around this we see that 0x804b770 is located at sym.input_strings+240. Remember in phase 2 we determined that sym.input_strings was a global array of 80-byte character arrays to hold the inputs we provide. So 240 bytes beyond that is the 4th solution we provided (the number 9). There was no string after that but that is part of the secret...

The sym.read_line grabs the entire input line and in phase 4 sscanf only looked for %d which leaves the remainder of the buffer untouched. Nothing prevents us from providing some trailing values after the number so long as there is a space between them.

Supposing we did provide a trailing string the next step is to check that string against str.austinpowers. So that is the secret to accessing the secret phase: update the 4th input to be '9 austinpowers'. 


The secret phase reads in an additional line from the input stream and converts it to a long value using strtol. That value is decremented and compared against 0x3e8 (1000) - the bomb is triggered if our decremented value is greater than that. If the input passes that check we enter the final function: sym.fun7. Prior to going into detail, however, it is important to note that the return value from this function needs to be 0x7 to avoid triggering the bomb. The initial value to sym.fun7 is sym.n1 (0x0804b320).


This is a recursive function very similar to the one explained in stage 4. To understand what is happening with the control flow it is important to first understand what is contained in sym.n1. However, unlike stage 4 this variable name gives us little indication of what the memory may contain.

Looking at the first 16 bytes of that memory location we see the values are (after adjusting for endianess and assuming 32-byte values): 0x24, 0x0804b314, 0x0804b308, 0x0. The second two look very much like memory addresses in a range very close to sym.n1.


Following these two addresses we arrive at a very similar layout. This begins to resemble a recursive data structure most people will recognize: a binary tree. In C it is represented as:

struct bst {
    int value;
    struct bst *left, *right;
};      

Mapping out the entire tree yields the following:


Now, that will make it easier to follow the control flow in sym.fun7 but there are still some pieces that are needed before a solution can be derived directly. Back in sym.fun7, there is an initial check for a nil next pointer and then the remainder of the function follows a pre-order traversal of the binary search tree.

The main concern at this point is understanding how the return value is calculated. Ultimately, we need to understand when the return value will be 7 so that we can provide input that will force a return at that particular point. The control flow on the left subtree either continues down the next left subtree when the argument node value is less than the current node or the right subtree if the value is greater than the current node. If the value is equal to the current node, eax is set to zero and the function returns.

The return path from a left tree traversal simply doubles the value of eax and returns to the caller. The return from the right subtree is a little more interesting - in addition to eax being doubled it is also incremented by one prior to returning to the caller. Since eax is used to hold intermediate memory addresses, the calculation probably only makes sense when the search value is found in the tree (thus setting eax to 0).

Since a found value returns 0 initially any return from a left subtree will only propagate the zero value; in order to get to seven we need to rely on the increment on the return path of the right subtree path. The only path that leads to the target return value is the one from the rightmost leaf in the tree.


To force a return value of 7 we must provide a value of 1001.



2 comments :

  1. This comment has been removed by a blog administrator.

    ReplyDelete
  2. recap:
    Phase 1: Basics, global string storage access
    Phase 2: Six numbers, solve through logic or getting the number when bomb would explode
    Phase 3: Number and character (for me it was each number corresponding to a switch case)
    Phase 4: Fibonacci sequence in function
    Phase 5: Cipher shift
    Phase 6: Linked list with pointers
    Secret Phase: Binary tree with harder-to-decipher Fibonacci sequence

    ReplyDelete