uu kk: 03.2011

Thursday, March 31, 2011

The Price of Parking

I live in the Philadelphia area. If you've ever been to a Philadelphia sporting event you understand that there exists a certain kind of crazy that envelops the area just before game time and hangs around to varying degrees after the game depending on the outcome. If you are a true Philly fan you drink the crazy Kool-Aid. I don't, but my father-in-law has season tickets to the Flyers so occasionally I pretend to get a little punch drunk.

I don't live in an area that makes it feasible to take public transportation so I usually end up driving to the stadium from work. Parking is an assumed cost (as is $7 beers) but the last game I went to the price was pretty steep: $15. That is opposed to just three years earlier when I could get a spot for a mere $10. This was painful enough to force me to look at it more when I got home - and so I did.

There wasn't much to be gleaned from the fact that prices had risen over the past few years. Especially since Phildelphia built three new stadiums in the last ten years and, unfortunately, there is a blanket excuse to raise prices without cause lately: the poor economy. In either case, in thinking about all of this I started to consider how information can be used to manipulate people. More specifically, I imagined how the owner of the lot might try to explain away the increase to his (or her) customers.

As a consumer, I might approach the situation with the argument that the cost of parking has risen exponentially over the past five years. Certainly, from the following representation that might make sense:

Of course, that is presented in precisely the right way: the scale is only large enough to include the data points being shown; there is no history previous to 2007 where a consistent price may have been held (the two data points at $10 hint at this, however); and there is no comparison to other lots and/or prices in the area.

The counter to that point of view might look something like this:

Plenty going on here. First, and probably most important, is the scale: adjusting the scale results in a trend that appears closer to linear than exponential. Other factors: extending the range (going back to 2001); a carefully chosen cost of living number - modest, but keeps total profit negative; keeping the number of games high (it is a multiplier of the profit). In addition, the loss is not exactly negative income per patron, it is lost profit against the chosen cost of living increase.

In exploring this thought experiment I've only enforced what I already know: we, as presenters of data, have an obligation to be honest and straightforward. Data can be made to tell any convenient story; we would do well to remember that when consuming and producing information.

"The most dangerous untruths are truths moderately distorted." - Georg Lichtenberg

Incidentally, I had to leave that game early due to my daughter being very tired. On the way out I was helping my daughter put on her coat and a boy came over and asked if he could give her a puck. It seems he caught the puck during the game and wanted my daughter to have it. Well her eyes lit up and, of course, it completely made my night. Certainly worth the price of parking if you ask me.

Sunday, March 13, 2011

Disappearing Ink

This is something I've been wanting to post about for a while now. There was thread on a forum I used to frequent (under a different handle). You can follow the link for details, but the main concept was that in a transition to a GUI environment the developers needed to intercept printf and fprintf calls and do something GUI-related when the GUI was running and leave the software as-is when the GUI was not running. The approach to handle this was to define their own version of the functions and handle the context there. Something like:

extern "C" int fprintf (FILE *__restrict __stream,
       __const char *__restrict __format, ...)
{
   va_list args;
   int return_status = 0;

   va_start(args,__format);

   if (is_gui && (__stream == stdout || __stream == stderr)) {
       return_status = showMessageInGui(NULL, __format, args);
   }
   else {
       return_status = vfprintf(__stream, __format, args);
   }

   va_end(args);
   return return_status;
}

Which only worked part of the time. Complicating matters was the behavior was different across multiple compiler versions: gcc4.2.4 exhibited the problem while gcc3.4.2 did not.

I did some digging around and found out that the newer version of gcc was actually implementing a level of optimization that undermined the approach of redefining printf. Basically, any argument to printf that did not contain a format string (%) to be filled in resulted in a translation by later versions of gcc to a call to fwrite (or puts). For example:

#include <stdio.h>
int main () {
    printf ("With args %d\n", 10);
    printf ("Without\n");
    return 0;
}

Translates to

.LC0:
    .string "With args %d\n"
.LC1:
    .string "Without"
    .text
    ...
    movl    $.LC0, %edi
    movl    $0, %eax
    call    printf
    movl    $.LC1, %edi
    call    puts
    movl    $0, %eax

Notice the puts call for the second call to printf. So regardless of what definition was provided for printf the code would never be executed. Try it with the following:

#include <stdio.h>

int printf (const char * fmt, ...) { return 1/0; }

int main () {
    printf ("w00t\n");
    return 0;
}

You'll see that a warning is presented (for division by zero) but the code runs without issue.

There were several solutions presented in the thread but the OP eventually chose to use gcc flags to prevent this behavior. Using -fno-builtin-fprintf and -fno-builtin-printf allowed compilation without the translation and the redefinition of printf worked as expected.

Anatomy of an update

I recently put ntrace up on github.

While I was doing testing for that version I saw my package manager blinking at me telling me I needed updates. I decided to take that as an opportunity to test ntrace on a more realistic use case (at the time I was just doing scp or iperf tests). Instead of using the Synaptic Package Manager GUI I opened up a terminal and used aptitude to do the update. Since I know nothing about how aptitude works, this was a good opportunity to see what ntrace could tell me.

Here was the command I ran:

LD_PRELOAD=./libntrace.so aptitude safe-upgrade

The update was small; it consisted of only 5 packages:

The following packages will be upgraded:
      libmozjs1d libsvn1 subversion xulrunner-1.9 xulrunner-1.9-gnome-support
    5 packages upgraded, 0 newly installed, 0 to remove and 7 not upgraded.
    Need to get 10.4MB of archives. After unpacking 8192B will be used.
    Do you want to continue? [Y/n/?]

After typing Y to continue the update completed without issue. I parsed the resulting output and compiled the following graph:

Interesting points:

Between seconds 3 and 12 there is no activity. This is the time I was reading the screen before I confirmed and continued with the update.
Each package was retrieved by it's own process. The dots are color-coded according to the PID that was associated with the traffic (see the legend).
Packages are downloaded sequentially instead of in parallel.
The long delay after 17 seconds is the time it took to actually install the updates on my system. The original process then communicates some more (presumably about the status of the children) and the update exits.
The traffic patterns for the parent process are nearly identical at the front and back end of the communication. Maybe all details of the update are advertised at both ends instead of a diff? Dunno, but it is interesting.

This is how ntrace is useful to me: quick insight into application activity as it relates to the network.

As part of my testing I am doing performance impact analysis. Perhaps some of those details will appear here as well.

Wednesday, March 9, 2011

github

So I've finally caught up with all the hype and got myself a github account. You can find more randomness there.

Please let me know if you find anything there useful.

ezpz on github :)