Sunday, December 12, 2010

Know Your Tools

I was recently pulled into a project that needed a developer for a short sprint to make a portable version of a set of already existing tools. The functional version of this suite consists mostly of C and Perl code written to one of three platforms: FreeBSD, Linux, or Windows. The Windows support was provided via Cygwin and the version I was to develop intended to remove this and other crutches. It's also relevant to mention that I would only be converting the client side code; the server would remain untouched.

The port itself was relatively straightforward. Moving Perl and C to Python is not terribly taxing in and of itself, it's more of a burden ensuring that you do not break any existing functionality. These tools essentially provide a remote query/response mechanism across multiple platforms which simplified the testing approach. I would just iterate through all supported queries, store the output, and do a diff on each target platform.

I had two weeks to deliver. At the end of the second week the C and Perl files were converted to Python and I was testing the functionality of the system but was getting a 50% fail rate. A random set of commands were only getting a portion of the response - no error codes, just partial responses.

From trace output in each version the behavior looked sane. From tcpdump output I could tell that the handshake and shutdown was happening properly it just looked as if the server had less to send my version of the client. Because it can be easier to elicit salient information through Wireshark's interface I plugged the pcap traces in and looked a them side-by-side. There it was: a single byte was different between the two. The original query:



And the version I was sending:


The difference is a single space (the 0x20 sitting at the end of the first image). It threw me the first time since I was actually looking for non-printable characters in the ASCII-formatted output. Non-printable characters show up as '.' but 0x20 is a space character. Grrr. For whatever reason, the server will only return a full response if the command has a trailing space character. It's intriguing that a partial response is returned for an invalid request without warning. My guess is that there is an off-by-one error somewhere in the server command parsing code.

I pushed a bug up to the server group though I doubt it will see the light of day since I've already covered the ground necessary to make the new tool work. If it ain't broke costing us money, don't fix it.

The moral of the story is that you should know the tools that can help you track down problems. You should also understand the strengths of different versions of those tools (Wireshark vs. tcpdump). Tried and true tools are invaluable to getting good work done and it is your responsibility as a developer to have your set at the ready at all times.

No comments :

Post a Comment