Sunday, June 28, 2009


Enabling "stealth mode" (which should kill all responses to an unsolicited [from the OS's point of view] SYN/ACK packets) does not prevent the OS from responding on lo0. Evidently stealth mode is not active on the loopback interface.



Power is back on, But network problems continue

Power's back on, woooooooooo.

Okay, so I have two problems:
[1] If I attempt to 'send' packets 'from' an IP address that my computer actually has, the host OS's TCP stack sends a RST packet when it sees a SYN/ACK response to my 'sent' SYN packet.
[2] If I attempt to 'send' packets 'from' a different IP address, the packets get bounced back and forth several times between my computer and the router. For whatever reason, the listening server ('nc -l <port>') doesn't respond at all.


Power is out. Posting this from my iPhone.

Fighting with the host OS

After finding out that Wireshark is stupid, and considers 0x2000 to be the same as 0x0002 for Loopback-layer family fields, I am now battling with the host OS. Sending a connection request properly elicits a SYN/ACK response, but the OS is quickly sending out a RST packet to kill the connection.

Changing the source IP to be something other than an IP the operating system should care about results in no SYN/ACK packet. I tried moving back over to a non-loopback interface, with a different source IP address (using the correct destination IP address). It gets sent back from the router (the packet is duplicated with the source MAC being the router MAC), but it does not elicit a SYN/ACK response.

Wierd. And frustrating.



So, evidently even if you have a host that is listening on (e.g. "nc -l 12345") and inject packets onto an active interface, pointed at that interface's IP address (e.g. "en1" has IP -- any normally-generated packets to/from a local IP address go over "lo0".

That would have been good to know, what, a few hours ago?


Saturday, June 27, 2009


Lots of troubleshooting the code right now, trying to get the 3-way connect to work. Lots of issues with typos, unfortunately. A lot of it had to do with using (for example) "off" in place of "offset" to access the offset of a TCP packet, because of the way it's declared:

off = pcs.Field( "offset", 4 )

My eyes just saw the first part, and ignored the second half when I started writing code. Thought I went back and fixed all of the fields, evidently not!


Friday, June 26, 2009

P4 Is Getting Kind of Annoying

I don't know if it's P4 itself, or P4 integration with Eclipse, but a versioning system should never prevent editing local files because a connection attempt failed, or because the host-name of the machine changed.

That is all.


Saturday, June 20, 2009

Checksumming No Se Funciona

Alright, so the last part of the puzzle that needs to fall in place is the checksum. Conveniently, the algorithm for checksumming IP and TCP packets is the same (albeit you feed them different sets of information).

The algorithm works somewhat like so (source IBM):

ushort checksum16(uchar* data, int len)
uint sum = 0;
if ((len & 1) == 0)
len = len >> 1;
len = (len >> 1) + 1;
while (len > 0) {
sum += *((ushort*)data);
data += sizeof(ushort);
sum = (sum >> 16) + (sum & 0xffff);
sum += (sum >> 16);
return htons(~sum);

That is, it sums all of the 16-bit words, does some 1's compliment fanciness, and then returns the 16 least significant bits (inverted). Of course, the operation can be done with any arbitrary multiple of 16 bits. Since most machines nowadays use 32- or 64-bit arithmetic, there's no reason not to do it with 32 bits (or 64). However, the final result is always 16 bits wide.

In Python, the code looks like (or so I think) the following:

def checksum( bytes ):
tmpbytes = bytes
total = 0
# Must be a number of bytes equal to a multiple of two.
if len( tmpbytes ) % 2 == 1:
tmpbytes += "\0"
# For each pair, add the value to the total.
for i in range( len( tmpbytes ) / 2 ):
total += ( unpack( "!H", tmpbytes[( 2 * i ):( 2 * i ) + 2] )[0] )
# Keep shifting
while total >> 16:
total = ( total >> 16 ) + ( total & 0xffff )
# One's complement
total = ~total
# Make into 2 bytes
total = total & 0xffff
# Return value
return total

However, given an IP layer: 0x4500004096da40004006eca1c0a80168d04524e6, the checksum is 0xECA1. When calculating the checksum, the checksum field is set to zero (0x0000). So, the checksum is calculated like so:

>>> from binascii import *
>>> from struct import *
>>> data = unhexlify('4500004096da40004006<b>0000</b>c0a80168d04524e6')
>>> data
>>> cksum = checksum(data)
>>> cksum
>>> hex(cksum)

Obviously, the end result is incorrect. Turns out that the values that I was using for testing were incorrect. I've since fixed them in the blog post and the offending code.


Friday, June 19, 2009

Glad I got up early

So I got up early before class today to work on the problem. I don't know what the issue was, I think it was a matter of incorrectly setting the TCP offset that made Wireshark freak out.

Another PCS Architecture Rant
This time, I'm a bit curious about the design of the pcs.Chain class. It inherits from the 'list' base type, but instead of using itself to store the data, it actually *contains* a seperate list. Example:

>>> isinstance(myChain,list)
>>> myChain
>>> myChain.packets
[<ethernet: src:="" \x00\x1bc\x06\x82\xb2',="" dst:="" \x00!)\xa5\xa9?',="" type:="" 2048="">, <ipv4: hlen:="" 5,="" protocol:="" 6,="" src:="" 2886729738l,="" tos:="" 0,="" dst:="" 3494107623l,="" ttl:="" 64,="" length:="" 44,="" version:="" 4,="" flags:="" offset:="" checksum:="" id:="" 56649="">, <tcp: reset:="" 0,="" reserved:="" sequence:="" 3915041816l,="" ack:="" checksum:="" 60177,="" offset:="" 5,="" syn:="" 1,="" urgent:="" window:="" 65535,="" push:="" ack_number:="" 0l,="" dport:="" 80,="" sport:="" 52722,="" fin:="" urg_pointer:="" 0="">, <payload:>]

Not sure what's going on with that. Its constructor only takes a list object:

class  Chain(list):
def __init__(self, packets = None ):
self.packets = packets
# Versus
class Chain(list):
def __init__(self, packets = None ):
list.__init__(self, packets)

This would end up saving a lot of time and make the resulting code look cleaner:

That would make it easier to perform nifty operations. pcs.Chain offers a "checksum" operation, that operates on the whole chain. However, consider the utility of "myChain[-2:].calc_checksum()". Now we can checksum an arbitrary range of packets! Not immediately useful, unless you realize that this is a nifty way to calculate the IP checksum ;-).

Also, I have to wonder about the duplication of functionality offered by the pcs.Chain class, and the member. It seems that, for the large part, they perform the exact same function. Consider:

>>> p = payload()
>>> t = tcp()
>>> i = ipv4()
>>> e = ethernet()
>>> = i
>>> = t
>>> = p
# Finally, these two should have the same ultimate result
>>> e.chain()
>>> chain([e,i,t,p])

I can see the reasoning behind the "data" member -- it makes it easy to set-and-forget what the next level is, and the "chain()" method is a nice helper that builds a chain. Consider the earlier example of myChain[-2:].calc_checksum() currently looks like: myChain.packets[-2].chain().calc_checksum().


Thursday, June 18, 2009

Malformed Packet Bull@$!#

Getting pretty sick and tired of Wireshark being useless in telling me exactly why my TCP packets are malformed.
Just sayin'.


Saturday, June 13, 2009

Unrelated Note

On an unrelated note, FreeBSD handles pthread_mutex's shared status properly. Mac OS X does not.

Debugging code that relies on this on Mac OSX may cause hair loss.

(This was from my first CSE410 project, thought I'd post it here while I was posting things).


Bailed on previous idea

Also of note, I bailed on the idea from last night (and before) of using raw IP sockets to do the sending. The idea was to avoid needing to ever touch the physical or IP layer, thus removing some amount of work (e.g., no setting of the IP length fields, IP checksum calculation).

However, using raw sockets has the caveat of not providing a receive buffer. If using raw IP sockets, one cannot override the 'protocol' field to be IPPROTO_TCP. Haven't put much more thought into using raw TCP sockets.

Short Rant on pcs.Field
The class pcs.Field is used as a base class to represent any field inside of a packet. It provides bit-width flexibility, and the ability to encode/decode values of arbitrary bit-width. This is very useful when defining the various fields of a packet, as you can specify their exact size, and then with the pcs.Layout class, specify their exact order.

However, in order to string these two things together, you get somewhat of a kluge. Consider the following code in

     def  __init__(self, bytes = None ):
"""initialize a TCP packet"""
sport = pcs.Field("sport" , 16 )
dport = pcs.Field("dport" , 16 )

And then the corresponding code to manipulate those fields...

>>> from pcs.packets.tcp import tcp
>>> t = tcp()
>>> type(t.offset)
<type 'int'>
>>> type(
<type 'int'>

Okay, great. In the __init__ method, they were working on local variables with the same name as the variables to be used. The pcs.Field objects are supplied with a name to correspond to the object property (e.g. the pcs.Field object with (... name="sport",...) corresponds to the actual field "". Great.

However, the approach falls short here. pcs.Field provides no way of setting the bit-order of whatever field it is manipulating, so users are stuck doing this: = socket.htons(80)
# Instead of = 80

And having the "magic" just happen. This prevents simple comparisons, as well. Which of the following is easier to read?

if ntohs( == 80: ...
if == htons(80): ...
if == 20480: ...
if == 80: ...

This is less trivially simple for fields that may have more than one representation. This was discussed in a previous post (IIRC) with relation to IP and Hardware addresses. Consider the following. The IP address "" can be represented as:

  • 16777343 (host-byte-order)
  • 2130706433 (network-byte-order)
  • '\x01\x00\x00\x7f' (host-byte-order byte-string)
  • '\x7f\x00\x00\x01' (network-byte-order byte-string)
  • And of course, ""

Obviously, this makes things a bit difficult to work with. However, the current method does not just use network-byte order. An IP address is a NBO long. A port is a NBO short. An ethernet address is a NBO byte-string. Figuring out what the proper internal representation for everything is a pain, and translating between HBO and NBO makes it worse.

The simple solution is to allow multiple ways of getting and setting a field. Example using ports (also works with IP addresses, ethernet addresses):

>>> == 80
>>> == 80
>>> == '80'
>>> == 'P\x00\x00\x00'
>>> == '\x00\x00\x00P'
>>> == '\x00\x00\x00P'

This allows us all the flexibility in the world that is needed. Assuming that all of the different types support the base functions (get/set for ASCII, Integer, and Bytes in both host- and network-byte order), everything is quickly extendible.

Visiting the original code, things have improved slightly. No longer do we need to worry about bit-ordering.

# Old = htons(80) = htons(str('80')) = struct.unpack('!L','\x00\x00\x00P')
# New'80')'\x00\x00\x00P')

Alright, now we face the problem of making the encoding 'cleaner'. Encoding things is extremely simple if everything is aligned on an 8-bit byte boundary. One could simply use " + t.dport.getNetworkBytes()" ad infinitum. However, that approach is ugly. Overriding __add__() with some gentle application of isinstance() and you can get the bytes pretty easily (even without 8-bit byte boundaries).

Will do some work tonight on this front. Will make for a very worthwhile patch to PCS.



If Red Bull would just...

...sponsor Google Summer of Code, it would make my life much easier (and less expensive).

Works pretty well

After moving everything to Python 2.6 (read: re-installing Pyrex, PyLint, PyLibPcap, and PCS), the packet-queueing process was extremely easy to do. Much easier than I thought it would be. Here's an example of it in action (note that the TCP data that is interpreted by PCS is whatever the equivalent of the ASCII data is).

Note that the output gets a little bit wacky as both processes fight for StdOut.

Python 2.6.2 (r262:71600, May  2 2009, 17:25:25)
[GCC 4.0.1 (Apple Inc. build 5490)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from packetQueue import *
>>> ipl = IpListener("")
>>> ipl.process.start()
>>> from socket import *
>>> host = gethostbyname(gethostname())
>>> port = 0
>>> text = "Hello"
>>> def send(data=text):
... sender = socket(AF_INET,SOCK_RAW,IPPROTO_IP)
... sender.sendto(data, (host,port))
>>> send()
>>> send(Hello
'This is a test')
>>> ipl.recvQueue.get()
This is a test
<TCP: reset: 0, reserved: 0, dport: 27756, sequence: 1862270976, ack: 0, checksum: 0, ack_number: 0, syn: 0, urgent: 0, window: 0, offset: 0, push: 0, sport: 18533, fin: 0, urg_pointer: 0>
>>> ipl.recvQueue.get()
<TCP: reset: 29, reserved: 13, dport: 26995, sequence: 543781664, ack: 7, checksum: 0, ack_number: 1629516901, syn: 58, urgent: 3, window: 0, offset: 7, push: 14, sport: 21608, fin: 0, urg_pointer: 0>
>>> ipl.process.terminate()


First Round Done

Not really a project update, but a worthwhile note: the first round of exams are done at MSU, and I have already completed my second-semester project for my CSE course. There's a paper due on the 25th, and the homework for all of the next week is already done. Should be able to put in some extra time this weekend and the upcoming week. Right now is about 3 weeks from the end of the semester, then my it's hyperspeed.


Thursday, June 11, 2009


Although there IS the catch that if we do not call recv() before the response TCP packet arrives, it will be lost forever (since we are using raw sockets). The easy solution to this is to have a thread that does all of the buffering, as I imagine happens in a real TCP stack. Might have to look into the 'multiprocessing' module. It seems like passing pcs.packets.tcp.tcp objects via a multiprocessing.queue would be the way to go about doing things.

Actually, after reading a bit about the multiprocessing module, this may end up being a non-issue. Just a have a reader thread that reads all of the raw IP data, checks to see if it is the right stuff (i.e. the IP addresses are what were expected, the TCP ports are the expected ports) and then passes it along to the queue. Behind-the-scenes, a call to recv() will just fetch the next item off the queue.


Working with Raw IP

I figure that messing with IP stuff at any point in time is a Bad Idea (TM), so doing anything at all with the pcs IP or Ethernet layer shouldn't be done. Ideally, I should be able to point an IP packet (filled with TCP data and payload) at some IP address, and everything will get sorted out.

I figure it'll work something like the code below (need root, open two Terminals, call recv() before send()). Not sure that much more robustness is needed...

from socket import *
host = gethostbyname(gethostname())
port = 0
text = "Hello"
def send():
sender = socket(AF_INET,SOCK_RAW,IPPROTO_IP)
sender.sendto(text, (host,port))
def recv():
recver = socket(AF_INET,SOCK_RAW,IPPROTO_IP)
data = recver.recv(1024)
print data[20:]


Sunday, June 7, 2009


Is what pylint is.
Integration with PyDev (Eclipse) makes it even better.


Thursday, June 4, 2009

Hi, George


Looks like there's been an issue again...



Delivery to the following recipient has been delayed:

Message will be retried for 2 more day(s)
For help, please quote incident ID 13500304. (state 18).