Packet dissection
How it works Each dissector decodes it's part of the protocol, and then hand off decoding to subsequent dissectors for an encapsulated protocol. So it might all start with a Frame dissector which dissects the packet details of the capture file itself (e.g. timestamps), passes the data on to an Ethernet frame dissector that decodes the Ethernet header, and then passes the payload to the next dissector (e.g. IP) and so on. At each stage, details of the packet will be decoded and displayed. Dissection can be implemented in two possible ways. One is to have a dissector module compiled into the main program, which means its always available. Another way is to make a plugin (a shared library/DLL) that registers itself to handle dissection. - XXX add a special explanation section for this?
Adding a basic dissector Lets step through adding a basic dissector. We'll start with the made up "foo" protocol. It consists of the following basic items. A packet type - 8 bits, possible values: 1 - initialisation, 2 - terminate, 3 - data. A set of flags stored in 8 bits, 0x01 - start packet, 0x02 - end packet, 0x04 - priority packet. A sequence number - 16 bits. An IP address.
Setting up the dissector The first decision you need to make is if this dissector will be a built in dissector, included in the main program, or a plugin. Plugins are the easiest to write initially as they don't need write permission on the main code base. So lets start with that. With a little care, the plugin can be made to run as a built in easily too - so we haven't lost anything. Basic Plugin setup. #include #include /* forward reference */ void proto_register_foo(); void proto_reg_handoff_foo(); void dissect_foo(tvbuff_t *tvb, packet_info *pinfo, proto_tree *tree); /* Define version if we are not building ethereal statically */ #ifndef ENABLE_STATIC G_MODULE_EXPORT const gchar version[] = "0.0"; #endif static int proto_foo = -1; static int global_foo_port = 1234; static dissector_handle_t foo_handle; #ifndef ENABLE_STATIC G_MODULE_EXPORT void plugin_register(void) { /* register the new protocol, protocol fields, and subtrees */ if (proto_foo == -1) { /* execute protocol initialization only once */ proto_register_foo(); } } G_MODULE_EXPORT void plugin_reg_handoff(void){ proto_reg_handoff_foo(); } #endif]]> Lets go through this a bit at a time. First we have some boiler plate include files. These will be pretty constant to start with. Here we also pre-declare some functions that we'll be writing shortly. Next we have a section surrounded by #ifdef ENABLE_STATIC. This is what makes this a plugin rather than a built in dissector. The version is a simple string that is used to report on the version of this dissector. You should increase this number each time you make changes that you need to keep track of. Next we have an int that is initialised to -1 that records our protocol. This will get updated when we register this plugin with the main program. We can use this as a handy way to detect if we've been initialised yet. Its good practice to make all variables and functions that aren't exported static to keep name space pollution. Normally this isn't a problem unless your dissector gets so big it has to span multiple files. Then a global variable which contains the UDP port that we'll assume we are dissecting traffic for. Next a dissector reference that we'll initialise later. Next, the first plugin entry point. The function plugin_register() is called when the plugin is loaded and allows you to do some initialisation stuff, which will include communicating with the main program what you're plugins capabilities are. The plugin_reg_handoff routine is used when dissecting sub protocols. As our hypothetical protocol will be hypothetically carried over UDP then we will need to do this. Now we have the basics in place to interact with the main program, we had better fill in those missing functions. Lets start with register function. Plugin Initialisation. First a call to proto_register_protocol that registers the protocol. We can give it three names that will be used in various places to display it. - XXX explain where, this can be confusing Then we call the preference register function. At the moment we have no specific protocol preferences so this will be all that we need. This takes a function parameter which is our handoff function. I guess we'd better write that next. Plugin Handoff. What's happening here? We are initialising the dissector if it hasn't been initialised yet. First we create the dissector. This registers a routine to be called to do the actual dissecting. Then we associate it with a udp port number so that the main program will know to call us when it gets UDP traffic on that port. Now at last we finally get to write some dissecting code. For the moment we'll leave it as a basic placeholder. Plugin Dissection. cinfo, COL_PROTOCOL)) col_set_str(pinfo->cinfo, COL_PROTOCOL, "FOO"); /* Clear out stuff in the info column */ if(check_col(pinfo->cinfo,COL_INFO)){ col_clear(pinfo->cinfo,COL_INFO); } }]]> This function is called to dissect the packets presented to it. The packet data is held in a special buffer referenced here as tvb. We shall become fairly familiar with this as we get deeper into the details of the protocol. The packet info structure contains general data about the protocol, and we can update information here. The tree parameter is where the detail dissection takes place. For now we'll do the minimum we can get away with. The first two lines check to see if the Protocol column is being displayed in the UI. If it is, we set the text of this to our protocol, so everyone can see its been recognised. The only other thing we do is to clear out any data in the INFO column if its being displayed. At this point we should have a basic dissector ready to compile and install. It doesn't do much at present, than identify the protocol and label it. Compile the dissector to a dll or shared library, and copy it into the plugin directory of the installation. To finish this off a Makefile of some sort will be required. A Makefile.nmake for Windows platforms and a Makefile.am for unix/linux types. Makefile.nmake for Windows. Makefile.am for unix/linux.
Dissecting the details of the protocol Now we have our basic dissector up and running, lets do something with it. The simplest thing to do to start with is to just label the payload. This will allow us to set up some of the parts we will need. The first thing we will do is to build a subtree to decode our results into. This helps to keep things looking nice in the detailed display. Now the dissector is called in two different cases. In one case it is called to get a summary of the packet, in the other case it is called to look into details of the packet. These two cases can be distinguished by the tree pointer. If the tree pointer is NULL, then we are being asked for a summary. If it is non null, we can pick apart the protocol for display. So with that in mind, lets enhance our dissector. Plugin Packet Dissection. cinfo, COL_PROTOCOL)) col_set_str(pinfo->cinfo, COL_PROTOCOL, "FOO"); /* Clear out stuff in the info column */ if(check_col(pinfo->cinfo,COL_INFO)){ col_clear(pinfo->cinfo,COL_INFO); } if (tree) { /* we are being asked for details */ proto_item *ti = NULL; ti = proto_tree_add_item(tree, proto_foo, tvb, 0, -1, FALSE); } }]]> What we're doing here is adding a subtree to the dissection. This subtree will hold all the details of this protocol and so not clutter up the display when not required. We are also marking the area of data that is being consumed by this protocol. In our cases its all that has been passed to us, as we're assuming this protocol does not encapsulate another. Therefore, we add the new tree node with proto_tree_add_item, adding it to the passed in tree, label it with the protocol, use the passed in tvb buffer as the data, and consume from 0 to the end (-1) of this data. The FALSE we'll ignore for now. After this change, there should be a label in the detailed display for the protocol, and selecting this will highlight the remaining contents of the packet. Now lets go to the next step and add some protocol dissection. For this step we'll need to construct a couple of tables that help with dissection. This needs some changes to proto_register_foo. First a couple of statically declare arrays. Plugin Registering data structures. Then, after the registration code, we register these arrays. Plugin Registering data structures. The variables hf_foo_pdu_type and ett_foo also need to be declared somewhere near the top of the file. Plugin data structure globals. Now we can enhance the protocol display with some detail. Plugin starting to dissect the packets. Now the dissection is starting to look more interesting. We have picked apart our first bit of the protocol. One byte of data at the start of the packet that defines the packet type for foo protocol. The proto_item_add_subtree call has added a child node to the protocol tree which is where we will do our detail dissection. The expansion of this node is controlled by the ett_foo variable. This remembers if the node should be expanded or not as you move between packets. All subsequent dissection will be added to this tree, as you can see from the next call. A call to proto_tree_add_item in the foo_tree, this time using the hf_foo_pdu_type to control the formatting of the item. The pdu type is one byte of data, starting at 0. We assume it is in network order, so that is why we use FALSE. Although for 1 byte there is no order issues its best to keep right. If we look in detail at the hf_foo_pdu_type declaration in the static array we can see the details of the definition. hf_foo_pdu_type - the index for this node. FOO PDU Type - the label for this item. foo.type - this is the filter string. It enables us to type constructs such as foo.type=1 into the filter box. FT_UNIT8 - this specifies this item is an 8bit unsigned integer. This tallies with our call above where we tell it to only look at one byte. BASE_DEC - for an integer type, this tells it to be printed as a decimal number. It could be BASE_HEX or BASE_OCT if that made more sense. We'll ignore the rest of the structure for now. If you install this plugin and try it out, you'll see something that begins to look useful. Now lets finish off dissecting the simple protocol. We need to add a few more variables to the hf array, and a couple more procedure calls. Plugin wrapping up the packet dissection. This dissects all the bits of this simple hypothetical protocol. We've introduced a new variable offset into the mix to help keep track of where we are in the packet dissection. With these extra bits in place, the whole protocol is now dissected.
Improving the dissection information We can certainly improve the display of the proto with a bit of extra data. The first step is to add some text labels. Lets start by labelling the packet types. There is some useful support for this sort of thing by adding a couple of extra things. First we add a simple table of type to name. Naming the packet types. This is a handy data structure that can be used to look up value to names. There are routines to directly access this lookup table, but we don't need to do that, as the support code already has that added in. We just have to give these details to the appropriate part of data, using the VALS macro. Adding Names to the protocol. This helps in deciphering the packets, and we can do a similar thing for the flags structure. For this we need to add some more data to the table though. Adding Flags to the protocol. Some things to note here. For the flags, as each bit is a different flag, we use the type FT_BOOLEAN, as the flag is either on or off. Second, we include the flag mask in the 7th field of the data, which allows the system to mask the relevant bit. We've also changed the 5th field to 8, to indicate that we are looking at an 8 bit quantity when the flags are extracted. Then finally we add the extra constructs to the dissection routine. Note we keep the same offset for each of the flags. This is starting to look fairly full featured now, but there are a couple of other things we can do to make things look even more pretty. At the moment our dissection shows the packets as "Foo Protocol" which whilst correct is a little uninformative. We can enhance this by adding a little more detail. First, lets get hold of the actual value of the protocol type. We can use the handy function tvb_get_guint8 to do this. With this value in hand, there are a couple of things we can do. First we can set the INFO column of the non-detailed view to show what sort of PDU it is - which is extremely helpful when looking at protocol traces. Second, we can also display this information in the dissection window. Enhancing the display. cinfo, COL_PROTOCOL)) col_set_str(pinfo->cinfo, COL_PROTOCOL, "FOO"); /* Clear out stuff in the info column */ if(check_col(pinfo->cinfo,COL_INFO)){ col_clear(pinfo->cinfo,COL_INFO); } if (check_col(pinfo->cinfo, COL_INFO)) { col_add_fstr(pinfo->cinfo, COL_INFO, "Type %s", val_to_str(packet_type, packettypenames, "Unknown (0x%02x)")); } if (tree) { /* we are being asked for details */ proto_item *ti = NULL; proto_tree *foo_tree = NULL; gint offset = 0; ti = proto_tree_add_item(tree, proto_foo, tvb, 0, -1, FALSE); proto_item_append_text(ti, ", Type %s", val_to_str(packet_type, packettypenames, "Unknown (0x%02x)")); foo_tree = proto_item_add_subtree(ti, ett_foo); proto_tree_add_item(foo_tree, hf_foo_pdu_type, tvb, offset, 1, FALSE); offset += 1; ... ]]> So here, after grabbing the value of the first 8 bits, we use it with one of the built in utility routines val_to_str, to lookup the value. If the value isn't found we provide a fallback which just prints the value in hex. We use this twice, once in the INFO field of the columns - if its displayed, and similarly we append this data to the base of our dissecting tree.
How to handle transformed data Some protocols do clever things with data. They might possibly encrypt the data, or compress data, or part of it. If you know how these steps are taken it is possible to reverse them within the dissector. As encryption can be tricky, lets consider the case of compression. These techniques can also work for other transformations of data, where some step is required before the data can be examined. What basically needs to happen here, is to identify the data that needs conversion, take that data and transform it into a new stream, and then call a dissector on it. Often this needs to be done "on-the-fly" based on clues in the packet. Sometimes this needs to be used in conjunction with other techniques, such as packet reassembly. The following shows a technique to achieve this effect. Decompressing data packets for dissection. The first steps here are to recognise the compression. In this case a flag byte alerts us to the fact the remainder of the packet is compressed. Next we retrieve the original size of the packet, which in this case is conveniently within the protocol. If its not, it may be part of the compression routine to work it out for you, in which case the logic would be different. So armed with the size, a buffer is allocated to receive the uncompressed data using g_malloc, and the packet is decompressed into it. The tvb_get_ptr function is useful to get a pointer to the raw data of the packet from the offset onwards. In this case the decompression routine also needs to know the length, which is given by the tvb_length_remaining function. Next we build a new tvb buffer from this data, using the tvb_new_real_data call. This data is a child of our original data, so we acknowledge that in the next call to tvb_set_child_real_data_tvbuff. Finally we add this data as a new data source, so that the detailed display can show the decompressed bytes as well as the original. One procedural step is to add a handler to free the data when its no longer needed. In this case as g_malloc was used to allocate the memory, g_free is the appropriate function. After this has been set up the remainder of the dissector can dissect the buffer next_tvb, as its a new buffer the offset needs to be 0 as we start again from the beginning of this buffer. To make the rest of the dissector work regardless of whether compression was involved or not, in the case that compression was not signaled, we use the tvb_new_subset to deliver us a new buffer based on the old one but starting at the current offset, and extending to the end. This makes dissecting the packet from this point on exactly the same regardless of compression.
How to reassemble split packets Some protocols have times when they have to split a large packet across multiple other packets. In this case the dissection can't be carried out correctly until you have all the data. The first packet doesn't have enough data, and the subsequent packets don't have the expect format. To dissect these packets you need to wait until all the parts have arrived and then start the dissection.
How to reassemble split UDP packets As an example, lets examine a protocol that is layered on top of UDP that splits up its own data stream. If a packet is bigger than some given size, it will be split into chunks, and somehow signaled within its protocol. To deal with such streams, we need several things to trigger from. We need to know that this is packet is part of a multi-packet sequence. We need to know how many packets are in the sequence. We need to also know when we have all the packets. For this example we'll assume there is a simple in-protocol signaling mechanism to give details. A flag byte that signals the presence of a multi-packet and also the last packet, followed by an ID of the sequence, a packet sequence number. Reassembling fragments - Part 1 ... save_fragmented = pinfo->fragmented; flags = tvb_get_guint8(tvb, offset); offset++; if (flags & FL_FRAGMENT) { // fragmented tvbuff_t* new_tvb = NULL; fragment_data *frag_msg = NULL; guint16 msg_seqid = tvb_get_ntohs(tvb, offset); offset += 2; guint16 msg_num = tvb_get_ntohs(tvb, offset); offset += 2; pinfo->fragmented = TRUE; frag_msg = fragment_add_seq_check (tvb, offset, pinfo, msg_seqid, /* guint32 ID for fragments belonging together */ msg_fragment_table, /* list of message fragments */ msg_reassembled_table, /* list of reassembled messages */ msg_num, /* guint32 fragment sequence number */ -1, /* guint32 fragment length - to the end */ flags & FL_FRAG_LAST); /* More fragments? */ ]]> We start by saving the fragmented state of this packet, so we can restore it later. Next comes some protocol specific stuff, to dig the fragment data out of the stream if it's present. Having decided it is present, we let the function fragment_add_seq_check do its work. We need to provide this with a certain amount of data. The tvb buffer we are dissecting. The offset where the partial packet starts. The provided packet info. The sequence number of the fragment stream. There may be several streams of fragments in flight, and this is used to key the relevant one to be used for reassembly. The msg_fragment_table and the msg_reassembled_table are variables we need to declare. We'll consider these in detail later. msg_num is the packet number within the sequence. The length here is specified as -1, as we want the rest of the packet data. Finally a parameter that signals if this is the last fragment or not. This might be a flag as in this case, or there may be a counter in the protocol. Reassembling fragments part 2 cinfo, COL_INFO)) col_append_str (pinfo->cinfo, COL_INFO, " (Message Reassembled)"); } else { /* Not last packet of reassembled Short Message */ if (check_col (pinfo->cinfo, COL_INFO)) col_append_fstr (pinfo->cinfo, COL_INFO, " (Message fragment %u)", msg_num); } if (new_tvb) { // take it all next_tvb = new_tvb; } else // make a new subset next_tvb = tvb_new_subset(next_tvb, offset, -1, -1); } else { next_tvb = tvb_new_subset(next_tvb, offset, -1, -1); } offset = 0; pinfo->fragmented = save_fragmented; ]]> Having passed the fragment data to the reassembly handler, we can now check if we have the whole message. We can only do this if were in the display mode, as we need to pass the display tree parameter into this routine. If there is enough information, this routine will return the newly reassembled data buffer. After that, we add a couple of informative messages to the display to show that this is part of a sequence. Then a bit of manipulation of the buffers and the dissection can proceed. Normally you will probably not bother dissecting further unless the fragments have been reassembled as there won't be much to find. Sometimes the first packet in the sequence can be partially decoded though if you wish. Now the mysterious data we passed into the fragment_add_seq_check. Reassembling fragments - Initialisation First a couple of hash tables are declared, and these are initialised in the protocol initialisation routine. Following that, a fragment_items structure is allocated and filled in with a series of ett items, hf data items, and a string tag. The ett and hf values should be included in the relevant tables like all the other variables your protocol may use. The hf variables need to be placed in the structure something like the following. Of course the names may need to be adjusted. Reassembling fragments - Data These hf variables are used internally within the reassembly routines to make useful links, and to add data to the dissection. It produces links from one packet to another - such as a partial packet having a link to the fully reassembled packet. Likewise there are back pointers to the individual packets from the reassembled one. The other variables are used for flagging up errors.
How to tap protocols Adding a Tap interface to a protocol allows it to do some useful things. In particular you can produce protocol statistics from teh tap interface. A tap is basically a way of allowing other items to see whats happening as a protocol is dissected. A tap is registered with the main program, and then called on each dissection. Some arbritary protocol specific data is provided with the routine that can be used. To create a tap, you first need to register a tap. A tap is registered with an integer handle, and registered with the routine register_tap. This takes a string name with which to find it again. Initialising a tap static int foo_tap = -1; struct FooTap { gint packet_type; gint priorty; ... }; ... foo_tap = register_tap("foo"); ]]> Whilst you can program a tap without protocol specific data, it is generally not very useful. Therefore its a good idea to declare a structure that can be passed through the tap. This needs to be a static structure as it will be used after the dissection routine has returned. Its generally best to pick out some generic parts of the protocol you are dissecting into the tap data. A packet type, a priority, a status code maybe. The structure really needs to be included in a header file so that it can be included by other components that want to listen in to the tap. Once you have these defined, its simply a case of populating the protocol specific structure and then calling tap_queue_packet probably as the last part of the dissector. Calling a protocol tap This now enables those interested parties to listen in on the details of this protocol conversation.
How to produce protocol stats Given that you have a tap interface for the protocol, you can use this to produce some interesting statistics (well presumably interesting!) from protocol traces. This can be done in a separate plugin, or in the same plugin that is doing the dissection. The latter scheme is better, as the tap and stats module typically rely on sharing protocol specific data, which might get out of step between two different plugins. Here is a mechanism to produce statistics from the above TAP interface. Initialising a stats interface Working from the bottom up, first the plugin interface entry point is defined, plugin_register_tap_listener. This simply calls the initialisation function register_foo_stat_trees. This in turn calls the stats_tree_register function, which takes three strings, and three functions. This is the tap name that is registered. An abbreviation of the stats name. The name of the stats module. A '/' character can be used to make sub menus. The function that will called to generate the stats. A function that can be called to initialise the stats data. A function that will be called to clean up the stats data. In this case we only need the first two functions, as there is nothing specific to clean up. Initialising a stats session In this case we create a new tree node, to handle the total packets, and as a child of that we create a pivot table to handle the stats about different packet types. Generating the stats packet_type, msgtypevalues, "Unknown packet type (%d)")); return 1; } ]]> In this case the processing of the stats is quite simple. First we call the tick_stat_node for the st_str_packets packet node, to count packets. Then a call to stats_tree_tick_pivot on the st_node_packet_types subtree allows us to record statistics by packet type.
How to use conversations Some info about how to use conversations in a dissector can be found in the file doc/README.developer.