$Id$ TVBUFFs and Exceptions NOTE: This document describes the ideas and some of the tvbuff functions. If it describes a function, the description may not be complete. For a description of the functions, see epan/tvbuff.h This document describes the changes made to the Ethereal dissector routines in Ethereal 0.8.9. All protocol dissectors need to be modified, but can be modified one at a time. During this transition time, this document will stand apart from 'README.developer'. Once all the protocol dissectors are converted to use the new tvbuff routines, the information in this document will be merged into 'README.developer'. While Ethereal does a grand job of dissecting frames that are complete, it has done only a mediocre job of dissecting partial frames. Frames can be incomplete for two reasons: the user used a capture length which is smaller than the MTU of the interface (which is the default behavior of tcpdump, BTW), or the frame on the wire is corrupted. In either case, Ethereal should gracefully handle the incomplete frame. With the aid of two C preprocessor macros, BYTES_ARE_IN_FRAME() and IS_DATA_IN_FRAME(), dissector authors are supposed to check that the data they are trying to read from the frame actually exists. Some dissectors used these macros diligently, and others not. In the end we realized that depending on human diligence would get us nowhere and that a programmed solution would be necessary. The idea was to encapsulate the byte array which represented the data in the frame with a "class" that would check the enforce limits regarding the boundaries of the byte array. In the event of an improper data access, it is not enough to return an error condition since we knew that it would be impractical to check this error flags after every data access. Instead, we needed to implement exceptions in Ethereal. Other languages (Java, C++, Python) have exceptions, but we had to introduce an exception library and some magic C preprocess macros to implement them in C. The encapsulating class is called a "tvbuff", or "testy, virtual(-izable) buffer". They are testy in that they get mad when an attempt is made to access data beyond the bounds of their array. In that case, they throw an exception. They are virtualizable in that new tvbuff's can be made from other tvbuffs, while only the original tvbuff may have data. That is, the new tvbuff has virtual data. There are three types of tvbuffs, defined by an enum in tvbuff.h. A TVBUFF_REAL_DATA contains a guint8* that points to real data. The data is allocated by the user and is contiguous, since is an array of guint8's. A TVBUFF_SUBSET has a backing tvbuff. The TVBUFF_SUBSET is a "window" through which the program sees only a portion of the backing tvbuff. A TVBUFF_COMPOSITE combines multiple tvbuffs sequentually to produce a larger byte array. The top-most dissector, dissect_packet(), creates a TVBUFF_REAL_DATA that points the frame's data. As each dissector completes its portion of the protocl analysis, it is expected to create a new tvbuff of type TVBUFF_SUBSET which contains the payload portion of the protocol (that is, the bytes that are relevant to the next dissector). The syntax for creating a new TVBUFF_SUBSET is: next_tvb = tvb_new_subset(tvb, offset, length, reported_length) Where: tvb is the tvbuff that the dissector has been working on. It can be a tvbuff of any type. next_tvb is the new TVBUFF_SUBSET. offset is the byte offset of 'tvb' at which the new tvbuff should start. The first byte is the 0th byte. length is the number of bytes in the new TVBUFF_SUBSET. A length argument of -1 says to use as many bytes as are available in 'tvb'. reported_length is the number of bytes that the current protocol says should be in the payload. A reported_length of -1 says that the protocol doesn't say anything about the size of its payload. The tvb_new_subset() function will throw an exception if the offset/length pair go beyond the boundaries of 'tvb'. You do not need to free your new tvbuff. When the root tvbuff is freed in epan_dissect_free(), all the tvbuff's that were created from it are also freed. If you really wanted to keep one of the tvbuffs around after the root tvbuff is freed, you can do it by incrementing the usage count on the tvbuff (tvb_increment_usage_count()). But in all normal cases you really want to take advantage of the default behavior. Even though the definition of tvbuff is in tvbuff.h, you should treat a tvbuff as an opaque structure. You should not access its members directly. You must use one of the provided accessor methods to retrieve data from the tvbuff. All accessors will throw an exception if an attempt is made to read beyond the boundaries of the data in the tvbuff. If reported_length is set, then if the attempt to access data goes beyond reported_length, a ReportedBoundsError exception is thrown. Otherwise, if an attempt to access data beyond the bounds of the tvbuff is made, a BoundsError exception is thrown. The accessors are: Single-byte accessor: guint8 tvb_get_guint8(tvbuff_t*, gint offset); Network-to-host-order access for 16-bit integers (guint16), 32-bit integers (guint32), 24-bit integers, and 64-bit integers (guint64): guint16 tvb_get_ntohs(tvbuff_t*, gint offset); guint32 tvb_get_ntohl(tvbuff_t*, gint offset); guint32 tvb_get_ntoh24(tvbuff_t*, gint offset); guint64 tvb_get_ntoh64(tvbuff_t*, gint offset); Little-Endian-to-host-order access for 16-bit integers (guint16), 32-bit integers (guint32), 24-bit integers, and 64-bit integers (guint64): guint16 tvb_get_letohs(tvbuff_t*, gint offset); guint32 tvb_get_letohl(tvbuff_t*, gint offset); guint32 tvb_get_letoh24(tvbuff_t*, gint offset); guint64 tvb_get_letoh64(tvbuff_t*, gint offset); Copying memory: guint8* tvb_memcpy(tvbuff_t*, guint8* target, gint offset, gint length); guint8* tvb_memdup(tvbuff_t*, gint offset, gint length); Pointer-retrieval: /* WARNING! This function is possibly expensive, temporarily allocating * another copy of the packet data. Furthermore, it's dangerous because once * this pointer is given to the user, there's no guarantee that the user will * honor the 'length' and not overstep the boundaries of the buffer. */ guint8* tvb_get_ptr(tvbuff_t*, gint offset, gint length); The reason that tvb_get_ptr() have to allocate a copy of its data only occurs with TVBUFF_COMPOSITES. If the user request a pointer to a range of bytes that spans the member tvbuffs that make up the TVBUFF_COMPOSITE, the data will have to be copied to another memory region to assure that all the bytes are contiguous. Modifications to the Dissectors The dissector prototype will now be: void/gboolean dissector(tvbuff_t*, packet_info*, proto_tree*) The packet_info struct now has the frame_data struct that used to be passed to each dissector. Additionally, packet_info has a char* called 'current_proto'. The first thing a dissector should do is set pinfo->current_proto to point to a string referring to the name of the protocol (use the same name that appears in the COL_PROTO column, if possible). If an exception jumps us out of a dissector, dissect_packet() will use pinfo->current_proto to report which dissector encountered an exception. The packet_info struct also has a tvbuff_t* called 'compat_top_tvb'. This points to the same tvbuff_t that dissect_packet() creates. This is useful for creating a tvbuff (TVBUFF_SUBSET) inside a dissector which itself does not use tvbuffs. Once all the dissectors are converted to use tvbuffs, 'compat_top_tvb' will be removed. A dissector that uses tvbuffs can call another dissector that does not. This code snippet shows how: tvbuff_t *next_tvb; const guint8 *next_pd; int next_offset; .... next_tvb = tvb_new_subset(tvb, offset_of_next_protocol, -1, -1); tvb_compat(next_tvb, &next_pd, &next_offset); That is, next_pd and next_offset will get assigned values relative to the start of the byte array, not relative to the tvbuff. This function, tvb_compat(), is only useful while the dissectors are in transition; once all dissectors are converted, this function can be removed. A dissector that is called via the dissector tables needs to preserve its old-style argument list until all such dissectors are converted to use tvbuffs. The dissector can create its own tvbuff by using pi.compat_top_tvb, which is the top-level tvbuff created in dissect_packet(). "compat_top_tvb" will only be available during the conversion process; once all dissector have been converted to use tvbuff's, that variable will disappear. A macro called tvb_create_from_top() has been provided to ease your work. It takes one argument --- the name of the offset variable. Here is an example, from packet-cops.c, of how to create your own tvbuff. The use of the #if/#endif block is optional. /* Code to actually dissect the packets */ #if 0 static void dissect_cops(tvbuff_t *tvb, packet_info *pinfo, proto_tree *tree) { #else static void dissect_cops(const guchar *pd, int offset, frame_data *fd, proto_tree *tree) { tvbuff_t *tvb; packet_info *pinfo = π tvb = tvb_create_from_top(offset); #endif Once we convert all the dissector-table dissectors, the second half of the #if-block will disappear. Exceptions The exception module from Kazlib was copied into the Ethereal tree. A header file "exceptions.h" was created which defines C preprocess macros that make the usage of the exception functions easier. The exception routines in Kazlib have a lot of features, but in Ethereal we only need a subset of those features, so the macros hide the complexity of the Kazlib calls, and try to emulate the syntax of languages which have native support for exceptions.