$Id: proto_tree,v 1.5 1999/10/12 06:21:15 gram Exp $ The Ethereal Protocol Tree ========================== Up until version 0.6.3 of Ethereal, the protocol tree that is displayed in the middle pane of the Ethereal GUI had been created by having the protocol dissection routines add strings to a GTK+ tree. This GUI container was not easily manipulated; the print routines had to reach inside what should be an opaque GUI structure and pull out the data. The tree of strings also did not lend itself to filtering on the data available in the tree. Mostly to solve the display filter problem, I decided to have the protocol dissection routines put their data into a logical tree instead of a GUI tree. This tree structure would provide a generic way for multiple routines, like the dissection routines, the display filter routines, and the print routines, to retrieve data about the protocol fields. The GUI routines would then be modified to draw the GUI tree based on the data in the logical tree. By structuring this logical tree well, with well-defined field types, Ethereal can have a very powerful display filter option. No longer would display filters be limited to the ability of the BPF compiler (libpcap or wiretap), but would have access to the full range of C field types available within Ethereal. In Ethereal 0.7.6, I decided to extend the information that the programmer must provide about each field. I was frustrated by the way in which the original proto_tree code handled bitfields. By providing a small amount of extra info, bitfields can now be added very easily to the proto_tree. In addition, filtering on bitfields now works more naturally. The protocol tree, or proto_tree, is a GNode, the N-way tree structure available within GLIB. Of course the protocol dissectors don't care what a proto_tree really is; they just pass the proto_tree pointer as an argument to the routines which allow them to add items and new branches to the tree. When a packet is selected in the packet-list pane, a new logical protocol tree (proto_tree) is created. The pointer to the proto_tree (in this case, 'protocol tree'), is passed to the top-level protocol dissector, and then to all subsequent protocol dissectors for that packet, and then the GUI tree is drawn via proto_tree_draw(). Programming for the proto_tree ============================== The logical proto_tree needs to know detailed information about the protocols and fields about which information will be collected from the dissection routines. By strictly defining (or "typing") the data that can be attached to a proto tree, searching and filtering becomes possible. This means that the for every protocol and field (which I also call "header fields", since they are fields in the protocol headers) which might be attached to a tree, some information is needed. Every dissector routine will need to register its protocols and fields with the central protocol routines (in proto.c). At first I thought I might keep all the protocol and field information about all the dissectors in one file, but decentralization seemed like a better idea. That one file would have gotten very large; one small change would have required a re-compilation of the entire file. Also, by allowing registration of protocols and fields at run-time, loadable modules of protocol dissectors (perhaps even user-supplied) is feasible. For every protocol or field that a dissector wants to register, a variable of type int needs to be used to keep track of the protocol. The IDs are needed for establishing parent/child relationships between protocols and fields, as well as associating data with a particular field so that it can be stored in the logical tree and displayed in the GUI protocol tree. Some dissectors will need to create branches within their tree to help organize header fields. These branches should be registered as header fields. Only true protocols should be registered as protocols. This is so that a display filter user interface knows how to distinguish protocols from fields. A protocol is registered with the name of the protocol and its abbreviation. Here is how the frame "protocol" is registered. int proto_frame; proto_frame = proto_register_protocol ( /* name */ "Frame", /* abbrev */ "frame" ); A header field is also registered with its name and abbreviation, but information about the its data type is needed. It helps to look at the header_field_info struct to see what information is expected: struct header_field_info { char *name; char *abbrev; enum ftenum type; int display; void *strings; guint bitmask; char *blurb; int id; /* calculated */ int parent; int bitshift; /* calculated */ }; name ---- A string representing the name of the field. This is the name that will appear in the graphical protocol tree. abbrev ------ A string with an abbreviation of the field. We concatenate the abbreviation of the parent protocol with an abbreviation for the field, using a period as a separator. For example, the "src" field in an IP packet would have "ip.addr" as an abbreviation. It is acceptable to have multiple levels of periods if, for example, you have fields in your protocol that are then subdivided into subfields. For example, TRMAC has multiple error fields, so the abbreviations follow this pattern: "trmac.errors.iso", "trmac.errors.noniso", etc. The abbreviation is the identifier used in a display filter. type ---- The type of value this field holds. The current field types are: FT_NONE, FT_BOOLEAN, FT_UINT8, FT_UINT16, FT_UINT24, FT_UINT32, FT_INT8, FT_INT16, FT_INT24, FT_INT32, FT_DOUBLE, FT_ABSOLUTE_TIME, FT_RELATIVE_TIME, FT_STRING, FT_ETHER, FT_BYTES, FT_IPv4, FT_IPv6, FT_IPXNET Some of these field types are still not handled in the display filter routines, but the most common ones are. The FT_UINT* variables all represent unsigned integers; the number on the end represent how many bits are used to represent the number. display ------- The display field has a couple of overloaded uses. This is unfortunate, but since we're C as an application programming language, this sometimes makes for cleaner programs. Right now I still think that overloading this variable was okay. For integer fields (FT_UINT*), this variable represents the base in which you would like the value displayed. The acceptable bases are: BASE_DEC, BASE_HEX, BASE_OCT, BASE_BIN For FT_BOOLEAN fields that are also bitfields, 'display' is used to tell the proto_tree how wide the parent bitfield is. With integers this is not needed since the type of integer itself (FT_UINT8, FT_UINT16, FT_UINT24, FT_UINT32) tells the proto_tree how wide the parent bitfield is. Additionally, BASE_NONE is used for 'display' as a NULL-value. That is, for non-integers and non-bitfield FT_BOOLEANs, you'll want to use BASE_NONE in the 'display' field. It is possible that in the future we will record the endianness of integers. If so, it is likely that we'll use a bitmask on the display field so that integers would be represented as BEND|BASE_DEC or LEND|BASE_HEX. But that has not happened yet. strings ------- Some integer fields need labels to represent the true value of a field. A value_string structure is a way to map values to strings. typedef struct _value_string { guint32 value; gchar *strptr; } value_string; For FT_UINT* fields, the 'string' field is a pointer to an array of such value_string structs. (Note: before Ethereal 0.7.6, we had separate field types like FT_VALS_UINT8 which denoted the use of value_strings. Now, the non-NULLness of the pointer lets the proto_tree know that a value_string is meant for this field). FT_BOOLEANS have a default map of 0 = "False", 1 (or anything else) = "True". Sometimes it is useful to change the labels for boolean values (e.g., to "Yes"/"No", "Fast"/"Slow", etc.). For these mappings, a struct called true_false_string is used. (This struct is new as of Ethereal 0.7.6). typedef struct true_false_string { char *true_string; char *false_string; } true_false_string; It's two fields are pointers to the string representing truth, and the string representing falsehood. For FT_BOOLEAN fields that need a true_false_string struct, the 'strings' field is a pointer to that struct. bitmask ------- If the field is not a bitfield, then bitmask should be set to 0. If it is a bitfield, then the bitmask is the mask which will leave only the bits needed to make the field when ANDed with a value. The proto_tree routines will calculate 'bitshift' automatically from 'bitmask', by finding the first set bit in the bitmask. blurb ----- This is a string giving a sentence or two description of the field. It is meant to provide a more detailed description of the field than the name alone provides. This information will be used in the man page, and in a future GUI display-filter creation tool. We might also add tooltips to the labels in the GUI protocol tree, in which case the blurb would be used as the tooltip text. Field Registration ------------------ Protocol registration is handled by creating an instance of the header_field_info struct (or an arry of such structs), and calling the registration function along with the registration ID of the protocol that is the parent of the fields. Here is a complete example: int proto_eg = -1; int hf_field_a = -1; int hf_field_b = -1; static hf_register_info hf[] = { { &hf_field_a, { "Field A", "proto.field_a", FT_UINT8, BASE_HEX, NULL, 0xf0, "Field A represents Apples" }}, { &hf_field_b, { "Field B", "proto.field_a", FT_UINT16, BASE_DEC, VALS(vs), 0x0, "Field B represents Bananas" }} }; proto_eg = proto_register_protocol("Example Protocol", "proto"); proto_register_field_array(proto_eg, hf, array_length(hf)); Be sure that your array of hf_register_info structs is declared 'static', since the proto_register_field_array() function does not create a copy of the information in the array... it uses that static copy of the information that the compiler created inside your array. Here's the layout of the hf_register_info struct: typedef struct hf_register_info { int *p_id; /* pointer to parent variable */ header_field_info hfinfo; } hf_register_info; Also be sure to use the handy array_length() macro found in packet.h to have the compiler compute the array length for you at compile time. Adding Items and Values to the Protocol Tree -------------------------------------------- A protocol item is added to an existing protocol tree with one of a handful of proto_tree_add_item*() funtions. Subtrees can be made with the proto_item_add_subtree() function: item = proto_tree_add_item(....); new_tree = proto_item_add_subtree(item, tree_type); There are now 4 functions that the programmer can use to add either protocol or field labels to the proto_tree: proto_item* proto_tree_add_item(tree, id, start, length, value); proto_item* proto_tree_add_item_format(tree, id, start, length, value, format, ...); proto_item* proto_tree_add_item_hidden(tree, id, start, length, value); proto_item* proto_tree_add_text(tree, start, length, format, ...); proto_tree_add_item() --------------------- The first function, proto_tree_add_item, is used when you wish to do no special formatting. The item added to the GUI tree will contain the name (as passed in the proto_register_*() function) and any value. If your field does have a value, it is passed after the length variable (notice the ellipsis in the function prototype). Now that the proto_tree has detailed information about bitfield fields, you an use proto_tree_add_item() with no extra processing to add bitfield values to your tree. Here's an example. Take the Format Identifer (FID) field in the Tranmission Header (TH) portion of the SNA protocol. The FID is the high nibble of the first byte of the TH. The FID would be registered like this: name = "Format Identifer" abbrev = "sna.th.fid" type = FT_UINT8 display = BASE_HEX strings = sna_th_fid_vals bitmask = 0xf0 The bitmask contains the value which would leave only the FID if bitwise-ANDed against the parent field, the first byte of the TH. The code to add the FID to the tree would be; guint8 th_0 = pd[offset]; proto_tree_add_item(bf_tree, hf_sna_th_fid, offset, 1, th_0); Note: we do not do *any* manipulation of th_0 in order to ge the FID value. We just pass it to proto_tree_add_item(). The proto_tree already has the information about bitmasking and bitshifting, so it does the work of masking and shifting for us! This also means that you no longer have to crate value_string structs with the values bitshifted. The value_string for FID looks like this, even though the FID value is actually contained in the high nibble. (You'd expect the values to be 0x0, 0x10, 0x20, etc.) /* Format Identifier */ static const value_string sna_th_fid_vals[] = { { 0x0, "SNA device <--> Non-SNA Device" }, { 0x1, "Subarea Node <--> Subarea Node" }, { 0x2, "Subarea Node <--> PU2" }, { 0x3, "Subarea Node or SNA host <--> Subarea Node" }, { 0x4, "?" }, { 0x5, "?" }, { 0xf, "Adjaced Subarea Nodes" }, { 0, NULL } }; The final implication of this is that display filters work the way you'd naturally expect them to. You'd type "sna.th.fid == 0xf" to find Adjacent Subarea Nodes. The user does not have to shift the value of the FID to the high nibble of the byte ("sna.th.fid == 0xf0") as was necessary before Ethereal 0.7.6. proto_tree_add_item_format() ---------------------------- The second function, proto_tree_add_item_format(), is used when the dissector routines wants complete control over how the field and value will be represented on the GUI tree. The caller must pass include the name of the protocol or field; it is not added automatically as in proto_tree_add_item(). proto_tree_add_item_hidden() ---------------------------- The third function is used to add fields and values to a tree, but not show them on a GUI tree. The caller may want a value to be included in a tree so that the packet can be filtered on this field, but the representation of that field in the tree is not appropriate. An example is the token-ring routing information field (RIF). The best way to show the RIF in a GUI is by a sequence of ring and bridge numbers. Rings are 3-digit hex numbers, and bridges are single hex digits: RIF: 001-A-013-9-C0F-B-555 In the case of RIF, the programmer should use a field with no value and use proto_tree_add_item_format() to build the above representation. The programmer can then add the ring and bridge values, one-by-one, with proto_tree_add_item_hidden() so that the user can then filter on or search for a particular ring or bridge. Here's a skeleton of how the programmer might code this. char *rif; rif = create_rif_string(...); proto_tree_add_item_format(tree, hf_tr_rif_label,..., "RIF: %s", rif); for(i = 0; i < num_rings; i++) { proto_tree_add_item_hidden(tree, hf_tr_rif_ring, ..., ring[i]); } for(i = 0; i < num_rings - 1; i++) { proto_tree_add_item_hidden(tree, hf_tr_rif_ring, ..., bridge[i]); } The logical tree has these items: hf_tr_rif_label, text="RIF: 001-A-013-9-C0F-B-555", value = NONE hf_tr_rif_ring, hidden, value=0x001 hf_tr_rif_bridge, hidden, value=0xA hf_tr_rif_ring, hidden, value=0x013 hf_tr_rif_bridge, hidden, value=0x9 hf_tr_rif_ring, hidden, value=0xC0F hf_tr_rif_bridge, hidden, value=0xB hf_tr_rif_ring, hidden, value=0x555 GUI or print code will not display the hidden fields, but a display filter or "packet grep" routine will still see the values. The possible filter is then possible: tr.rif_ring eq 0x013 proto_tree_add_text() --------------------- The fourth function, proto_tree_add_text(), is used to add a label to the GUI tree. It will contain no value, so it is not searchable in the display filter process. This function was needed in the transition from the old-style proto_tree to this new-style proto_tree so that Ethereal would still decode all protocols w/o being able to filter on all protocols and fields. Otherwise we would have had to cripple Ethereal's functionality while we converted all the old-style proto_tree calls to the new-style proto_tree calls.