diff options
author | Ulf Lamping <ulf.lamping@web.de> | 2006-10-16 02:37:06 +0000 |
---|---|---|
committer | Ulf Lamping <ulf.lamping@web.de> | 2006-10-16 02:37:06 +0000 |
commit | 262774ad51daeb313a8b5094b4ffa44962640715 (patch) | |
tree | 6f2d24f9ae8fa122ccf4141ff175348473baa23c /doc | |
parent | fd39e44fd0a981aed9fcab4a6aa8e8a2979bdfed (diff) |
split the display filter engine doc into it's own file
svn path=/trunk/; revision=19549
Diffstat (limited to 'doc')
-rw-r--r-- | doc/README.developer | 114 | ||||
-rw-r--r-- | doc/README.display_filter | 117 |
2 files changed, 119 insertions, 112 deletions
diff --git a/doc/README.developer b/doc/README.developer index c669d9137e..efd9a76709 100644 --- a/doc/README.developer +++ b/doc/README.developer @@ -3237,119 +3237,9 @@ a dissector. See wiretap/README.developer. -5. How the Display Filter Engine works. +5. Display Filter Engine. -code: -epan/dfilter/* - the display filter engine, including - scanner, parser, syntax-tree semantics checker, DFVM bytecode - generator, and DFVM engine. -epan/ftypes/* - the definitions of the various FT_* field types. -epan/proto.c - proto_tree-related routines - -5.1 Parsing text. - -The scanner/parser pair read the string representing the display filter -and convert it into a very simple syntax tree. The syntax tree is very -simple in that it is possible that many of the nodes contain unparsed -chunks of text from the display filter. - -5.1 Enhancing the syntax tree. - -The semantics of the simple syntax tree are checked to make sure that -the fields that are being compared are being compared to appropriate -values. For example, if a field is an integer, it can't be compared to -a string, unless a value_string has been defined for that field. - -During the process of checking the semantics, the simple syntax tree is -fleshed out and no longer contains nodes with unparsed information. The -syntax tree is no longer in its simple form, but in its complete form. - -5.2 Converting to DFVM bytecode. - -The syntax tree is analyzed to create a sequence of bytecodes in the -"DFVM" language. "DFVM" stands for Display Filter Virtual Machine. The -DFVM is similar in spirit, but not in definition, to the BPF VM that -libpcap uses to analyze packets. - -A virtual bytecode is created and used so that the actual process of -filtering packets will be fast. That is, it should be faster to process -a list of VM bytecodes than to attempt to filter packets directly from -the syntax tree. (heh... no measurement has been made to support this -supposition) - -5.3 Filtering. - -Once the DFVM bytecode has been produced, it's a simple matter of -running the DFVM engine against the proto_tree from the packet -dissection, using the DFVM bytecodes as instructions. If the DFVM -bytecode is known before packet dissection occurs, the -proto_tree-related code can be "primed" to store away pointers to -field_info structures that are interesting to the display filter. This -makes lookup of those field_info structures during the filtering process -faster. - -5.4 Display Filter Functions. - -You define a display filter function by adding an entry to -the df_functions table in epan/dfilter/dfunctions.c. The record struct -is defined in defunctions.h, and shown here: - -typedef struct { - char *name; - DFFuncType function; - ftenum_t retval_ftype; - guint min_nargs; - guint max_nargs; - DFSemCheckType semcheck_param_function; -} df_func_def_t; - -name - the name of the function; this is how the user will call your - function in the display filter language - -function - this is the run-time processing of your function. - -retval_ftype - what type of FT_* type does your function return? - -min_nargs - minimum number of arguments your function accepts -max_nargs - maximum number of arguments your function accepts - -semcheck_param_function - called during the semantic check of the - display filter string. - -DFFuncType function -------------------- -typedef gboolean (*DFFuncType)(GList *arg1list, GList *arg2list, GList **retval); - -The return value of your function is a gboolean; TRUE if processing went fine, -or FALSE if there was some sort of exception. - -For now, display filter functions can accept a maximum of 2 arguments. -The "arg1list" parameter is the GList for the first argument. The -'arg2list" parameter is the GList for the second argument. All arguments -to display filter functions are lists. This is because in the display -filter language a protocol field may have multiple instances. For example, -a field like "ip.addr" will exist more than once in a single frame. So -when the user invokes this display filter: - - somefunc(ip.addr) == TRUE - -even though "ip.addr" is a single argument, the "somefunc" function will -receive a GList of *all* the values of "ip.addr" in the frame. - -Similarly, the return value of the function needs to be a GList, since all -values in the display filter language are lists. The GList** retval argument -is passed to your function so you can set the pointer to your return value. - -DFSemCheckType --------------- -typedef void (*DFSemCheckType)(int param_num, stnode_t *st_node); - -For each parameter in the syntax tree, this function will be called. -"param_num" will indicate the number of the parameter, starting with 0. -The "stnode_t" is the syntax-tree node representing that parameter. -If everything is okay with the value of that stnode_t, your function -does nothing --- it merely returns. If something is wrong, however, -it should THROW a TypeError exception. +See README.display_filter. 6. The end diff --git a/doc/README.display_filter b/doc/README.display_filter new file mode 100644 index 0000000000..5f42f538d4 --- /dev/null +++ b/doc/README.display_filter @@ -0,0 +1,117 @@ +$Id$ + +XXX - move this file to epan? + +1. How the Display Filter Engine works. + +code: +epan/dfilter/* - the display filter engine, including + scanner, parser, syntax-tree semantics checker, DFVM bytecode + generator, and DFVM engine. +epan/ftypes/* - the definitions of the various FT_* field types. +epan/proto.c - proto_tree-related routines + +1.1 Parsing text. + +The scanner/parser pair read the string representing the display filter +and convert it into a very simple syntax tree. The syntax tree is very +simple in that it is possible that many of the nodes contain unparsed +chunks of text from the display filter. + +1.1 Enhancing the syntax tree. + +The semantics of the simple syntax tree are checked to make sure that +the fields that are being compared are being compared to appropriate +values. For example, if a field is an integer, it can't be compared to +a string, unless a value_string has been defined for that field. + +During the process of checking the semantics, the simple syntax tree is +fleshed out and no longer contains nodes with unparsed information. The +syntax tree is no longer in its simple form, but in its complete form. + +1.2 Converting to DFVM bytecode. + +The syntax tree is analyzed to create a sequence of bytecodes in the +"DFVM" language. "DFVM" stands for Display Filter Virtual Machine. The +DFVM is similar in spirit, but not in definition, to the BPF VM that +libpcap uses to analyze packets. + +A virtual bytecode is created and used so that the actual process of +filtering packets will be fast. That is, it should be faster to process +a list of VM bytecodes than to attempt to filter packets directly from +the syntax tree. (heh... no measurement has been made to support this +supposition) + +1.3 Filtering. + +Once the DFVM bytecode has been produced, it's a simple matter of +running the DFVM engine against the proto_tree from the packet +dissection, using the DFVM bytecodes as instructions. If the DFVM +bytecode is known before packet dissection occurs, the +proto_tree-related code can be "primed" to store away pointers to +field_info structures that are interesting to the display filter. This +makes lookup of those field_info structures during the filtering process +faster. + +1.4 Display Filter Functions. + +You define a display filter function by adding an entry to +the df_functions table in epan/dfilter/dfunctions.c. The record struct +is defined in defunctions.h, and shown here: + +typedef struct { + char *name; + DFFuncType function; + ftenum_t retval_ftype; + guint min_nargs; + guint max_nargs; + DFSemCheckType semcheck_param_function; +} df_func_def_t; + +name - the name of the function; this is how the user will call your + function in the display filter language + +function - this is the run-time processing of your function. + +retval_ftype - what type of FT_* type does your function return? + +min_nargs - minimum number of arguments your function accepts +max_nargs - maximum number of arguments your function accepts + +semcheck_param_function - called during the semantic check of the + display filter string. + +DFFuncType function +------------------- +typedef gboolean (*DFFuncType)(GList *arg1list, GList *arg2list, GList **retval); + +The return value of your function is a gboolean; TRUE if processing went fine, +or FALSE if there was some sort of exception. + +For now, display filter functions can accept a maximum of 2 arguments. +The "arg1list" parameter is the GList for the first argument. The +'arg2list" parameter is the GList for the second argument. All arguments +to display filter functions are lists. This is because in the display +filter language a protocol field may have multiple instances. For example, +a field like "ip.addr" will exist more than once in a single frame. So +when the user invokes this display filter: + + somefunc(ip.addr) == TRUE + +even though "ip.addr" is a single argument, the "somefunc" function will +receive a GList of *all* the values of "ip.addr" in the frame. + +Similarly, the return value of the function needs to be a GList, since all +values in the display filter language are lists. The GList** retval argument +is passed to your function so you can set the pointer to your return value. + +DFSemCheckType +-------------- +typedef void (*DFSemCheckType)(int param_num, stnode_t *st_node); + +For each parameter in the syntax tree, this function will be called. +"param_num" will indicate the number of the parameter, starting with 0. +The "stnode_t" is the syntax-tree node representing that parameter. +If everything is okay with the value of that stnode_t, your function +does nothing --- it merely returns. If something is wrong, however, +it should THROW a TypeError exception. |