path: root/doc/README.display_filter
diff options
authorUlf Lamping <ulf.lamping@web.de>2006-10-16 02:37:06 +0000
committerUlf Lamping <ulf.lamping@web.de>2006-10-16 02:37:06 +0000
commit262774ad51daeb313a8b5094b4ffa44962640715 (patch)
tree6f2d24f9ae8fa122ccf4141ff175348473baa23c /doc/README.display_filter
parentfd39e44fd0a981aed9fcab4a6aa8e8a2979bdfed (diff)
split the display filter engine doc into it's own file
svn path=/trunk/; revision=19549
Diffstat (limited to 'doc/README.display_filter')
1 files changed, 117 insertions, 0 deletions
diff --git a/doc/README.display_filter b/doc/README.display_filter
new file mode 100644
index 0000000..5f42f53
--- /dev/null
+++ b/doc/README.display_filter
@@ -0,0 +1,117 @@
+XXX - move this file to epan?
+1. How the Display Filter Engine works.
+epan/dfilter/* - the display filter engine, including
+ scanner, parser, syntax-tree semantics checker, DFVM bytecode
+ generator, and DFVM engine.
+epan/ftypes/* - the definitions of the various FT_* field types.
+epan/proto.c - proto_tree-related routines
+1.1 Parsing text.
+The scanner/parser pair read the string representing the display filter
+and convert it into a very simple syntax tree. The syntax tree is very
+simple in that it is possible that many of the nodes contain unparsed
+chunks of text from the display filter.
+1.1 Enhancing the syntax tree.
+The semantics of the simple syntax tree are checked to make sure that
+the fields that are being compared are being compared to appropriate
+values. For example, if a field is an integer, it can't be compared to
+a string, unless a value_string has been defined for that field.
+During the process of checking the semantics, the simple syntax tree is
+fleshed out and no longer contains nodes with unparsed information. The
+syntax tree is no longer in its simple form, but in its complete form.
+1.2 Converting to DFVM bytecode.
+The syntax tree is analyzed to create a sequence of bytecodes in the
+"DFVM" language. "DFVM" stands for Display Filter Virtual Machine. The
+DFVM is similar in spirit, but not in definition, to the BPF VM that
+libpcap uses to analyze packets.
+A virtual bytecode is created and used so that the actual process of
+filtering packets will be fast. That is, it should be faster to process
+a list of VM bytecodes than to attempt to filter packets directly from
+the syntax tree. (heh... no measurement has been made to support this
+1.3 Filtering.
+Once the DFVM bytecode has been produced, it's a simple matter of
+running the DFVM engine against the proto_tree from the packet
+dissection, using the DFVM bytecodes as instructions. If the DFVM
+bytecode is known before packet dissection occurs, the
+proto_tree-related code can be "primed" to store away pointers to
+field_info structures that are interesting to the display filter. This
+makes lookup of those field_info structures during the filtering process
+1.4 Display Filter Functions.
+You define a display filter function by adding an entry to
+the df_functions table in epan/dfilter/dfunctions.c. The record struct
+is defined in defunctions.h, and shown here:
+typedef struct {
+ char *name;
+ DFFuncType function;
+ ftenum_t retval_ftype;
+ guint min_nargs;
+ guint max_nargs;
+ DFSemCheckType semcheck_param_function;
+} df_func_def_t;
+name - the name of the function; this is how the user will call your
+ function in the display filter language
+function - this is the run-time processing of your function.
+retval_ftype - what type of FT_* type does your function return?
+min_nargs - minimum number of arguments your function accepts
+max_nargs - maximum number of arguments your function accepts
+semcheck_param_function - called during the semantic check of the
+ display filter string.
+DFFuncType function
+typedef gboolean (*DFFuncType)(GList *arg1list, GList *arg2list, GList **retval);
+The return value of your function is a gboolean; TRUE if processing went fine,
+or FALSE if there was some sort of exception.
+For now, display filter functions can accept a maximum of 2 arguments.
+The "arg1list" parameter is the GList for the first argument. The
+'arg2list" parameter is the GList for the second argument. All arguments
+to display filter functions are lists. This is because in the display
+filter language a protocol field may have multiple instances. For example,
+a field like "ip.addr" will exist more than once in a single frame. So
+when the user invokes this display filter:
+ somefunc(ip.addr) == TRUE
+even though "ip.addr" is a single argument, the "somefunc" function will
+receive a GList of *all* the values of "ip.addr" in the frame.
+Similarly, the return value of the function needs to be a GList, since all
+values in the display filter language are lists. The GList** retval argument
+is passed to your function so you can set the pointer to your return value.
+typedef void (*DFSemCheckType)(int param_num, stnode_t *st_node);
+For each parameter in the syntax tree, this function will be called.
+"param_num" will indicate the number of the parameter, starting with 0.
+The "stnode_t" is the syntax-tree node representing that parameter.
+If everything is okay with the value of that stnode_t, your function
+does nothing --- it merely returns. If something is wrong, however,
+it should THROW a TypeError exception.