aboutsummaryrefslogtreecommitdiffstats
path: root/doc/README.developer
diff options
context:
space:
mode:
authorgram <gram@f5534014-38df-0310-8fa8-9805f1628bb7>2003-07-25 03:44:05 +0000
committergram <gram@f5534014-38df-0310-8fa8-9805f1628bb7>2003-07-25 03:44:05 +0000
commit18f7f8359efffc1360b64a616878a55e0a05e79e (patch)
treec295c5d9f4e05517f4d56f17032183c996ab27df /doc/README.developer
parentdf80562e6f1493cd68dbb444cc83039d6eb93873 (diff)
Add to the fundamental types passed between the scanner and the parser.
Besides "STRING", there is now "UNPARSED_STRING", where the distinction is that "STRING" was a double-quoted string and "UNPARSED_STRING" is just a sequence of characters that the scanner didn't know how to scan/parse, so it's up to the Ftype to parse it. This gives us more flexibility and prepares the dfilter parsing engine for the upcoming addition of the "contains" operator. In the process of doing this, I also re-did the double-quoted string support in the scanner, so that instead of the naively-simple support we used to have, double-quoted strings now can have embedded dobule-quotes, embedded octal sequences, and embedded hexadecimal sequences: "\"" embedded double-quote "\110" embedded octal "\x48" embedded hex Enhance the dfilter unit test script to be able to run a single collection of tests instead of having to run all of them all the time. git-svn-id: http://anonsvn.wireshark.org/wireshark/trunk@8083 f5534014-38df-0310-8fa8-9805f1628bb7
Diffstat (limited to 'doc/README.developer')
-rw-r--r--doc/README.developer56
1 files changed, 52 insertions, 4 deletions
diff --git a/doc/README.developer b/doc/README.developer
index 82f2151ac5..17395b40d7 100644
--- a/doc/README.developer
+++ b/doc/README.developer
@@ -1,4 +1,4 @@
-$Id: README.developer,v 1.76 2003/07/07 22:59:54 guy Exp $
+$Id: README.developer,v 1.77 2003/07/25 03:43:59 gram Exp $
This file is a HOWTO for Ethereal developers. It describes how to start coding
a Ethereal protocol dissector and the use some of the important functions and
@@ -208,7 +208,7 @@ code inside
is needed only if you are using the "snprintf()" function.
-The "$Id: README.developer,v 1.76 2003/07/07 22:59:54 guy Exp $"
+The "$Id: README.developer,v 1.77 2003/07/25 03:43:59 gram Exp $"
in the comment will be updated by CVS when the file is
checked in; it will allow the RCS "ident" command to report which
version of the file is currently checked out.
@@ -218,7 +218,7 @@ version of the file is currently checked out.
* Routines for PROTONAME dissection
* Copyright 2000, YOUR_NAME <YOUR_EMAIL_ADDRESS>
*
- * $Id: README.developer,v 1.76 2003/07/07 22:59:54 guy Exp $
+ * $Id: README.developer,v 1.77 2003/07/25 03:43:59 gram Exp $
*
* Ethereal - Network traffic analyzer
* By Gerald Combs <gerald@ethereal.com>
@@ -2136,7 +2136,55 @@ a dissector.
4.0 Extending Wiretap.
-5.0 Adding new capabilities.
+5.0 How the Display Filter Engine works
+
+code:
+epan/dfilter/* - the display filter engine, including
+ scanner, parser, syntax-tree semantics checker, DFVM bytecode
+ generator, and DFVM engine.
+epan/ftypes/* - the definitions of the various FT_* field types.
+epan/proto.c - proto_tree-related routines
+
+5.1 Parsing text
+
+The scanner/parser pair read the string representing the display filter and
+convert it into a very simple syntax tree. The syntax tree is very simple in that
+it is possible that many of the nodes contain unparsed chunks of text from the display
+filter.
+
+5.1 Enhancing the syntax tree.
+
+The semantics of the simple syntax tree are checked to make sure that the fields
+that are being compared are being compared to appropriate values. For example,
+if a field is an integer, it can't be compared to a string, unless a value_string
+has been defined for that field.
+
+During the process of checking the semantics, the simple syntax tree is fleshed out
+and no longer contains nodes with unparsed information. The syntax tree is no
+longer in its simple form, but in its complete form.
+
+5.2 Converting to DFVM bytecode
+
+The syntax tree is analyzed to create a sequence of bytecodes in the "DFVM" language.
+"DFVM" stands for Display Filter Virtual Machine. The DFVM is similar in spirit, but
+not in definition, to the BPF VM that libpcap uses to analyze packets.
+
+A virtual bytecode is created and used so that the actual process of filtering packets
+will be fast. That is, it should be faster to process a list of VM bytecodes than
+to attempt to filter packets directly from the syntax tree. (heh... no measurement
+has been made to support this supposition)
+
+5.3 Filtering
+
+Once the DFVM bytecode has been produced, its a simple matter of running the
+DFVM engine against the proto_tree from the packet dissection, using the DFVM bytecodes
+as instructions. If the DFVM bytecode is known before packet dissection occurs,
+the proto_tree-related code can be "primed" to store away pointers to field_info
+structures that are interesting to the display filter. This makes lookup of those
+field_info structures during the filtering process faster.
+
+
+6.0 Adding new capabilities.