diff options
author | Bill Meier <wmeier@newsguy.com> | 2010-12-18 18:07:36 +0000 |
---|---|---|
committer | Bill Meier <wmeier@newsguy.com> | 2010-12-18 18:07:36 +0000 |
commit | 70ce24ded56a5c2fd50b38741e7dca7951c3b888 (patch) | |
tree | 757a02d29981d717ea3bfc9b2e1a9b605ae24583 | |
parent | cd5544679a11098ab7ebdfd74abc7588e0949767 (diff) |
Revert SVN #35213 based upon comments in the Wireshark-dev list.
http://www.wireshark.org/lists/wireshark-dev/201012/msg00206.html
svn path=/trunk/; revision=35219
-rw-r--r-- | docbook/wsdg_src/WSDG_chapter_build_intro.xml | 1086 | ||||
-rw-r--r-- | docbook/wsdg_src/WSDG_preface.xml | 80 |
2 files changed, 84 insertions, 1082 deletions
diff --git a/docbook/wsdg_src/WSDG_chapter_build_intro.xml b/docbook/wsdg_src/WSDG_chapter_build_intro.xml index 3f92ada4c9..93106cf0b1 100644 --- a/docbook/wsdg_src/WSDG_chapter_build_intro.xml +++ b/docbook/wsdg_src/WSDG_chapter_build_intro.xml @@ -3,1041 +3,67 @@ <chapter id="ChapterBuildIntro"> <title>Introduction</title> - + <section id="ChCodeOverview"> - <title>Source overview</title> - <para> - Wireshark consists of the following major parts: - <itemizedlist> - <listitem> - <para> - Packet dissection - in the /epan/dissector and /plugin/* directory - </para> - </listitem> - <listitem> - <para> - File I/O - using Wireshark’s own wiretap library - </para> - </listitem> - <listitem> - <para> - Capture - using the libpcap/winpcap library, in /wiretap - </para> - </listitem> - <listitem> - <para> - User interface - using the GTK+ (and corresponding) libraries - </para> - </listitem> - <listitem> - <para> - Help - using an external webbrowser and GTK text output - </para> - </listitem> - </itemizedlist> - Beside this, some other minor parts and additional helpers exist. - </para> + <title>Source overview</title> + <para> + Wireshark consists of the following major parts: + <itemizedlist> + <listitem><para> + Packet dissection - in the /epan/dissector and /plugin/* directory + </para></listitem> + <listitem><para> + File I/O - using Wireshark's own wiretap library + </para></listitem> + <listitem><para> + Capture - using the libpcap/winpcap library, in /wiretap + </para></listitem> + <listitem><para> + User interface - using the GTK+ (and corresponding) libraries + </para></listitem> + <listitem><para> + Help - using an external webbrowser and GTK text output + </para></listitem> + </itemizedlist> + Beside this, some other minor parts and additional helpers exist. + </para> + <para> + Currently there's no clean separation of the modules in the code. + However, as the development team switched from Concurrent Versions System + (CVS) to Subversion (SVN) some time ago, + directory cleanup is much easier now. So there's a chance that + the directory structure will become clean in the future. + </para> </section> <section id="ChCodeStyle"> - <title>Coding Styleguide</title> - <section id="ChCodeStylePortability"> - <title>Portability</title> - <simpara> - Wireshark runs on many platforms, and can be compiled with a number of - different compilers; here are some rules for writing code that will work - on multiple platforms. - </simpara> - <itemizedlist> - <listitem> - <simpara> - Don’t use C++ style comments (comments beginning with <literal>//</literal> and running - to the end of the line); Wireshark’s dissectors are written in C, and - thus run through C rather than C++ compilers, and not all C compilers - support C++ style comments (GCC does, but IBM’s C compiler for AIX, for - example, doesn’t do so by default). - </simpara> - </listitem> - <listitem> - <simpara> - In general, don’t use C99 features since some C compilers used to compile - Wireshark don’t support C99 (e.g., Microsoft C). - </simpara> - </listitem> - <listitem> - <simpara> - Don’t initialize variables in their declaration with non-constant - values. Not all compilers support this. E.g., don’t use - </simpara> - -<programlisting> guint32 i = somearray[2];</programlisting> - - <simpara>use</simpara> - -<programlisting> guint32 i; - i = somearray[2];</programlisting> - - <simpara>instead.</simpara> - </listitem> - <listitem> - <simpara> - Don’t use zero-length arrays; not all compilers support them. If an - array would have no members, just leave it out. - </simpara> - </listitem> - <listitem> - <simpara> - Don’t declare variables in the middle of executable code; not all C - compilers support that. Variables should be declared outside a - function, or at the beginning of a function or compound statement. - </simpara> - </listitem> - <listitem> - <simpara> - Don’t use anonymous unions; not all compilers support them. - Example: - </simpara> - -<programlisting>typedef struct foo { - guint32 foo; - union { - guint32 foo_l; - guint16 foo_s; - } u; /* have a name here */ -} foo_t;</programlisting> - - </listitem> - <listitem> - <simpara> - Don’t use <literal>uchar</literal>, <literal>u_char</literal>, - <literal>ushort</literal>, <literal>u_short</literal>, <literal>uint</literal>, - <literal>u_int</literal>, - <literal>ulong</literal>, <literal>u_long</literal> or - <literal>boolean</literal>; they aren’t defined on all platforms. - If you want an 8-bit unsigned quantity, use <literal>guint8</literal>; if you want an - 8-bit character value with the 8th bit not interpreted as a sign bit, - use <literal>guchar</literal>; if you want a 16-bit unsigned quantity, - use <literal>guint16</literal>; - if you want a 32-bit unsigned quantity, use <literal>guint32</literal>; and if you want - an <literal>int-sized</literal> unsigned quantity, - use <literal>guint</literal>; if you want a boolean, - use <literal>gboolean</literal>. Use <literal>%d</literal>, <literal>%u</literal>, - <literal>%x</literal>, and <literal>%o</literal> to print those types; - don’t use <literal>%ld</literal>, <literal>%lu</literal>, - <literal>%lx</literal>, or <literal>%lo</literal>, as longs are 64 bits long on - many platforms, but <literal>guint32</literal> is 32 bits long. - </simpara> - </listitem> - <listitem> - <simpara> - Don’t use <literal>long</literal> to mean signed 32-bit integer, and don’t use - <literal>unsigned long</literal> to mean unsigned 32-bit integer; - <literal>long</literal>’s are 64 bits - long on many platforms. Use <literal>gint32</literal> for signed 32-bit integers and use - <literal>guint32</literal> for unsigned 32-bit integers. - </simpara> - </listitem> - <listitem> - <simpara> - Don’t use <literal>long</literal> to mean signed 64-bit integer - and don’t use <literal>unsigned - long</literal> to mean unsigned 64-bit integer; A <literal>long</literal> is 32 bits long on - many other platforms. Don’t use <literal>long long</literal> or - <literal>unsigned long long</literal>, - either, as not all platforms support them; use <literal>gint64</literal> or - <literal>guint64</literal>, - which will be defined as the appropriate types for 64-bit signed and - unsigned integers. - </simpara> - <simpara> - On LLP64 data model systems (notably 64-bit Windows), <literal>int</literal> and - <literal>long</literal> - are 32 bits while <literal>size_t</literal> and <literal>ptrdiff_t</literal> are 64 bits. - This means that - the following will generate a compiler warning: - </simpara> - -<programlisting> int i; - i = strlen("hello, sailor"); /* Compiler warning */</programlisting> - <simpara> - Normally, you’d just make it a <literal>size_t</literal>. However, many GLib and Wireshark - functions won’t accept a <literal>size_t</literal> on LLP64: - </simpara> -<programlisting> size_t i; - - char greeting[] = "hello, sailor"; - guint byte_after_greet; - - i = strlen(greeting); - byte_after_greet = tvb_get_guint8(tvb, i); /* Compiler warning */</programlisting> - <simpara> - Try to use the appropriate data type when you can. When you can’t, you - will have to cast to a compatible data type, e.g., - </simpara> - -<programlisting> size_t i; - char greeting[] = "hello, sailor"; - guint byte_after_greet; - - i = strlen(greeting); - byte_after_greet = tvb_get_guint8(tvb, (gint) i); /* OK */</programlisting> - - <simpara>or</simpara> - -<programlisting> gint i; - char greeting[] = "hello, sailor"; - guint byte_after_greet; - - i = (gint) strlen(greeting); - byte_after_greet = tvb_get_guint8(tvb, i); /* OK */</programlisting> - - <simpara> - See <ulink url="http://www.unix.org/version2/whatsnew/lp64_wp.html"/> for more - information on the sizes of common types in different data models. - </simpara> - </listitem> - <listitem> - <simpara> - When printing or displaying the values of 64-bit integral data types, - don’t use <literal>%lld</literal>, <literal>%llu</literal>, - <literal>%llx</literal>, or <literal>%llo</literal> — not all platforms - support <literal>%ll</literal> for printing 64-bit integral data types. Instead, for - GLib routines, and routines that use them, such as all the routines in - Wireshark that take format arguments, use <literal>G_GINT64_MODIFIER</literal>, - for example: - </simpara> - -<programlisting> proto_tree_add_text(tree, tvb, offset, 8, - "Sequence Number: %" G_GINT64_MODIFIER "u", - sequence_number);</programlisting> - - </listitem> - <listitem> - <simpara> - When specifying an integral constant that doesn’t fit in 32 bits, don’t - use <literal>LL</literal> at the end of the constant — not all compilers use - <literal>LL</literal> for - that. Instead, put the constant in a call to the <literal>G_GINT64_CONSTANT()</literal> - macro, e.g., - </simpara> - -<programlisting> G_GINT64_CONSTANT(11644473600U)</programlisting> - - <simpara>rather than</simpara> - -<programlisting> 11644473600ULL</programlisting> - - </listitem> - <listitem> - <simpara> - Don’t assume that you can scan through a <literal>va_list</literal> - initialized by <literal>va_start()</literal> - more than once without closing it with <literal>va_end()</literal> and re-initalizing it with - <literal>va_start()</literal>. This applies even if you’re not scanning through it yourself, - but are calling a routine that scans through it, such as <literal>vfprintf()</literal> or - one of the routines in Wireshark that takes a format and a <literal>va_list</literal> as an - argument. You must do - </simpara> - -<programlisting> va_start(ap, format); - call_routine1(xxx, format, ap); - va_end(ap); - va_start(ap, format); - call_routine2(xxx, format, ap); - va_end(ap);</programlisting> - - <simpara>rather than</simpara> - -<programlisting> va_start(ap, format); - call_routine1(xxx, format, ap); - call_routine2(xxx, format, ap); - va_end(ap);</programlisting> - - </listitem> - <listitem> - <simpara> - Don’t use a label without a statement following it. For example, - something such as - </simpara> - -<programlisting> if (...) { - - ... - -done: - }</programlisting> - - <simpara>will not work with all compilers — you have to do</simpara> - -<programlisting> if (...) { - - ... - -done: - ; - }</programlisting> - - <simpara>with some statement, even if it’s a null statement, after the label.</simpara> - </listitem> - <listitem> - <simpara> - Don’t use <literal>bzero()</literal>, <literal>bcopy()</literal>, - or <literal>bcmp()</literal>; instead, use the ANSI C - routines - </simpara> - <itemizedlist> - <listitem> - <simpara> - <literal>memset()</literal> (with zero as the second argument, so that it sets - all the bytes to zero); - </simpara> - </listitem> - <listitem> - <simpara> - <literal>memcpy()</literal> or <literal>memmove()</literal> (note that the first and second - arguments to <literal>memcpy()</literal> are in the reverse order to the - arguments to <literal>bcopy()</literal>; note also that <literal>bcopy()</literal> is typically - guaranteed to work on overlapping memory regions, while - <literal>memcpy()</literal> isn’t, so if you may be copying from one region to a - region that overlaps it, use <literal>memmove()</literal>, - not <literal>memcpy()</literal> — but - <literal>memcpy()</literal> might be faster as a result of not guaranteeing - correct operation on overlapping memory regions); - </simpara> - </listitem> - <listitem> - <simpara> - <literal>memcmp()</literal> (note that <literal>memcmp()</literal> returns 0, 1, or -1, doing - an ordered comparison, rather than just returning 0 for equal - and 1 for not equal, as <literal>bcmp()</literal> does). - </simpara> - </listitem> - </itemizedlist> - <simpara> - Not all platforms necessarily have - <literal>bzero()</literal>/<literal>bcopy()</literal>/<literal>bcmp()</literal>, and - those that do might not declare them in the header file on which they’re - declared on your platform. - </simpara> - </listitem> - <listitem> - <simpara> - Don’t use <literal>index()</literal> or <literal>rindex()</literal>; - instead, use the ANSI C equivalents, - <literal>strchr()</literal> and <literal>strrchr()</literal>. Not all platforms necessarily have - <literal>index()</literal> or <literal>rindex()</literal>, - and those that do might not declare them in the - header file on which they’re declared on your platform. - </simpara> - </listitem> - <listitem> - <simpara> - Don’t fetch data from packets by getting a pointer to data in the packet - with <literal>tvb_get_ptr()</literal>, casting that pointer to a pointer to a structure, - and dereferencing that pointer. That pointer won’t necessarily be aligned - on the proper boundary, which can cause crashes on some platforms (even - if it doesn’t crash on an x86-based PC); furthermore, the data in a - packet is not necessarily in the byte order of the machine on which - Wireshark is running. Use the tvbuff routines to extract individual - items from the packet, or use <literal>proto_tree_add_item()</literal> and let it extract - the items for you. - </simpara> - </listitem> - <listitem> - <simpara> - Don’t use structures that overlay packet data, or into which you copy - packet data; the C programming language does not guarantee any - particular alignment of fields within a structure, and even the - extensions that try to guarantee that are compiler-specific and not - necessarily supported by all compilers used to build Wireshark. Using - bitfields in those structures is even worse; the order of bitfields - is not guaranteed. - </simpara> - </listitem> - <listitem> - <simpara> - Don’t use <literal>ntohs()</literal>, <literal>ntohl()</literal>, - <literal>htons()</literal>, or <literal>htonl()</literal>; the header - files required to define or declare them differ between platforms, and - you might be able to get away with not including the appropriate header - file on your platform but that might not work on other platforms. - Instead, use <literal>g_ntohs()</literal>, <literal>g_ntohl()</literal>, - <literal>g_htons()</literal>, and <literal>g_htonl()</literal>; - those are declared by <literal><glib.h></literal>, and you’ll - need to include that anyway, - as Wireshark header files that all dissectors must include use stuff from - <glib.h>. - </simpara> - </listitem> - <listitem> - <simpara> - Don’t fetch a little-endian value using <literal>tvb_get_ntohs()</literal> or - <literal>tvb_get_ntohl()</literal> and then using <literal>g_ntohs()</literal>, - <literal>g_htons()</literal>, <literal>g_ntohl()</literal>, - or <literal>g_htonl()</literal> on the resulting value — the <literal>g_</literal> - routines in question - convert between network byte order (big-endian) and - <emphasis role="strong">host</emphasis> byte order, - not <emphasis role="strong">little-endian</emphasis> byte order; - not all machines on which Wireshark runs - are little-endian, even though PCs are. Fetch those values using - <literal>tvb_get_letohs()</literal> and <literal>tvb_get_letohl()</literal>. - </simpara> - </listitem> - <listitem> - <simpara> - Don’t put a comma after the last element of an - <literal>enum</literal> — some compilers may - either warn about it (producing extra noise) or refuse to accept it. - </simpara> - </listitem> - <listitem> - <simpara> - Don’t include <literal><unistd.h></literal> without protecting it with - </simpara> - -<programlisting>#ifdef HAVE_UNISTD_H - -... - -#endif</programlisting> - - <simpara> - and, if you’re including it to get routines such - as <literal>close()</literal>, - <literal>read()</literal>, and <literal>write()</literal> - declared, also <literal>#include <io.h></literal> if present: - </simpara> - -<programlisting>#ifdef HAVE_IO_H -#include <io.h> -#endif</programlisting> - - <simpara> - in order to declare the Windows C library routines - <literal>_close()</literal>, <literal>_read()</literal>, - and <literal>_write()</literal>. - </simpara> - <simpara>Your file must include <literal><glib.h></literal> — which - many of the Wireshark header files include, so you might not have - to include it explicitly — in order to get <literal>close()</literal>, - <literal>read()</literal>, <literal>write()</literal>, etc. mapped - to <literal>_close()</literal>, <literal>_read()</literal>, - <literal>_write()</literal>, etc.</simpara> - </listitem> - <listitem> - <simpara> - Do not use <literal>open()</literal>, <literal>rename()</literal>, - <literal>mkdir()</literal>, <literal>stat()</literal>, - <literal>unlink()</literal>, <literal>remove()</literal>, - <literal>fopen()</literal>, <literal>freopen()</literal> directly. - Instead, use <literal>ws_open()</literal>, <literal>ws_rename()</literal>, - <literal>ws_mkdir()</literal>, <literal>ws_stat()</literal>, - <literal>ws_unlink()</literal>, <literal>ws_remove()</literal>, - <literal>ws_fopen()</literal>, - <literal>ws_freopen()</literal>; these wrapper functions change - the path and file name from - UTF8 to UTF16 on Windows allowing the functions to work correctly when the - path or file name contain non-ASCII characters. - </simpara> - </listitem> - <listitem> - <simpara> - When opening a file with <literal>ws_fopen()</literal>, - <literal>ws_freopen()</literal>, or <literal>ws_fdopen()</literal>, if - the file contains ASCII text, use <literal>"r"</literal>, - <literal>"w"</literal>, <literal>"a"</literal>, - and so on as the open mode; - if it contains binary data, use <literal>"rb"</literal>, - <literal>"wb"</literal>, and so on. On - Windows, if a file is opened in a text mode, writing a byte with the - value of octal 12 (newline) to the file causes two bytes, one with the - value octal 15 (carriage return) and one with the value octal 12, to be - written to the file, and causes bytes with the value octal 15 to be - discarded when reading the file (to translate between C’s UNIX-style - lines that end with newline and Windows’ DEC-style lines that end with - carriage return/line feed). - </simpara> - <simpara> - In addition, that also means that when opening or creating a binary - file, you must use <literal>ws_open()</literal> - (with <literal>O_CREAT</literal> and possibly <literal>O_TRUNC</literal> if the - file is to be created if it doesn’t exist), - and OR in the <literal>O_BINARY</literal> flag. - That flag is not present on most, if not all, UNIX systems, so you must - also do - </simpara> - -<programlisting>#ifndef O_BINARY -#define O_BINARY 0 -#endif</programlisting> - - <simpara>to properly define it for UNIX (it’s not necessary on UNIX).</simpara> - </listitem> - <listitem> - <simpara> - Don’t use forward declarations of static arrays without a specified size - in a fashion such as this: - </simpara> - -<programlisting>static const value_string foo_vals[]; - -... - -static const value_string foo_vals[] = { - { 0, "Red" }, - { 1, "Green" }, - { 2, "Blue" }, - { 0, NULL } -};</programlisting> - - <simpara> - as some compilers will reject the first of those statements. Instead, - initialize the array at the point at which it’s first declared, so that - the size is known. - </simpara> - </listitem> - <listitem> - <simpara> - Don’t put a comma after the last tuple of an initializer of an array. - </simpara> - </listitem> - <listitem> - <simpara> - For <literal>#define</literal> names and <literal>enum</literal> member names, - prefix the names with a tag so - as to avoid collisions with other names — this might be more of an issue - on Windows, as it appears to <literal>#define</literal> names such as - <literal>DELETE</literal> and - <literal>OPTIONAL</literal>. - </simpara> - </listitem> - <listitem> - <simpara> - Don’t use the “numbered argument” feature that - many UNIX <literal>printf</literal>’s - implement, e.g., - </simpara> - -<programlisting> g_snprintf(add_string, 30, " - (%1$d) (0x%1$04x)", value);</programlisting> - - <simpara> - as not all UNIX <literal>printf</literal>’s implement it, - and Windows <literal>printf</literal> doesn’t appear - to implement it. Use something like - </simpara> - -<programlisting> g_snprintf(add_string, 30, " - (%d) (0x%04x)", value, value);</programlisting> - - <simpara>instead.</simpara> - </listitem> - <listitem> - <simpara> - Don’t use “variadic macros”, such as - </simpara> - -<programlisting>#define DBG(format, args...) fprintf(stderr, format, ## args)</programlisting> - - <simpara> - as not all C compilers support them. Use macros that take a fixed - number of arguments, such as - </simpara> - -<programlisting>#define DBG0(format) fprintf(stderr, format) -#define DBG1(format, arg1) fprintf(stderr, format, arg1) -#define DBG2(format, arg1, arg2) fprintf(stderr, format, arg1, arg2) - ...</programlisting> - - <simpara>or something such as</simpara> - -<programlisting>#define DBG(args) printf args</programlisting> - - </listitem> - <listitem> - <simpara> - Don’t use - </simpara> - -<programlisting> case N ... M:</programlisting> - - <simpara>as that’s not supported by all compilers.</simpara> - </listitem> - <listitem> - <simpara> - Use <literal>g_snprintf()</literal> instead of <literal>snprintf()</literal>. - <literal>snprintf()</literal> is not available on all platforms, - so it’s a good idea to use the - <literal>g_snprintf()</literal> function declared by <glib.h> instead. - </simpara> - </listitem> - <listitem> - <simpara> - Use <literal>mkstemp()</literal> instead of <literal>tmpnam()</literal>. - <literal>tmpnam()</literal> is insecure and should not be used - any more. Wireshark brings its - own <literal>mkstemp()</literal> implementation for use on platforms - that lack <literal>mkstemp()</literal>. - Note: <literal>mkstemp()</literal> does not accept <literal>NULL</literal> - as a parameter. - </simpara> - </listitem> - <listitem> - <simpara> - The pointer returned by a call to <literal>tvb_get_ptr()</literal> - is not guaranteed to be - aligned on any particular byte boundary; this means that you cannot - safely cast it to any data type other than a pointer to <literal>char</literal>, - <literal>unsigned char</literal>, <literal>guint8</literal>, - or other one-byte data types. You cannot, - for example, safely cast it to a pointer to a structure, and then access - the structure members directly; on some systems, unaligned accesses to - integral data types larger than 1 byte, and floating-point data types, - cause a trap, which will, at best, result in the OS slowly performing an - unaligned access for you, and will, on at least some platforms, cause - the program to be terminated. - </simpara> - </listitem> - <listitem> - <simpara> - Wireshark supports platforms with GLib 2.4[.x]/GTK+ 2.4[.x] or newer. - If a Glib/GTK+ mechanism is available only in Glib/GTK+ versions - newer than 2.4/2.4 then use <literal>#if GTK_CHECK_VERSION(...)</literal> - to conditionally - compile code using that mechanism. - </simpara> - </listitem> - <listitem> - <simpara> - When different code must be used on UN*X and Win32, use a - <literal>#if</literal> or <literal>#ifdef</literal> - that tests <literal>_WIN32</literal>, not <literal>WIN32</literal>. - Try to write code portably whenever - possible, however; note that there are some routines in Wireshark with - platform-dependent implementations and platform-independent APIs, such - as the routines in <filename>epan/filesystem.c</filename>, - allowing the code that calls it to - be written portably without <literal>#ifdefs</literal>. - </simpara> - </listitem> - </itemizedlist> - </section> - <section id="ChCodeStyleStringHandling"> - <title>String handling</title> - <itemizedlist> - <listitem> - <simpara> - Do not use functions such as <literal>strcat()</literal> or <literal>strcpy()</literal>. - A lot of work has been done to remove the existing calls to these functions and - we do not want any new callers of these functions. - </simpara> - <simpara> - Instead, use <literal>g_snprintf()</literal> since that - function will, if used correctly, prevent - buffer overflows for large strings. - </simpara> - </listitem> - <listitem> - <simpara> - When using a buffer to create a string, do not use a buffer stored on the stack. - I.e., do not use a buffer declared as - </simpara> - -<programlisting> char buffer[1024];</programlisting> - - <simpara> - instead, allocate a buffer dynamically using the string-specific - or plain <literal>emem</literal> - routines (see <filename>doc/README.malloc</filename>) such as - </simpara> - -<programlisting> emem_strbuf_t *strbuf; - strbuf = ep_strbuf_new_label(""); - ep_strbuf_append_printf(strbuf, ...</programlisting> - - <simpara>or</simpara> - -<programlisting> char *buffer=NULL; - ... - -#define MAX_BUFFER 1024 - - buffer=ep_alloc(MAX_BUFFER); - buffer[0]='\0'; - ... - g_snprintf(buffer, MAX_BUFFER, ...</programlisting> - - <simpara> - This avoids corrupting the stack in case there is a bug in your code - that accidentally writes beyond the end of the buffer. - </simpara> - </listitem> - <listitem> - <simpara> - If you write a routine that will create and return a pointer to a filled-in - string and if that buffer will not be further processed or appended to after - the routine returns (except being added to the proto tree), - do not preallocate the buffer to fill in and pass as a parameter. Instead, - pass a pointer to a pointer to the function and return a pointer to an - <literal>emem</literal> allocated buffer that will be automatically freed. - (see <filename>doc/README.malloc</filename>) - </simpara> - <simpara>I.e., do not write code such as</simpara> - -<programlisting>static void -foo_to_str(char *string, ... ) { - <fill in string> -} - ... - char buffer[1024]; - ... - foo_to_str(buffer, ... - proto_tree_add_text(... buffer ...</programlisting> - - <simpara>instead, write the code as</simpara> - -<programlisting>static void -foo_to_str(char **buffer, ...) { - -#define MAX_BUFFER x - - *buffer=ep_alloc(MAX_BUFFER); - <fill in *buffer> -} - ... - char *buffer; - ... - foo_to_str(&buffer, ... - proto_tree_add_text(... *buffer ...);</programlisting> - - </listitem> - <listitem> - <simpara> - Allocate buffers by using <literal>ep_alloc()</literal>. - These buffers are very fast and nice. They are all - automatically freed by calling <literal>g_free()</literal> when - the dissection of the current packet ends so you - don’t have to worry about doing so - explicitly in order to not leak memory. - Please read <filename>doc/README.malloc</filename>. - </simpara> - </listitem> - <listitem> - <simpara> - Don’t use non-ASCII characters in source files; not all compiler - environments will be using the same encoding for non-ASCII characters, - and at least one compiler (Microsoft’s Visual C) will, in environments - with double-byte character encodings, such as many Asian environments, - fail if it sees a byte sequence in a source file that doesn’t correspond - to a valid character. This causes source files using either an ISO - 8859/n single-byte character encoding or UTF-8 to fail to compile. Even - if the compiler doesn’t fail, there is no guarantee that the compiler, - or a developer’s text editor, will interpret the characters the way you - intend them to be interpreted. - </simpara> - </listitem> - </itemizedlist> - </section> - <section id="ChCodeStyleRobustness"> - <title>Robustness</title> - <simpara> - Wireshark is not guaranteed to read only network traces that contain - correctly-formed packets. Wireshark is commonly used to track down networking - problems, and the problems might be due to a buggy protocol implementation - sending out bad packets. - </simpara> - <simpara> - Therefore, protocol dissectors not only have to be able to handle - correctly-formed packets without, for example, crashing or looping - infinitely, they also have to be able to handle - <emphasis role="strong">incorrectly</emphasis>-formed - packets without crashing or looping infinitely. - </simpara> - <simpara> - Here are some suggestions for making dissectors more robust in the face - of incorrectly-formed packets: - </simpara> - <itemizedlist> - <listitem> - <simpara> - Do <emphasis role="strong">NOT</emphasis> use <literal>g_assert()</literal> - or <literal>g_assert_not_reached()</literal> in dissectors. - </simpara> - <simpara> - <emphasis role="strong">NO</emphasis> value in a packet’s data - should be considered “wrong” in the sense - that it’s a problem with the dissector if found; if it cannot do - anything else with a particular value from a packet’s data, the - dissector should put into the protocol tree an indication that the - value is invalid, and should return. The “expert” mechanism should be - used for that purpose. - </simpara> - <simpara> - If there is a case where you are checking not for an invalid data item - in the packet, but for a bug in the dissector (for example, an - assumption being made at a particular point in the code about the - internal state of the dissector), use the <literal>DISSECTOR_ASSERT()</literal> - macro for - that purpose; this will put into the protocol tree an indication that - the dissector has a bug in it, and will not crash the application. - </simpara> - </listitem> - <listitem> - <simpara> - If you are allocating a chunk of memory to contain data from a packet, - or to contain information derived from data in a packet, and the size of - the chunk of memory is derived from a size field in the packet, make - sure all the data is present in the packet before allocating the buffer. - Doing so means that: - </simpara> - <orderedlist numeration="arabic"> - <listitem> - <simpara> - Wireshark won’t leak that chunk of memory if an attempt to - fetch data not present in the packet throws an exception. - </simpara> - </listitem> - <listitem> - <simpara> - Wireshark won’t crash trying to allocate an absurdly-large chunk of - memory if the size field has a bogus large value. - </simpara> - </listitem> - </orderedlist> - <simpara> - If you’re fetching into such a chunk of memory a string from the buffer, - and the string has a specified size, you can use <literal>tvb_get_*_string()</literal>, - which will check whether the entire string is present before allocating - a buffer for the string, and will also put a trailing <literal>'\0'</literal> at the end of - the buffer. - </simpara> - <simpara> - If you’re fetching into such a chunk of memory a 2-byte Unicode string - from the buffer, and the string has a specified size, you can use - <literal>tvb_get_ephemeral_faked_unicode()</literal>, which will check whether the entire - string is present before allocating a buffer for the string, and will also - put a trailing <literal>'\0'</literal> at the end of the buffer. The resulting string will be - a sequence of single-byte characters; the only Unicode characters that - will be handled correctly are those in the ASCII range. (Wireshark’s - ability to handle non-ASCII strings is limited; it needs to be - improved). - </simpara> - <simpara> - If you’re fetching into such a chunk of memory a sequence of bytes from - the buffer, and the sequence has a specified size, you can use - <literal>tvb_memdup()</literal>, which will check whether the entire sequence is present - before allocating a buffer for it. - </simpara> - <simpara> - Otherwise, you can check whether the data is present by using - <literal>tvb_ensure_bytes_exist()</literal> or by getting a pointer to the data by using - <literal>tvb_get_ptr()</literal>, although note that there might be problems with using - the pointer from <literal>tvb_get_ptr()</literal> (see the item on this in the - Portability section above, and the next item below). - </simpara> - <simpara> - Note also that you should only fetch string data into a fixed-length - buffer if the code ensures that no more bytes than will fit into the - buffer are fetched (“the protocol ensures” isn’t good enough, as - protocol specifications can’t ensure only packets that conform to the - specification will be transmitted or that only packets for the protocol - in question will be interpreted as packets for that protocol by - Wireshark). If there’s no maximum length of string data to be fetched, - routines such as <literal>tvb_get_*_string()</literal> are safer, as they allocate a buffer - large enough to hold the string. (Note that some variants of this call - require you to free the string once you’re finished with it). - </simpara> - </listitem> - <listitem> - <simpara> - If you have gotten a pointer using <literal>tvb_get_ptr()</literal>, you must make sure - that you do not refer to any data past the length passed as the last - argument to <literal>tvb_get_ptr()</literal>; while the various - <literal>tvb_get...()</literal> routines - perform bounds checking and throw an exception if you refer to data not - available in the tvbuff, direct references through a pointer gotten from - <literal>tvb_get_ptr()</literal> do not do any bounds checking. - </simpara> - </listitem> - <listitem> - <simpara> - If you have a loop that dissects a sequence of items, each of which has - a length field, with the offset in the tvbuff advanced by the length of - the item, then, if the length field is the total length of the item, and - thus can be zero, you <emphasis role="strong">MUST</emphasis> - check for a zero-length item and abort the - loop if you see one. Otherwise, a zero-length item could cause the - dissector to loop infinitely. You should also check that the offset, - after having the length added to it, is greater than the offset before - the length was added to it, if the length field is greater than 24 bits - long, so that, if the length value is <emphasis role="strong">very</emphasis> - large and adding it to the - offset causes an overflow, that overflow is detected. - </simpara> - </listitem> - <listitem> - <simpara> - If you have a - </simpara> - -<programlisting> for (i = {start}; i < {end}; i++)</programlisting> - - <simpara> - loop, make sure that the type of the loop index variable is large enough - to hold the maximum {end} variable plus 1; Otherwise the loop index variable - can overflow before it ever reaches its maximum value. In - particular, be very careful when using <literal>gint8</literal>, - <literal>guint8</literal>, <literal>gint16</literal>, or <literal>guint16</literal> - variables as loop indices; you almost always want to use an - <literal>int</literal>/<literal>gint</literal> - or <literal>unsigned int</literal>/<literal>guint</literal> - as the loop index rather than a shorter type. - </simpara> - </listitem> - <listitem> - <simpara> - If you are fetching a length field from the buffer, corresponding to the - length of a portion of the packet, and subtracting from that length a - value corresponding to the length of, for example, a header in the - packet portion in question, <emphasis role="strong">ALWAYS</emphasis> - check that the value of the length - field is greater than or equal to the length you’re subtracting from it, - and report an error in the packet and stop dissecting the packet if it’s - less than the length you’re subtracting from it. Otherwise, the - resulting length value will be negative, which will either cause errors - in the dissector or routines called by the dissector, or, if the value - is interpreted as an unsigned integer, will cause the value to be - interpreted as a very large positive value. - </simpara> - </listitem> - <listitem> - <simpara> - Any tvbuff offset that is added to as processing is done on a packet - should be stored in a 32-bit variable, such as an <literal>int</literal>; if you store it - in an 8-bit or 16-bit variable, you run the risk of the variable - overflowing. - </simpara> - </listitem> - <listitem> - <simpara> - Use <literal>g_snprintf()</literal> instead of <literal>sprintf()</literal>. - Prevent yourself from using the <literal>sprintf()</literal> - function, as it does not test the - length of the given output buffer and might be writing into unintended memory - areas. This function is one of the main causes of security problems like buffer - exploits and many other bugs that are very hard to find. It’s much better to - use the <literal>g_snprintf()</literal> function declared by - <literal><glib.h></literal> instead. - </simpara> - </listitem> - <listitem> - <simpara> - You should test your dissector against incorrectly-formed packets. - </simpara> - <simpara> - This can be done using the randpkt and editcap utilities that come with the - Wireshark distribution. Testing using randpkt can be done by generating - output at the same layer as your protocol, and forcing Wireshark/TShark - to decode it as your protocol, e.g., if your protocol sits on top of UDP: - </simpara> - -<programlisting>randpkt -c 50000 -t dns randpkt.pcap -tshark -nVr randpkt.pcap -d udp.port==53,<myproto></programlisting> - - <simpara> - Testing using editcap can be done using preexisting capture files and the - “-E” flag, which introduces errors in a capture file. E.g., - </simpara> - -<programlisting>editcap -E 0.03 infile.pcap outfile.pcap -tshark -nVr outfile.pcap</programlisting> - - <simpara> - The script <literal>tools/fuzz-test.sh</literal> is available to help - automate these tests. - </simpara> - </listitem> - </itemizedlist> - </section> - <section id="ChCodeStyleNamingConvention"> - <title>Naming convention</title> - <simpara> - Wireshark uses the underscore_convention rather than the InterCapConvention for - function names, so new code should probably use underscores rather than - intercaps for functions and variable names. This is especially important if you - are writing code that will be called from outside your code. We are just - trying to keep things consistent for other developers. - </simpara> - </section> - <section id="ChCodeStyleWhiteSpaceConvention"> - <title>White space convention</title> - <simpara> - Please avoid using tab expansions different from 8 column widths, as not all - text editors in use by the developers support this. For a detailed - discussion of tabs, spaces, and indentation, - see <ulink url="http://www.jwz.org/doc/tabs-vs-spaces.html"/> - </simpara> - <simpara> - When creating a new file, you are free to choose an indentation logic. - Most of the files in Wireshark tend to use 2-space or 4-space - indentation. You are encouraged to write a short comment on the - indentation logic at the beginning (or end) of this new file. - The tabs-vs-spaces document above provides - examples of Emacs and vi modelines for this purpose. - </simpara> - <simpara> - See <ulink url="http://www.wireshark.org/tools/modelines.html">Editor Modeline Generator</ulink> - for a form which can be used to generate modeline blurbs to your taste that - you can copy and paste into the file that you’re editing. - </simpara> - <simpara> - When editing an existing file, try following the existing indentation - logic and even if it very tempting, never ever use a restyler/reindenter - utility on an existing file. If you run across wildly varying - indentation styles within the same file, it might be helpful to send a - note to wireshark-dev for guidance. - </simpara> - </section> - <section id="ChCodeStyleCompilerWarnings"> - <title>Compiler warnings</title> - <simpara> - You should write code that is free of compiler warnings. Such warnings will - often indicate questionable code and sometimes even real bugs, so it’s best - to avoid warnings at all. - </simpara> - <simpara> - The compiler flags in the Makefiles are set to “treat warnings as errors”, - so your code won’t even compile when warnings occur. - </simpara> - </section> - </section> + <title>Coding styleguides</title> + <para> + The coding styleguides for Wireshark can be found in the "Code style" + section of the file <filename>doc/README.developer</filename>. + </para> + </section> - <section id="ChCodeGLib"> - <title>The GLib library</title> - <para> - Glib is used as a basic platform abstraction library; it’s not related to - GUI things. - </para> - <para> - To quote the Glib documentation: - <blockquote> - <simpara> - <quote> - GLib is a general-purpose utility - library, which provides many useful - data types, macros, type conversions, string utilities, file utilities, - a main loop abstraction, and so on. It works on many UNIX-like platforms, - Windows, OS/2 and BeOS. GLib is released under the GNU Library General - Public License (GNU LGPL). - </quote> - </simpara> - </blockquote> - </para> - <para> - GLib contains lots of useful things for platform independent development. - See <ulink url="http://library.gnome.org/devel/glib/stable/"/> - for details about GLib. - </para> - </section> + <section id="ChCodeGLib"> + <title>The GLib library</title> + <para> + Glib is used as a basic platform abstraction library, it's not related to + GUI things. + </para> + <para> + To quote the Glib documentation: <quote>GLib is a general-purpose utility + library, which provides many useful + data types, macros, type conversions, string utilities, file utilities, + a main loop abstraction, and so on. It works on many UNIX-like platforms, + Windows, OS/2 and BeOS. GLib is released under the GNU Library General + Public License (GNU LGPL).</quote> + </para> + <para> + GLib contains lot's of useful things for platform independent development. + See <ulink url="http://developer.gnome.org/doc/API/2.0/glib/index.html"/> + for details about GLib. + </para> + </section> </chapter> <!-- End of WSDG Chapter Build Introduction --> diff --git a/docbook/wsdg_src/WSDG_preface.xml b/docbook/wsdg_src/WSDG_preface.xml index bd68748339..7c8cded6d8 100644 --- a/docbook/wsdg_src/WSDG_preface.xml +++ b/docbook/wsdg_src/WSDG_preface.xml @@ -5,22 +5,22 @@ <section id="PreForeword"> <title>Foreword</title> <para> - This book tries to give you a guide to start your own experiments into + This book tries to give you a guide to start your own experiments into the wonderful world of Wireshark development. </para> <para> - Developers who are new to Wireshark often have a hard time getting + Developers who are new to Wireshark often have a hard time getting their development environment up and running. This is especially true for Win32 developers, as a lot of the tools and methods used when building Wireshark are much more common in the UNIX world than on Win32. </para> <para> - The first part of this book will describe how to set up the environment + The first part of this book will describe how to set up the environment needed to develop Wireshark. </para> <para> - The second part of this book will describe how to change the Wireshark + The second part of this book will describe how to change the Wireshark source code. </para> <para> @@ -31,18 +31,18 @@ <section id="PreAudience"> <title>Who should read this document?</title> <para> - The intended audience of this book is anyone going into the development of + The intended audience of this book is anyone going into the development of Wireshark. </para> <para> - This book is not intended to explain the usage of Wireshark in general. - Please refer the - <ulink url="&WiresharkUsersGuidePage;">Wireshark User's Guide</ulink> + This book is not intended to explain the usage of Wireshark in general. + Please refer the + <ulink url="&WiresharkUsersGuidePage;">Wireshark User's Guide</ulink> about Wireshark usage. </para> <para> - By reading this book, you will learn how to develop Wireshark. It will - hopefully guide you around some common problems that frequently appear for + By reading this book, you will learn how to develop Wireshark. It will + hopefully guide you around some common problems that frequently appear for new (and sometimes even advanced) developers of Wireshark. </para> </section> @@ -50,7 +50,7 @@ <section id="PreAck"> <title>Acknowledgements</title> <para> - The authors would like to thank the whole Wireshark team for their + The authors would like to thank the whole Wireshark team for their assistance. In particular, the authors would like to thank: <itemizedlist> <listitem> @@ -67,7 +67,7 @@ </itemizedlist> </para> <para> - The authors would also like to thank the following people for their + The authors would also like to thank the following people for their helpful feedback on this document: <itemizedlist> <listitem> @@ -75,7 +75,7 @@ </para> </listitem> </itemizedlist> - And of course a big thank you to the many, many contributors of the + And of course a big thank you to the many, many contributors of the Wireshark development community! </para> </section> @@ -83,61 +83,37 @@ <section id="PreAbout"> <title>About this document</title> <para> - This book was developed by + This book was developed by <ulink url="mailto:&AuthorEmail;">Ulf Lamping</ulink>. </para> <para> It is written in DocBook/XML. </para> <para> - Portions of this book were originally contained in - the file <filename>doc/README.developer</filename> whose contributors - were listed as: - </para> - <literallayout class="monospaced"> - James Coe <jammer[AT]cin.net> - Gilbert Ramirez <gram[AT]alumni.rice.edu> - Jeff Foster <jfoste[AT]woodward.com> - Olivier Abad <oabad[AT]cybercable.fr> - Laurent Deniel <laurent.deniel[AT]free.fr> - Gerald Combs <gerald[AT]wireshark.org> - Guy Harris <guy[AT]alum.mit.edu> - Ulf Lamping <ulf.lamping[AT]web.de> - </literallayout> - <para> - <filename>README.developer</filename> was converted to DocBook/XML - and integrated into this book by - <literal>Bill Meier <wmeier[AT]newsguy.com></literal> - </para> - </section> - - <section id="PreNotation"> - <title>Notation</title> - <para> - You will find some specially marked parts in this book: + You will find some specially marked parts in this book: </para> <warning><title>This is a warning!</title> - <para> - You should pay attention to a warning, as otherwise data loss might occur. - </para> + <para> + You should pay attention to a warning, as otherwise data loss might occur. + </para> </warning> <note><title>This is a note!</title> - <para> - A note will point you to common mistakes and things that might not be - obvious. - </para> + <para> + A note will point you to common mistakes and things that might not be + obvious. + </para> </note> <tip><title>This is a tip!</title> - <para> - Tips will be helpful for your everyday work developing Wireshark. - </para> + <para> + Tips will be helpful for your everyday work developing Wireshark. + </para> </tip> </section> - + <section id="PreDownload"> <title>Where to get the latest copy of this document?</title> <para> - The latest copy of this documentation can always be found at: + The latest copy of this documentation can always be found at: <ulink url="&WiresharkDevsGuidePage;">&WiresharkDevsGuidePage;</ulink> in PDF (A4 and US letter), HTML (single and chunked) and CHM format. </para> @@ -146,7 +122,7 @@ <section id="PreFeedback"> <title>Providing feedback about this document</title> <para> - Should you have any feedback about this document, please send it + Should you have any feedback about this document, please send it to the authors through <ulink url="mailto:&WiresharkDevMailList;">&WiresharkDevMailList;</ulink>. </para> </section> |