aboutsummaryrefslogtreecommitdiffstats
path: root/epan/proto.c
AgeCommit message (Collapse)AuthorFilesLines
2023-03-03epan: Do not try to add a bits item with negative bit lengthJohn Thacker1-2/+9
A negative number of bits in a bit item isn't allowed. Treat it as a very large number (i.e., as unsigned), and throw a ReportedBoundsError. This was already happening in most cases, but not in the edge case of a number of bits between -1 and -7 (which was being rounded up to 0 octets and passed our length checks.) Fix #18877
2023-01-16epan: FT_FRAMENUM strings are specialJohn Thacker1-1/+1
When creating the text of a custom column, don't call hf_try_val_to_str, etc. on FT_FRAMENUM fields that have hfinfo->strings. It refers to the ft_framenum_type. Prevents crashing on custom columns of FT_FRAMENUM fields of a type different than FT_FRAMENUM_NONE.
2023-01-13proto(.c): Fix Argument with 'nonnull' attribute passed nullAlexis La Goutte1-21/+22
2023-01-12CMake: Reverse debug macrosJoão Valverde1-3/+0
Originally WS_DISABLE_DEBUG was chosen to be similar to G_DISABLE_ASSERT and NDEBUG. However generator expressions are essential for modern CMake but the syntax is weird and having to use negations makes it ten-fold worse. Remove the negation. Instead of changing the CMake variable reverse the macro definition for WS_DISABLE_DEBUG. The $<CONFIG:cgs> generator expression with multiple config arguments requires CMake >= 3.19 so we can't use that yet for a further syntactical simplification.
2023-01-03proto: Fix validity test for proto namesJoão Valverde1-6/+6
We want at least one letter. Because protocol names can contain dots and hyphens testing for !isdigit is not enough to make it dissimilar to decimal numeric expressions.
2022-12-27WIP: Check types for _add_bits_ functions, and ensure no maskMartin Mathieson1-0/+2
2022-12-26epan: Allow FT_IPv4, FT_IPv6 custom columns to be resolved or not.John Thacker1-6/+9
Similar to commit dbb9fe2a37c7fcf6281, proto_item_fill_display_label now uses address_to_display for FT_IPv4, FT_IPv6, and FT_FCWWN, the other three address types that double as field types and which have optional name resolution. Add these to the list of types that, if present in a custom column, has the GUI enable the checkbox to switch between "resolved" (names) and not (values). This allows adding custom columns with these field types with both resolved and non resolved text. Note that the appropriate Name Resolution preference settings must be enabled for the type as well.
2022-12-17epan: Allow FT_ETHER custom columns to be resolved or notJohn Thacker1-2/+3
Have proto_item_fill_display_label (which is used for custom columns resolved type and packet diagrams) use address_to_display for FT_ETHER. This is resolved when name resolution for MAC Addresses is enabled. Add FT_ETHER to the list of types that, if present in a custom column, has the GUI enable the checkbox to switch between "resolved" and "unresolved" text. This allows FT_ETHER custom columns to be displayed as either resolved addresses or unresolved. (Note that to be displayed as resolved, the column resolved option must be checked and the name resolution preference enabled.) Fix #18665
2022-12-16proto: Custom column concatenation and truncationJohn Thacker1-171/+57
Fix some issues regarding custom columns near the maximum size: Fix where when near the column limit, a comma was not being added to separate a value but the first character of the next field was, resulting in an invalid field. Create the "result" and the "expr" (resolved and unresolved) separately to address issue where for multifield custom columns of different types, the "result" might be truncated without "expr" necessarily being so. This created problems when concatenating the end of the result to the expr for certain types later. Avoid passing a NULL to snprintf for integer columns of BASE_NONE of unexpected value. Indicate when the custom column has been truncated, since after commit e449b560c02d363603224a7 this string value is no longer used to create the filter and is for display only. Also use the label truncation function so that truncatation is on UTF-8 boundaries. Fix #17618
2022-12-03wmem: Remove strbuf max size parameterJoão Valverde1-1/+1
This parameter was introduced as a safeguard for bugs that generate an unbounded string but its utility for that purpose is doubtful and the way it is being used creates problems with invalid truncation of UTF-8 strings. Rename wmem_strbuf_sized_new() with a better name.
2022-11-18Fix some cppcheck issuesMartin Mathieson1-1/+1
2022-11-17CMake: Move clang warningsJoão Valverde1-10/+10
Move clang warnings to normal set. Let the CMake compatibility check control the warning. Fix or work-around -Wunreachable warnings in the code.
2022-11-08tshark: update man to explain why some fields are skipped in elastic-mapping.Dario Lombardo1-0/+1
2022-11-02epan: Simplify construct_match_selected_stringJohn Thacker1-91/+19
Since fvalue_to_string_repr does take the field base as a parameter and that affects the representation, an existing comment is no longer true, and we can get rid of a large amount of duplicative special handling for integer-based types.
2022-11-02epan: Properly generate filter expressions for custom columnsJohn Thacker1-0/+102
Properly generate filter expressions for custom columns by using proto_construct_match_selected_string on each value and then joining them together later instead of trying to split the column expression value. This ensures that escaping is done properly for display filter strings, that commas internal to field values are not confused with commas between occurrences, that for multifield columns we can distinguish which field each value matches, etc. It's not entirely clear whether AND or OR logic is appropriate for multiple occurrences; currently OR is used. Bump glib requirement to 2.54 for g_ptr_array_find_with_equal_func (this doesn't drop support for any major distribution that already meets our other library requirements, like Qt.) Fix #18001.
2022-10-26ftypes: Do not sanitize strings for UTF-8 errorsJoão Valverde1-1/+2
The ftype itself is encoding agnostic. In the case of literal display filter strings it is possible and legal to contain invalid UTF-8. Maybe it shouldn't be but that requires a user-friendly diagnostic message, not silently sanitizing the string as is done currently (only a debug message is printed in that case). Do the debug checks in proto_tree_set_string() instead. That still detects dissector code that might need fixing, which was the purpose for this check. Improve documentation and add admonition for proto_tree_add_string(). Ping #18521.
2022-10-26Rename ws_label_strcat() to ws_label_strcpy()João Valverde1-13/+13
The semantics of ws_label_strcat() are closer to g_strlcpy() so rename the function to reflect that.
2022-10-26S7Comm: Fix invalid UTF-8 value string charsJoão Valverde1-2/+1
Fixes #18533.
2022-10-20epan/proto: Replace format text()João Valverde1-57/+38
The proto.h APIs expect valid UTF-8 so replace uses of format_text() with a label copy function that just does formatting and does not check for encoding errors. Avoid multiple levels of temporary string allocations. Make sure the copy does not truncate a multibyte character and produce invalid strings. Add debug checks for UTF-8 encoding errors instead. We escape C0 and C1 control codes (because control codes) and ASCII whitespace (and bell). Overall the goal is to be more efficient and optimized and help detect misuse of APIs by passing invalid UTF-8. Add a unit test for ws_label_strcat.
2022-10-19epan: centralize SDNV processing along other similar varint typesBrian Sipos1-6/+6
This avoids having general-purpose decoding happening in non-DLL-exported functions defined in a dissector for #18478, and removes unused functions and avoids duplicate decoding. This also removes unnecessary early exit conditions for #18145. Unit test cases for varint decoding are added to verify this.
2022-09-28epan: Add BASE_STR_WSP and use itJoão Valverde1-15/+23
This field display type formats the representation string of FT_STRING by replacing all space character with ' '. Instead of "A line end\n" it will output "A line end ". This allows cleaner code using proto_tree_add_item() and avoids the problematic pattern proto_tree_add_string(..., tvb_format_text_wsp(...)); because we only want to affect the way the string value is displayed, not the actual field value stored.
2022-09-27Add some UTF-8 debug checks with a compile time flagJoão Valverde1-11/+1
Some older dissectors that predate Unicode and parse text protocols are prone to generate invalid UTF-8 strings. This is a bug and can have safety implications. For example passing invalid UTF-8 to proto_tree_add_string() is a common bug. There are safeguards in format_text() but this should not be relied on as a general solution to the problem. For one, as the name implies, it is only used with representation of a field value, which is not the same as the value itself of an FT_STRING field. Issue #18317 shows another reason why. For now this compile flag only enables extra checks for string ftypes, which covers a subset of proto.h APIs including proto_tree_append_string(). Later is should be extended to other interfaces. This is also not expected to be disabled for release builds because there are still many dissectors that do not correctly handle strings. More work is needed to 1) identify them and 2) fix them. Ping #18317
2022-09-23epan: Prevent crash when asserting on unvalidated UTF-8 stringsJohn Thacker1-3/+6
If UTF-8 validation fails, set the fvalue to a sanitized value so that calls later to retrieve it don't null deference and crash. We could, especially for a release, disable the assertion and just sanitize bad strings. Related to #18363
2022-09-21proto: Validate add_string values as UTF-8John Thacker1-0/+8
When a dissector directly adds a string value through proto_tree_add_string[_format_value], validate that it is UTF-8 so that only valid UTF-8 strings are used internally, and written to output (whether text, JSON, or XML.) (We were treating it as a UTF-8 string anyway, but not validating it.) If the string passed in is not UTF-8, that's a dissector bug Dissectors that use API functions like tvb_get_string_enc will always produce valid UTF-8, but some do their own processing. Fix #18317
2022-09-11proto: Ensure that representation strings are printable, valid UTF-8John Thacker1-17/+57
The proto_item_XXX_text() routines and proto_tree_add_XXX_format[_value] functions allow dissectors to alter the representation string for a protocol tree item with data that may come from arbitrary packet data. These values are displayed by tshark or wireshark, so they should made into printable, valid UTF-8. This means that dissectors no longer need to call format_text before using those functions (though, if they want to produce some other kind of printable string, such as with format_text_wsp, they still can.) Also, mark when appending and prepending text truncates a string that was not previously truncated (except for a small number of cases where it is difficult to determine if it was truncated before.) Part of #18317
2022-09-08proto: Fix truncation of UTF-8 strings.John Thacker1-1/+5
It is correct to pass in the memory address immediately past the end of our buffer, as g_utf8_prev_char() does not deference it until after decrementing it once, and we want to find the final UTF-8 character start. Starting one byte earlier truncates the string more than necessary. This effectively reverts 4b6224a67326dc72e428e8f6606b8ae10059c0bd which noted that Coverity flagged this as a memory access error, although it is not. This is possibly because it was written as &label_str[ITEM_LABEL_LENGTH]. All versions of the ISO C standard starting with C99 have indicated (6.5.3.2) than in such a case "neither the & operator nor the unary * that is implied by the [] is evaluated and the result is as if the & operator were removed and the [] operator were changed to a + operator" and (6.5.6) that referring to the memory address one past the last element of an array object "shall not produce an overflow" and is not undefined (so long as it not deferenced.) However, Coverity may not have been aware of this, so rewrite the expression using the + operator in the hopes of avoiding false positive Coverity errors.
2022-08-16Increase number of preallocated fields.Anders Broman1-1/+1
2022-08-08Streamline hfinfo retrieval in proto_tree_add_* functionsJaap Keuter1-22/+22
Instead of a function call, instantiate the PROTO_REGISTRAR_GET_NTH macro directly, which contains the subsequent DISSECTOR_ASSERT macro to test the result anyway.
2022-08-02epan: Refactor floating point display typesJoão Valverde1-60/+80
Remove the redundant BASE_FLOAT field display type. The name BASE_FLOAT is meaningless and the value aliased to BASE_NONE. Require BASE_NONE instead of BASE_FLOAT (corresponding to the printf() %g format). Add new float display types using BASE_DEC, BASE_HEX and BASE_EXP corresponfing to %f, %a and %e respectively. Add support for BASE_CUSTOM with floats.
2022-07-15proto: fix proto_tree_add_bitmask_list_ret_uint64 to always return a value.Guy Harris1-1/+3
A "proto_tree_add..._ret_..." routine *must* return the value through the pointer, even if no protocol tree is being built, as there's no guarantee that a protocol tree will be built under all circumstances (for example, if the dissection is only being done to generate the column values, no column is a custom column, there are no coloring rules, etc., so that none of the named field values are of interest, and the protocol tree isn't going to be displayed, no protocol tree will be built). Fixes #18203.
2022-07-14pfcp: change to utilize proto_tree_add_bitmask_listJoakim Karlsson1-0/+14
2022-07-12epan: ws_debug log for heuristic that claims frame (len != 0)Chuck Craft1-0/+23
It's possible for a dissector to claim a frame without adding to the tree or being added to frame.protocols (see !6669) Log a debug message showing the pinfo layers and the dissector that claimed the tvb (frame/packet).
2022-07-08epan: Copy multifield custom column undecoded values correctlyJohn Thacker1-3/+13
When writing a custom column, some field types can't have a resolved value, and just copy the label from the expression to the value. Only copy information from the most recent field when doing so, so that with multifield custom columns the entire unresolved value doesn't get overwritten with the resolved value (if some fields have resolved values and some don't.) This also reduces copying from O(N^2) to O(N). Fixes the display "unresolved" value for multifield custom columns that are a mix of field types.
2022-07-05epan: Fix return value of prooto_strlcpy when not enough roomJohn Thacker1-2/+6
proto_strlcpy in normal situations returns the number of bytes copied (because the return value of g_strlcpy is strlen of the source buffer). It can copy no more than dest_size - 1, because dest_size is the size of the buffer, including the null terminator. (https://docs.gtk.org/glib/func.strlcpy.html) Returning dest_size can cause offsets to get off by one and reach the end of the buffer, and can cause subsequent calls to have buffer overflows. (See #16905 for an example in the comments.)
2022-07-05Properly free range strings, ext strings, custom baseDavid Perry1-8/+30
2022-07-02dfilter: Remove unparsed syntax type and RHS literal biasJoão Valverde1-0/+26
This removes unparsed name resolution during the semantic check because it feels like a hack to work around limitations in the language syntax, that should be solved at the lexical level instead. We were interpreting unparsed differently on the LHS and RHS. Now an unparsed value is always a field if it matches a registered field name (this matches the implementation in 3.6 and before). This requires tightening a bit the allowed filter names for protocols to avoid some common and potentially weird conflicting cases. Incidentally this extends set grammar to accept all entities. That is experimental and may be reverted in the future.
2022-06-20ftypes: Make accessor functions type safeJoão Valverde1-35/+36
2022-05-23dfilter: Fix protocol slices with negative indexesJoão Valverde1-5/+10
Field infos have a length property that was not stored with the field value so when using a negative index the end was computed from the captured length of the frame tvbuff, leading to incorrect results. The documentation in wireshark-filter(5) describes how this was supposed to work but as far as I can tell it never worked properly. We now store the length and use that (when it is different from -1) to locate the end of the protocol data in the tvbuff. An extra wrinkle is that sometimes the length is set after the field value is created. This is the most common case as the majority of protocols have a variable length and dissection generally proceeds with a TVB subset from the current layer (with offset zero) through all remaining layers to the end of the captured length. For that reason we must use an expedient to allow changing the protocol length of an existing protocol fvalue, whenever proto_item_set_len() is called. Fixes #17772.
2022-05-15proto: Handle BASE_SPECIAL_VALS in add_bitmask_ titleJohn Thacker1-27/+51
Respect BASE_SPECIAL_VALS when adding to the title item of an item added with the proto_tree_add_bitmask* functions. Note that the documentation for the BMT_NO_INT flag has always said that "only boolean flags are added to the title" and that no integer based items are added, but the actual behavior has been to add integer items with custom format functions and value strings.
2022-05-14proto: Fix display of BASE_UNIT_STRING for 64 bit fields in bitmaskJohn Thacker1-4/+12
When integer fields are displayed in the bitmask header item in proto_tree_add_bitmask_tree and hf->strings is set, only the string from the value_string is used, not the integer value, to save space. However, that means that BASE_UNIT_STRING fields have to be treated differently from all the other fields with hf->strings set. If not, then only the units are displayed instead of the number with the units. Fields based on 32 bit integers were already being handled correctly. Use that same logic for fields based on 64 bit integers. (See commit 24d991dab493d249167e91 for something similar.)
2022-05-14proto: Fix reversed test for signed ints with unit stringsJohn Thacker1-1/+1
In proto_item_add_bitmask_tree, on the signed integer path, the test for if the display uses a unit string is clearly reversed, calling it only if BASE_UNIT_STRING is unset. Use the correct test from the unsigned integer path.
2022-05-13proto: Add support for BASE_SPECIAL_VALS to fields with bitmasksJohn Thacker1-10/+40
Add support for BASE_SPECIAL_VALS to fill_label_bitfield[64], for fields with a nonzero bitmask, using the same logic as fill_label_number[64]. There's at least one dissector (packet-ipmi-se.c) that was trying to use this already, but silently had no effect.
2022-05-12dfilter: Add support for universal quantifiersJoão Valverde1-0/+2
Adds the keywords "any" and "all" to implement the quantification to any existing relational operator. Filter: all tcp.port in {100, 2000..3000} Syntax tree: 0 ALL TEST_IN: 1 FIELD(tcp.port) 1 SET(#2): 2 FVALUE(100 <FT_UINT16>) 2 FVALUE(2000 <FT_UINT16>) .. FVALUE(3000 <FT_UINT16>) Instructions: 00000 READ_TREE tcp.port -> reg#0 00001 IF_FALSE_GOTO 5 00002 ALL_EQ reg#0 === 100 <FT_UINT16> 00003 IF_TRUE_GOTO 5 00004 ALL_IN_RANGE reg#0 in { 2000 <FT_UINT16> .. 3000 <FT_UINT16> } 00005 RETURN
2022-04-26epan: Add more bookkeeping for layersJoão Valverde1-0/+3
Packet info already contains the notion of layer depth for the current protocol, among all the protocols in the frame. This adds an extra layer number for the protocols that are the same as the current one. Obviously this will only go above one if the protocol is repeated in the stack, such as with IP tunneling. Adds extra logic to track numbers for each protocol in the frame and update them when calling a dissector. The total layer number and protocol layer number are store in the field info structure so they can be used after dissection, namely by display filters.
2022-04-20epan: Add comments about _get_parent, _set_len and faked itemsJohn Thacker1-0/+24
If we're faking items, then proto_[item|tree]_get_parent[_nth] return the parent of the faked item, which may not be what we want. We have no way of knowing if the logical item meant was the faked item itself or one of its children that share the same proto_item when faked. Thus we don't know if we should return the proto_item itself or its parent when called on a possibly faked item. Most of the time we will be adding new items to what we return here, which means not faking items that could be faked (since we might be returning the root node, which doesn't have a field_info), hurting performance (see #8069). It can also have some unusual effects on the protocol hierarchy stats, particularly if we change things so that non-visible items can change their length, which has a similar issue. (#17877)
2022-04-14epan: add ENC_TIME_USECS timestamp encodingChuck Craft1-0/+20
Needed to format timestamp in #18038 - packet-cql.c Mirrors changes made in !1924 - Add ENC_TIME_NSECS timestamp encoding Documentation in README.dissector, proto.c, proto.h - could use refresh in a different merge request.
2022-03-29dfilter: Refactor macro tree referencesJoão Valverde1-26/+0
This replaces the current macro reference system with a completely different implementation. Instead of a macro a reference is a syntax element. A reference is a constant that can be filled in the dfilter code after compilation from an existing protocol tree. It is best understood as a field value that can be read from a fixed tree that is not the frame being filtered. Usually this fixed tree is the currently selected frame when the filter is applied. This allows comparing fields in the filtered frame with fields in the selected frame. Because the field reference syntax uses the same sigil notation as a macro we have to use a heuristic to distinguish them: if the name has a dot it is a field reference, otherwise it is a macro name. The reference is synctatically validated at compile time. There are two main advantages to this implementation (and a couple of minor ones): The protocol tree for each selected frame is only walked if we have a display filter and if the display filter uses references. Also only the actual reference values are copied, intead of loading the entire tree into a hash table (in textual form even). The other advantage is that the reference is tested like a protocol field against all the values in the selected frame (if there is more than one). Currently the reference fields are not "primed" during dissection, so the entire tree is walked to find a particular reference (this is similar to the previous implementation). If the display filter contains a valid reference and the reference is not loaded at the time the filter is run the result is the same as a non existing field for a regular READ_TREE instruction. Fixes #17599.
2022-03-28dfilter: Add ftypes pseudofieldsJoão Valverde1-0/+4
This adds a _ws.ftypes namespace with protocol fields with all the existing field types. Currently this is only useful to debug the display filter compiler, without having to find a real protocol field with the desired type. Later it may find other uses.
2022-03-25proto: Fix comment on NTP Era 1 EpochJohn Thacker1-1/+1
NTP Era 1 begins on 7 February 2036, 06:28:16 UTC, exactly when the 64 bit fixed point timestamp rolls over. See RFC 4330/5905 (and the correct comments later in get_time_value). Fix the comment where the constant is defined (the value is already correct, however.)
2022-03-14elastic: fix mapping with recent es versions.Dario Lombardo1-27/+26