aboutsummaryrefslogtreecommitdiffstats
path: root/epan/dfilter
AgeCommit message (Collapse)AuthorFilesLines
2023-01-03dfilter: Remove semcheck arithmetic commute argumentJoão Valverde1-8/+4
No one is using this so I'd like to explore other options first to handle constants in arithmetic expressions that lack type information. Reverts 3ddb017a88797f520cda45961819c7084a0a5b29.
2023-01-02dfilter: Tweak representation for length-1 byte arrayJoão Valverde1-1/+1
Make dfilter byte representation always use ':' for consistency. Make 1 byte be represented as "XX:" with the colon suffix to make it nonambiguous that is is a byte and not other type, like a protocol. The difference is can be seen in the following programs. In the before representation it is not obvious at all that the second "fc" value is a literal bytes value and not the value of the protocol "fc", although it can be inferred from the lack of a READ_TREE instruction. In the After we know that "fc:" must be bytes and not a protocol. Note that a leading colon is a syntactical expedient to say "this value with any type is a literal value and not a protocol field." A terminating colon is just a part of the dfilter literal bytes syntax. Before: Filter: fc == :fc Syntax tree: 0 TEST_ANY_EQ: 1 FIELD(fc <FT_PROTOCOL>) 1 FVALUE(fc <FT_PROTOCOL>) Instructions: 00000 READ_TREE fc <FT_PROTOCOL> -> reg#0 00001 IF_FALSE_GOTO 3 00002 ANY_EQ reg#0 == fc <FT_PROTOCOL> After: Filter: fc == :fc Syntax tree: 0 TEST_ANY_EQ: 1 FIELD(fc <FT_PROTOCOL>) 1 FVALUE(fc: <FT_PROTOCOL>) Instructions: 00000 READ_TREE fc <FT_PROTOCOL> -> reg#0 00001 IF_FALSE_GOTO 3 00002 ANY_EQ reg#0 == fc: <FT_PROTOCOL>
2023-01-02dfilter: Improve debug formatJoão Valverde2-8/+9
2023-01-02dfilter: Replace global variableJoão Valverde3-16/+14
2023-01-02dfilter: Minor flex clean upJoão Valverde2-34/+9
Replace flex prefix to improve readability. Remove two no-longer-needed workarounds to suppress warnings.
2023-01-01Lemon: Update code and remove cruftJoão Valverde1-3/+0
Remove some unused historical files. Aggressively disable warnings to keep the lemon source pristine and avoid the maintenance burden for lemon itself. Lemon has its own lax policy for warnings that doesn't match our own and they won't accept external patches to remove the warnings, so just ignore them. Lemon is just executed to generate code for the Wireshark build and the minor code issues it has have no influence at runtime. For lemon generated code we selectively disable some linting warnings. Remove patches for lemon and lempar, they are no longer required with these changes to silence warnings.
2022-12-30dfilter: Reject constant expressionsJoão Valverde3-10/+43
Constant logical expressions are tautologies and almost certainly user error. Reject them as invalid. Most of them were already rejected with insufficient type information but some corner cases were still valid. Before: Filter: ${frame.number} == 3 Syntax tree: 0 TEST_ANY_EQ: 1 REFERENCE(frame.number <FT_UINT32>) 1 FVALUE(3 <FT_UINT32>) Instructions: 00000 READ_REFERENCE ${frame.number <FT_UINT32>} -> reg#0 00001 IF_FALSE_GOTO 3 00002 ANY_EQ reg#0 == 3 <FT_UINT32> 00003 RETURN After: Filter: ${frame.number} == 3 dftest: Constant expression is invalid. ${frame.number} == 3 ^~~~~~~~~~~~~~~~~~~~
2022-12-30dfilter: Remove commute argument from semantic checkJoão Valverde1-32/+88
Take a more conservative, less flexible, maybe more elegant, approach to type inference for now.
2022-12-30dfilter: Add a check_nonzero() functionJoão Valverde2-26/+83
Small refactoring with no functional difference.
2022-12-30dftest: Add debug command-line optionsJoão Valverde3-8/+26
2022-12-29dfilter: Add compilation warning for ambiguous syntaxJoão Valverde5-18/+69
$ dfilter 'frame contains fc' Filter: frame contains fc Warning: Interpreting "fc" as "Fibre Channel". Consider writing :fc or .fc. (...)
2022-12-29dfilter: Refactor error location for expressionsJoão Valverde4-79/+93
Underline the whole expression for errors, not just the token. Implement it for all expressions.
2022-12-29dfilter: Replace unparsed lexical type and simplify grammarJoão Valverde3-84/+102
Remove unparsed lexical type and replace it with identifier and constant. This separation is still necessary to differentiate names (fields and function) from literals that look like names but it has some advantages to do it at the lexical level. The main advantage is a much cleaner and simplified grammar, because we only have a single token type for field names, without any loss of generality (the same name is valid for fields and function names for example). The CONSTANT token type is necessary to be different from literal to provide errors for function rules.
2022-12-29dfilter: Rename grammar rulesJoão Valverde1-11/+11
2022-12-28dfilter: Improve error location for functionsJoão Valverde1-2/+10
Underline the whole expression if the error is for the function. Before: Filter: frame.number == abs(1, 2) dftest: Function abs can only accept 1 arguments. frame.number == abs(1, 2) ^~~ After: Filter: frame.number == abs(1, 2) dftest: Function abs can only accept 1 arguments. frame.number == abs(1, 2) ^~~~~~~~~
2022-12-27dfilter: Allow compatible types to be compared in min/maxJoão Valverde3-3/+6
2022-12-27dfilter: Do not jump when generating function argumentsJoão Valverde1-9/+9
Instead of "jumping" with length zero to the next sequential instruction skip generating the no-op jump instruction entirely.
2022-12-27dfilter: Preserve function argument order when printingJoão Valverde1-7/+18
Instead of printing back to front (from the top of the stack print them front to back as a user would type them.
2022-12-27dfilter: Allow constants as the first or only argument to min/maxJoão Valverde4-36/+63
The strategy here is to delay resolving literals to values until we have looked at the entire argument list. Also we will try to commute the relation in a comparison if we do not have a type for the return value of the function, like any other constant. Before: Filter: max(1,_ws.ftypes.int8) == 1 dftest: Argument '1' is not valid for max() max(1,_ws.ftypes.int8) == 1 ^ After: Filter: max(1,_ws.ftypes.int8) == 1 Syntax tree: 0 TEST_ANY_EQ: 1 FUNCTION(max#2): 2 FVALUE(1 <FT_INT8>) 2 FIELD(_ws.ftypes.int8 <FT_INT8>) 1 FVALUE(1 <FT_INT8>) Instructions: 00000 STACK_PUSH 1 <FT_INT8> 00001 READ_TREE _ws.ftypes.int8 <FT_INT8> -> reg#1 00002 IF_FALSE_GOTO 3 00003 STACK_PUSH reg#1 00004 CALL_FUNCTION max(reg#1, 1 <FT_INT8>) -> reg#0 00005 STACK_POP 2 00006 IF_FALSE_GOTO 8 00007 ANY_EQ reg#0 == 1 <FT_INT8> 00008 RETURN
2022-12-27dfilter: Fix crash with min/max literal argumentJoão Valverde1-8/+7
Filter: max(1,_ws.ftypes.int8) == 1 ** (dftest:64938) 01:43:25.950180 [DFilter ERROR] epan/dfilter/sttype-field.c:117 -- sttype_field_ftenum(): Magic num is 0x5cf30031, but should be 0xfc2002cf
2022-12-26dfilter: Fix crash with a constant arithmetic expressionJoão Valverde2-3/+6
2022-12-26dfilter: Allow arithmetic expression to commuteJoão Valverde3-101/+175
Allow an arithmetic expression like 1 + some.field. If we cannot assign a type to the LHS commute the terms and try again. Before: Filter: _ws.ftypes.int32 + 1 == 10 Syntax tree: 0 TEST_ANY_EQ: 1 OP_ADD: 2 FIELD(_ws.ftypes.int32 <FT_INT32>) 2 FVALUE(1 <FT_INT32>) 1 FVALUE(10 <FT_INT32>) Instructions: 00000 READ_TREE _ws.ftypes.int32 <FT_INT32> -> reg#0 00001 IF_FALSE_GOTO 4 00002 ADD reg#0 + 1 <FT_INT32> -> reg#1 00003 ANY_EQ reg#1 == 10 <FT_INT32> 00004 RETURN Filter: 1 + _ws.ftypes.int32 == 10 dftest: Constant arithmetic expression on the LHS is invalid. 1 + _ws.ftypes.int32 == 10 ^ After: Filter: _ws.ftypes.int32 + 1 == 10 Syntax tree: 0 TEST_ANY_EQ: 1 OP_ADD: 2 FIELD(_ws.ftypes.int32 <FT_INT32>) 2 FVALUE(1 <FT_INT32>) 1 FVALUE(10 <FT_INT32>) Instructions: 00000 READ_TREE _ws.ftypes.int32 <FT_INT32> -> reg#0 00001 IF_FALSE_GOTO 4 00002 ADD reg#0 + 1 <FT_INT32> -> reg#1 00003 ANY_EQ reg#1 == 10 <FT_INT32> 00004 RETURN Filter: 1 + _ws.ftypes.int32 == 10 Syntax tree: 0 TEST_ANY_EQ: 1 OP_ADD: 2 FVALUE(1 <FT_INT32>) 2 FIELD(_ws.ftypes.int32 <FT_INT32>) 1 FVALUE(10 <FT_INT32>) Instructions: 00000 READ_TREE _ws.ftypes.int32 <FT_INT32> -> reg#0 00001 IF_FALSE_GOTO 4 00002 ADD 1 <FT_INT32> + reg#0 -> reg#1 00003 ANY_EQ reg#1 == 10 <FT_INT32> 00004 RETURN
2022-12-26dfilter: Fix an assertion macroJoão Valverde1-1/+1
2022-12-26dfilter: Fix grammar memory leakJoão Valverde1-0/+4
2022-12-26dfilter: Allow comparison relation to commuteJoão Valverde1-5/+22
Comparison relations should be allowed to commute but they can not because we need type information to resolve literals to fvalues. For that reason an expression like "1 == some.field" is invalid. Solve that by commuting the relation if the first try did not succeed in assigning a type to the LHS. After the second try give up, that means we have a relation with constants on both sides and that is not semantically valid. Other relations like "matches" and "contains" are not symmetric and should not commute anyway. Before: Filter: _ws.ftypes.int32 == 10 Syntax tree: 0 TEST_ANY_EQ: 1 FIELD(_ws.ftypes.int32 <FT_INT32>) 1 FVALUE(10 <FT_INT32>) Instructions: 00000 READ_TREE _ws.ftypes.int32 <FT_INT32> -> reg#0 00001 IF_FALSE_GOTO 3 00002 ANY_EQ reg#0 == 10 <FT_INT32> 00003 RETURN Filter: 10 == _ws.ftypes.int32 dftest: Left side of "==" expression must be a field or function, not 10. 10 == _ws.ftypes.int32 ^~ After: Filter: _ws.ftypes.int32 == 10 Syntax tree: 0 TEST_ANY_EQ: 1 FIELD(_ws.ftypes.int32 <FT_INT32>) 1 FVALUE(10 <FT_INT32>) Instructions: 00000 READ_TREE _ws.ftypes.int32 <FT_INT32> -> reg#0 00001 IF_FALSE_GOTO 3 00002 ANY_EQ reg#0 == 10 <FT_INT32> 00003 RETURN Filter: 10 == _ws.ftypes.int32 Syntax tree: 0 TEST_ANY_EQ: 1 FVALUE(10 <FT_INT32>) 1 FIELD(_ws.ftypes.int32 <FT_INT32>) Instructions: 00000 READ_TREE _ws.ftypes.int32 <FT_INT32> -> reg#0 00001 IF_FALSE_GOTO 3 00002 ANY_EQ 10 <FT_INT32> == reg#0 00003 RETURN
2022-12-26dfilter: Allow the first DFVM argument to be an fvalueJoão Valverde1-28/+48
Do not assert that arg1 must be a register, allow passing constants as the first argument to allow the arguments to commute freely.
2022-12-26dfilter: Change two scanner patterns to camel caseJoão Valverde1-5/+5
2022-12-26dfilter: Minor fixupsJoão Valverde2-1/+7
2022-12-26dfilter: Improve error location for parenthesized expressionsJoão Valverde3-2/+18
2022-12-24dfilter: Reformat grammar codeJoão Valverde1-14/+49
Use a consistent style for grammar rules. Remove a comment that is too generic. The current code should conform to how Python operates and does not need additional error checking.
2022-12-24dfilter: Clean up scanner codeJoão Valverde1-11/+10
Clean up some issues flagged by a linter. Remove hyphen from pattern names and remove an unused start condition.
2022-12-23dfilter: Improve error location for expressionsJoão Valverde4-2/+40
Try to underline the whole expression instead of the token.
2022-12-23dfilter: Refactor error location trackingJoão Valverde11-93/+112
Remove duplicate location struct by adding a new header. Pass around a structure instead of a pointer.
2022-12-22dfilter: Add support for negation of arithmetic expressionsJoão Valverde3-4/+4
2022-12-22dfilter: Improve arithmetic error messagesJoão Valverde1-6/+29
2022-12-21dfilter: Check if type supports unary minusJoão Valverde1-0/+4
Fix crash for types that do not support unary minus. Fixes #18750.
2022-12-03wmem: Remove strbuf max size parameterJoão Valverde1-1/+1
This parameter was introduced as a safeguard for bugs that generate an unbounded string but its utility for that purpose is doubtful and the way it is being used creates problems with invalid truncation of UTF-8 strings. Rename wmem_strbuf_sized_new() with a better name.
2022-12-01Qt: Check field autocomplete for syntactical validityJoão Valverde2-1/+2
Currently the autocompletion engine always suggests a protocol field completion, even in places where it isn't syntactically valid. Fix that by compiling the preamble to the token under the cursor and checking the returned error. If it is DF_ERROR_UNEXPECTED_END that indicates a field or literal value was expected. Otherwise a field replacement is not valid in this position. Fixes #12811.
2022-11-30dfilter: Replace compile booleans arguments with a bit flagJoão Valverde2-12/+18
2022-11-30dfilter: Add optimization flagJoão Valverde4-4/+9
When we are just testing code to see if it compiles performing optimizations is wasteful. Add an option to disable them.
2022-11-30dfilter: Always set error pointer in case of failureJoão Valverde1-0/+1
2022-11-28dfilter: Return an error object instead of stringJoão Valverde7-95/+170
Return an struct containing error information. This simplifies the interface to more easily provide richer diagnostics in the future. Add an error code besides a human-readable error string to allow checking programmatically for errors in a robust manner. Currently there is only a generic error code, it is expected to increase in the future. Move error location information to the struct. Change callers and implementation to use the new interface.
2022-11-20Add macros to control lemon diagnosticsJoão Valverde2-6/+7
Rename flex macros using parenthesis (mostly a style issue): DIAG_OFF_FLEX -> DIAG_OFF_FLEX() DIAG_ON_FLEX -> DIAG_ON_FLEX() Use the same kind of construct with lemon generated code using DIAG_OFF_LEMON() and DIAG_ON_LEMON(). Use %include and %code directives to enforce the desired order with generated code in the middle in between pragmas. Fix a clang-specific pragma to use DIAG_OFF_CLANG(). DIAG_OFF(unreachable-code) -> DIAG_OFF_CLANG(unreachable-code). Apparently GCC is ignoring the -Wunreachable flag, that's why it did not trigger an unknown pragma warning. From [1}: The -Wunreachable-code has been removed, because it was unstable: it relied on the optimizer, and so different versions of gcc would warn about different code. The compiler still accepts and ignores the command line option so that existing Makefiles are not broken. In some future release the option will be removed entirely. - Ian [1] https://gcc.gnu.org/legacy-ml/gcc-help/2011-05/msg00360.html
2022-11-17Disable another -Wunreachable lemon warningJoão Valverde1-0/+3
2022-11-07dfilter: treat carriage returns as whitespacePeter Wu1-1/+1
Fixes #18595
2022-10-31dfilter: Improve representation of raw field referencesJoão Valverde3-13/+30
Instead of using the abstract type "<RAW>", which might be confusing, show FT_BYTES, but display the representation with the "@" operator, so it's not even more confusing in error messages why a field might flip-flop types. Refactor the field tostr() function and some other clean ups. Before: ``` Filter: _ws.ftypes.string ==${@frame.len} dftest: _ws.ftypes.string and frame.len <RAW> are not of compatible types. _ws.ftypes.string ==${@frame.len} ^~~~~~~~~ ``` After: ``` Filter: _ws.ftypes.string ==${@frame.len} dftest: _ws.ftypes.string <FT_STRING> and @frame.len <FT_BYTES> are not of compatible types. _ws.ftypes.string ==${@frame.len} ^~~~~~~~~ ```
2022-10-31dfilter: Add suport for raw addressing with referencesJoão Valverde7-16/+80
Extends raw adressing syntax to wok with references. The syntax is @field1 == ${@field2} This requires replicating the logic to load field references, but using raw values instead. We use separate hash tables for that, namely "references" vs "raw_references".
2022-10-31dfilter: Add support for raw (bytes) addressing modeJoão Valverde10-26/+160
This adds new syntax to read a field from the tree as bytes, instead of the actual type. This is a useful extension for example to match matformed strings that contain unicode replacement characters. In this case it is not possible to match the raw value of the malformed string field. This extension fills this need and is generic enough that it should be useful in many other situations. The syntax used is to prefix the field name with "@". The following artificial example tests if the HTTP user agent contains a particular invalid UTF-8 sequence: @http.user_agent == "Mozill\xAA" Where simply using "http.user_agent" won't work because the invalid byte sequence will have been replaced with U+FFFD. Considering the following programs: $ dftest '_ws.ftypes.string == "ABC"' Filter: _ws.ftypes.string == "ABC" Syntax tree: 0 TEST_ANY_EQ: 1 FIELD(_ws.ftypes.string <FT_STRING>) 1 FVALUE("ABC" <FT_STRING>) Instructions: 00000 READ_TREE _ws.ftypes.string <FT_STRING> -> reg#0 00001 IF_FALSE_GOTO 3 00002 ANY_EQ reg#0 == "ABC" <FT_STRING> 00003 RETURN $ dftest '@_ws.ftypes.string == "ABC"' Filter: @_ws.ftypes.string == "ABC" Syntax tree: 0 TEST_ANY_EQ: 1 FIELD(_ws.ftypes.string <RAW>) 1 FVALUE(41:42:43 <FT_BYTES>) Instructions: 00000 READ_TREE @_ws.ftypes.string <FT_BYTES> -> reg#0 00001 IF_FALSE_GOTO 3 00002 ANY_EQ reg#0 == 41:42:43 <FT_BYTES> 00003 RETURN In the second case the field has a "raw" type, that equates directly to FT_BYTES, and the field value is read from the protocol raw data.
2022-10-31dfilter: Pass a value by referenceJoão Valverde1-6/+5
The lifetime of the reference is longer than the runtime so avoid an unecessary fvalue dup.
2022-10-31dfilter: Remove unused data structureJoão Valverde2-9/+0