Age | Commit message (Collapse) | Author | Files | Lines |
|
No one is using this so I'd like to explore other
options first to handle constants in arithmetic
expressions that lack type information.
Reverts 3ddb017a88797f520cda45961819c7084a0a5b29.
|
|
Make dfilter byte representation always use ':' for consistency.
Make 1 byte be represented as "XX:" with the colon suffix to
make it nonambiguous that is is a byte and not other type,
like a protocol.
The difference is can be seen in the following programs. In the
before representation it is not obvious at all that the second
"fc" value is a literal bytes value and not the value of the
protocol "fc", although it can be inferred from the lack of
a READ_TREE instruction. In the After we know that "fc:" must
be bytes and not a protocol.
Note that a leading colon is a syntactical expedient to say
"this value with any type is a literal value and not a protocol
field." A terminating colon is just a part of the dfilter
literal bytes syntax.
Before:
Filter: fc == :fc
Syntax tree:
0 TEST_ANY_EQ:
1 FIELD(fc <FT_PROTOCOL>)
1 FVALUE(fc <FT_PROTOCOL>)
Instructions:
00000 READ_TREE fc <FT_PROTOCOL> -> reg#0
00001 IF_FALSE_GOTO 3
00002 ANY_EQ reg#0 == fc <FT_PROTOCOL>
After:
Filter: fc == :fc
Syntax tree:
0 TEST_ANY_EQ:
1 FIELD(fc <FT_PROTOCOL>)
1 FVALUE(fc: <FT_PROTOCOL>)
Instructions:
00000 READ_TREE fc <FT_PROTOCOL> -> reg#0
00001 IF_FALSE_GOTO 3
00002 ANY_EQ reg#0 == fc: <FT_PROTOCOL>
|
|
|
|
|
|
Replace flex prefix to improve readability.
Remove two no-longer-needed workarounds to suppress warnings.
|
|
Remove some unused historical files.
Aggressively disable warnings to keep the lemon source
pristine and avoid the maintenance burden for lemon itself.
Lemon has its own lax policy for warnings that doesn't match our
own and they won't accept external patches to remove the
warnings, so just ignore them. Lemon is just executed to generate
code for the Wireshark build and the minor code issues it has
have no influence at runtime.
For lemon generated code we selectively disable some linting
warnings.
Remove patches for lemon and lempar, they are no longer required
with these changes to silence warnings.
|
|
Constant logical expressions are tautologies and almost certainly
user error. Reject them as invalid.
Most of them were already rejected with insufficient type information
but some corner cases were still valid.
Before:
Filter: ${frame.number} == 3
Syntax tree:
0 TEST_ANY_EQ:
1 REFERENCE(frame.number <FT_UINT32>)
1 FVALUE(3 <FT_UINT32>)
Instructions:
00000 READ_REFERENCE ${frame.number <FT_UINT32>} -> reg#0
00001 IF_FALSE_GOTO 3
00002 ANY_EQ reg#0 == 3 <FT_UINT32>
00003 RETURN
After:
Filter: ${frame.number} == 3
dftest: Constant expression is invalid.
${frame.number} == 3
^~~~~~~~~~~~~~~~~~~~
|
|
Take a more conservative, less flexible, maybe more elegant,
approach to type inference for now.
|
|
Small refactoring with no functional difference.
|
|
|
|
$ dfilter 'frame contains fc'
Filter: frame contains fc
Warning: Interpreting "fc" as "Fibre Channel". Consider writing :fc or .fc.
(...)
|
|
Underline the whole expression for errors, not just the token.
Implement it for all expressions.
|
|
Remove unparsed lexical type and replace it with identifier
and constant. This separation is still necessary to differentiate
names (fields and function) from literals that look like names
but it has some advantages to do it at the lexical level.
The main advantage is a much cleaner and simplified grammar,
because we only have a single token type for field names, without
any loss of generality (the same name is valid for fields and
function names for example).
The CONSTANT token type is necessary to be different from literal
to provide errors for function rules.
|
|
|
|
Underline the whole expression if the error is for the function.
Before:
Filter: frame.number == abs(1, 2)
dftest: Function abs can only accept 1 arguments.
frame.number == abs(1, 2)
^~~
After:
Filter: frame.number == abs(1, 2)
dftest: Function abs can only accept 1 arguments.
frame.number == abs(1, 2)
^~~~~~~~~
|
|
|
|
Instead of "jumping" with length zero to the next sequential
instruction skip generating the no-op jump instruction entirely.
|
|
Instead of printing back to front (from the top of the stack
print them front to back as a user would type them.
|
|
The strategy here is to delay resolving literals to values until
we have looked at the entire argument list.
Also we will try to commute the relation in a comparison if
we do not have a type for the return value of the function,
like any other constant.
Before:
Filter: max(1,_ws.ftypes.int8) == 1
dftest: Argument '1' is not valid for max()
max(1,_ws.ftypes.int8) == 1
^
After:
Filter: max(1,_ws.ftypes.int8) == 1
Syntax tree:
0 TEST_ANY_EQ:
1 FUNCTION(max#2):
2 FVALUE(1 <FT_INT8>)
2 FIELD(_ws.ftypes.int8 <FT_INT8>)
1 FVALUE(1 <FT_INT8>)
Instructions:
00000 STACK_PUSH 1 <FT_INT8>
00001 READ_TREE _ws.ftypes.int8 <FT_INT8> -> reg#1
00002 IF_FALSE_GOTO 3
00003 STACK_PUSH reg#1
00004 CALL_FUNCTION max(reg#1, 1 <FT_INT8>) -> reg#0
00005 STACK_POP 2
00006 IF_FALSE_GOTO 8
00007 ANY_EQ reg#0 == 1 <FT_INT8>
00008 RETURN
|
|
Filter: max(1,_ws.ftypes.int8) == 1
** (dftest:64938) 01:43:25.950180 [DFilter ERROR] epan/dfilter/sttype-field.c:117 -- sttype_field_ftenum(): Magic num is 0x5cf30031, but should be 0xfc2002cf
|
|
|
|
Allow an arithmetic expression like 1 + some.field. If we
cannot assign a type to the LHS commute the terms and
try again.
Before:
Filter: _ws.ftypes.int32 + 1 == 10
Syntax tree:
0 TEST_ANY_EQ:
1 OP_ADD:
2 FIELD(_ws.ftypes.int32 <FT_INT32>)
2 FVALUE(1 <FT_INT32>)
1 FVALUE(10 <FT_INT32>)
Instructions:
00000 READ_TREE _ws.ftypes.int32 <FT_INT32> -> reg#0
00001 IF_FALSE_GOTO 4
00002 ADD reg#0 + 1 <FT_INT32> -> reg#1
00003 ANY_EQ reg#1 == 10 <FT_INT32>
00004 RETURN
Filter: 1 + _ws.ftypes.int32 == 10
dftest: Constant arithmetic expression on the LHS is invalid.
1 + _ws.ftypes.int32 == 10
^
After:
Filter: _ws.ftypes.int32 + 1 == 10
Syntax tree:
0 TEST_ANY_EQ:
1 OP_ADD:
2 FIELD(_ws.ftypes.int32 <FT_INT32>)
2 FVALUE(1 <FT_INT32>)
1 FVALUE(10 <FT_INT32>)
Instructions:
00000 READ_TREE _ws.ftypes.int32 <FT_INT32> -> reg#0
00001 IF_FALSE_GOTO 4
00002 ADD reg#0 + 1 <FT_INT32> -> reg#1
00003 ANY_EQ reg#1 == 10 <FT_INT32>
00004 RETURN
Filter: 1 + _ws.ftypes.int32 == 10
Syntax tree:
0 TEST_ANY_EQ:
1 OP_ADD:
2 FVALUE(1 <FT_INT32>)
2 FIELD(_ws.ftypes.int32 <FT_INT32>)
1 FVALUE(10 <FT_INT32>)
Instructions:
00000 READ_TREE _ws.ftypes.int32 <FT_INT32> -> reg#0
00001 IF_FALSE_GOTO 4
00002 ADD 1 <FT_INT32> + reg#0 -> reg#1
00003 ANY_EQ reg#1 == 10 <FT_INT32>
00004 RETURN
|
|
|
|
|
|
Comparison relations should be allowed to commute but they can not
because we need type information to resolve literals to fvalues. For
that reason an expression like "1 == some.field" is invalid. Solve
that by commuting the relation if the first try did not succeed in
assigning a type to the LHS.
After the second try give up, that means we have a relation with
constants on both sides and that is not semantically valid.
Other relations like "matches" and "contains" are not symmetric and
should not commute anyway.
Before:
Filter: _ws.ftypes.int32 == 10
Syntax tree:
0 TEST_ANY_EQ:
1 FIELD(_ws.ftypes.int32 <FT_INT32>)
1 FVALUE(10 <FT_INT32>)
Instructions:
00000 READ_TREE _ws.ftypes.int32 <FT_INT32> -> reg#0
00001 IF_FALSE_GOTO 3
00002 ANY_EQ reg#0 == 10 <FT_INT32>
00003 RETURN
Filter: 10 == _ws.ftypes.int32
dftest: Left side of "==" expression must be a field or function, not 10.
10 == _ws.ftypes.int32
^~
After:
Filter: _ws.ftypes.int32 == 10
Syntax tree:
0 TEST_ANY_EQ:
1 FIELD(_ws.ftypes.int32 <FT_INT32>)
1 FVALUE(10 <FT_INT32>)
Instructions:
00000 READ_TREE _ws.ftypes.int32 <FT_INT32> -> reg#0
00001 IF_FALSE_GOTO 3
00002 ANY_EQ reg#0 == 10 <FT_INT32>
00003 RETURN
Filter: 10 == _ws.ftypes.int32
Syntax tree:
0 TEST_ANY_EQ:
1 FVALUE(10 <FT_INT32>)
1 FIELD(_ws.ftypes.int32 <FT_INT32>)
Instructions:
00000 READ_TREE _ws.ftypes.int32 <FT_INT32> -> reg#0
00001 IF_FALSE_GOTO 3
00002 ANY_EQ 10 <FT_INT32> == reg#0
00003 RETURN
|
|
Do not assert that arg1 must be a register, allow passing constants
as the first argument to allow the arguments to commute freely.
|
|
|
|
|
|
|
|
Use a consistent style for grammar rules.
Remove a comment that is too generic. The current code should
conform to how Python operates and does not need additional error
checking.
|
|
Clean up some issues flagged by a linter.
Remove hyphen from pattern names and remove an unused start condition.
|
|
Try to underline the whole expression instead of the
token.
|
|
Remove duplicate location struct by adding a new header.
Pass around a structure instead of a pointer.
|
|
|
|
|
|
Fix crash for types that do not support unary minus.
Fixes #18750.
|
|
This parameter was introduced as a safeguard for bugs
that generate an unbounded string but its utility for
that purpose is doubtful and the way it is being used
creates problems with invalid truncation of UTF-8
strings.
Rename wmem_strbuf_sized_new() with a better name.
|
|
Currently the autocompletion engine always suggests a protocol
field completion, even in places where it isn't syntactically
valid.
Fix that by compiling the preamble to the token under the cursor
and checking the returned error. If it is DF_ERROR_UNEXPECTED_END
that indicates a field or literal value was expected. Otherwise
a field replacement is not valid in this position.
Fixes #12811.
|
|
|
|
When we are just testing code to see if it compiles performing
optimizations is wasteful. Add an option to disable them.
|
|
|
|
Return an struct containing error information. This simplifies
the interface to more easily provide richer diagnostics in the future.
Add an error code besides a human-readable error string to allow
checking programmatically for errors in a robust manner. Currently
there is only a generic error code, it is expected to increase
in the future.
Move error location information to the struct. Change callers and
implementation to use the new interface.
|
|
Rename flex macros using parenthesis (mostly a style issue):
DIAG_OFF_FLEX -> DIAG_OFF_FLEX()
DIAG_ON_FLEX -> DIAG_ON_FLEX()
Use the same kind of construct with lemon generated code using
DIAG_OFF_LEMON() and DIAG_ON_LEMON(). Use %include and %code
directives to enforce the desired order with generated code
in the middle in between pragmas.
Fix a clang-specific pragma to use DIAG_OFF_CLANG().
DIAG_OFF(unreachable-code) -> DIAG_OFF_CLANG(unreachable-code).
Apparently GCC is ignoring the -Wunreachable flag, that's why
it did not trigger an unknown pragma warning. From [1}:
The -Wunreachable-code has been removed, because it was unstable: it
relied on the optimizer, and so different versions of gcc would warn
about different code. The compiler still accepts and ignores the
command line option so that existing Makefiles are not broken. In some
future release the option will be removed entirely. - Ian
[1] https://gcc.gnu.org/legacy-ml/gcc-help/2011-05/msg00360.html
|
|
|
|
Fixes #18595
|
|
Instead of using the abstract type "<RAW>", which might be confusing,
show FT_BYTES, but display the representation with the "@" operator,
so it's not even more confusing in error messages why a field might
flip-flop types.
Refactor the field tostr() function and some other clean ups.
Before:
```
Filter: _ws.ftypes.string ==${@frame.len}
dftest: _ws.ftypes.string and frame.len <RAW> are not of compatible types.
_ws.ftypes.string ==${@frame.len}
^~~~~~~~~
```
After:
```
Filter: _ws.ftypes.string ==${@frame.len}
dftest: _ws.ftypes.string <FT_STRING> and @frame.len <FT_BYTES> are not of compatible types.
_ws.ftypes.string ==${@frame.len}
^~~~~~~~~
```
|
|
Extends raw adressing syntax to wok with references. The syntax
is
@field1 == ${@field2}
This requires replicating the logic to load field references, but
using raw values instead. We use separate hash tables for that,
namely "references" vs "raw_references".
|
|
This adds new syntax to read a field from the tree as bytes, instead
of the actual type. This is a useful extension for example to match
matformed strings that contain unicode replacement characters. In
this case it is not possible to match the raw value of the malformed
string field. This extension fills this need and is generic enough
that it should be useful in many other situations.
The syntax used is to prefix the field name with "@". The following
artificial example tests if the HTTP user agent contains a particular
invalid UTF-8 sequence:
@http.user_agent == "Mozill\xAA"
Where simply using "http.user_agent" won't work because the invalid byte
sequence will have been replaced with U+FFFD.
Considering the following programs:
$ dftest '_ws.ftypes.string == "ABC"'
Filter: _ws.ftypes.string == "ABC"
Syntax tree:
0 TEST_ANY_EQ:
1 FIELD(_ws.ftypes.string <FT_STRING>)
1 FVALUE("ABC" <FT_STRING>)
Instructions:
00000 READ_TREE _ws.ftypes.string <FT_STRING> -> reg#0
00001 IF_FALSE_GOTO 3
00002 ANY_EQ reg#0 == "ABC" <FT_STRING>
00003 RETURN
$ dftest '@_ws.ftypes.string == "ABC"'
Filter: @_ws.ftypes.string == "ABC"
Syntax tree:
0 TEST_ANY_EQ:
1 FIELD(_ws.ftypes.string <RAW>)
1 FVALUE(41:42:43 <FT_BYTES>)
Instructions:
00000 READ_TREE @_ws.ftypes.string <FT_BYTES> -> reg#0
00001 IF_FALSE_GOTO 3
00002 ANY_EQ reg#0 == 41:42:43 <FT_BYTES>
00003 RETURN
In the second case the field has a "raw" type, that equates directly to
FT_BYTES, and the field value is read from the protocol raw data.
|
|
The lifetime of the reference is longer than the runtime so avoid
an unecessary fvalue dup.
|
|
|