Age | Commit message (Collapse) | Author | Files | Lines |
|
|
|
Add field expression functions to convert unsigned integer
and char fields to hex or decimal. (BASE_OCT is handled
somewhat different currently now, presumably because it
can't be used in filters, so leave that commented until
it is handled as a display representation.)
Currently string() always converts unsigned integers to their
decimal representation so it is the same as dec(), but possibly in
the future string() might use the native base.
These can be used in columns thanks to the fix for #15990
Fix #5308
|
|
Allow matching against 64-bit extended value strings the same
way as other value strings.
The IAX2 sample capture on the Wiki is a good test of this. Previously
the matches operator would never match, and comparison operators we not
allowed.
Before:
$ ./run/dftest -s 'iax2.voice.codec == "GSM compression"'
Filter:
iax2.voice.codec == "GSM compression"
Error: "GSM compression" cannot be found among the possible values for iax2.voice.codec.
iax2.voice.codec == "GSM compression"
^~~~~~~~~~~~~~~~~
After:
$ ./run/dftest -s 'iax2.voice.codec == "GSM compression"'
Filter:
iax2.voice.codec == "GSM compression"
Syntax tree:
0 TEST_ANY_EQ:
1 FIELD(iax2.voice.codec <FT_UINT64>)
1 FVALUE(2 <FT_UINT64>)
Instructions:
0000 READ_TREE iax2.voice.codec -> R0
0001 IF_FALSE_GOTO 3
0002 ANY_EQ R0 == 2
0003 RETURN
|
|
Use directory-level suppressions where needed.
|
|
Add an initial Clang-Tidy configuration file which checks for recursion
and various clang analyzer issues.
Run Clang-Tidy in the "Clang + Code Checks" merge request job.
Add NOLINT suppressions where needed in wsutil, epan, and lemon.
|
|
The matches operator implicitly converts non-stringlike fields
that have value strings to their value string value. (This is
not the same as the string representation of the number, which
applying the string function first would do, but it usually less
useful and worse performance than using numeric comparisons.)
However, FT_FRAMENUM fields have a hfinfo->strings but it is not
strings used for conversion, it is an overload with the special
ft_framenum_type_t, so don't convert.
This prevents a segmentation fault if expressions with
expressions like 'gtp.response_in ~ "test"'
|
|
|
|
In many grammatical contexts fields are only tested for existence
instead of loading the values into a register, because that's all
that is needed to determine if a filter passes or not. Add a
dfilter option to load the field values from the tree and return
them when a field (including field at a certain protocol layer) is
the root of the filter syntax tree.
This is useful for columns, especially for parsing columns defined
with the layer operator, but it can't completely replace the current
custom column handling because we don't yet return exactly which
hfinfo was present, if more than one has the same abbreviation, and
it's possible for fields with the same abbreviation to have different
strings, and hence different "resolved" values.
$ ./run/dftest -s "@ip.proto#1"
Filter:
@ip.proto#1
Syntax tree:
0 FIELD(@ip.proto#[1:1] <FT_BYTES>)
Instructions:
0000 CHECK_EXISTS_R ip.proto#[1:1]
0001 RETURN
$ ./run/dftest -s "@ip.proto#1" --return-vals
Filter:
@ip.proto#1
Syntax tree:
0 FIELD(@ip.proto#[1:1] <FT_BYTES>)
Instructions:
0000 READ_TREE_R @ip.proto#[1:1] -> R0
0001 NO_OP
0002 RETURN R0
Related to #18588
|
|
When generating DVFM code, tell the return function what
register has the final set of fvalues for filters that are
functions, arithmetic, or slices (that is, that compare one
or more fvalues to see if they are all zero.) Make sure
that these functions return an empty ptr array, unlike
tests that return a null ptr array.
For fields, we could return the fvalues, but currently we
don't bother loading the fvalues into registers since display
filters that just have a field test existence, so the generated
code would have to change. It's also a little more complicated
because there can be multiple fields that have different types
(sometimes not commensurable, which is an error noted by some of
the checks.) The logic in custom columns handles the field cases
currently.
|
|
min and max need to handle null arguments where the GPtrArray
is null, generated when there have been other opcodes between
the field loading and the function. (They are ignored, not
treated as zero, so they don't change the minimum.)
Prevents crashes with filters where a field does not exist in the tree:
min(tcp.srcport * 10, tcp.dstport * 10) == 800
min(len(tcp.payload), len(udp.payload)) == 153
min(len(tcp.payload[2:]) + 2, len(udp.payload[2:]) + 2) == 153
where a register is loaded where it has not had its GPtrArray created:
./run/dftest 'min(len(tcp.payload), len(udp.payload))'
Filter:
min(len(tcp.payload), len(udp.payload))
Instructions:
0000 READ_TREE tcp.payload -> R1
0001 IF_FALSE_GOTO 3
0002 LENGTH R1 -> R2
0003 STACK_PUSH R2
0004 READ_TREE udp.payload -> R3
0005 IF_FALSE_GOTO 7
0006 LENGTH R3 -> R4
0007 STACK_PUSH R4
0008 CALL_FUNCTION min(R2, R4) -> R0
0009 STACK_POP [2]
0010 IF_FALSE_GOTO 12
0011 NOT_ALL_ZERO R0
0012 RETURN
Related to fcb6bb576388e8a8ef4b657d794a80f008a99ff7
(Prior to that commit, this worked because a NULL pointer is a
valid, empty GSList.)
|
|
When trying to check if syntax in a filter that starts with
"${" is a macro or a field reference, use strpbrk to find the
first of '#' (layer) or '}' (closing the macro or field reference
expression.) Using strchr twice in a row causes incorrect behavior
in a long filter that has a '#' located later past the '}',
referring to a layer of a different field.
Also test for ';' and ':' and return if the string has those before
the other two characters.
Those two characters are illegal in fields but indicate that it is
a macro, as they separate macros from their arguments. Skip the other
processing as unnecessary.
|
|
If a null argument is given to a macro, print an error saying that
is disallowed instead of substituting the null argument (i.e., an
unquoted empty string) into the macro.
The latter almost certainly still produces a grammatical error, but it
will be something mysterious that depends on the macro definition like
"==" was unexpected in this context
instead of a useful error string.
For macros that take strings as argument, substituting a null has
never worked either, "" has always needed to be used.
As a special case, accept a single empty argument as meaning
"a macro with 0 arguments" instead of how it is currently treated,
a "macro with 1 null argument", i.e. $mymacro() for the new
function-like syntax and ${mymacro:} for the original syntax.
See 7d87367e22119be4b67822ebe14671a3f56a7f61
|
|
Instead of requiring ${macro:arg1;...;argN}, allow the format
${macro;arg1;...;argN}.
The semicolon isn't used anywhere else, it's simple to support,
and already used in the macro syntax. It's easier to remember
if all the separators in a macro are the same.
The colon is allowed in literals, which is why it's not used
between the arguments in the macro argument list, and allowing
it after the name makes the grammar more complicated, including
tokenizing when having pop-ups of potential field matches in
the display filter line edit (#19499.)
Update the documentation for this. Also edit the documentation
for macro syntax in a few places where it implies that whitespace
in macro arguments would be ignored; in fact, it's significant.
|
|
We haven't allowed anything other than alphanumerics or `_`
in macro names since at least 2007
(commit 8e849698a32263ec7291fac437620ddd7cbdb8a8)
so the better error message if a `-` is included is to say
that it's an invalid character, not that a macro with that
name doesn't exist.
We can also stop parsing at that point, which is more efficient.
That it was allowed at this point was a legacy of when field
references were handled using the macro code instead of separate
lexical elements, pre commit 260942e17041a79cb2ca27b6e016867da3c60735.
(#17599)
Just use the same macro for valid characters in a macro name
in the two syntax forms. It's unlikely that we'll start allowing
`.` in macro names, and if we do, we'd have to revisit the
checking for the $macro(args,...) syntax as well.
|
|
|
|
Manual revert of commit 0e82c6b4b8ed18ef1878446dd26d6345be2d2c2b.
Fixes #19493.
|
|
We should avoid matching .. (DOTDOT) anywhere. Revert that change.
Keep the other HyphenBytes change.
|
|
Make sure NN-MM is parsed as subtraction.
Avoid the use of hyphen with bytes unless preceded with a colon
(the prefix for literals).
Before:
Filter:
tcp.port == 64-63
Error: "64-63" cannot be converted to Unsigned integer (16 bits).
tcp.port == 64-63
^~~~~
After:
Filter:
tcp.port == 64-63
Instructions:
0000 READ_TREE tcp.port -> R0
0001 IF_FALSE_GOTO 3
0002 ANY_EQ R0 == 1
0003 RETURN
|
|
|
|
Unparsed tokens are tokens that can either be fields
or literal values like byte arrays or something weirder.
Some cases are troublesome, like two letter tokens
being a protocol name or a byte (fc is Fiber Channel or 0xFC),
or hypothetically aa.bb.cc being a byte array
{ 0xaa, 0xbb, 0xcc} or the "bb.cc" field of the "aa"
protocol. Etc.
This semantic difference obviously matters when parsing
an expression and providing helpful error messages to users.
I have now made several attempts at resolving unparsed tokens
into field/not field at the lexical level and still provide
good error messages and there are always limitations
and weird corner cases. Assigning a semantic type to
such ambiguous tokens requires more context. Originally this
was implemented by checking for registered field values
in the scanner but that is one of the possible solutions that
does not produce good results in practice IMO.
Accept that we will never fully fix this without backward
incompatible grammar changes and commit to resolving unparsed
types during the semantic check phase and maybe having a convoluted
lemon grammar with lots of ugly UNPARSED special cases.
|
|
|
|
Add an alternative macro notation as $mymacro(a,b,c,d). For me
this notation is more natural, I have difficulty remembering how
to use macros with ${mymacro:a;b;c} and it makes the filter
expression harder to understand.
For convenience and to simplify the code we also allow
curly braces to open/close macro argument lists and the semicolon
as an argument separator for the new syntax.
This added flexibility may be reevaluated and dropped later if it
turns out to be undesirable for some reason.
|
|
|
|
|
|
Remove the UAT macro usage. The UAT API is nifty for dissectors
but clunky for everything else.
This allows using a hash table to store macros, that is the natural
data structure for the use case (and faster).
It also allows using the existing filter GUI dialog, adapted for
display filter macros. The difference isn't huge but it's better
and less limited than the more generic UAT dialog, with room for
improvement. Changing the UAT dialog for filter specific
use cases is difficult.
The config file is renamed to "dmacros" and uses the same format
as "dfilter", that is more amenable and forgiving for hand-editing
than the UAT storage format.
There is some logic to convert the "dfilter_macros" UAT config
file to a "dmacros" filter config file, for backward-compatibility.
The conversion is only done if there is no existing "dmacros" file
in the profile folder.
|
|
Avoid running the name check twice.
|
|
|
|
Update epan to not initialize static proto values to -1.
|
|
|
|
Fixes #19466.
|
|
If a branch instruction does not branch, i.e it jumps
to the next instruction, replace it with a no-op for
a slight performance optimization and decluttering
of the bytecode.
|
|
When the jumps_ptr is NULL a nested function call
results in a NULL pointer dereference. We could add
a NULL check but removing the jump in commit e85f8d4cf19
was a mitake, because the jump is not always a no-op,
so add it back.
Fixes e85f8d4cf193721cd95daa15363054cd4d12a0f3.
|
|
This reverts commit acdee884307a2df8add5124fa651aaa056ffaf7d.
The use of a default switch statement prevents useful compiler
diagnostics, unlike the Clang Analyzer warning it purports to
fix.
|
|
Print the function argument type if it is not a register.
Use square brackets with DFVM_STACK_POP to make clear
the argument is a count intrinsic to the instruction, not a
numeric field value.
Example:
Filter:
min(2, _ws.ftypes.int32, 1000)
Instructions:
0000 STACK_PUSH 2 <FT_INT32>
0001 READ_TREE _ws.ftypes.int32 <FT_INT32> -> R1
0002 STACK_PUSH R1
0003 STACK_PUSH 1000 <FT_INT32>
0004 CALL_FUNCTION min(2 <FT_INT32>, R1, 1000 <FT_INT32>) <***> -> R0
0005 STACK_POP [3]
0006 IF_FALSE_GOTO 8
0007 NOT_ALL_ZERO R0
0008 RETURN
|
|
|
|
The UAT update_cb can *only* be used to validate an entry in ways
that cannot be determined by checking the UAT fields individually.
The update_cb is called on the copy of the record that is placed in
the UAT's user_data, not on the records that are in raw_data. When
the UAT is saved, the records in user_data are destroyed, and valid
records (as determined by running update_cb before) from raw_data
are newly copied into user_data. This *doesn't* cause the update_cb
to be run again, since the records were validated before. For this
reason, the update_cb should probably take a const void* not a void*.
This meant that Display Filter Macros were not parsed into their
parts and argument positions when the UAT was subsequently saved,
only on first load.
The only callback that is guaranteed to run whenever the data has
changed is the post_update_cb. Assign one, and call the former
macro_update (renamed and with slightly different signature; note that
it always returned true and never had an error before) on all the
macros then.
The comment about macros_init() adding a separate post_update_cb has not
been valid ever since GTK+ Wireshark went away, as QT Wireshark never
added that. The post_update_cb placed by the GTK+ macros_init was designed
to avoid a crash that is now avoided in Qt Wireshark by registering the
UAT with the UAT_AFFECTS_FIELDS flag (see a3806fc69b9ee53ac6e4a52f679e)
Fix #11946
|
|
Before:
Filter:
frame[:2] == $@frame[:2]
Error: Range is not supported for entity @frame <FT_BYTES>
frame[:2] == $@frame[:2]
^~~~~~~
After:
Filter:
frame[:2] == $@frame[:2]
Instructions:
0000 READ_TREE frame -> R0
0001 IF_FALSE_GOTO 7
0002 SLICE R0[0:2] -> R1
0003 READ_REFERENCE ${@frame} -> R2
0004 IF_FALSE_GOTO 7
0005 SLICE R2[0:2] -> R3
0006 ANY_EQ R1 == R3
0007 RETURN
|
|
Allow references without braces, for a less cluttered syntax:
Filter:
frame.number > $frame.number
Instructions:
0000 READ_TREE frame.number -> R0
0001 IF_FALSE_GOTO 5
0002 READ_REFERENCE ${frame.number} -> R1
0003 IF_FALSE_GOTO 5
0004 ANY_GT R0 > R1
0005 RETURN
The original syntax of ${reference} came from macros but the
braces don't add much. In any case they are still allowed.
|
|
|
|
|
|
Install headers required to build display filter plugins.
Refactoring and optimizing system headers is an ongoing effort.
|
|
Allow writing display filter plugins in C. Plugins can
register one or more display filter functions.
This should lower the barrier for implementing and sharing
new display feature extensions.
An example plugin will be provided in a follow-up commit.
TODO: Put some work into refactoring display filter headers.
Right now some plugin-related APIs are implemented in dfilter-int.h,
which we'd rather not install to the system.
|
|
For the benefit of plugins.
|
|
|
|
|
|
Allow functions to be tested for "existence". This is in fact
not an existence test but a truthiness test for the function
return value.
|
|
|
|
Simplify the interface to declare the function. This also makes
it easier to infer the function return type from a compiled
filter.
Print the function return type when dumping the compiled filter
or <***> if the type is unknown from the function name.
|
|
It's more compact than "bitwise_and" and inspired by C.
|
|
|