aboutsummaryrefslogtreecommitdiffstats
path: root/doc
diff options
context:
space:
mode:
authorkpfleming <kpfleming@f38db490-d61c-443f-a65b-d21fe96a405b>2005-05-16 00:35:38 +0000
committerkpfleming <kpfleming@f38db490-d61c-443f-a65b-d21fe96a405b>2005-05-16 00:35:38 +0000
commit5e9ff3009ec0f9ca310ac94fb7a8ebf7cd2db571 (patch)
treed149948a4b2a5d510a60e6c3ddc8d04b6f57f8df /doc
parentcd6784bf2bab3dc3c32f9a3ca92b179dc62d8cef (diff)
add upgraded expression parser (bug #2058)
git-svn-id: http://svn.digium.com/svn/asterisk/trunk@5691 f38db490-d61c-443f-a65b-d21fe96a405b
Diffstat (limited to 'doc')
-rwxr-xr-xdoc/README.variables207
1 files changed, 200 insertions, 7 deletions
diff --git a/doc/README.variables b/doc/README.variables
index 05955bdba..ff5313f31 100755
--- a/doc/README.variables
+++ b/doc/README.variables
@@ -1,5 +1,6 @@
+----------------------------
Asterisk dial plan variables
----------------------------
+----------------------------
There are two levels of parameter evaluation done in the Asterisk
dial plan in extensions.conf.
@@ -12,6 +13,15 @@ Asterisk has user-defined variables and standard variables set
by various modules in Asterisk. These standard variables are
listed at the end of this document.
+NOTE: During the Asterisk build process, the versions of bison and
+flex available on your system are probed. If you have versions of
+flex greater than or equal to 2.5.31, it will use flex to build a
+"pure" (re-entrant) tokenizer for expressions. If you use bison version
+greater than 1.85, it will use a bison grammar to generate a pure (re-entrant)
+parser for $[] expressions.
+Notes specific to the flex parser are marked with "**" at the beginning
+of the line.
+
___________________________
PARAMETER QUOTING:
---------------------------
@@ -123,6 +133,10 @@ considered as an expression and it is evaluated. Evaluation works similar to
evaluation.
Note: The arguments and operands of the expression MUST BE separated
by at least one space.
+** Using the Flex generated tokenizer, this is no longer the case. Spaces
+** are only required where they would seperate tokens that would normally
+** be merged into a single token. Using the new tokenizer, spaces can be
+** used freely.
For example, after the sequence:
@@ -132,6 +146,11 @@ exten => 1,2,Set(koko=$[2 * ${lala}])
the value of variable koko is "6".
+** Using the new Flex generated tokenizer, the expressions above are still
+** legal, but so are the following:
+** exten => 1,1,Set(lala=$[1+2])
+** exten => 1,2,Set(koko=$[2* ${lala}])
+
And, further:
exten => 1,1,Set(lala=$[1+2]);
@@ -141,15 +160,19 @@ token "1+2" are not numbers, it will be evaluated as the string "1+2". Again,
please do not forget, that this is a very simple parsing engine, and it
uses a space (at least one), to separate "tokens".
+** Please note that spaces are not required to separate tokens if you have
+** Flex version 2.5.31 or higher on your system.
+
and, further:
exten => 1,1,Set,"lala=$[ 1 + 2 ]";
will parse as intended. Extra spaces are ignored.
-___________________________
-SPACES INSIDE VARIABLE
----------------------------
+
+______________________________
+SPACES INSIDE VARIABLE VALUES
+------------------------------
If the variable being evaluated contains spaces, there can be problems.
For these cases, double quotes around text that may contain spaces
@@ -173,7 +196,7 @@ DELOREAN MOTORS : Privacy Manager
and will result in syntax errors, because token DELOREAN is immediately
followed by token MOTORS and the expression parser will not know how to
-evaluate this expression.
+evaluate this expression, because it does not match its grammar.
_____________________
OPERATORS
@@ -204,6 +227,14 @@ with equal precedence are grouped within { } symbols.
Return the results of multiplication, integer division, or
remainder of integer-valued arguments.
+** - expr1
+** Return the result of subtracting expr1 from 0.
+**
+** ! expr1
+** Return the result of a logical complement of expr1.
+** In other words, if expr1 is null, 0, an empty string,
+** or the string "0", return a 1. Otherwise, return a "0". (only with flex >= 2.5.31)
+
expr1 : expr2
The `:' operator matches expr1 against expr2, which must be a
regular expression. The regular expression is anchored to the
@@ -216,11 +247,70 @@ with equal precedence are grouped within { } symbols.
the pattern contains a regular expression subexpression the null
string is returned; otherwise 0.
+ Normally, the double quotes wrapping a string are left as part
+ of the string. This is disastrous to the : operator. Therefore,
+ before the regex match is made, beginning and ending double quote
+ characters are stripped from both the pattern and the string.
+
+** expr1 =~ expr2
+** Exactly the same as the ':' operator, except that the match is
+** not anchored to the beginning of the string. Pardon any similarity
+** to seemingly similar operators in other programming languages!
+** (only if flex >= 2.5.31)
+
+
+
Parentheses are used for grouping in the usual manner.
-The parser must be parsed with bison (bison is REQUIRED - yacc cannot
-produce pure parsers, which are reentrant)
+Operator precedence is applied as one would expect in any of the C
+or C derived languages.
+
+The parser must be generated with bison (bison is REQUIRED - yacc cannot
+produce pure parsers, which are reentrant) The same with flex, if flex
+is at 2.5.31 or greater; Re-entrant scanners were not available before that
+version.
+
+
+
+Examples
+** "One Thousand Five Hundred" =~ "(T[^ ]+)"
+** returns: Thousand
+
+** "One Thousand Five Hundred" =~ "T[^ ]+"
+** returns: 8
+
+ "One Thousand Five Hundred" : "T[^ ]+"
+ returns: 0
+
+ "8015551212" : "(...)"
+ returns: 801
+
+ "3075551212":"...(...)"
+ returns: 555
+
+** ! "One Thousand Five Hundred" =~ "T[^ ]+"
+** returns: 0 (because it applies to the string, which is non-null, which it turns to "0",
+ and then looks for the pattern in the "0", and doesn't find it)
+
+** !( "One Thousand Five Hundred" : "T[^ ]+" )
+** returns: 1 (because the string doesn't start with a word starting with T, so the
+ match evals to 0, and the ! operator inverts it to 1 ).
+
+ 2 + 8 / 2
+ returns 6. (because of operator precedence; the division is done first, then the addition).
+
+** 2+8/2
+** returns 6. Spaces aren't necessary.
+
+**(2+8)/2
+** returns 5, of course.
+
+Of course, all of the above examples use constants, but would work the same if any of the
+numeric or string constants were replaced with a variable reference ${CALLERIDNUM}, for
+instance.
+
+
___________________________
CONDITIONALS
---------------------------
@@ -277,6 +367,26 @@ going to be somewhere between the last '^' on the second line, and the
'^' on the third line. That's right, in the example above, there are two
'&' chars, separated by a space, and this is a definite no-no!
+** WITH FLEX >= 2.5.31, this has changed slightly. The line showing the
+** part of the expression that was successfully parsed has been dropped,
+** and the parse error is explained in a somewhat cryptic format in the log.
+**
+** The same line in extensions.conf as above, will now generate an error
+** message in /var/log/asterisk/messages that looks like this:
+**
+** Jul 15 21:27:49 WARNING[1251240752]: ast_yyerror(): syntax error: parse error, unexpected TOK_AND, expecting TOK_MINUS or TOK_LP or TOKEN; Input:
+** "3072312154" = "3071234567" & & "Steves Extension" : "Privacy Manager"
+** ^
+**
+** The log line tells you that a syntax error was encountered. It now
+** also tells you (in grand standard bison format) that it hit an "AND" (&)
+** token unexpectedly, and that was hoping for for a MINUS (-), LP (left parenthesis),
+** or a plain token (a string or number).
+**
+** As before, the next line shows the evaluated expression, and the line after
+** that, the position of the parser in the expression when it became confused,
+** marked with the "^" character.
+
___________________________
NULL STRINGS
@@ -306,6 +416,89 @@ whatever language you desire, be it Perl, C, C++, Cobol, RPG, Java,
Snobol, PL/I, Scheme, Common Lisp, Shell scripts, Tcl, Forth, Modula,
Pascal, APL, assembler, etc.
+----------------------------
+INCOMPATIBILITIES
+----------------------------
+
+The asterisk expression parser has undergone some evolution. It is hoped
+that the changes will be viewed as positive.
+
+The "original" expression parser had a simple, hand-written scanner, and
+a simple bison grammar. This was upgraded to a more involved bison grammar,
+and a hand-written scanner upgraded to allow extra spaces, and to generate
+better error diagnostics. This upgrade required bison 1.85, and a [art of the user
+community felt the pain of having to upgrade their bison version.
+
+The next upgrade included new bison and flex input files, and the makefile
+was upgraded to detect current version of both flex and bison, conditionally
+compiling and linking the new files if the versions of flex and bison would
+allow it.
+
+If you have not touched your extensions.conf files in a year or so, the
+above upgrades may cause you some heartburn in certain circumstances, as
+several changes have been made, and these will affect asterisk's behavior on
+legacy extension.conf constructs. The changes have been engineered
+to minimize these conflicts, but there are bound to be problems.
+
+The following list gives some (and most likely, not all) of areas
+of possible concern with "legacy" extension.conf files:
+
+1. Tokens separated by space(s).
+ Previously, tokens were separated by spaces. Thus, ' 1 + 1 ' would evaluate
+ to the value '2', but '1+1' would evaluate to the string '1+1'. If this
+ behavior was depended on, then the expression evaluation will break. '1+1'
+ will now evaluate to '2', and something is not going to work right.
+ To keep such strings from being evaluated, simply wrap them in double
+ quotes: ' "1+1" '
+
+2. The colon operator. In versions previous to double quoting, the
+ colon operator takes the right hand string, and using it as a
+ regex pattern, looks for it in the left hand string. It is given
+ an implicit ^ operator at the beginning, meaning the pattern
+ will match only at the beginning of the left hand string.
+ If the pattern or the matching string had double quotes around
+ them, these could get in the way of the pattern match. Now,
+ the wrapping double quotes are stripped from both the pattern
+ and the left hand string before applying the pattern. This
+ was done because it recognized that the new way of
+ scanning the expression doesn't use spaces to separate tokens,
+ and the average regex expression is full of operators that
+ the scanner will recognize as expression operators. Thus, unless
+ the pattern is wrapped in double quotes, there will be trouble.
+ For instance, ${VAR1} : (Who|What*)+
+ may have have worked before, but unless you wrap the pattern
+ in double quotes now, look out for trouble! This is better:
+ "${VAR1}" : "(Who|What*)+"
+ and should work as previous.
+
+3. Variables and Double Quotes
+ Before these changes, if a variable's value contained one or more double
+ quotes, it was no reason for concern. It is now!
+
+4. LE, GE, NE operators removed. The code supported these operators,
+ but they were not documented. The symbolic operators, <=, >=, and !=
+ should be used instead.
+
+**5. flex 2.5.31 or greater should be used. Bison-1.875 or greater. In
+** the case of flex, earlier versions do not generate 'pure', or
+** reentrant C scanners. In the case of bison-1.875, earlier versions
+** didn't support the location tracking mechanism.
+
+** http://ftp.gnu.org/gnu/bison/bison-1.875.tar.bz2
+** http://prdownloads.sourceforge.net/lex/flex-2.5.31.tar.bz2?download
+** or http://lex.sourceforge.net/
+
+**6. Added the unary '-' operator. So you can 3+ -4 and get -1.
+
+**7. Added the unary '!' operator, which is a logical complement.
+** Basically, if the string or number is null, empty, or '0',
+** a '1' is returned. Otherwise a '0' is returned.
+
+**8. Added the '=~' operator, just in case someone is just looking for
+** match anywhere in the string. The only diff with the ':' is that
+** match doesn't have to be anchored to the beginning of the string.
+
+
---------------------------------------------------------
Asterisk standard channel variables
---------------------------------------------------------