aboutsummaryrefslogtreecommitdiffstats
path: root/include/osmocom/core/fsm.h
AgeCommit message (Collapse)AuthorFilesLines
2021-02-10WIP: Tracingdaniel/tracingDaniel Willmann1-18/+46
Change-Id: I78945ab2bde7c93e9461dc446809f7cbd6493100
2020-12-21fsm: Add osmo_fsm_inst_broadcast_children()Harald Welte1-0/+11
This is a helper function to broadcast an event to all of the siblings of a specified FSM instance. Change-Id: I2ce398741a8672d7b7c4058d056f46e2fe7353c1
2019-11-24fsm.h: add missing include of logging.hNeels Hofmeyr1-0/+1
Change-Id: I783bf0eb40b674fb6a77f7673563fdf156975f5a
2019-10-29add osmo_fsm_set_dealloc_ctx(), to help with use-after-freeNeels Hofmeyr1-0/+1
This is a simpler and more general solution to the problem so far solved by osmo_fsm_term_safely(true). This extends use-after-free fixes to arbitrary functions, not only FSM instances during termination. The aim is to defer talloc_free() until back in the main loop. Rationale: I discovered an osmo-msc use-after-free crash from an invalid message, caused by this pattern: void event_action() { osmo_fsm_inst_dispatch(foo, FOO_EVENT, NULL); osmo_fsm_inst_dispatch(bar, BAR_EVENT, NULL); } Usually, FOO_EVENT takes successful action, and afterwards we also notify bar. However, in this particular case, FOO_EVENT caused failure, and the immediate error handling directly terminated and deallocated bar. In such a case, dispatching BAR_EVENT causes a use-after-free; this constituted a DoS vector just from sending messages that cause *any* failure during the first event dispatch. Instead, when this is enabled, we do not deallocate 'foo' until event_action() has returned back to the main loop. Test: duplicate fsm_dealloc_test.c using this, and print the number of items deallocated in each test loop, to ensure the feature works. We also verify that the deallocation safety works simply by fsm_dealloc_test.c not crashing. We should probably follow up by refusing event dispatch and state transitions for FSM instances that are terminating or already terminated: see I0adc13a1a998e953b6c850efa2761350dd07e03a. Change-Id: Ief4dba9ea587c9b4aea69993e965fbb20fb80e78
2019-06-05fsm: Allow millisecond granularity in osmo_fsm built-in timerHarald Welte1-0/+15
So far, the public API of osmo_fsm only allowed integral seconds as timeout. Let's change that to milli-seconds in order to cover more use cases. This introduces * osmo_fsm_inst_state_chg_ms() * osmo_fsm_inst_state_chg_keep_or_start_timer_ms() Which both work exactly like their previous counterparts without the _ms suffix - the only difference being that the timeout parameter is specified in milli-seconds, not in seconds. The value range for an unsigned long in milli-seconds even on a 32bit platform extends to about 48 days. This patch also removes the documentation notice about limiting the maximum value to 0x7fffffff due to time_t signed-ness. We don't use time_t but unsigned long. Change-Id: I35b330e460e80bb67376c77e997e464439ac5397
2019-04-12add identifier sanitation for setting FSM instance idsNeels Hofmeyr1-0/+1
We often compose FSM instance IDs from context information, for example placing an MSISDN string or IP:port information in the FSM instance id, using osmo_fsm_inst_update_id_f(). This fails if any characters are contained that don't pass osmo_identifier_valid(). Hence it is the task of the caller to make sure only characters allowed in an FSM id are applied. Provide API to trivially allow this by replacing illegal chars: - osmo_identifier_sanitize_buf(), with access to the same set of illegal characters defined in utils.c, - osmo_fsm_inst_update_id_f_sanitize() implicitly replaces non-identifier chars. This makes it easy to add strings like '192.168.0.1:2342' or '+4987654321' to an FSM instance id, without adding string mangling to each place that sets an id; e.g. replacing with '-' to yield '192-168-0-1:2342' or '-4987654321'. Change-Id: Ia40a6f3b2243c95fe428a080b938e11d8ab771a7
2019-04-11fsm: support graceful osmo_fsm_inst_term() cascadesNeels Hofmeyr1-0/+1
Add global flag osmo_fsm_term_safely() -- if set to true, enable the following behavior: Detect osmo_fsm_inst_term() occuring within osmo_fsm_inst_term(): - collect deallocations until the outermost osmo_fsm_inst_term() is done. - call osmo_fsm_inst_free() *after* dispatching the parent event. If a struct osmo_fsm_inst enters osmo_fsm_inst_term() while another is already within osmo_fsm_inst_term(), do not directly deallocate it, but talloc-reparent it to a separate talloc context, to be deallocated with the outermost FSM inst. The effect is that all osmo_fsm_inst freed within an osmo_fsm_inst_term() cascade will stay allocated until all osmo_fsm_inst_term() are complete and all of them will be deallocated at the same time. Mark the deferred deallocation state as __thread in an attempt to make cascaded deallocation handling threadsafe. Keep the enable/disable flag separate, so that it is global and not per-thread. The feature is showcased by fsm_dealloc_test.c: with this feature, all of those wild deallocation scenarios succeed. Make fsm_dealloc_test a normal regression test in testsuite.at. Rationale: It is difficult to gracefully handle deallocations of groups of FSM instances that reference each other. As soon as one child dispatching a cleanup event causes its parent to deallocate before fsm.c was ready for it, deallocation will hit a use-after-free. Before this patch, by using parent_term events and distinct "terminating" FSM states, parent/child FSMs can be taught to wait for all children to deallocate before deallocating the parent. But as soon as a non-child / non-parent FSM instance is involved, or actually any other cleanup() action that triggers parent FSMs or parent talloc contexts to become unused, it is near impossible to think of all possible deallocation events ricocheting, and to avoid running into freeing FSM instances that were still in the middle of osmo_fsm_inst_term(), or FSM instances to enter osmo_fsm_inst_term() more than once. This patch makes deallocation of "all possible" setups of complex cross referencing FSM instances easy to handle correctly, without running into use-after-free or double free situations, and, notably, without changing calling code. Change-Id: I8eda67540a1cd444491beb7856b9fcd0a3143b18
2019-04-11fsm: add flag to ensure osmo_fsm_inst_term() happens only onceNeels Hofmeyr1-0/+2
To prevent re-entering osmo_fsm_inst_term() twice for the same osmo_fsm_inst, add flag osmo_fsm_inst.proc.terminating. osmo_fsm_inst_term() sets this to true, or exits if it already is true. Update fsm_dealloc_test.err for illustration. It is not relevant for unit testing yet, just showing the difference. Change-Id: I0c02d76a86f90c49e0eae2f85db64704c96a7674
2019-03-07fsm: add osmo_fsm_inst_state_chg_keep_or_start_timer()Neels Hofmeyr1-0/+15
During FSM design for osmo-msc, I noticed that the current behavior that keep_timer=true doesn't guarantee a running timer can make FSM design a bit complex, especially when using osmo_tdef for timeout definitions. A desirable keep_timer=true behavior is one that keeps the previous timer running, but starts a timer if no timer is running yet. The simplest example is: a given state repeatedly transitions back to itself, but wants to set a timeout only on first entering, avoiding to restart the timeout on re-entering. Another example is a repeated transition between two or more states, where the first time we enter this group a timeout should start, but it should not restart from scratch on every transition. When using osmo_tdef timeout definitions for this, so far separate meaningless states have to be introduced that merely set a fixed timeout. To simplify, add osmo_fsm_inst_state_chg_keep_or_start_timer(), and use this in osmo_tdef_fsm_inst_state_chg() when both keep_timer == true *and* T != 0. In tdef_test.ok, the changes show that on first entering state L, the previous T=1 is now kept with a large remaining timeout. When entering state L from O, where no timer was running, this time L's T123 is started. Change-Id: Id647511a4b18e0c4de0e66fb1f35dc9adb9177db
2019-03-06represent negative T-timers as Osmocom-specific X-timersNeels Hofmeyr1-0/+3
fi->T values are int, i.e. can be negative. Do not log them as unsigned, but define a distinct timer class "Xnnnn" for negative T values: i.e. for T == -1, print "Timeout of X1" instead of "Timeout of T4294967295". The negative T timer number space is useful to distinguish freely invented timers from proper 3GPP defined T numbers. So far I was using numbers like T993210 or T9999 for invented T, but X1, X2 etc. is a better solution. This way we can make sure to not accidentally define an invented timer number that actually collides with a proper 3GPP specified timer number that the author was not aware of at the time of writing. Add OSMO_T_FMT and OSMO_T_FMT_ARGS() macros as standardized timer number print format. Use that in fsm.c, tdef_vty.c, and adjust vty tests accordingly. Mention the two timer classes in various API docs and VTY online-docs. Change-Id: I3a59457623da9309fbbda235fe18fadd1636bff6
2019-02-26log: fsm: allow logging the timeout on state changeNeels Hofmeyr1-0/+1
Add a flag that adds timeout info to osmo_fsm_inst state change logging. To not affect unit testing, make this an opt-in feature that is disabled by default -- mostly because osmo_fsm_inst_state_chg_keep_timer() will produce non-deterministic logging depending on timing (logs remaining time). Unit tests that don't verify log output and those that use fake time may also enable this feature. Do so in fsm_test.c. The idea is that in due course we will add osmo_fsm_log_timeouts(true) calls to all of our production applications' main() initialization. Change-Id: I089b81021a1a4ada1205261470da032b82d57872
2019-01-29osmo_fsm_state_name: make robust against NULL fiNeels Hofmeyr1-1/+1
Change-Id: I61d4f7dfada2763948f330745ac886405d889a12
2019-01-03add LOGPFSMSL(), LOGPFSMSLSRC()Neels Hofmeyr1-6/+30
Using an FSM instace's logging context is very useful. Sometimes it makes sense to log something on a different logging category than the FSM definition's default category. For example, an MSC conn has aspects concerning MM, CC, RR, MGCP, ..., and currently all of those log on DMM. This came up in I358cfbaf0f44f25148e8b9bafcb9257b1952b35a, where I want to log an MGCP event using a ran_conn context, and used the conn->fi->id. That of course omits context like the current conn FSM state... I remember at least one other place where I recently added logging using some fi->id as context, so it might turn out useful in various places. Change-Id: I11b182a03f5ecb6df7cd8f260757d3626c8e945d
2019-01-03LOGPFSM*: guard against fi == NULLNeels Hofmeyr1-4/+4
The LOGPFSM macros are in such wide use that they should guard against a NULL fi pointer. In case of NULL, default to subsys = DLGLOBAL, loglevel = LOGL_ERROR and state = "fi=NULL". Change-Id: I9eaf8b7e2cf1e450ae626cb2fc928862008f6233
2019-01-03add API doc for LOGPFSM* macrosNeels Hofmeyr1-0/+31
Change-Id: I3bf6500889aa58195f50a726dec0876c0c2baec3
2018-12-23define LOGPFSM fmt only once, in LOGPFSMLSRCNeels Hofmeyr1-7/+4
Instead of duplicating the fmt and args in LOGPFSML and LOGPFSMLSRC, rather make LOGPFSML invoke LOGPFSMLSRC with __FILE__ and __LINE__. This is a cosmetic preparation for more tweaks coming up. Change-Id: I2f23c57ebfdb5355919c06ac5ded7732e3b17a97
2018-08-20use __FILE__, not __BASE_FILE__Neels Hofmeyr1-5/+5
The intention was to use the file's basename, but __BASE_FILE__ means "the root file that is being parsed and contains #include statements". If we had a function using __BASE_FILE__ and that was defined in an #included file, __BASE_FILE__ would indicate the first file where the #include is, and not the file where the function is defined. __BASE_FILE__ works for us because we don't ever include function definitions that log something, so __BASE_FILE__ always coincides with __FILE__ for our logging; but still __BASE_FILE__ is semantically the wrong constant. Related: OS#2740 Change-Id: Ibc1d3746f1876ac42d6b1faf0e5f83bd2283cdcc
2018-06-08fsm: Change semantics of LOGPFSML() log-levelHarald Welte1-1/+2
The general idea about each osmo_fsm_instance having a separate log-level was to be able to selectively increase/show/enable logging for some FSM instances (e.g. of a particular subscriber) while maintaining normal logging verbosity for all other instances of the same FSM. The introduction of LOGPFSML() in Change-Id If295fdabb3f31a0fd9490d1e0df57794c75ae547 broke that idea, as it would use a compile-time log level, irrespective of the osmo_fsm_inst.log_level setting of the given instance. Let's combine the two: Use the explicit level stated at LOGPFSML(), _unless_ this instance has a higher log_level configured. This way, all FSMs should normally be created with osmo_fsm_inst.log_level == LOGL_DEBUG. At that point LOGPFSM() statements would be rendered at debug level, typically below the threshold of most logging configurations. Code that has explicit higher log levels like LOGPFSML(fi, LOGL_ERROR) would always be printed, as it is an error message. And if we now increase the osmo_fsm_inst.log_level, then even the normal LOGPFSM() statements would suddenly be logged at that higher level, selectively increasing log verbosity - like originally intended. Change-Id: I1820f04d0c6f5d5ff08eb95b8c0e88764534491a
2018-05-31add osmo_fsm_inst_state_chg_keep_timer()Neels Hofmeyr1-0/+15
Change-Id: I3c0e53b846b2208bd201ace99777f2286ea39ae8
2018-04-09add osmo_fsm_inst_update_id_f()Neels Hofmeyr1-0/+1
In the osmo-msc, I would like to set the subscr conn FSM identifier by a string format, to include the type of Complete Layer 3 that is taking place. I could each time talloc a string and free it again. This API is more convenient. From osmo_fsm_inst_update_id(), call osmo_fsm_inst_update_id_f() with "%s" (or pass NULL). Put the name updating into separate static update_name() function to clarify. Adjust the error message for erratic ID: don't say "allocate", it might be from an update. Adjust test expectation. Change-Id: I76743a7642f2449fd33350691ac8ebbf4400371d
2018-02-19fsm: allow graceful exit on FSM terminationPhilipp Maier1-0/+2
The function _osmo_fsm_inst_term() terminates all child FSMs befor it calls fi->fsm_cleanup(). This prevents the cleanup callback to perform last actions on the child FSMs (e.g. osmo_fsm_inst_unlink_parent()). - Since moving the cleanup callack to the beginning of the function would alter the termination behavior and possibly cause malfunction in already existing implementation that use OSMO fsm, a new optional callback that is called immediately at the beginning of the terminatopn process is added. Change-Id: I0fdda9fe994753f975a658c0f3fb3615949cc8bb Closes: OS#2915
2018-02-15fsm: Add a function to change the FSM instance ID laterDaniel Willmann1-0/+2
Sometimes we want to create an FSM instance before we know its name. In that case we should be able to update the id later. Change-Id: Ic216e5b11d4440f8e106a297714f4f06c1152945
2018-01-24fsm: add functions for unlinking and changing parentsPhilipp Maier1-0/+4
At the moment it is not possible to unlink a child from from its parent, nor is it possible to assign a new parent to a child FSM. - osmo_fsm_inst_unlink_parent(): Make it possible to unlink childs from a parent. - osmo_fsm_inst_change_parent(): Make it possible to change the parent of a child. Change-Id: I6d18cbd4ada903cf3720b3ad2a89fc643085beef
2017-06-23doxygen: unify use of \file across the boardNeels Hofmeyr1-5/+5
Considering the various styles and implications found in the sources, edit scores of files to follow the same API doc guidelines around the doxygen grouping and the \file tag. Many files now show a short description in the generated API doc that was so far only available as C comment. The guidelines and reasoning behind it is documented at https://osmocom.org/projects/cellular-infrastructure/wiki/Guidelines_for_API_documentation In some instances, remove file comments and add to the corresponding group instead, to be shared among several files (e.g. bitvec). Change-Id: Ifa70e77e90462b5eb2b0457c70fd25275910c72b
2017-06-23doxygen: enable AUTOBRIEF, drop \briefNeels Hofmeyr1-45/+45
Especially for short descriptions, it is annoying to have to type \brief for every single API doc. Drop all \brief and enable the AUTOBRIEF feature of doxygen, which always takes the first sentence of an API doc as the brief description. Change-Id: I11a8a821b065a128108641a2a63fb5a2b1916e87
2017-04-16osmo_fsm: Lookup functions to find FSM Instance by name or IDHarald Welte1-0/+4
Introduce two lookup helper functions to resolve a fsm_instance based on the FSM and name or ID. Also, add related test cases. Change-Id: I707f3ed2795c28a924e64adc612d378c21baa815
2017-03-01fsm: convenience: add inline osmo_fsm_inst_state_name()Neels Hofmeyr1-0/+4
Change-Id: If9a6ecc4d6e2beaf716569e9a6053d73488e860b
2017-01-07Add osmo_fsm_find_by_name() and avoid registering FSM with same nameHarald Welte1-0/+1
This addresses a FIXME in the fsm.c code: osmo_fsm_register() should fail in case a FSM with the given name already exists. Change-Id: I5fd882939859c79581eba70c14cbafd64560b583
2016-12-24fsm: factor out osmo_fsm_inst_term_children() from osmo_fsm_inst_term()Neels Hofmeyr1-0/+13
osmo_fsm_inst_term() has code for safe child removal, publish that part as osmo_fsm_inst_term_children(); also use from osmo_fsm_inst_term(). As with osmo_fsm_inst_term(), add osmo_fsm_inst_term_children() macro to pass the caller's source file and line to new _osmo_fsm_inst_term_children(). Rationale: in openbsc's VLR, I want to discard child FSMs when certain events are handled. I could keep a pointer to each one, or simply iterate all children, making the code a lot simpler in some places. (Unfortunately, the patch may be displayed subobtimally. This really only moves the children-loop to a new function, replaces it with a call to _osmo_fsm_inst_term_children(fi, OSMO_FSM_TERM_PARENT, NULL, file, line) and drops two local iterator variables. No other code changes are made, even though the diff may show large removal + addition chunks) Change-Id: I8dac1206259cbd251660f793ad023aaa1dc705a2
2016-12-24fsm: move LOGPFSMSRC and LOGPFSMLSRC to .hNeels Hofmeyr1-0/+13
LOGPFSM and LOGPFSML are in the header file, put the *SRC variants also there so users of the osmo_fsm_inst API may conveniently create own functions that log the caller's source file and line. Very useful if many action functions call the same event dispatching function, like foo_fsm_done(), and one needs to know which of the callers to debug. Change-Id: I39447b1d15237b28f88d8c5f08d82c764679dc80
2016-12-20fsm api doc: fix typo in doxygen marker '\breif'Neels Hofmeyr1-1/+1
Change-Id: I5c57e35b29d50cb409becada6b9b120ce5210ae0
2016-12-15add value strings for enum osmo_fsm_term_cause and use for loggingNeels Hofmeyr1-0/+7
Change-Id: Iaf63d3cadb0d46bf454e3314ebb439240cafd834
2016-12-14fsm: log caller's source for events and state changes, not fsm.c linesNeels Hofmeyr1-6/+35
When looking at log output, it is not interesting to see that a state transition's petty details are implemented in fsm.c. Rather log the *caller's* source file and line that caused an event, state change and cascading events. To that end, introduce LOGPSRC() absorbing the guts of LOGP(), to be able to explicitly pass the source file and line information. Prepend an underscore to the function names of osmo_fsm_inst_state_chg(), osmo_fsm_inst_dispatch() and osmo_fsm_inst_term(), and add file and line arguments to them. Provide the previous names as macros that insert the caller's __BASE_FILE__ and __LINE__ constants for the new arguments. Hence no calling code needs to be changed. In fsm.c, add LOGPFSMSRC to call LOGPSRC, and add LOGPFSMLSRC, and use them in above _osmo_fsm_inst_* functions. In addition, in _osmo_fsm_inst_term(), pass the caller's source file and line on to nested event dispatches, so showing where a cascade originated from. Change-Id: Iae72aba7bbf99e19dd584ccabea5867210650dcd
2016-12-14fsm: add LOGPFSML to pass explicit logging levelNeels Hofmeyr1-2/+5
Provide one central LOGPFSML to print FSM information, take the FSM logging subsystem from the FSM instance but use an explicitly provided log level instead of the FSM's default level. Use to replace some, essentially, duplications of the LOGPFSM macro. In effect, the fsm_test's expected error changes, since the previous code dup for logging events used round braces to indicate the fi's state, while the central macro uses curly braces. Change-Id: If295fdabb3f31a0fd9490d1e0df57794c75ae547
2016-12-09doc: fsm timer_cb: explain return valueNeels Hofmeyr1-1/+2
Change-Id: Ic6fbe95737862ed8b8de78058989c8b2ae330006
2016-11-02Add osmo_fsm_unregister() to headerMax1-1/+1
Previously function was defined but not exposed so there were a way to register FSM but no way to unregister it. Change-Id: I2e749d896009784b77d6d5952fcc38e1c131db2b
2016-06-27fsm: Introduce default time-out handlingHarald Welte1-1/+3
If a FSM doesn't specify any timer_cb, simply terminate the FSM by default on time-out. This is a reasonable default for most cases, and avoids copy+pasting a one-line timer_cb function in every FSM. Also, even if there is a timer_cb, let it have a return value to decide if the core should terminate after return from timer_cb or not. Change-Id: I0461a9593bfb729c82b7d1d1cf9f30b1079d0212
2016-06-16Add Finite State Machine abstraction codeHarald Welte1-0/+135
This code is supposed to formalize some of the state machine handling in Osmocom code. Change-Id: I0b0965a912598c1f6b84042a99fea9d522642466 Reviewed-on: https://gerrit.osmocom.org/163 Tested-by: Jenkins Builder Reviewed-by: Harald Welte <laforge@gnumonks.org>