diff options
author | Pau Espin Pedrol <pespin@sysmocom.de> | 2019-05-24 19:58:20 +0200 |
---|---|---|
committer | pespin <pespin@sysmocom.de> | 2019-06-11 14:28:17 +0000 |
commit | 6a305feb0f7bdcae9d0552e5d2bca9c48ec2e63f (patch) | |
tree | 207ee632688a948568c6b31867b95baea1cafc3c /doc | |
parent | bde55afd29fc9aae10eb11f6515821afa39b772d (diff) |
Add VTY commands to set error ctr thresholds
osmo-trx will validate over time that those thresholds are not reached.
If they are reached, osmo-trx will die. As a result, osmo-bts-trx will
notice and will end up notifying the BSC about it (for instance because
it will also restart its process).
For instance:
"""
ctr-error-threshold rx_drop_events 2 minute
ctr-error-threshold rx_underruns 10 second
"""
In those cases above, osmo-trx will die if rate_ctr rx_drop_events went
to a value higher than 2 per minute, or it will die to if rx_underruns
went higher than 10 per second.
Change-Id: I4bcf44dbf064e2e86dfc3b8a2ad18fea76fbd51a
Diffstat (limited to 'doc')
-rw-r--r-- | doc/manuals/chapters/counters.adoc | 59 | ||||
-rw-r--r-- | doc/manuals/chapters/counters_generated.adoc | 18 | ||||
-rw-r--r-- | doc/manuals/vty/trx_vty_reference.xml | 31 |
3 files changed, 104 insertions, 4 deletions
diff --git a/doc/manuals/chapters/counters.adoc b/doc/manuals/chapters/counters.adoc index 7fbb10c..79d1962 100644 --- a/doc/manuals/chapters/counters.adoc +++ b/doc/manuals/chapters/counters.adoc @@ -2,3 +2,62 @@ == Counters include::./counters_generated.adoc[] + +=== Rate Counter Configurable Error Thresholds + +Some rate counters such as overruns, underruns and dropped packets indicate +events that can really harm correct operation of the BTS served by OsmoTRX, +specially if they happen frequently. OsmoTRX is in most cases (depending on +maturity of device driver) prepared to dodge the temporary failure and keep +running and providing service. + +Still, it is sometimes important for this kind of events to not go unnoticed by +the operator, since they may indicate issues regarding the set up that may +require operator intervention to fix it. + +For instance, frequent dropped packets could indicate SDR HW/FW/power errors, or +a faulty connection against the host running OsmoTRX. + +They can also indicate issues on the host running OsmoTRX itself: For instance, +OsmoTRX may not be running under a high enough priority (hence other processes +eventually battling for resources with it), or that simply the HW running +OsmoTRX is not powerful enough to accomplish all work in a timely fashion all +the time. + +As a result, OsmoTRX can be configured to exit the process upon certain +conditions being met, in order to let osmoBTS notice something is wrong and thus +announcing issues through alarms to the network, where the operator can then +investigate the issue by looking at OsmoTRX logs. + +These conditions are configured by means of introducing rate counter thresholds +in the VTY. The OsmoTRX user can provide those threshold commands either in the +VTY cfg file read by OsmoTRX process during startup, or by adding/removing them +dynamically through the VTY interactive console. + +Each threshold cmd states an event (a rate counter type), a value and an time +interval (a second, a minute, an hour or a day). A threshold will be reached +(and OsmoTRX stopped) if its value grows bigger than the configured threshold +value over the configured time interval. This is the syntax used to manage rate +counter thresholds: + +`(no) ctr-error-threshold <EVENT> <VALUE> <INTERVAL>` + +If several rate counter thresholds are set, then all of them are checked over +time and the first one reached will stop OsmoTRX. + +.Example: rate counter threshold configuration (VTY .cfg file) +---- +trx + ctr-error-threshold rx_drop_events 2 per-minute <1> + ctr-error-threshold rx_drop_samples 800 per-second <2> +---- +<1> Stop OsmoTRX if dropped event (any amount of samples) during Rx was detected 2 times or more during a minute. +<2> Stop OsmoTRX if 800 or more samples were detected during Rx to be dropped by the HW during a second. + +.Example: rate counter threshold configuration (VTY interactive) +---- +OsmoTRX(config-trx)# ctr-error-threshold tx_underruns 3 per-hour <1> +OsmoTRX(config-trx)# no ctr-error-threshold tx_underruns 3 per-hour <2> +---- +<1> Stop OsmoTRX if 3 or more underruns were detected during Tx over the last hour +<2> Remove previously set rate counter threshold diff --git a/doc/manuals/chapters/counters_generated.adoc b/doc/manuals/chapters/counters_generated.adoc index b40dc37..6955b18 100644 --- a/doc/manuals/chapters/counters_generated.adoc +++ b/doc/manuals/chapters/counters_generated.adoc @@ -1,7 +1,17 @@ // autogenerated by show asciidoc counters -These counters and their description based on OsmoTRX 0.2.0.61-408f (OsmoTRX). +These counters and their description based on OsmoTRX 1.0.0.43-3f7c0 (OsmoTRX). + +=== Rate Counters // generating tables for rate_ctr_group -// generating tables for osmo_stat_items -// generating tables for osmo_counters -// there are no ungrouped osmo_counters +// rate_ctr_group table osmo-trx statistics +.trx:chan - osmo-trx statistics +[options="header"] +|=== +| Name | Reference | Description +| device:rx_underruns | <<trx:chan_device:rx_underruns>> | Number of Rx underruns +| device:rx_overruns | <<trx:chan_device:rx_overruns>> | Number of Rx overruns +| device:tx_underruns | <<trx:chan_device:tx_underruns>> | Number of Tx underruns +| device:rx_drop_events | <<trx:chan_device:rx_drop_events>> | Number of times Rx samples were dropped by HW +| device:rx_drop_samples | <<trx:chan_device:rx_drop_samples>> | Number of Rx samples dropped by HW +|=== diff --git a/doc/manuals/vty/trx_vty_reference.xml b/doc/manuals/vty/trx_vty_reference.xml index d6cd15d..e448a46 100644 --- a/doc/manuals/vty/trx_vty_reference.xml +++ b/doc/manuals/vty/trx_vty_reference.xml @@ -1253,6 +1253,37 @@ <param name='dummy' doc='Dummy method' /> </params> </command> + <command id='ctr-error-threshold (rx_underruns|rx_overruns|tx_underruns|rx_drop_events|rx_drop_samples) <0-65535> (per-second|per-minute|per-hour|per-day)'> + <params> + <param name='ctr-error-threshold' doc='Threshold rate for error counter' /> + <param name='rx_underruns' doc='Set threshold value for rate_ctr device:rx_underruns' /> + <param name='rx_overruns' doc='Set threshold value for rate_ctr device:rx_overruns' /> + <param name='tx_underruns' doc='Set threshold value for rate_ctr device:tx_underruns' /> + <param name='rx_drop_events' doc='Set threshold value for rate_ctr device:rx_drop_events' /> + <param name='rx_drop_samples' doc='Set threshold value for rate_ctr device:rx_drop_samples' /> + <param name='<0-65535>' doc='Value to set for threshold' /> + <param name='per-second' doc='Threshold value sampled per-second' /> + <param name='per-minute' doc='Threshold value sampled per-minute' /> + <param name='per-hour' doc='Threshold value sampled per-hour' /> + <param name='per-day' doc='Threshold value sampled per-day' /> + </params> + </command> + <command id='no ctr-error-threshold (rx_underruns|rx_overruns|tx_underruns|rx_drop_events|rx_drop_samples) <0-65535> (per-second|per-minute|per-hour|per-day)'> + <params> + <param name='no' doc='Negate a command or set its defaults' /> + <param name='ctr-error-threshold' doc='Threshold rate for error counter' /> + <param name='rx_underruns' doc='Set threshold value for rate_ctr device:rx_underruns' /> + <param name='rx_overruns' doc='Set threshold value for rate_ctr device:rx_overruns' /> + <param name='tx_underruns' doc='Set threshold value for rate_ctr device:tx_underruns' /> + <param name='rx_drop_events' doc='Set threshold value for rate_ctr device:rx_drop_events' /> + <param name='rx_drop_samples' doc='Set threshold value for rate_ctr device:rx_drop_samples' /> + <param name='<0-65535>' doc='Value to set for threshold' /> + <param name='per-second' doc='Threshold value sampled per-second' /> + <param name='per-minute' doc='Threshold value sampled per-minute' /> + <param name='per-hour' doc='Threshold value sampled per-hour' /> + <param name='per-day' doc='Threshold value sampled per-day' /> + </params> + </command> <command id='chan <0-100>'> <params> <param name='chan' doc='Select a channel to configure' /> |