aboutsummaryrefslogtreecommitdiffstats
path: root/doc
diff options
context:
space:
mode:
authorPau Espin Pedrol <pespin@sysmocom.de>2019-05-24 19:58:20 +0200
committerpespin <pespin@sysmocom.de>2019-06-11 14:28:17 +0000
commit6a305feb0f7bdcae9d0552e5d2bca9c48ec2e63f (patch)
tree207ee632688a948568c6b31867b95baea1cafc3c /doc
parentbde55afd29fc9aae10eb11f6515821afa39b772d (diff)
Add VTY commands to set error ctr thresholds
osmo-trx will validate over time that those thresholds are not reached. If they are reached, osmo-trx will die. As a result, osmo-bts-trx will notice and will end up notifying the BSC about it (for instance because it will also restart its process). For instance: """ ctr-error-threshold rx_drop_events 2 minute ctr-error-threshold rx_underruns 10 second """ In those cases above, osmo-trx will die if rate_ctr rx_drop_events went to a value higher than 2 per minute, or it will die to if rx_underruns went higher than 10 per second. Change-Id: I4bcf44dbf064e2e86dfc3b8a2ad18fea76fbd51a
Diffstat (limited to 'doc')
-rw-r--r--doc/manuals/chapters/counters.adoc59
-rw-r--r--doc/manuals/chapters/counters_generated.adoc18
-rw-r--r--doc/manuals/vty/trx_vty_reference.xml31
3 files changed, 104 insertions, 4 deletions
diff --git a/doc/manuals/chapters/counters.adoc b/doc/manuals/chapters/counters.adoc
index 7fbb10c..79d1962 100644
--- a/doc/manuals/chapters/counters.adoc
+++ b/doc/manuals/chapters/counters.adoc
@@ -2,3 +2,62 @@
== Counters
include::./counters_generated.adoc[]
+
+=== Rate Counter Configurable Error Thresholds
+
+Some rate counters such as overruns, underruns and dropped packets indicate
+events that can really harm correct operation of the BTS served by OsmoTRX,
+specially if they happen frequently. OsmoTRX is in most cases (depending on
+maturity of device driver) prepared to dodge the temporary failure and keep
+running and providing service.
+
+Still, it is sometimes important for this kind of events to not go unnoticed by
+the operator, since they may indicate issues regarding the set up that may
+require operator intervention to fix it.
+
+For instance, frequent dropped packets could indicate SDR HW/FW/power errors, or
+a faulty connection against the host running OsmoTRX.
+
+They can also indicate issues on the host running OsmoTRX itself: For instance,
+OsmoTRX may not be running under a high enough priority (hence other processes
+eventually battling for resources with it), or that simply the HW running
+OsmoTRX is not powerful enough to accomplish all work in a timely fashion all
+the time.
+
+As a result, OsmoTRX can be configured to exit the process upon certain
+conditions being met, in order to let osmoBTS notice something is wrong and thus
+announcing issues through alarms to the network, where the operator can then
+investigate the issue by looking at OsmoTRX logs.
+
+These conditions are configured by means of introducing rate counter thresholds
+in the VTY. The OsmoTRX user can provide those threshold commands either in the
+VTY cfg file read by OsmoTRX process during startup, or by adding/removing them
+dynamically through the VTY interactive console.
+
+Each threshold cmd states an event (a rate counter type), a value and an time
+interval (a second, a minute, an hour or a day). A threshold will be reached
+(and OsmoTRX stopped) if its value grows bigger than the configured threshold
+value over the configured time interval. This is the syntax used to manage rate
+counter thresholds:
+
+`(no) ctr-error-threshold <EVENT> <VALUE> <INTERVAL>`
+
+If several rate counter thresholds are set, then all of them are checked over
+time and the first one reached will stop OsmoTRX.
+
+.Example: rate counter threshold configuration (VTY .cfg file)
+----
+trx
+ ctr-error-threshold rx_drop_events 2 per-minute <1>
+ ctr-error-threshold rx_drop_samples 800 per-second <2>
+----
+<1> Stop OsmoTRX if dropped event (any amount of samples) during Rx was detected 2 times or more during a minute.
+<2> Stop OsmoTRX if 800 or more samples were detected during Rx to be dropped by the HW during a second.
+
+.Example: rate counter threshold configuration (VTY interactive)
+----
+OsmoTRX(config-trx)# ctr-error-threshold tx_underruns 3 per-hour <1>
+OsmoTRX(config-trx)# no ctr-error-threshold tx_underruns 3 per-hour <2>
+----
+<1> Stop OsmoTRX if 3 or more underruns were detected during Tx over the last hour
+<2> Remove previously set rate counter threshold
diff --git a/doc/manuals/chapters/counters_generated.adoc b/doc/manuals/chapters/counters_generated.adoc
index b40dc37..6955b18 100644
--- a/doc/manuals/chapters/counters_generated.adoc
+++ b/doc/manuals/chapters/counters_generated.adoc
@@ -1,7 +1,17 @@
// autogenerated by show asciidoc counters
-These counters and their description based on OsmoTRX 0.2.0.61-408f (OsmoTRX).
+These counters and their description based on OsmoTRX 1.0.0.43-3f7c0 (OsmoTRX).
+
+=== Rate Counters
// generating tables for rate_ctr_group
-// generating tables for osmo_stat_items
-// generating tables for osmo_counters
-// there are no ungrouped osmo_counters
+// rate_ctr_group table osmo-trx statistics
+.trx:chan - osmo-trx statistics
+[options="header"]
+|===
+| Name | Reference | Description
+| device:rx_underruns | <<trx:chan_device:rx_underruns>> | Number of Rx underruns
+| device:rx_overruns | <<trx:chan_device:rx_overruns>> | Number of Rx overruns
+| device:tx_underruns | <<trx:chan_device:tx_underruns>> | Number of Tx underruns
+| device:rx_drop_events | <<trx:chan_device:rx_drop_events>> | Number of times Rx samples were dropped by HW
+| device:rx_drop_samples | <<trx:chan_device:rx_drop_samples>> | Number of Rx samples dropped by HW
+|===
diff --git a/doc/manuals/vty/trx_vty_reference.xml b/doc/manuals/vty/trx_vty_reference.xml
index d6cd15d..e448a46 100644
--- a/doc/manuals/vty/trx_vty_reference.xml
+++ b/doc/manuals/vty/trx_vty_reference.xml
@@ -1253,6 +1253,37 @@
<param name='dummy' doc='Dummy method' />
</params>
</command>
+ <command id='ctr-error-threshold (rx_underruns|rx_overruns|tx_underruns|rx_drop_events|rx_drop_samples) &lt;0-65535&gt; (per-second|per-minute|per-hour|per-day)'>
+ <params>
+ <param name='ctr-error-threshold' doc='Threshold rate for error counter' />
+ <param name='rx_underruns' doc='Set threshold value for rate_ctr device:rx_underruns' />
+ <param name='rx_overruns' doc='Set threshold value for rate_ctr device:rx_overruns' />
+ <param name='tx_underruns' doc='Set threshold value for rate_ctr device:tx_underruns' />
+ <param name='rx_drop_events' doc='Set threshold value for rate_ctr device:rx_drop_events' />
+ <param name='rx_drop_samples' doc='Set threshold value for rate_ctr device:rx_drop_samples' />
+ <param name='&lt;0-65535&gt;' doc='Value to set for threshold' />
+ <param name='per-second' doc='Threshold value sampled per-second' />
+ <param name='per-minute' doc='Threshold value sampled per-minute' />
+ <param name='per-hour' doc='Threshold value sampled per-hour' />
+ <param name='per-day' doc='Threshold value sampled per-day' />
+ </params>
+ </command>
+ <command id='no ctr-error-threshold (rx_underruns|rx_overruns|tx_underruns|rx_drop_events|rx_drop_samples) &lt;0-65535&gt; (per-second|per-minute|per-hour|per-day)'>
+ <params>
+ <param name='no' doc='Negate a command or set its defaults' />
+ <param name='ctr-error-threshold' doc='Threshold rate for error counter' />
+ <param name='rx_underruns' doc='Set threshold value for rate_ctr device:rx_underruns' />
+ <param name='rx_overruns' doc='Set threshold value for rate_ctr device:rx_overruns' />
+ <param name='tx_underruns' doc='Set threshold value for rate_ctr device:tx_underruns' />
+ <param name='rx_drop_events' doc='Set threshold value for rate_ctr device:rx_drop_events' />
+ <param name='rx_drop_samples' doc='Set threshold value for rate_ctr device:rx_drop_samples' />
+ <param name='&lt;0-65535&gt;' doc='Value to set for threshold' />
+ <param name='per-second' doc='Threshold value sampled per-second' />
+ <param name='per-minute' doc='Threshold value sampled per-minute' />
+ <param name='per-hour' doc='Threshold value sampled per-hour' />
+ <param name='per-day' doc='Threshold value sampled per-day' />
+ </params>
+ </command>
<command id='chan &lt;0-100&gt;'>
<params>
<param name='chan' doc='Select a channel to configure' />