
Credentials and Access
SSH Access
Other Access
- Required credentials are in the "GÉANT Dashboard v3" LastPass folder
Problems
New alarms are not appearing in the gui
Possible Cause: Traps not being processed by RabbitMQ
Analysis
- Open one or more of the following RabbitMQ management consoles. (Credentials are in the "GÉANT Dashboard v3" LastPass folder)
- Scroll down to the "Nodes" section
- There should be 3 rows in the table and all status icons should be green (currently - there is a red bar showing a deprecated node - this will be removed when possible). The expected node names are:
- rabbit@prod-noc-alarms01
- rabbit@prod-noc-alarms02
- rabbit@prod-noc-alarms03
Solution
- If one of the 3 nodes is failing or missing from the list, log into the failing server via ssh and restart the RabbitMQ service:
- After a minute or two the management consoles should show the cluster is restored.
Solution #2
- If all 3 nodes appear in the list, but if the state of the nodes is different when logging into their respective administration gui's
Collectors have stopped working
Analysis
- Open this Correlation status dashboard
- Scroll down to the "Collectors" panel
- Check that the graph shows a nonzero rate of traps being processes
Solution
- On each of the following servers:
- prod-noc-alarms01.geant.org
- prod-noc-alarms02.geant.org
- prod-noc-alarms03.geant.org
- Log in via ssh and execute the following command:
Possible Cause: Correlators have stopped working
Analysis
- Open this Correlation status dashboard
- Scroll down to the "Collectors" panel
- Check that the graph shows the leader collector processing a non-zero rate of traps. The current leader can be identified by the FORWARDER with state 2 in the "Raft States" panel.
Solution
- On each of the following servers:
- prod-noc-alarms01.geant.org
- prod-noc-alarms02.geant.org
- prod-noc-alarms03.geant.org
- Log in via ssh and execute the following command:
In case production operation isn't restored quickly ...
If the production environment can't be recovered quickly and operation restored, please refer temporarily to the UAT environment. This UAT environment continually processes the same traps as production, and uses the same IMS instance, so should be useable while production operation is being restored. To access the UAT environment gui, please navigate to one of the following:
Related articles
Related articles appear here based on the labels you select. Click to edit the macro and add or change labels.
