3

There's a "rule of monitors" that a user of scom might be aware of. If you have an alert that was generated by a monitor (as opposed to a rule that generates alerts) do not close it and let it close it self. If you do close the alert you wont be notified of the issue again until the monitor returns to a healthy state and then goes back into an unhealthy state.

If someone closes an alert for disk space (or if it occurs during maintenance, or it's old and gets auto closed) we don't find out the server still has space issues until the machine has problems. I'd like a way to regenerate alerts for monitors that are in an unhealthy state.

Currently I've looked at a powershell solution called GreenMachine but it doesn't seem to work very well and is very very slow.

What solutions have people found to this problem?

reconbot
  • 2,435
  • 3
  • 25
  • 30

2 Answers2

1

An easier solution, though with a higher impact: put the object generating the critical health state in maintenance mode for 15 minutes. The health state will change to "not monitored" an will be re-evaluated once it quits maintenance mode.

This way you will regenerate an alert. Bear in mind though that it will have the same effect for every rule an monitor running on that object.

David Biot
  • 91
  • 5
  • +1, Works perfectly and also doesn't need 15 minutes (5 are enough for the objects to go into "not monitored" state). – Massimo Oct 05 '12 at 12:04
0

While you could hack together a solution with PowerShell that can reset the monitoring data of a monitor. i.e. Call the ResetMonitoringState method of the monitoring object like the GreenMachine link you have supplied. If the monitor is reset in Health Explorer or from a PowerShell script a new alert will be generated if the previous one was close when the monitor reruns and detects that there is still a Critical or Warning state.

While SCOM has alerts it is state driven not alert driven like MOM. Some converted management packs still exhibit alert driven nature. However in a SCOM designed management pack (such as Windows platform ones) alerts almost always come from monitors (state observers).

This change ultimately means that how you monitor needs to more focused on state not alerts. State can be viewed in state views like Windows Computers and many others or Health Explorer. New state views can be created for specific cases. Also note, an alert may not be raised for each warning or critical state (it is optional) when a monitor is created and can be overridden.

My suggestion is that you don’t find a way to regenerate alerts but change how monitoring is done in your organisation.

Bernie White
  • 1,024
  • 7
  • 17
  • That's.. presumptuous. While I agree on your base point that the stateful monitoring is more useful then alert driven monitoring and I do understand that this is how scom works, this doesn't help solve my problem. The alerts generated when a monitor goes into a bad state are sent to a different system. The state of the monitor updates the alert and when the monitor returns to a healthy state it will clear the alert. The zen of statefulness is preserved. My issue is with the integration as sometimes the state is lost and I need to resend it. – reconbot May 17 '12 at 15:21
  • @wizard Ah ok thanks. For clarification, you have a secondary system that "tickets?" alerts. When a alert is closed, it closed in the secondary system but you still have an outstanding issue you don't know about. You want to reopen alerts so that your secondary system can reraise the alerts. Is that correct? – Bernie White May 20 '12 at 21:39
  • Not exactly, but for the sake of argument we can go with that. The basic gist is correct. – reconbot May 21 '12 at 02:28