Ecosystem Spotlight: Event Enrichment HQ
The Event Enrichment Platform (EEP) helps Operations teams accelerate downtime resolution by eliminating noise and improving access to remediation information.
EEP integrates events from Zenoss, normalizes the events into a common event format, and provides Zenoss users with easy ways to classify, suppress, and enrich their alerts.
Events arriving in the middle of the night, whether via email or an escalation platform like PagerDuty, typically do not have context. The person or team responsible for the alert source, the impact on the business, and the initial steps for triage, are critical bits of information generally lacking from Operations alerts. In Zenoss, the embedded Python interpreter provides a powerful mechanism that can be used to suppress and enrich events, but doing so is non-trivial.
In order to suppress a Zenoss alert, begin by searching for the relevant alert location in the Event Class hierarchy, in this case /Perf/SNMP, and then write some Python code. As an example, let’s suppress the annoying “oid XXXXXX is bad” SNMP alerts:
import re if evt.message.find("Error reading value for") >= 0: evt.eventState = 2 # 2 == suppressed
The same Python-fu can be used to enrich events in Zenoss with escalation and remediation information. Having that information embedded in the event can shave minutes off an outage, as operators can begin remediation efforts immediately upon receipt of the initial alert. The same problem as before is still evident in the complexity necessary to generate the enrichment:
import re if device: evt.device = device.titleOrId() match = re.search("Disk Space Threshold Alert", evt.summary) if match: evt.message = 'Disk Space Threshold Alert on ' + str(evt.device) + '\nAlert the SysOps on-call team' evt.Escalation ='Send this event to the oncall SysOps team' evt.Remediation = '1) Log into the' + str(evt.device) + '\n2) confirm the disk space problem by issuing the df -h command.\n3) if the problem is confirmed, then initiate the prune log file recipe’
Unless you are someone that is relatively comfortable with Python AND Zenoss, you would need a
DevOps resource to help write transforms like these. What we’ve found in the
past is that the need for development resources in the creation of transforms can cause this work to be left undone, or conducted very slowly, resulting in excess Ops noise.
The Event Enrichment Platform is built to pull in event data from a wide variety of Ops systems including, of course, Zenoss. Integrating Zenoss with EEP takes all of about 30 seconds, as can be seen in our Zenoss Integration Guide. The only thing that you need to do to enable the integration is add your EEP API Token to the EEP ZenPack.
Now that you have Zenoss events flowing into EEP, let’s take a look at the work involved in the suppression and enrichment of events.
An event arrives and is placed in the unclassified EEP events queue.
First click “New” in the Classification column for the event and create an appropriate classification. EEP will automatically create a specific match expression for you using the entire message field of the event. Replace the hostname with a wild card to make this classification more generic.
To suppress a noisy event, click on the “Suppressed” check box and choose “Save”. All subsequent events matching this classification will be suppressed.
To enrich the event, use the same classification as before (with “Suppressed” unchecked), then create an associated Enrichment. In the example below, we provide triage steps for a host down event.
Once you save your Enrichment it will be available for use in the Classification “Enrichment” drop down. Choose your newly created Enrichment and click “Save”.
Congratulations, all subsequent events that match the Classification will be suppressed or enriched appropriately.
To learn more about the Event Enrichment Platform, visit their website. There's a free trial available for members of the Zenoss Community.