Monitoring for Stolen CPU on Linux Servers
Note: This information is also available in the Knowledge Base article "How Do I Monitor for Stolen CPU Cycles on Linux Servers?" on the Zenoss Support portal.
If you are looking for an automated way to add the “CPU steal time” to your overall Linux server monitoring regimen, Zenoss can help.
Just go into the monitoring template for Linux servers and activate the %st counter. Then, if Zenoss detects high steal time, you can drill down directly in the Zenoss console to see which host a VM is running on and immediately discover the CPU utilization. With this information, you’ll be able to determine the best way to get your “stolen CPU” back.
To monitor for stolen CPU:
- Ensure that you have Net-SNMP 5.7 or higher installed on the Linux server where you want to monitor for stolen CPU.
- Edit the monitoring template used to monitor Linux servers to include the ssCPURawSteal data source. For more information, see “Adding ssCPURawSteal to the Monitoring Template”.
- Set up a graph that displays stolen CPU when you view the Linux server. For more information, see “Setting Up a Graph That Displays Stolen CPU”.
- Set up a threshold that alerts you when CPU is being stolen. For more information, see “Setting Up a Threshold for Alerting on Stolen CPU”.
For additional information about monitoring for Stolen CPU, see the “Hey! Who Stole My CPU?” article, available on the Zenoss blog.
Adding ssCPURawSteal to the Monitoring Template
To monitor for stolen CPU on Linux servers, first add ssCPURawSteal as a data source to your Linux server monitoring template.
To add the ssCPURawSteal data source the monitoring template used for monitoring Linux servers:
- In the Zenoss Console, click the Advanced tab, and then click Monitoring Templates.
- In the left tree pane, under Device, click Server/Linux.
- Add a new data source for “CPU steal time” by completing the following steps:
- In the Data Sources area, click the plus (+) sign.
- In the Add Data Source dialog box, in the Name field, type ssCpuRawSteal.
- In the Type field, specify SNMP, and then click Submit.
- After adding ssCpuRawSteal as a new data source, add the SNMP OID for ssCpuRawSteal by completing the following steps:
- Select the ssCpuRawSteal data point, click the gear, or Edit icon, and then click View and Edit Details.
- In the OID field, type 18.104.22.168.4.1.2021.11.64.0.
- Click Save.
- Edit the RRD Type for the data point by completing the following steps:
- Under ssCPUawSteal, select ssCPURawSTeal.ssCPURawSteal,click the gear, or Edit icon, and then click View and Edit Details.
- In the RRD Type field, select DERIVE from the drop-down list. This calculates the rate at which CPU is being stolen as a percent.
- Click Save.
Setting Up a Graph That Displays Stolen CPU
Once you have added ssCPURawSteal as a data source to your Linux server monitoring template, set up a graph that shows stolen CPU for the Linux server. To graph stolen CPU data:
- Under Graph Definitions, ensure you have a CPU Utilization graph set.
- Under Graph Definitions, select the CPU Utilization graph, and then on the gear button, or Edit button, click Manage Graph Points.
- On the Manage Graph Points dialog box, click the plus (+) sign, and then click Data Point.
- In the Data Point field, select ssCpuRawSteal.ssCpuRawSteal, and then click Submit.
- Click Save.
Now you are monitoring for stolen CPU, and you also have a graph that displays stolen CPU when you go and look at the Linux server.
Setting Up a Threshold for Alerting on Stolen CPU
After you have configured Zenoss to collect and graph stolen CPU data, your final step is to go in and set a threshold so you get an alert when your stolen CPU on a Linux server goes about a certain value.
To generate an alert when stolen CPU passes a certain threshold:
- Under Thresholds, click the plus (+) sign.
- In the Name field, specify the name you want to use for the threshold. For example, type High Stolen CPU.
- In the Type field, select MinMaxThreshold from the drop-down list, and then click Add.
- Select the new high stolen CPU threshold you just created, and then and then on the gear button, or Edit button.
- On the Edit Threshold dialog box, under Data Points, ensure the ssCPURawStea_SSCPURawSteal data point displays in the Selected column.
- In the Maximum Value field, specify a value. For example, if you want to receive an alert when more than 10% of the CPU is being stolen, type 10.
- In the Severity field, select a type for the event, such as Warning.
- In the Event Class field, specify the event class you want to use, such as /Perf/CPU.
- Click Save.
Now you are collecting stolen CPU information from your Linux server. You will receive an alert when the value goes above 10%. You can also see trends over time when you look at the Linux server.