Per-Filesystem Thresholds

From Zenoss Wiki
This is the approved revision of this page, as well as being the most recent.
Jump to: navigation, search

Oftentimes you want to apply different thresholds to different filesystems. For example, it's OK for /usr to be 99% full because you mount it read-only, but you want to know if /var gets over 80% full. This is already possible to do on a filesystem by filesystem basis by creating lots of local copies of the FileSystem monitoring template, but this becomes a headache to manage very quickly.

The following approach to filesystem thresholds allows you to use a cProperty (custom property) to set what the threshold should be on different mount points. Because it's a cProperty it also gives you the extra benefit of setting these per-filesystem thresholds globally, for a device class or for an individual device depending on your needs.

Here's how you set it up.

Step 1: Create cFileSystemThresholds custom property at root that looks like this.

CFileSystemThresholds.png

Step 2: Create getFileSystemThreshold "Script (Python)" at /zport/dmd/Devices that looks like this.

NOTE: The following involves navigating within ZMI, the Zope Management Interface, which is not recommended unless specifically instructed to do so.

a. Navigate to the following URL, substituting your Zenoss server name in place of 'ZENOSSSERVER': http://ZENOSSSERVER:8080/zport/dmd/Devices/manage

b. On the upper right, locate the drop-list that's pre-populated with 'Accelerated HTTP Cache Manager'. Expand this list and select 'Script (Python)'

c. On the 'Add Python Script' page, in the 'Id' text box, paste the following: getFileSystemThreshold

d. Select the 'Add and Edit' button

e. Delete the existing contents of the text box, and paste in the contents from the 'copyable form' text below

f. Hit the 'Save Changes' button and then navigate back to the normal Zenoss UI


GetFileSystemThreshold.png

Again in copyable form:

Important: Every mount point starts with '/' so its value will be returned for everything that doesn't match something earlier in the list. Consider using a more specific comparison like 'if context.mount == prefix' if you do not want this behavior.

thresholds = context.getZ('cFileSystemThresholds')
 
# If no thresholds are set, default to a billion percent.
if not hasattr(thresholds, 'append'):
    return 1e7
 
for threshold in thresholds:
    prefix, value = threshold.split(':')
    if context.mount.startswith(prefix):
        return float(value) * 0.01
 
# If no prefixes match, default to a billion percent.
return 1e7

Step 3: Change FileSystem threshold to look like this.

FileSystemThreshold.png

Again in copyable form:

here.getTotalBlocks() * here.getFileSystemThreshold()

Alternate Method

I didn't like having to tweak Zope, and I also wanted to have a multi-threshold setup for warning, error, and critical levels, so I came up with a slightly different method.

Create cFileSystemThresholds

As above, create a custom property cFileSytemThresholds and set the value to something like:

/:90:95:98

To set the default thresholds for warning, error, and critical levels.

Keep in mind that these values will be calculated after applying the zFileSystemSizeOffset property. If you leave that at 1, then you might want to adjust the global cFileSystemThresholds to be /:85:90:93 to account for the 5% reserved space in ext2 style filesystems

Create or modify thresholds

Now adjust or create the thresholds in the FileSystem template to be like:

here.getTotalBlocks() * int(filter(lambda x:here.mount.startswith(x.split(':')[0]),here.getZ('cFileSystemThresholds'))[0].split(':')[1]) / 100

where the '[1]' can be '[2]' or '[3]' to chose the warning, error, or critical values.

This code is doing pretty much what the above Zope function does, but does it all as an expression, so it can just be stuffed directly into the threshold property. A quick dissection is that we use the filter function to find any entries in the cFileSystemThresholds property (which is an array) where the text before the the first ':' is a prefix for the current filesystem, from that list we grab the '[0]' entry, then break that up on ':' and then grab the '[1/2/3]' value from that.

As above you can set the cFileSystemThresholds value on each node or section of the infrastructure tree to override the global values. For example, you could do:

/backup:95:98:99
/home:80:85:90
/:90:95:98

the first match in the list will be used, so /backup/foo would use the /backup settings.