You have a "problem" device. This may manifest in different ways; here are just a few:
- Cannot see device main page in GUI
- Cannot re-add the device - it says it is already there
- Poskey errors on yellow flashes at top of GUI
- Run toolbox tools and they throw errors
- Attempt to install a ZenPack and get errors
You cannot select the device in the GUI and delete it - it errors in some way. The device is "half there".
You should always start by running the toolbox tools. These come as part of the standard build in later versions of Zenoss 5.x; earlier versions of 5 and previous versions, follow the link https://support.zenoss.com/hc/en-us/articles/203117595-How-To-Install-And-Use-the-zenoss-toolbox to get and install the tools. This suite has grown and improved over the years so you may not have all the utilities if you have an older version installed. Instructions for running the toolbox tools are now (October 2017) in appendix A of the Upgrade Guide. There are four main tools that can either be run in check mode or fix mode; always run in check mode first, which may take a long time but should not actually DO anything to the databases. They should be run in the following order:
The "-f" flag for each tool requests the problems to be fixed. The commands can also take a -v10 flag for verbose logging. If you are working with Zenoss 5.x, these should be run from within the zope container. For many problems, the toolbox tools can now fix the issue.
Using the Zope Management Interface (ZMI)
Zenoss is built using the Zope application development environment. The ZMI, strictly, is part of Zope rather than Zenoss, and lets sufficiently authorised users of the GUI explore the ZODB database. In practise, this means you need the Manager role for your Zenoss GUI user (not just ZenManager). Point your browser at the normal Zenoss GUI, with the end of the URL as /zport/dmd/manage , for example https://zen42.class.example.org/zport/dmd/manage .
By navigating down the device class path hierarchy at the left-hand side you can inspect the instances of a particular class. Note that each Zenoss device class has a "devices" (lower-case d) link to follow before you see the actual device instances. Once you have navigated to the device instance, use the Properties tab at the far right to see attributes of this device.
If you have got this far, then the issue is more likely to be with a component of the device so go back to the Contents tab for the device and try to explore some of the relationships; most of the components are found under the os relationship or the hw relationship. Fundamentally you are looking for clues as to where it breaks.
The ZMI is most unlikely to cause further breakage unless you actively select to save any changes. Use it as an investigative tool.
Hacking using zendmd
Occasionally, the internal Zope Database (ZODB) has got messed up such that the toolbox tools can't fix it. Once you are in this scenario, you must make sure that your system is backed up and, preferably, you are in a maintenance window for your organisation.
zendmd is a tool that allows you to access and manipulate the ZODB database (and potentially other databases). It should be run as the zenoss user, in the zope container (for Zenoss 5 folk). It has the power to do bad things as well as good but it is a good investigative tool and can sometimes work fix magic. Assume that the problem device is zen42.class.example.org. First see if you can get to the device using find:
>>> d=find('zen42.class.example.org') >>> d <Device at /zport/dmd/Devices/Server/Linux/devices/zen42.class.example.org>
This shows that the device can be found in ZODB and that it is of object class Device (that's a Python object class, not a Zenoss device class, though they may look the same).
>>> print d.id, d.title, d.manageIp zen42.class.example.org zen42.class.example.org 192.168.10.42
We can access various attributes of this object - id, title and manageIp - so at least some of it is there.
Any object in the ZODB database has to be connected into this hierarchical datastructure so, typically, a component (like FileSystem) has a single (toOne) relationship with its owning device, whereas a device can have a toMany relationship with FileSystem (there can be multiple instances of the FileSystem object, one for each file system). Basically, these are links that connect the objects together. Often, the issue is that one half of these links get broken somehow.