Zenoss tuning

From Zenoss Wiki
This is the approved revision of this page, as well as being the most recent.
Jump to: navigation, search

Zope

Zope is the foundation of the Object Database the web UI. Proper tuning of Zope is crucial for a positive user experience.

Key items to tune:

  • python-check-interval
  • zserver-threads
  • pool-size
  • cache-local-mb
  • cache-size
  • cache-module-name
  • cache-servers
Bulbgraph.png Note: In order to tune these items you will want to edit $ZENHOME/etc/zope.conf. Alternatively you can use the Web UI and navigate to Advanced --> Settings --> Daemons --> zopectl --> edit config

python-check-interval

The python-check-interval determines interrupt and effects threading. This value will periodically need to be adjusted for load.

To calculate the value, ssh to the master then su zenoss. under zendmd execute the following:

import math; from test import pystone; int(math.ceil(sum(s[1] for s in (pystone.pystones() for i in range(3)))/150.0))

The returned value will be your python-check-interval. Try to compute this number after ZenOSS has settled from a startup or reconfiguration period otherwise the value will be less accurate.

Ambox check.png Recommended value: Calculate

zserver-threads

The zserver-threads value represents the number of threads zope will use to service web requests. Having this value too high or too low can cause undesirable web UI problems.

Ambox check.png Recommended value: 5

Bulbgraph.png Note: Generally speaking, the value should not be greater than 5.

pool-size

The pool-size sets the number of connections that are held open to the ZODB for zope. Instances with less than 50 users should omit this option altogether. Systems with greater than 50 users should set it to 10.

Ambox check.png Recommended (less than 50 users) value: Comment out option

Ambox check.png Recommended (greater than 50 users) value: 10

Note: In v 4.x, this is removed from zope.conf. It can be added in <zodb_db main> stanza.

cache-local-mb

The cache-local-mb value configures how much local caching each zope instance will use in addition to the cache-servers (if set). Depending on the size of the instance and the number of zopes, this value can vary.

Ambox check.png Recommended single zope configuration values:

  • 1-500 devices: 1536
  • 501-1000 devices: 2560
  • 1001-2000 devices: 3072
  • 2001+ devices: 4096

Ambox check.png Recommended dual zope configuration values:

  • 1-500 devices: 768
  • 501-1000 devices: 1705
  • 1001-2000 devices: 2048
  • 2001+ devices: 3072
Bulbgraph.png Note: See Multi-Zope for more information.
Bulbgraph.png Note: Many Zope configurations will require special evaluation.

Ambox warning.jpeg Warning: Don't over commit your available memory!

cache-size

The cache-size dictates how many objects each zope should cache. Try to keep this value at 115% of the total objects in the global catalog.

To calculate the value, ssh to the master then su zenoss. under zendmd execute the following:

len(dmd.global_catalog)

The returned value is the size of the global catalog in entries. Take 115% of this value for your cache-size.

Ambox check.png Recommended value: Calculated 115% of global catalog length.

Bulbgraph.png Note: Excess allocation offers no benefit.

cache-module-name

The cache-module-name entry instructs zope to use a specific caching plugin. In all cases use memcache.

Ambox check.png Recommended value: memcache

cache-servers

The cache-servers entry directs the caching plugin to a specific cache server. All but advanced ZenOSS configurations should set this to 127.0.0.1:11211, the localhost memcached server.

Ambox check.png Recommended value: 127.0.0.1:11211

Bulbgraph.png Note: Some distributed installations may run memcached on a dedicated box. In such case point to that IP address. Ensure port 11211 is open.

Zope.conf examples

File: $ZENHOME/etc/zope.conf

zodb_db stanza

<zodb_db main>
  mount-point /
  cache-size 1000000 #<- Len of global catalog is 875,000  ->#
  %import relstorage
  <relstorage>
    cache-local-mb 4096 #<- Device count equals 12,500 ->#
    cache-servers 127.0.0.1:11211 #<- memcached running on the master, default memcached port 11211 ->#
    cache-module-name memcache #<- using memcached plugin ->#
    keep-history false
    <mysql>
      host localhost
      port 3306
      db zodb
      user zenoss
      passwd zenoss
    </mysql>
  </relstorage>
</zodb_db>
Bulbgraph.png Note: This is only an example of part of the zope.conf.

Zeneventserver

Zeneventserver is the core of the ZenOSS v4.x event processing system.

Key items to tune:

  • Optimize flag
  • JDBC test on borrow workaround

Optimize flag

By default Zeneventserver is configured to issue an 'optimize table' against several key tables. Under certain circumstances this can cause the system to stop processing events during the lock. Optimization of these tables, in theory, renders only minor improvement in performance since the tables in question are frequently partitioned off.

In the $ZENHOME/etc/zeneventserver.conf, uncomment 'zep.database.optimize_minutes' and set it equal to 0.

Ambox check.png Recommended value: 0

JDBC test on borrow workaround

By default Zeneventserver is configured to not validate the status of reused connections. Circumstantially this can cause zeneventserver to lose connectivity with mysqld/zends.

In the $ZENHOME/etc/zeneventserver.conf, uncomment 'zep.jdbc.pool.test_on_borrow' and set it equal to true.

Ambox check.png Recommended value: true

JVM Memory Limit

By default Zeneventserver will use all the memory in your system and perhaps even the memory in servers nearby. (Just kidding) Inevitably it's a good idea to define a maximum allocation for the java VM.

In the $ZENHOME/bin/zeneventserver script and add JVM_ARGS="$JVM_ARGS -Xmx768m" after the other JVM arguments. Save the file and restart zeneventserver.

Ambox check.png Recommended value: Small Environments 768m

Ambox check.png Recommended value: Medium Environments 1024m

Ambox check.png Recommended value: Large Environments 4096m

Ambox check.png Recommended value: Jumbo Environments 8192m

Zeneventd

Zeneventd is a daemon that consumes from the zep.rawevents and populates the zep.zenevents rabbit queues. The main purpose of zeneventd is to execute transforms against events before they persist. In ZenOSS Core, zenventd is a single threaded, non-worker daemon. More than one zeneventd may be spawned to combat high rawevent queues.

Key items to tune:

  • cachesize
  • workers

cachesize

This value instructs zeneventd how large its cache should be. Depending on the event rate this value may need to be increased, however, a guideline would be 20,000 for small instances, 75,000 for medium instances and 150,000 for larger instances.

Ambox check.png Recommended (small instance) value: 20000

Ambox check.png Recommended (medium instance) value: 75000

Ambox check.png Recommended (large instance) value: 150000

workers

This value dictates how many workers Zeneventd will spawn. this option is available only if the EnterpriseCollectors ZenPack is installed.

Zenhub

Zenactiond

Zengomd

memcached

Memcached is used in Zenoss to help de-duplicate common caches bewtween Zope instances and various daemons. Using memcached often results in a performance hit when compared to native caching, but saves large amounts of memory. The 'CACHESIZE' argument in /etc/sysconfig/memcached dictates how much memory memcached allocates on startup.

Ambox check.png Recommended value: 512

Bulbgraph.png Note: Generally speaking, the value should not be greater than 512 unless you're running massively concurrent Zopes, hubs or zeneventd daemons.

rabbitmq-server

RabbitMQ is known to refuse to start if the hostname of the server is changed. You can prevent this with the following steps.

(Do not perform these steps if you are an Enterprise customer and are intending to use a High Availability setup.)

# echo 'NODENAME=rabbit@localhost' > /etc/rabbitmq/rabbitmq-env.conf


# service rabbitmq-server restart


# rabbitmqctl add_user zenoss zenoss
# rabbitmqctl add_vhost /zenoss
# rabbitmqctl set_permissions -p /zenoss zenoss '.*' '.*' '.*'


# service rabbitmq-server restart


# rm -f /var/log/rabbitmq/*OLDHOSTNAME*
# rm -rf /var/lib/rabbitmq/mnesia/*OLDHOSTNAME*


Zends

The below ZenDS configuration is for a 15,000 node ZenOSS instance with a fairly high flow rate. Tune the innodb_buffer_pool_size and innodb_buffer_pool_instances for your hardware. Avoid changing innodb_thread_concurrency and thread_pool_size.

[mysqld]
socket = /var/lib/zends/zends.sock
pid-file = /var/run/zends/zends.pid
basedir = /opt/zends
datadir = /opt/zends/data
port = 13306
user = zenoss
innodb_file_per_table
skip_external_locking

#
# Per the current Zenoss Resource Manager Install Guide,
# please size innodb_buffer_pool_size according to the following
# guidelines:
#
# Deployment Size       Value of innodb_buffer_pool_size
# --------------------  --------------------------------
#    1 to  250 devices   512M
#  250 to  500 devices   768M
#  500 to 1000 devices  1024M
# 1000 to 2000 devices  2048M
#
# If ZenDS is installed on a dedicated system, this can (and should) be set
# to as much as 75% of available memory on the system.
#
innodb_buffer_pool_size = 16G
# log file size should be 25% of of buffer pool size
innodb_log_file_size = 512M
innodb_additional_mem_pool_size = 32M
innodb_log_buffer_size = 8M
innodb_flush_method = O_DIRECT
innodb_flush_log_at_trx_commit = 2

# In previous releases of MySQL, this was recommended to be set to 2 times the
# number of CPUs, however the default and recommended option in 5.5 is to not
# set a bound on the thread pool size.
innodb_thread_concurrency = 0

# Setting this setting to 0 is recommended in virtualized environments. If
# running virtualized, it is recommended to uncomment the setting below when
# seeing database performance issues.
#innodb_spin_wait_delay = 0

# In large installs, there were a significant number of mutex waits on the 
# adaptive hash index, and this needed to be disabled.
#innodb_adaptive_hash_index = OFF

# Use the Barracuda file format which enables support for dynamic and 
# compressed row formats.
innodb_file_format = Barracuda

# Enable the thread pool plug-in - recommended on 5.5.16 and later.
plugin-load = thread_pool.so
thread_pool_size = 32

# Disable the query cache - it provides negligible performance improvements
# and leads to significant thread contention under load.
query_cache_size = 0
query_cache_type = OFF

max_allowed_packet = 64M
wait_timeout = 86400

# 1,000 is usually enough for moderately large deployments
max_connections = 1000

# Enable dedicated purge thread. (default is 0)
innodb_purge_threads = 1

# Introduce operation lag to allow purge operations. (default is 0)
innodb_max_purge_lag = 0

# Set buffer pool instances (cpu core count for physical machines, subtract one for VMs)
innodb_buffer_pool_instances = 16

[client]
socket = /var/lib/zends/zends.sock
user = zenoss

[mysql]
max_allowed_packet = 64M
prompt = "zends> "

[mysqldump]
max_allowed_packet = 64M

If using ZenOSS Analytics, be sure to include the following under [mysqld]:

open_files_limit=200000
event_scheduler=ON

# Need to tweak key_buffer_size, 25% of system memory for analytics
key_buffer_size = 64G

# This setting is for Innodb files, Analytics 4.x uses very few of these
innodb_buffer_pool_size = 8G

mysqld

The base config that comes with MySQL is not optimized for performance and therefore requires tweaking to obtain maximum performance with Zenoss.

The following config is a good base config to use, even for a smaller instance of Zenoss (/etc/my.cnf):

[mysqld]
skip_external_locking
innodb_buffer_pool_size = 512M
innodb_log_file_size = 64M
innodb_additional_mem_pool_size = 32M
innodb_log_buffer_size = 8M
innodb_flush_method = O_DIRECT
innodb_flush_log_at_trx_commit = 2

# in mysql 5.5+, innodb_thread_concurrency should be 0
innodb_thread_concurrency = 8
innodb_file_format = Barracuda
innodb_purge_threads = 1

# a non-zero innodb_max_purge_lag can cause significant delays in query processing, set to 0 unless you are a mysql savant
innodb_max_purge_lag = 0

query_cache_size = 0
query_cache_type = OFF
max_allowed_packet = 64M
wait_timeout = 86400
interactive_timeout = 86400
max_connections = 500
max_user_connections = 500

[client]
user = zenoss

[mysql]
max_allowed_packet = 64M

[mysqldump]
max_allowed_packet = 64M

After adding the above options you must do the following (this must be done any time innodb_log_file_size changes):

# service mysql stop
# rm /var/lib/mysql/ib_logfile*
# service mysql start

If running a larger instance then you can increase the following settings (the values shown here are examples, adjust according to your needs):

innodb_buffer_pool_size = 4096M
innodb_log_file_size = 512M
max_user_connections = 1500

It is also highly recommended that you perform the tuning described in the Optimize flag and JDBC test on borrow workaround sections of this page.

zenSOS

Not yet complete.

Threading

You can thread many zen* services with additional workers if you find the service in question is locking up, has many Missed_Runs in its log, etc. This capability though, depends on your version. You can check capability by running "servicename genconf" and looking for a "workers" line. Below is a table of which services can have workers added, simply add "workers #" and "devicesPerWorker #" in the config file for the service and restart it. This option is available only if the EnterpriseCollectors ZenPack is installed.

Service Name Threading?
zencommand YES
zendisc NO
zeneventlog YES
zenjmx YES
zenmailtx YES
zenmodeler NO
zenperfsnmp YES
zenperfsql YES
zenping YES
zenprocess YES
zenrender YES
zenrrdcached NO
zenstatus YES
zensyslog YES
zentrap YES
zenucsevents YES
zenvcloud YES
zenvmwareevents YES
zenvmwaremodeler NO
zenvmwareperf YES
zenwebtx YES
zenwin YES
zenwinperf YES