Scaling and Tuning Hyperic Performance

Available only in vFabric Hyperic

About this page...

This page has information about tuning Hyperic Server for large deployments, including recommended values for server properties based on the number of platforms you will manage.

Sizing Profiles in vFabric Hyperic

In vFabric Hyperic 4.6.5, the settings described below are implemented by the Hyperic installer — server property values are set based on the sizing profile you select when installing the Hyperic Server. For information about sizing profiles, see About Sizing Profiles in vFabric Hyperic. Note that you can run the Hyperic installer to change the current sizing profile for the vFabrich Hyperic Server, as described in Change vFabric Hyperic Server Sizing Profile.

Sizing Considerations

The number of platforms the Hyperic Server can manage depends on the hardware it runs on, the number of Hyperic Agents reporting to the server, the volume of metrics that are collected, and the size of the Hyperic database.

See Supported Configurations and System Requirements for Hyperic Server system requirements. Typically, a minimal system configuration will support 25 Hyperic Agents or more. On a high performance platform, a properly configured Hyperic Server can support up to 1500 agents. There are a variety of Hyperic Server properties that govern the system resources available to the server — settings whose values should be set based on the number of platforms under management.

Server Configuration Settings for Scaling

The table lists server properties that relate to Hyperic Server scaling. The values shown in the "Small", "Medium", and "Large" columns correspond the values set for the corresponding sizing profiles in vFabric Hyperic 4.6.5. In Hyperic HQ, these properties default to the values shown in the "Small" column.

Property

Small
(less than 50 platforms)

Medium
(50-250 platforms)

Large
(more than 250 platforms)

server.jms.highmemory

350

1400

2400

server.jms.maxmemory

400

1600

3600

server.database-minpoolsize

5

20

50

server.database-maxpoolsize

100

200

400

server.java.opts

-Djava.awt.headless=true
-XX:MaxPermSize=192m
-Xmx512m
-Xms512m
-XX:+HeapDumpOnOutOfMemory
Error
-XX:+UseConcMarkSweepGC

-Djava.awt.headless=true
-XX:MaxPermSize=192m
-Xmx4g
-Xms4g
-XX:+HeapDumpOnOutOfMemory
Error
-XX:+UseConcMarkSweepGC

-Djava.awt.headless=true
-XX:MaxPermSize=192m
-Xmn4g
-Xmx8g
-Xms8g
-XX:+HeapDumpOnOutOfMemory
Error
-XX:+UseConcMarkSweepGC
-XX:SurvivorRatio=12
-XX:+UseCompressedOops

tomcat.maxthreads
(new in vFabric Hyperic 4.6.5)

500

2000

4000

tomcat.minsparethreads
(new in vFabric Hyperic 4.6.5)

50

100

200

org.hyperic.lather.maxConns
(in ServerHome\hq-engine\hq-server\webapps\ROOT
\WEB-INF\web.xml)

475

1900

3800

About Java Heap and Garbage Collection

Heap size startup options are set in the server.java.opts property. Note that how much you can increase the heap size depends on the amount of RAM on the Hyperic Server host. Given sufficient RAM, you could use these settings:

server.java.opts=-Djava.awt.headless=true -XX:MaxPermSize=192m -Xmx4096m -Xms4096m -XX:+UseConcMarkSweepGC -XX:+UseCompressedOops

Note: If you are running Hyperic Server on a 64-bit system with 4GB (4096 MB) or less memory, Hyperic recommends you use 32-bit JVM. If you use a 64-bit JVM, be sure to set the -XX:+UseCompressedOops in server.java.opts property. with the oops option set.

About Hyperic Server Caches

Hyperic Server uses Ehcache for in-memory caching. Effective cache management is necessary for server stability and performance. Caching policies that define the cache size (maximum number of objects to cache) for each type are defined in server-n.n.n-EE\hq-engine\hq-server\webapps\ROOT\WEB-INF\classes\ehcache.xml. The cache size for a type depends on the on how dynamic that type is: how often it likely to be is updated. Given a fixed amount of memory, cache sizing in Hyperic tries to allocate cache according to these guidelines:

  • Relatively static types — Caches for types that are not frequently updated — for instance, Resource, Platform,Server, and Measurement — are sized to keep objects in memory for the lifetime of the HQ Server. An extremely low miss rate desired. The default cache sizes (the maximum number elements in cache) configured in ehcache.xml for inventory types are:

    • Platforms — 2,000

    • Servers ---- 50,000

    • Services — 100,000

      This sizing should be adequate for medium to large deployments.

  • Dynamic types — Caches for types that are frequently updated, (for instance Alert and Galert) and hence get stale sooner, are configured such that objects age out more quickly. A high hit/miss ratio of is optimal for dynamic types, in larger environments, on the order of 2:1 or 4:1.

Monitoring Hyperic Caches

You can monitor Hyperic caches on the HQ Health page, on the Cache tab, shown in the screenshot below. The following information is shown for each cache:

  • Size — The number of objects currently in the cache.

  • Hits — How many times a requested object was available in cache since last Hyperic Server restart.

  • Misses — How many times a requested object was not available in cache since last Hyperic Server restart.

  • Limit — The maximum number of objects the cache can contain.

  • Total Memory Usage — The amount or memory (in KB) currently consumed by all objects in the cache.

Note: You can view size, hits and misses, but not cache limit by running the "ehCache Diagnostics" query on the Diagnostics tab. This data is also written periodically written to server.log.

images/download/attachments/79038211/HqHealthCache.png

Interpreting Cache Statistics

The values that indicate a well-tuned cache vary by the nature of the caches, and a host of deployment-specific factors. Key things to check for include:

  • Has the cache limit been reached?

  • What is the hits:misses ratio?

The table below lists statistics for several Hyperic caches and, in the "Comments" column, a possible interpretation of the data.

Cache

Size

Hits

Misses

Limit

Comments

Agent.findByAgentToken

605

26010977

605

5000

This cache looks healthy. It contains relatively static objects. The cache has not filled up, and the number of misses is equal to the number of hits, so misses occurred only upon first request of each object.

org.hyperic.hq.events.
server.session.Alert

70526

48049

71274

100000

This cache looks healthy. It contains a type that is likely to become stale relatively quickly, so aging out is appropriate. Although there are more misses than hits, the low number of objects in memory, compared to the cache limit, indicates a low level of server activity since last restart.

org.hyperic.hq.events.
server.session.AlertDefinition

66287

44385

66340

100000

This cache looks healthy. It contains a relatively static type, so it is appropriate that the objects do not age out. The cache is not filled up, and the number of misses is very close to the number of hits, indicating most misses occurred upon first request of the object.

Measurement.findByTemplate
ForInstance

10000

6766

25772

10000

This cache looks less healthy. It has reached its maximum size, and the hit ratio is around 20-25%. Ideally, the number of misses should peak at about the maximum size of the cache. Increasing the cache limit would probably improve Hyperic performance.

(Note that the rule-of-thumb that misses should peak around the limit of the cache does not apply to the UpdateTimestampsCache and the PermissionCache caches, which contain types that are invalidated frequently.

Configuring Caches

Caches you cannot change

There are two caches that you cannot reconfigure:

  • org.hibernate.cache.UpdateTimestampsCache is managed by Hibernate.

  • AvailabilityCache is managed by Hyperic Server.

To modify the size of a Hyperic cache, you edit the associate element in server-n.n.n-EE\hq-engine\hq-server\webapps\ROOT\WEB-INF\classes\ehcache.xml. (In general, only cache sizes should need to be changed.) Each cache is defined with an entry like:

<cache name="DerivedMeasurement.findByTemplateForInstance"
    maxElementsInMemory="10000"
    eternal="true"
    timeToIdleSeconds="0"
    timeToLiveSeconds="0"
    memoryStoreEvictionPolicy="LRU"/>

You may need to iterate on the cache size to find the optimal setting.