Write an SNMP Plugin

Topics marked with * relate to features available only in vFabric Hyperic.

Writing an SNMP Plugin

SNMP is the standard protocol for monitoring network-attached devices, which is leveraged by several bundled HQ plugins and made easy by the PDK.

In any HQ plug-in, there are two main concepts to understand:

First is the inventory model. Resource types define where things live in the hierarchy along with supported metrics, control actions, log message sources, etc., as well as the configuration properties used by each feature. For more information, see Resources, Resource Types and Inventory Types.

In the case of implementing a custom SNMP plug-in for a network device, you are typically defining a platform type that collects any scalar variables that apply to the device and one or more service types to collect table data such as interfaces, power supplies, fans, etc.

Second is the metric template attribute which is a string containing all info required to collect a particular data point. In an SNMP plug-in, each of the metrics correlate to an SNMP OID. While we tend to use the object names to gather the desired data points in the plugins, you can also use the numeric OID. This has the added benefit of avoiding having to worry about ready access to the MIB file anywhere the plug-in is used.

The process of implementing a new SNMP based plugin for HQ starts with locating the device vendor's MIB file(s) and choosing which OIDs you wish to collect as metrics in HQ.

Getting Started

We'll be implementing a plugin for NetScreen, an SSL VPN gateway. The first step is to verify basic connectivity to the device using the snmpwalk command:

$ snmpwalk -Os -v2c -c netscreen 10.2.0.140 system
sysDescr.0 = STRING: NetScreen-5GT version 5.0.0r11.1 (SN: 0064062006000809, Firewall+VPN)
sysObjectID.0 = OID: enterprises.3224.1.14
sysUpTimeInstance = Timeticks: (186554200) 21 days, 14:12:22.00
sysContact.0 = STRING: ops@hyperic.com
sysName.0 = STRING: ns5gt
sysLocation.0 = STRING: SF Office
sysServices.0 = INTEGER: 72

Next, having downloaded and unarchived the MIB packages, we install the NetScreen MIB files in the appropriate location for our machine's SNMP installation:

$ sudo cp NS-SMI.mib NS-RES.mib NS-INTERFACE.mib /usr/share/snmp/mibs

Now, verify we can view OIDs defined in NS-RES.mib:

$ snmpwalk -Os -M /usr/share/snmp/mibs -m all -v2c -c netscreen 10.2.0.140 netscreenResource
nsResCpuAvg.0 = INTEGER: 2
nsResCpuLast1Min.0 = INTEGER: 2
nsResCpuLast5Min.0 = INTEGER: 2
nsResCpuLast15Min.0 = INTEGER: 2
nsResMemAllocate.0 = INTEGER: 47310400
nsResMemLeft.0 = INTEGER: 60884848
nsResMemFrag.0 = INTEGER: 2537
nsResSessAllocate.0 = INTEGER: 19
nsResSessMaxium.0 = INTEGER: 2000
nsResSessFailed.0 = INTEGER: 0

And the tabular OIDs defined in NS-INTERACE.mib:

$ snmpwalk -Os -M /usr/share/snmp/mibs -m all -v2c -c netscreen 10.2.0.140 netscreenInterface | more
nsIfIndex.0 = INTEGER: 0
nsIfIndex.1 = INTEGER: 1
nsIfIndex.2 = INTEGER: 2
nsIfIndex.3 = INTEGER: 3
nsIfName.0 = STRING: "trust"
nsIfName.1 = STRING: "untrust"
nsIfName.2 = STRING: "serial"
nsIfName.3 = STRING: "vlan1"
...

Iteration 1 - A Very Basic Plug-in

Once the MIBs are sorted out, you can begin with a very simple plug-in that might look something like this (line numbers added for instructional purposes):

1  <plugin>
2    <property name="MIBDIR" value="/usr/share/snmp/mibs"/>
3
4    <property name="MIBS"
5              value="${MIBDIR}/NS-SMI.mib,${MIBDIR}/NS-RES.mib,${MIBDIR}/NS-INTERFACE.mib"/>
6
7    <platform name="NetScreen">
8
9      <config include="snmp"/>
10
11     <plugin type="measurement"
12             class="org.hyperic.hq.product.SNMPMeasurementPlugin"/>
13
14     <property name="template" value="${snmp.template}:${alias}"/>
15
16     <metric name="Availability"
17             template="${snmp.template},Avail=true:sysUpTime"
18             indicator="true"/>
19
20     <metric name="Uptime"
21             alias="sysUpTime"
22             category="AVAILABILITY"
23             units="jiffys"
24             defaultOn="true"
25             collectionType="static"/>
26
27   </platform>
28 </plugin>

Let's dissect this to better understand what is going on:

  1. The first and last lines enclose the plug-in contents within the tags <plugin> and </plugin>

  2. Line 2 defines where the MIB files can be found on the system that will be collecting the SNMP data from the Netscreen device

  3. Line 4 and 5 define which specific MIBs our plug-in will use

  4. Line 7 begins the Platform definition, and provides the type name that will appear in HQ

  5. Line 9 specifies that we want to include the HQ default SNMP information and templates available in the Network Host and Network Device specifications

  6. Lines 11 and 12 specify that we are defining a measurement plug-in using the SNMPMeasurementPlugin class we've imported

  7. Line 14 declares the template we will use for the measurement data we collect

  8. Lines 16 through 18 define how the Availability metric will be collected. The name set for the metric is how it will show up in the HQ UI. Note also that we change the template to denote that Availability is true if we can get the sysUpTime OID data, and that we set this as an indicator value that is turned on (provides the green light / red light information for the Platform)

  9. Lines 20 through 25 define our Uptime metric. Note the clarification of the metric alias that will be substituted in for the template's ${alias} (line 14) as data is collected. We also specify the category, units, defaultOn, and collectionType values as per the Measurement Plugin documentation.

  10. Line 27 closes out the platform definition

This gets us going, but does not yet provide us with a lot of useful information about the platform. Before diving in to gather more information, let's take another look at line 21. Instead of using the alias parameter, we could also have defined that line like this:

template="${snmp.template}:sysUpTime"

This explicitly defines a template for this metric rather than relying on the alias value and the default measurement template we set.

Iteration 2 - Additional Platform Metrics

OK. Let's gather some more scalar Platform metrics that might prove interesting:

...

26    <metric name="Average CPU Utilization"
27            alias="nsResCpuAvg"
28            units="percent"/>
29
30    <metric name="Average CPU Utilization (Last 1 min)"
31            alias="nsResCpuLast1Min"
32            units="percent"/>
33
34    <metric name="Average CPU Utilization (Last 5 min)"
35            alias="nsResCpuLast5Min"
36            units="percent"/>
37
38    <metric name="Average CPU Utilization (Last 15 min)"
39            alias="nsResCpuLast15Min"
40            indicator="true"
41            units="percent"/>
42
43    <metric name="Memory Allocated"
44            alias="nsResMemAllocate"
45            units="B"/>
46
47    <metric name="Memory Left"
48            alias="nsResMemLeft"
49            indicator="true"
50            units="B"/>
51
52    <metric name="Memory Memory Fragment"
53            alias="nsResMemFrag"
54            units="B"/>
55
56    <metric name="Sessions Allocated"
57            alias="nsResSessAllocate"/>
58
59    <metric name="Sessions Maximum"
60            alias="nsResSessMaxium"/>
61
62    <metric name="Sessions Failed"
63            alias="nsResSessFailed"
64            collectionType="trendsup"/>

...

Again, we provide a name value for how the metric will appear in HQ, use the alias to specify the OID name to be used with the template, and where necessary, specify units, whether or not this will be a default indicator, and the collectionType. This gets us good, basic system information for the platform.

Iteration 3 - Pulling in Network Interfaces as Platform Services

Now, we want to get information about the device network interfaces. To do this, we must query the SNMP table data from the device, and put them in proper context as Service definitions within HQ. We add the following to the plug-in:

...

67    <!-- index to get table data -->
68    <filter name="index"
69            value="snmpIndexName=${snmpIndexName},snmpIndexValue=%snmpIndexValue%"/>
70
71    <filter name="template"
72            value="${snmp.template}:${alias}:${index}"/>
73
74    <server>
75      <service name="Interface">
76        <config>
77          <option name="snmpIndexValue"
78                  description="Interface name"/>
79        </config>
80
81        <property name="snmpIndexName" value="nsIfName"/>
82
83        <metric name="Availability"
84                template="${snmp.template},Avail=true:nsIfStatus:${index}"
85                indicator="true"/>
86
87      </service>
88    </server>

...

Breaking the collection of this table data down:

  1. In lines 68 and 69 we define an index filter to correlate name and value pairs from the SNMP table data

  2. In lines 71 and 72 we define a new template that takes into account the OID and its associated index

  3. In line 74 we start a Server definition. In this case, the Server's only attributes are the Platform Services we are defining in lines 75 through 87: the network interfaces for the device

  4. In lines 75 through 81 he Service is given a name, and the individual interface name is derived by assoicating the snmpIndexValue with the nsIfName (through the snmpIndexNmae association) defined by the OID

  5. In lines 83 through 85, like we did at the top, Platform-level, we define our Availability metric, defining availability as true if we can gather nsIfStatus value for the inteface, and setting it as a default indicator.

  6. In line 87 we close the Service definition with the </service> tag

Iteration 4 - Collecting Network Interface Service Metrics

Collecting the metric data for each interface is very similar to what we did to collect the scalar data for the Platform. The difference is that it is contained within the Service definition. Here's what that looks like:

...

87        <!-- nsIfFlow* metrics -->
88        <metric name="Bytes Received"
89                alias="nsIfFlowInByte"
90                indicator="true"
91                collectionType="trendsup"
92                category="THROUGHPUT"
93                units="B"/>
94
95        <metric name="Bytes Sent"
96                alias="nsIfFlowOutByte"
97                indicator="true"
98                collectionType="trendsup"
99                category="THROUGHPUT"
100               units="B"/>
101
102       <metric name="Packets Received"
103               alias="nsIfFlowInPacket"
104               collectionType="trendsup"
105               category="THROUGHPUT"/>
106
107       <metric name="Packets Sent"
108               alias="nsIfFlowOutPacket"
109               collectionType="trendsup"
110               category="THROUGHPUT"/>
111
112       <!-- nsIfMon* metrics -->
113       <metric name="Auth Failures"
114               alias="nsIfMonAuthFail"
115               collectionType="trendsup"
116               category="AVAILABILITY"/>

...

Iteration 5 - Adding Auto-Discovery Components for the Platform

The final touch is to add the necessary pieces for auto-discovery to work. This makes it nice when you use the plug-in, since inventory information for the Platform, and any discoverable services that are defined are automatically pulled into HQ. The additions are:

...

7    <!-- for autoinventory plugin -->
8    <classpath>
9      <include name="pdk/plugins/netdevice-plugin.jar"/>
10   </classpath>

...

11     <properties>
12       <property name="sysContact"
13                 description="Contact Name"/>
14       <property name="sysName"
15                 description="Name"/>
16       <property name="sysLocation"
17                 description="Location"/>
18       <property name="Version"
19                 description="Version"/>
20     </properties>
21
22     <plugin type="autoinventory"
23             class="org.hyperic.hq.plugin.netdevice.NetworkDevicePlatformDetector"/>

...

94      <plugin type="autoinventory"
95              class="org.hyperic.hq.plugin.netdevice.NetworkDeviceDetector"/>

...

105       <plugin type="autoinventory"/>
106
107       <properties>
108         <property name="nsIfIp"
109                   description="IP Address"/>
110         <property name="nsIfNetmask"
111                   description="Netmask"/>
112         <property name="nsIfGateway"
113                   description="Gateway"/>
114       </properties>

...
  1. In lines 7 through 10, we import the netdevice-plugin to enable auto-discovery

  2. In lines 11 through 20, we add some inventory properties that will show-up on the Inventory tab

  3. In lines 22 and 23, we call-out the NetworkDevicePlatformDetector for auto-inventory of the Platform scalar values (enabled through the inclusion we did in lines 7 through 10)

  4. In lines 94 and 95, we call-out the NetworkDeviceDetector for auto-inventory of the Platform table values (also enabled through the inclusion we did in lines 7 through 10)

  5. In lines 105 thorugh 114, we insure that the network data is incorporated into the Platform inventory as part of the auto-discovery process

The Final Product

The final plug-in in its entirety is here in netscreen-plugin.xml.

Additional SNMP Plugin Examples

These additional examples are available from http://git.springsource.org/hq/hq/trees/master/hq-plugin/examples/src/main/resources.

  • netscaler

  • zxtm

  • wxgoos

  • squid