[LWN Logo]
[LWN.net]
From:	 announce-admin@opennms.org
To:	 announce@opennms.org
Subject: [OpenNMS-Announce] OpenNMS Update v2.16
Date:	 Tue, 17 Apr 2001 19:17:48 -0400 (EDT)

====================
   OpenNMS Update
====================
  Vol 2., Issue 16
====================
   April 17, 2001
====================

   In this week's installment...
     * Project Status
          + 0.7.3 Released
          + Known Issues in 0.7.2
          + SNMP Data Collection Configuration
          + Coding Projects Underway
     * Upcoming Road Shows
     * Early Adopter Program Status
     * The Wish List


==============
Project Status
==============

0.7.3 Released:

     0.7.2 has had some problems (see next section for details) and
     we've fixed a number of them, with those fixes available in CVS.
     We've also changed our release calendar to introduce a new release,
     0.7.3, that will fix these issues.

     As I write this, Ben is uploading tar bundles. Please note: We are
     seeing a number of inconsistencies on tar-bundle installations.
     Your results may be better (or at least our ability to help you
     with problems) if you install on Red Hat. Debian and SuSE, while
     nice OSes, are not our strength.

     If you prefer these platforms and can help us figure out what the
     issues are, we'd certainly appreciate it.

     From the CHANGELOG, here's a list of feature/functionality updates
     in the 0.7.3 release:

     * Fix for RedHat 7.1 and JDK 1.3.0_02

     * Changes to the build process, including updates to paths and some
       files missing from dist and install

     * INSTALL.html now auto-generated again

     * Fixes for install.pl for Perl 5.6.1, including auto-parsing of the
       homeDir entry in server.xml

     * Some changes to compilation defaults

     * RPM Updates

     * Better parsing of external exec command lines (should fix earlier
       issues with notifications and automated actions)

     * Many logging updates, with move to Log4J

     * CAPSD changes, including better handling of constants, logging, 
       et al

     * Introduced URL support for discovery ranges (despite the fact that
       it has been in the UI forever...)

     * Updated node name and hostname handling

     * Upgrade to latest version of ONC RPC

     * More reports

     * Event database updated to include columns to support
       acknowledgement of events


Known Issues in 0.7.2 & 0.7.3:

     Fortunately, we've solved some of the major bugs in 0.7.2, so the
     0.7.2 list is decidedly shorter. And that, per Martha Stewart, is a
     good thing.

     The biggest issues with 0.7.2 were SNMP Traps not being written to
     the events table in the database and authentication problems in the
     Web UI. The SNMP Traps issue was fixed almost immediately after
     release and the code has been in CVS for a few days now. The Web
     Authentication stuff, well, let's just say that it is still less
     than fixed. But it's at least getting some attention...

     The Web Authentication issues carry over into 0.7.3. We've come to
     the conclusion that as elegant as our authentication system is
     within the Tomcat framework, it's elegance is lost somewhere amidst
     its lack of functionality. So, we're abandoning it for 0.7.4, and
     Larry has promised a complete re-write of the authentication
     mechanism. The current one, when it works, is great. However, it
     appears that the "when it works" timeframe is dwindling.

     But that's OK, since Larry has been looking for something to do...


SNMP Data Collection Configuration:

     My main man Mike has put together a spiffy little tool with the
     combination of the SNMP Data Collector and the corresponding
     configuration files. Since I've been playing with the configs,
     expanding them here and there, for the past week or so, I thought
     I'd share some of my insights-- consider this a "What I Did Over
     Summer Vacation" report...

     The SNMP Data Collection subsystem is made up, at its most basic,
     of the SNMP Poller and the DataCollection.xml config file. The
     poller, like any other poller, is basically invoked by associating
     appropriate polling ranges with it and dragging the service into a
     package (all via the Administrator's UI). This config then ends up
     being reflected in the packages.xml file. Edit this one by hand if
     you like, but remember that you can get a fresh copy if you need it
     from CVS --
     http://www.opennms.org/cgi-bin/cvsweb.cgi/data/common/conf/packages.xml

     The DataCollection.xml file --

http://www.opennms.org/cgi-bin/cvsweb.cgi/data/common/conf/DataCollection.xml
      -- is where the cool stuff lives (at least from an admin/operations 
     perspective.

     The DataCollection.xml file contains 3 main sections:

     * Database creation parms (database)

     * Group definitions (groups)

     * System definitions (systems)

     The database basically constitutes the parameters necessary to
     effectively build the underlying RRDTool databases where the
     performance data will be stored. It identifies the counters that we
     will force to exist, even if nothing else is defined for
     collection, and it exists with very little interaction (if any)
     with other sections of the file--but I'm getting ahead of myself.

     The groups section creates a mapping of a groupName to a list of
     SNMP OIDs (Object Identifiers) to collect for a particular device.
     The list contains a minimum of 4 subelements:

     * oid -- The numeric object identifier to GET

     * instance -- The instance value for the OID (0 (zero) if it's
       actually a "leaf" in the MIB tree, or a keyword of "ifIndex" if we
       are after something in the ifTable. Note: There's even an example
       of a hard-coded non-zero integer in the file, but that's only
       because I wanted a cheesy, quick and dirty workaround for the Host
       Resources MIB...)

     * alias -- a text string to describe the OID. You can populate this
       with the OID's name from SMI (which we have done so far), but this
       value is only used by RRD and the Web UI for human-readability --
       there is no attempt to try convert this via some compiled SMI
       somewhere (basically, because we don't yet have a MIB compiler...)

     * type -- a text string (case-insensitive) that describes the data
       type for RRD. Currently, acceptable values are Counter, Gauge,
       Integer, and TimeTicks. For the RRD geeks amongst us, Integer and
       TimeTicks map back to Gauge.

     So with that, we now have enough information to map a groupName to
     some set of OIDs to collect. Now we need a mechanism to decide
     which systems to collect data from, and which groupName to select
     to associate with that system. Thus, the systems section was born.
     And it was good.

     The systems section provides a mechanism for us to map our
     groupNames, defined in the groups section, to sysObjectIDs. Or in
     this case, specific sysObjectIDs or sysObjectID masks, either of
     which can be augmented with a list of IP addresses or IP address
     masks. How's that for befuddling?

     In layman's terms, when CAPSD figures out that a device is
     supporting SNMP, we pull the device's sysObjectID and cram it into
     the database. Then, when the scheduler establishes a list of
     devices that the SNMP Poller should poll, we grab the corresponding
     sysObjectID of the device. We then compare that sysObjectID to our
     systems section and figure out which groupNames we need to
     associate with that IP, and in turn, what OIDs go with those
     groupNames. But the masking and IP address list gives us some
     granularity that we didn't have before (Mike just built this in
     over the past few weeks--what a guy). Masking the sysObjectID
     allows us to specify the first few characters of a sysObjectID and
     only compare those digits with the leftmost characters of the
     sysObjectID we collected from the device. So...

     * .1.3.6.1.4.1.9.1.217 is valid as either a sysoid or a sysoidmask
       for a Cisco 2900 switch

     * .1.3.6.1.4.1.9. is valid as a sysoidmask for all Cisco devices

     * .1.3.6.1 is valid as a sysoidmask for anything that supports SNMP

     * .1.3.6.1.4.1.647. is valid as a sysoidmask for anything from
       Lexmark, but note the trailing dot. That's necessary to prevent it
       from matching 6470, 64780, or 64789012345678901234567890.

     Now let's talk granular granularity, an important form of SNMP data
     collection strategery--augmenting the sysoid or sysoidmask with IP
     addresses. Just as you would surmise, you have the option of adding
     a tag under the sysoid or sysoidmask that will anticipate a list of
     IP addresses or address masks. This gives you the ability to say,
     collect these OIDs from any Cisco device in 192.168.0.0/24 subnet
     with the following systems section entries:

     <sysoidmask>.1.3.6.1.4.1.9.</sysoidmask>
     <ipList>
       <ipAddrMask>192.168.0.</ipAddrMask>
     </ipList>

     Note that the ipAddrMask works just like the sysoidmask does, in
     matching a substring of the leftmost characters from the IP
     address. And also note that the ipAddr tag would work just like
     you'd think it would. And you can repeat multiple ipAddr or
     ipAddrMask tags to get the grouping you want.

     Yes, the masking is not the most elegant solution, and yes, it can
     make configuration of networks subnetted other than /8, /16, or /24
     a pain, but the option was to include masking or only provide an
     ipAddr tag. Yes, you are correct--in that light, masking doesn't
     look quite so bad...

     So now we've created the mappings that allow us to say which
     sysObjectIDs I'm going to poll for what OIDs, with possible
     abilities to add IP-based granularity. So what happens when I push
     this into practice?

     Essentially, when the SNMP poller comes up, it's going to check the
     rules and ranges to determine what IP addresses it should be
     working with (all handled via the scheduler). The poller will be
     checking what OID to sysObjectID mappings are in place (as well as
     an ipAddr or ipAddrMask tags that might be present), and will begin
     polling for those data points appropriately. We will also be
     checking if an RRD for that interface exists. If it does, we'll
     prepare to write to it. If not, we'll create it.

     When the device is polled, success is an all-or-nothing deal. If we
     poll for 15 OIDs and have success on 14 of them and error out on
     the 15th, we're going to discard the data for that poll. Is that
     ideal? No. Is that functional? Pretty much. And when we collect all
     15 data points successfully, we jam them into an RRD under the
     covers. Note that we've had to tweak the RRD code a little bit to
     make it thread-safe for writing, but now we can go full-bore at
     this thing and gather more data, more effectively, and get it into
     an RRD in a full multi-threaded environment. Pretty cool? Yeah, we
     think so.

     And the strengths of having RRD under the covers don't just stop at
     having a database there (database? OK, that's arguable, but it is a
     data store, anyway...). RRD, for those of you that aren't familiar,
     comes with internal functions to build graphs based on the data,
     and the data stores are built fully-defined, so you have no need to
     worry about over-running your drive space with ongoing data
     collection. The data will be stored as configured in the database
     section of the DataCollection.xml file, storing 8928 granular data
     points (by default, reflecting data collected every five minutes,
     or about one month) and 8784 roll-up data points reflecting an
     average of the data collected every 12 polls (by default, one
     year's worth of hourly data). Considering the number of data points
     we build into the RRD by default, the RRD's are going to take up
     about 8MB per interface you wish to collect on. This can be tweaked
     as necessary, but it will be closer to 50MB if you want a year's
     worth of the most granular data. That could add up a lot quicker
     than adding in 8MB/interface increments. Actually, about 6.25 times
     quicker.

     Anyway, that's how it works. Now for the killer question? Who's
     done the necessary research to know what data points should be
     collected by default from which devices? If you have and can share
     your findings, let us know. We'll be glad to let you extend our
     current stock collections in the DataCollection.xml file, and we'll
     give you credit too! Who could ask for anything more?

     Oh yeah, I forgot to tell you about thresholding. But that can wait
     'til next week.


Coding Projects Underway:

     * CDP/L2/Mapping -- The Pete Siemsen show has begun and comments are
       directed to the [disc] (discovery, not discuss) list.

     * Snort Integration -- Still no update? Is there life in Leman-land?

     * Solaris Port -- Ben's working on it. Harald's got the icmpd mojo
       workin'.

     * NT/2K Port -- No update. The concerns are still with some of our
       dependencies.

     * SNMP Poller/Data Collection -- Thresholding is in.

     * User Interfaces -- Working on event ownership and acknowledgement.
       It's working in CVS if you want to play with it. Note that you'll
       need the newest create.sql too...

     * New Pollers -- Finally, some interest in CORBA. Should be seeing
       some posts on the list soon...

     * Maji Prelim Work -- Rick is building Perl code that is
       successfully parsing MIB files. Check him out, in all his glory,
       on the "events" list.

     * Configuration -- Still no code. Status? Nada to report.

     * Discovery/CAPSD/Database Review -- Mike's got "coalescence" again
       while Sowmya readies for a big vacation. Who approved that?!?

     * Agent Technologies -- Craig's been pulled to other tasks for the
       past two weeks. Nothing new to report. Looks like agent
       availability is sliding to June.

     * Reporting -- Jacinta is hacking something up and Larry is XSLTing
       it to display in the Web UI. Looks promising, with PDFs right
       around the corner.

     * Notification -- Releasing new cleaned up parsing code for invoking
       notification in 0.7.3

     * New Installer -- What? No one noticed?


===================
Upcoming Road Shows
===================

   In case you've never seen us live and in person, here's your chance:

     * May 5th - Twin Cities LUG, Minneapolis, MN
     * May 10th - Boulder LUG, Boulder, CO
     * June 1st - GNU/Linux BBQ!! Drink good beer and eat fire.
     * June 2nd - Northern Virginia LUG (NOVALUG), Alexandria, VA
     * June 13-14 - OpenView Forum 2001, New Orleans, LA
     * July 25 - O'Reilly Open Source Convention, San Diego, CA
     * August 28-30 - Linux World Expo, San Francisco, CA (BOOTH)

   For additional details on these appearances and others, check out the
   web site at http://www.opennms.org/sections/opennms/events


============================
Early Adopter Program Status
============================

   Effective last week, Jeff took over this column. Effective this week,
   I'll be filling in during his absence.

   Installations and upgrades appear to be going well in the cases where
   we are running on distributions that we run in-house. On others,
   including Debian and SuSE where we don't have RPMs, installations are
   somewhat more painful.

   Have distributions are fragmented Linux without anyone noticing?

   Jeff's in Memphis this week doing an install. And it's Debian. And he
   hates me.


=============
The Wish List
=============

   And now, on with the list...

     * In the 0.7.x release (and CVS), checkout the TODO file

     * Testing on notification

     * New Data Collection configs wanted for the DataCollection.xml

     * Build some event configurations, you slackers!

     * Any interest in more TCP pollers? Let us know (or better yet,
       build one yourself...)

     * LDAP/POP3/nmap Pollers

     * Documentation and development your game? How about a white paper
       on how to extend OpenNMS with custom pollers, custom configs,
       and/or your own scripts/code.

     * Any additional help we can get proving our documentation either
       right or wrong is appreciated. Thanks.

     * Got any creative applications for OpenNMS that we haven't
       considered? Let us know!

     * A Security analysis of OpenNMS?


=============
Afterthoughts
=============

   The Web UI Authentication is as big a thorn in our side as we have had
   to date (including the old installer problems). We're working on it,
   and Larry has many plans in store for 0.7.4 (which we are targeting,
   barring any unforeseen circumstances, for May 4th).

   It's April 17th. Raleigh, NC. And it's snowing. Go figure.

   Anybody out there using Broadslate DSL? Let me know at
   shaneo@opennms.org

   And are you familiar with the "watch" command. This little gem is
   going to make my do loop skills rusty...

Ciao for now -- enjoy 0.7.3!

Shane O.
========
Shane O'Donnell
OpenNMS.org
shaneo@opennms.org
==================
_______________________________________________
announce mailing list (announce@opennms.org)
To subscribe, unsubscribe, or change your list options, go to:
http://www.opennms.org/mailman/listinfo/announce