From: announce-admin@opennms.org To: announce@opennms.org Subject: [OpenNMS-Announce] OpenNMS Update v2.16 Date: Tue, 17 Apr 2001 19:17:48 -0400 (EDT) ==================== OpenNMS Update ==================== Vol 2., Issue 16 ==================== April 17, 2001 ==================== In this week's installment... * Project Status + 0.7.3 Released + Known Issues in 0.7.2 + SNMP Data Collection Configuration + Coding Projects Underway * Upcoming Road Shows * Early Adopter Program Status * The Wish List ============== Project Status ============== 0.7.3 Released: 0.7.2 has had some problems (see next section for details) and we've fixed a number of them, with those fixes available in CVS. We've also changed our release calendar to introduce a new release, 0.7.3, that will fix these issues. As I write this, Ben is uploading tar bundles. Please note: We are seeing a number of inconsistencies on tar-bundle installations. Your results may be better (or at least our ability to help you with problems) if you install on Red Hat. Debian and SuSE, while nice OSes, are not our strength. If you prefer these platforms and can help us figure out what the issues are, we'd certainly appreciate it. From the CHANGELOG, here's a list of feature/functionality updates in the 0.7.3 release: * Fix for RedHat 7.1 and JDK 1.3.0_02 * Changes to the build process, including updates to paths and some files missing from dist and install * INSTALL.html now auto-generated again * Fixes for install.pl for Perl 5.6.1, including auto-parsing of the homeDir entry in server.xml * Some changes to compilation defaults * RPM Updates * Better parsing of external exec command lines (should fix earlier issues with notifications and automated actions) * Many logging updates, with move to Log4J * CAPSD changes, including better handling of constants, logging, et al * Introduced URL support for discovery ranges (despite the fact that it has been in the UI forever...) * Updated node name and hostname handling * Upgrade to latest version of ONC RPC * More reports * Event database updated to include columns to support acknowledgement of events Known Issues in 0.7.2 & 0.7.3: Fortunately, we've solved some of the major bugs in 0.7.2, so the 0.7.2 list is decidedly shorter. And that, per Martha Stewart, is a good thing. The biggest issues with 0.7.2 were SNMP Traps not being written to the events table in the database and authentication problems in the Web UI. The SNMP Traps issue was fixed almost immediately after release and the code has been in CVS for a few days now. The Web Authentication stuff, well, let's just say that it is still less than fixed. But it's at least getting some attention... The Web Authentication issues carry over into 0.7.3. We've come to the conclusion that as elegant as our authentication system is within the Tomcat framework, it's elegance is lost somewhere amidst its lack of functionality. So, we're abandoning it for 0.7.4, and Larry has promised a complete re-write of the authentication mechanism. The current one, when it works, is great. However, it appears that the "when it works" timeframe is dwindling. But that's OK, since Larry has been looking for something to do... SNMP Data Collection Configuration: My main man Mike has put together a spiffy little tool with the combination of the SNMP Data Collector and the corresponding configuration files. Since I've been playing with the configs, expanding them here and there, for the past week or so, I thought I'd share some of my insights-- consider this a "What I Did Over Summer Vacation" report... The SNMP Data Collection subsystem is made up, at its most basic, of the SNMP Poller and the DataCollection.xml config file. The poller, like any other poller, is basically invoked by associating appropriate polling ranges with it and dragging the service into a package (all via the Administrator's UI). This config then ends up being reflected in the packages.xml file. Edit this one by hand if you like, but remember that you can get a fresh copy if you need it from CVS -- http://www.opennms.org/cgi-bin/cvsweb.cgi/data/common/conf/packages.xml The DataCollection.xml file -- http://www.opennms.org/cgi-bin/cvsweb.cgi/data/common/conf/DataCollection.xml -- is where the cool stuff lives (at least from an admin/operations perspective. The DataCollection.xml file contains 3 main sections: * Database creation parms (database) * Group definitions (groups) * System definitions (systems) The database basically constitutes the parameters necessary to effectively build the underlying RRDTool databases where the performance data will be stored. It identifies the counters that we will force to exist, even if nothing else is defined for collection, and it exists with very little interaction (if any) with other sections of the file--but I'm getting ahead of myself. The groups section creates a mapping of a groupName to a list of SNMP OIDs (Object Identifiers) to collect for a particular device. The list contains a minimum of 4 subelements: * oid -- The numeric object identifier to GET * instance -- The instance value for the OID (0 (zero) if it's actually a "leaf" in the MIB tree, or a keyword of "ifIndex" if we are after something in the ifTable. Note: There's even an example of a hard-coded non-zero integer in the file, but that's only because I wanted a cheesy, quick and dirty workaround for the Host Resources MIB...) * alias -- a text string to describe the OID. You can populate this with the OID's name from SMI (which we have done so far), but this value is only used by RRD and the Web UI for human-readability -- there is no attempt to try convert this via some compiled SMI somewhere (basically, because we don't yet have a MIB compiler...) * type -- a text string (case-insensitive) that describes the data type for RRD. Currently, acceptable values are Counter, Gauge, Integer, and TimeTicks. For the RRD geeks amongst us, Integer and TimeTicks map back to Gauge. So with that, we now have enough information to map a groupName to some set of OIDs to collect. Now we need a mechanism to decide which systems to collect data from, and which groupName to select to associate with that system. Thus, the systems section was born. And it was good. The systems section provides a mechanism for us to map our groupNames, defined in the groups section, to sysObjectIDs. Or in this case, specific sysObjectIDs or sysObjectID masks, either of which can be augmented with a list of IP addresses or IP address masks. How's that for befuddling? In layman's terms, when CAPSD figures out that a device is supporting SNMP, we pull the device's sysObjectID and cram it into the database. Then, when the scheduler establishes a list of devices that the SNMP Poller should poll, we grab the corresponding sysObjectID of the device. We then compare that sysObjectID to our systems section and figure out which groupNames we need to associate with that IP, and in turn, what OIDs go with those groupNames. But the masking and IP address list gives us some granularity that we didn't have before (Mike just built this in over the past few weeks--what a guy). Masking the sysObjectID allows us to specify the first few characters of a sysObjectID and only compare those digits with the leftmost characters of the sysObjectID we collected from the device. So... * .1.3.6.1.4.1.9.1.217 is valid as either a sysoid or a sysoidmask for a Cisco 2900 switch * .1.3.6.1.4.1.9. is valid as a sysoidmask for all Cisco devices * .1.3.6.1 is valid as a sysoidmask for anything that supports SNMP * .1.3.6.1.4.1.647. is valid as a sysoidmask for anything from Lexmark, but note the trailing dot. That's necessary to prevent it from matching 6470, 64780, or 64789012345678901234567890. Now let's talk granular granularity, an important form of SNMP data collection strategery--augmenting the sysoid or sysoidmask with IP addresses. Just as you would surmise, you have the option of adding a tag under the sysoid or sysoidmask that will anticipate a list of IP addresses or address masks. This gives you the ability to say, collect these OIDs from any Cisco device in 192.168.0.0/24 subnet with the following systems section entries: <sysoidmask>.1.3.6.1.4.1.9.</sysoidmask> <ipList> <ipAddrMask>192.168.0.</ipAddrMask> </ipList> Note that the ipAddrMask works just like the sysoidmask does, in matching a substring of the leftmost characters from the IP address. And also note that the ipAddr tag would work just like you'd think it would. And you can repeat multiple ipAddr or ipAddrMask tags to get the grouping you want. Yes, the masking is not the most elegant solution, and yes, it can make configuration of networks subnetted other than /8, /16, or /24 a pain, but the option was to include masking or only provide an ipAddr tag. Yes, you are correct--in that light, masking doesn't look quite so bad... So now we've created the mappings that allow us to say which sysObjectIDs I'm going to poll for what OIDs, with possible abilities to add IP-based granularity. So what happens when I push this into practice? Essentially, when the SNMP poller comes up, it's going to check the rules and ranges to determine what IP addresses it should be working with (all handled via the scheduler). The poller will be checking what OID to sysObjectID mappings are in place (as well as an ipAddr or ipAddrMask tags that might be present), and will begin polling for those data points appropriately. We will also be checking if an RRD for that interface exists. If it does, we'll prepare to write to it. If not, we'll create it. When the device is polled, success is an all-or-nothing deal. If we poll for 15 OIDs and have success on 14 of them and error out on the 15th, we're going to discard the data for that poll. Is that ideal? No. Is that functional? Pretty much. And when we collect all 15 data points successfully, we jam them into an RRD under the covers. Note that we've had to tweak the RRD code a little bit to make it thread-safe for writing, but now we can go full-bore at this thing and gather more data, more effectively, and get it into an RRD in a full multi-threaded environment. Pretty cool? Yeah, we think so. And the strengths of having RRD under the covers don't just stop at having a database there (database? OK, that's arguable, but it is a data store, anyway...). RRD, for those of you that aren't familiar, comes with internal functions to build graphs based on the data, and the data stores are built fully-defined, so you have no need to worry about over-running your drive space with ongoing data collection. The data will be stored as configured in the database section of the DataCollection.xml file, storing 8928 granular data points (by default, reflecting data collected every five minutes, or about one month) and 8784 roll-up data points reflecting an average of the data collected every 12 polls (by default, one year's worth of hourly data). Considering the number of data points we build into the RRD by default, the RRD's are going to take up about 8MB per interface you wish to collect on. This can be tweaked as necessary, but it will be closer to 50MB if you want a year's worth of the most granular data. That could add up a lot quicker than adding in 8MB/interface increments. Actually, about 6.25 times quicker. Anyway, that's how it works. Now for the killer question? Who's done the necessary research to know what data points should be collected by default from which devices? If you have and can share your findings, let us know. We'll be glad to let you extend our current stock collections in the DataCollection.xml file, and we'll give you credit too! Who could ask for anything more? Oh yeah, I forgot to tell you about thresholding. But that can wait 'til next week. Coding Projects Underway: * CDP/L2/Mapping -- The Pete Siemsen show has begun and comments are directed to the [disc] (discovery, not discuss) list. * Snort Integration -- Still no update? Is there life in Leman-land? * Solaris Port -- Ben's working on it. Harald's got the icmpd mojo workin'. * NT/2K Port -- No update. The concerns are still with some of our dependencies. * SNMP Poller/Data Collection -- Thresholding is in. * User Interfaces -- Working on event ownership and acknowledgement. It's working in CVS if you want to play with it. Note that you'll need the newest create.sql too... * New Pollers -- Finally, some interest in CORBA. Should be seeing some posts on the list soon... * Maji Prelim Work -- Rick is building Perl code that is successfully parsing MIB files. Check him out, in all his glory, on the "events" list. * Configuration -- Still no code. Status? Nada to report. * Discovery/CAPSD/Database Review -- Mike's got "coalescence" again while Sowmya readies for a big vacation. Who approved that?!? * Agent Technologies -- Craig's been pulled to other tasks for the past two weeks. Nothing new to report. Looks like agent availability is sliding to June. * Reporting -- Jacinta is hacking something up and Larry is XSLTing it to display in the Web UI. Looks promising, with PDFs right around the corner. * Notification -- Releasing new cleaned up parsing code for invoking notification in 0.7.3 * New Installer -- What? No one noticed? =================== Upcoming Road Shows =================== In case you've never seen us live and in person, here's your chance: * May 5th - Twin Cities LUG, Minneapolis, MN * May 10th - Boulder LUG, Boulder, CO * June 1st - GNU/Linux BBQ!! Drink good beer and eat fire. * June 2nd - Northern Virginia LUG (NOVALUG), Alexandria, VA * June 13-14 - OpenView Forum 2001, New Orleans, LA * July 25 - O'Reilly Open Source Convention, San Diego, CA * August 28-30 - Linux World Expo, San Francisco, CA (BOOTH) For additional details on these appearances and others, check out the web site at http://www.opennms.org/sections/opennms/events ============================ Early Adopter Program Status ============================ Effective last week, Jeff took over this column. Effective this week, I'll be filling in during his absence. Installations and upgrades appear to be going well in the cases where we are running on distributions that we run in-house. On others, including Debian and SuSE where we don't have RPMs, installations are somewhat more painful. Have distributions are fragmented Linux without anyone noticing? Jeff's in Memphis this week doing an install. And it's Debian. And he hates me. ============= The Wish List ============= And now, on with the list... * In the 0.7.x release (and CVS), checkout the TODO file * Testing on notification * New Data Collection configs wanted for the DataCollection.xml * Build some event configurations, you slackers! * Any interest in more TCP pollers? Let us know (or better yet, build one yourself...) * LDAP/POP3/nmap Pollers * Documentation and development your game? How about a white paper on how to extend OpenNMS with custom pollers, custom configs, and/or your own scripts/code. * Any additional help we can get proving our documentation either right or wrong is appreciated. Thanks. * Got any creative applications for OpenNMS that we haven't considered? Let us know! * A Security analysis of OpenNMS? ============= Afterthoughts ============= The Web UI Authentication is as big a thorn in our side as we have had to date (including the old installer problems). We're working on it, and Larry has many plans in store for 0.7.4 (which we are targeting, barring any unforeseen circumstances, for May 4th). It's April 17th. Raleigh, NC. And it's snowing. Go figure. Anybody out there using Broadslate DSL? Let me know at shaneo@opennms.org And are you familiar with the "watch" command. This little gem is going to make my do loop skills rusty... Ciao for now -- enjoy 0.7.3! Shane O. ======== Shane O'Donnell OpenNMS.org shaneo@opennms.org ================== _______________________________________________ announce mailing list (announce@opennms.org) To subscribe, unsubscribe, or change your list options, go to: http://www.opennms.org/mailman/listinfo/announce