Working for an IBM Business Partner for quite some years, I was used to deploy and configure (and even teach for IBM) IBM Director as a monitoring solution (for both hardware/operating systems/snmp devices/etc/etc ...). Now that I work as a sysadmin, I have to maintain one IBM director 5.20.3 setup I had myself installed and configured quite some time ago (as a consultant then). But I didn't want to update to 6.2 because it simply kills the machine on which it runs .. needs too much processor, too much memory .. and just to give you an idea : it's a Websphere/java thing that you have to install now ... I wanted to go the opensource way instead, but with something that can still monitor Linux/Windows/snmp devices and IPMIdevices (we have quite some IBM servers and/or BladeCenter).

I tested Zabbix and directly felt in love with it : the agent memory footprint is really small (in comparison with that java-based agent on the Director side) and the way to build Items and Triggers is really great. I deployed it in our environment but focused first on the OS/services side (as the 'other' monitoring solution was still there for the hardware layer monitoring). I wanted then to use the integrated IPMI features of Zabbix and started to poll data from our IBM servers ... until .. crash !

From the zabbix_server.log :

2774:20101217:100001.893 IPMI Host [my.host.name]: first network error, wait for 15 seconds
2774:20101217:100002.894 Got signal [signal:11(SIGSEGV),reason:2,refaddr:0x34a3f52a38]. Crashing ...

Hmm, not good when the monitoring application crashes itself. I disabled all my IPMI checks and then the server was back without any issue. I repeated the above steps vice and versa to proove that it was really IPMI related and it's the case. Browsing the Zabbix support website returned me quite some interesting answers, including that one (ZBX-2898) and surely that one (ZBX-633) . Ok so that confirms that IPMI checks have to be disabled now and let's wait for Zabbix 1.8.4 to appear .. In the meantime I'll write some scripts (type External Check) to return values in Zabbix that can be used to create Triggers ... that's also one of the advantages in Zabbix : you can still write many plugins/scripts to do the same things :-)