Sunday, November 9, 2014

Why do I hate Nagios?

You either love or hate Nagios, there are no intermediate opinions when you use this application. I am definitely uncomfortable with Nagios. While I'm not a radical SCOM guy, because I like other free (as in freedom) monitoring systems: Zabbix, Sensu even PandoraFMS, I really hate Nagios.
And if you want there are a lot of very good SaaS alternatives too, such as New Relic, Datadog, Monits, etc.

Nagios reminds me of the good old times, when we only had a few machines in our data centers and three or four services and they were easy to manage and monitor.  Do you remember big brother monitoring software? It was quite popular 15 years ago and it was very simple because our projects were very simple comparing with our projects nowadays.

One of the main problems of Nagios is that it needs a lot of dirty job in the config files (they are hard to parse, understand and modify). It doesn't provide a good user web interface (by itself) so you need permissions to access (for example via ssh)  and modify the config files. Nagios config files are boring and complex to manage, you can waste a lot of time creating the perfect config file. Nagios is still providing a user interface from 1999, it is slow and ugly and remember, time means money. Ok, it is true, there are Nagios distributions like OMD or Centreon with better user interface, but they aren't enough agile yet, for example there is a huge difference between Centreon and New Relic. 

Nagios is hard to deploy, specially the agents. By default it doesn't provide any type of autodiscovery or automated agent deployment and so you have to use three party tools like Puppet or Chef to achieve it. There are some Nagios distributions which improve the deployments like FAN, but even with this solutions, if you are in mixed environment with Linux and Windows systems,  for example, you can't deploy agents filtering by WMI query or deploy agents to OU. To do something like this you have to be a Nagios hacker and use your time (and money, again) developing a system to do it. 

Nagios doesn't provide charting or mapping by default so you need a third party software like pnp4nag or Nagvis but they aren't part of Nagios at all. Other solutions provide it by default, and it helps you to be more proactive when you are monitoring systems or sending understandable dashboards to other departments (non IT departments) in your company.

It's hard to do reporting and data mining in Nagios. By default, Nagios saves the monitoring activity in a file as a plain text, a format that is difficult to parse, probably you would need a tons os regexp to do it well. You can use ELK to manage this log or use Mysql as Nagios backend. Without this features it is difficult to make fast reports for the executives or the auditors.

Another point to evaluate is the scalability or the distributed monitoring. It is very hard to implement by default with Nagios, it is written in C and it is very light and fast, with one node you can have a lot of sensors, but horizontal scaling is complicated. There are some examples like this, but this piece of software is not developed to work in this way.

High availability is also hard to implement and I haven't found real H.A deployments. All docs I have found are based on a monitor monitoring the monitoring system, something like this.

Nagios was cretated in 1999 and a lot of new things have appeared since 1999, new monitoring models, new software, cloud, etc. You can adapt it for today monitoring requirements (not modern APM AFAIK) but it just wasn't developed for this.

In a nutshell, Nagios is a product for Nagios hackers who love it because they love coding for it and they are really good working with it. For them it is a very flexible software and there are many ways to implement it, so they really can create amazing monitoring systems.

Nowadays, there are better and easier alternatives to monitor your systems, as I said at the beginning of the post, and sometimes easier, faster and newer means saving money and increasing the productivity.

Finally, the best thing of Nagios is the community and documentation, there are a lot of people running this software around the world. From here thanks to all of them for their work!

1 comment:

  1. I could not agree more with your points on Nagios. It is hell for people who don't have time to waste becoming configuration file editing experts and add-on installation experts. By far the best SW I have used so far is prtg, as it can be installed and monitoring hundreds of services in less than 20 minutes. Adding new items is a few clicks. However, it only runs on Windows. I have been searching for years for something which comes close to PRTG but for linux.