nkinkade, December 15th, 2010

When I started at CC a number of years ago and began having to review Logwatch output on a daily basis, I tired quickly of the massive list of failed SSH login attempts in the log output. I care much less about who failed to login than who actually did log in. So the first thing I did was to reduce the verbosity of the SSH filters for Logwatch by creating the file /etc/logwatch/conf/services/sshd.conf, and added only “Detail = 0” to it. However, I still found it annoying to have thousands of failed login attempts on virtually all servers. Granted, I wasn’t really worried that anyone would get in by trying to brute-force a login. It was a more a matter of principle, and also a small bit that every failed login attempt uses some tiny amount of resources that could better be used for legitimate traffic. So I implemented connection rate limiting via Netfilter. However, that didn’t work for our then software engineer Asheesh, who generally has around 30 open terminals and as many SSH connections to remote hosts, and who was hitting the rate connection limit. So he started using the ControlMaster feature of SSH to get around this limitation. Some time later I removed the rules altogether with the idea that they weren’t doing anything useful, and were probably detrimental because the kernel was having to inspect a bunch of incoming packets and track connections. Also, at that same time Asheesh recommend that I use a program called fail2ban instead of tackling the issue with Netfilter. I didn’t like the idea. Something seemed hackish about inserting Netfilter rules via some daemon process that scrapes log files of various services. I also am an advocate of running as few services as possible on any given server; the less that runs, the less chance that something will fail in a service-impacting way. Then, the whole thing fell into the forgotten, until a few days ago.

A few days ago I was looking over the Logwatch output of our servers, as I do ever day, and was offended to find that on one server in particular there were nearly 30,000 failed SSH login attempts in a single day. Sure, in terms of network traffic and machine resources, it’s just a drop in the bucket, but it aggravated me. I revisited the idea of fail2ban and did a bit more research. I came to the conclusion that it was pretty stable and worked really well for most people. So I decided to install it on one server. I was really happy to find that it was as easy as apt-get install fail2ban. Done! On Debian, fail2ban works for SSH out-of-the-box, and I didn’t have to do a thing; just another testament to the awesomeness of package management in Debian. I was so impressed that I went ahead and installed it on all CC servers. It has been running nicely for about a week, and failed SSH login attempts are now reduced to a few dozen a day on each machine. Are the machines more secure? Probably not. But it’s just one of those things that makes a sysadmin happy.

One Response to “fail2ban”

  1. piggy says:

    Take a look at ossec for monitoring. Especially the ActiveResponse tool.