Debian

Upgrade to Debian Squeeze and Mediawiki woes

Nathan Kinkade, February 10th, 2011

Just a number of days ago Debian released Squeeze as the new stable version. I decided to test the upgrade one or two of CC’s servers to see how it would go. The upgrade process was standard and went without problems, as one comes to expect with Debian. Any problems with the upgrade didn’t manifest until I noticed that one of our sites running on Mediawiki had apparently broken.

I narrowed the problem down to several extensions. Upgrading to Squeeze brought in a new version of PHP, taking it from 5.2.6 (in Lenny) to 5.3.3. PHP was emitting warnings in the Apache logs like:

Warning: Parameter 1 to somefunction() expected to be a reference, value given in /path/to/some/file.php on line ##

Looking at the PHP code in question didn’t immediately reveal the problem to me. I finally stumbled across PHP bug 50394. A specific comment on that bug revealed that the issues I was seeing were not a bug, necessarily, but the result of the way PHP 5.3.x handles a specific form of incorrect coding.

In summary, it turns out the problem is related to Mediawiki hooks and its use of the call_user_func_array() PHP built-in function. The function takes two arguments: a user function name, and an array of arguments. If the called function expects some of the arguments to be passed in by reference, then each element of the passed array must be explicitly marked as a reference. For example, this is correct:

function lol ( &$var1, $var2 ) { //do something };
$a = 'foo';
$b = 'bar';
$args = array( &$a, $b )
call_user_func_array('lol', $args);

However, you will get a PHP warning, and a subsequent failure of call_user_func_array(), if $args is defined like (missing the & before $a):

$args = array( $a, $b );

Interestingly, the “correct” way of handling this case, where the callback function expects referenced variables, also happens to be deprecated, as a form of call-time referencing, and the call_user_func_array() documentation states this:

Referenced variables in param_arr are passed to the function by reference, regardless of whether the function expects the respective parameter to be passed by reference. This form of call-time pass by reference does not emit a deprecation notice, but it is nonetheless deprecated, and will most likely be removed in the next version of PHP.

As far as I can tell, this deprecated method is the only way to handle this, yet PHP may drop this functionality. Presumably another method will replace it before that happens, but the ambiguity at the moment leaves one wondering how to properly code for this without risking that the code will break in a future release of PHP. I suppose the only sure way is to make sure that your call-back doesn’t require or need any referenced variables. I’d be happy for someone to point me to the right way to handle this, if for some reason my research just failed to produce the correct method.

I found this breakage in the following modules, but presumably it exists in many more:

ReCAPTCHA
RecentActivityNotify
SpamBlacklist

The fix for the ReCAPTCHA extension was easy, since it’s published on the extension’s page. For the other extensions, I investigated the places where this problem was occurring and removed the references from the function definitions, but not before poking around a bit to make reasonably sure that the references weren’t fully necessary.

Lesson: use caution when doing any upgrade that moves you from PHP <5.2.x to >5.3.x. Google searches reveal that this issue is rife in not only Mediawiki, but also Joomla!, and presumably any other CMS or framework that makes use of call_user_func_array().

2 Comments »

fail2ban

Nathan Kinkade, December 15th, 2010

When I started at CC a number of years ago and began having to review Logwatch output on a daily basis, I tired quickly of the massive list of failed SSH login attempts in the log output. I care much less about who failed to login than who actually did log in. So the first thing I did was to reduce the verbosity of the SSH filters for Logwatch by creating the file /etc/logwatch/conf/services/sshd.conf, and added only “Detail = 0” to it. However, I still found it annoying to have thousands of failed login attempts on virtually all servers. Granted, I wasn’t really worried that anyone would get in by trying to brute-force a login. It was a more a matter of principle, and also a small bit that every failed login attempt uses some tiny amount of resources that could better be used for legitimate traffic. So I implemented connection rate limiting via Netfilter. However, that didn’t work for our then software engineer Asheesh, who generally has around 30 open terminals and as many SSH connections to remote hosts, and who was hitting the rate connection limit. So he started using the ControlMaster feature of SSH to get around this limitation. Some time later I removed the rules altogether with the idea that they weren’t doing anything useful, and were probably detrimental because the kernel was having to inspect a bunch of incoming packets and track connections. Also, at that same time Asheesh recommend that I use a program called fail2ban instead of tackling the issue with Netfilter. I didn’t like the idea. Something seemed hackish about inserting Netfilter rules via some daemon process that scrapes log files of various services. I also am an advocate of running as few services as possible on any given server; the less that runs, the less chance that something will fail in a service-impacting way. Then, the whole thing fell into the forgotten, until a few days ago.

A few days ago I was looking over the Logwatch output of our servers, as I do ever day, and was offended to find that on one server in particular there were nearly 30,000 failed SSH login attempts in a single day. Sure, in terms of network traffic and machine resources, it’s just a drop in the bucket, but it aggravated me. I revisited the idea of fail2ban and did a bit more research. I came to the conclusion that it was pretty stable and worked really well for most people. So I decided to install it on one server. I was really happy to find that it was as easy as apt-get install fail2ban. Done! On Debian, fail2ban works for SSH out-of-the-box, and I didn’t have to do a thing; just another testament to the awesomeness of package management in Debian. I was so impressed that I went ahead and installed it on all CC servers. It has been running nicely for about a week, and failed SSH login attempts are now reduced to a few dozen a day on each machine. Are the machines more secure? Probably not. But it’s just one of those things that makes a sysadmin happy.

1 Comment »

liblicense 0.8.1: The bugfixiest release ever

asheesh, December 25th, 2008

I’m greatly pleased to announce liblicense 0.8.1. Steren and Greg found a number of major issues (Greg found a consistent crasher on amd64, and Steren found a consistent crasher in the Python bindings). These issues, among
some others, are fixed by the wondrous liblicense 0.8.1. I mentioned to Nathan Y. that liblicense is officially “no longer ghetto.”

The best way enjoy liblicense is from our Ubuntu and Debian package repository, at http://mirrors.creativecommons.org/packages/. More information on what liblicense does is available on our wiki page about liblicense. You can also get them in fresh Fedora 11 packages. And the source tarball is available for download from sourceforge.net.

P.S. MERRY CHRISTMAS!

The full ChangeLog snippet goes like this:

liblicense 0.8.1 (2008-12-24):
* Cleanups in the test suite: test_predicate_rw’s path joiner finally works
* Tarball now includes data_empty.png
* Dynamic tests and static tests treat $HOME the same way
* Fix a major issue with requesting localized informational strings, namely that the first match would be returned rather than all matches (e.g., only the first license of a number of matching licenses). This fixes the Python bindings, which use localized strings.
* Add a cooked PDF example that actually works with exempi; explain why that is not a general solution (not all PDFs have XMP packets, and the XMP packet cannot be resized by libexempi)
* Add a test for writing license information to the XMP in a PNG
* Fix a typo in exempi.c
* Add basic support for storing LL_CREATOR in exempi.c
* In the case that the system locale is unset (therefore, is of value “C”), assume English
* Fix a bug with the TagLib module: some lists were not NULL-terminated
* Use calloc() instead of malloc()+memset() in read_license.c; this improves efficiency and closes a crasher on amd64
* Improve chooser_test.c so that it is not strict as to the *order* the results come back so long as they are the right licenses.
* To help diagnose possible xdg_mime errors, if we detect the hopeless application/octet-stream MIME type, fprintf a warning to stderr.
* Test that searching for unknown file types returns a NULL result rather than a segfault.

No Comments »

64 bit woes (almost) cleared up

Nathan Kinkade, August 2nd, 2008

As I mentioned in a recent post, we have upgraded our servers to 64 bit. All of them are now running amd64 for Debian. The first three server were upgraded remotely, but we noticed that a few applications were constantly dying due to segmentation faults. There was some speculation that this was a strange consequence of the remote upgrade process, so we upgraded the 4th server by reprovisioning it with Server Beach as a 64 bit system, cleanly installed from scratch.

Well, it turned out that even the cleanly installed 64 bit system was having problems. So I installed the GNU Debugger, which I had never actually used before. I attached it to one of the processes that was having a problem, and what should immediately reveal itself but:

(gdb) c
Continuing.
[New Thread 1090525536 (LWP 16948)]

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 1082132832 (LWP 16865)]
0x00002aaaaacfcd91 in tidySetErrorSink () from /usr/lib/libtidy-0.99.so.0

Nathan Yergler made a few changes to cc.engine, the application that was having a problem, and which is based on Zope, to remove any dependencies to libtidy, and the segfaults ceased. We haven’t had the time to debug libtidy itself, but it would seem that there was some incompatibility between the version we had installed and a 64 bit system.

We are still having a problem with cgit segfaulting, and that is the next thing to look into … 1 down, 1 to go.

No Comments »

liblicense 0.8 (important) fixes RDF predicate error

asheesh, July 30th, 2008

Brown paper bag release: liblicense claims that the RDF predicate for a file’s license is http://creativecommons.org/ns#License rather than http://creativecommons.org/ns#license. Only the latter is correct.

Any code compiled with liblicense between 0.6 and 0.7.1 (inclusive) contains this mistake.

This time I have audited the library for other insanities like the one fixed here, and there are none. Great thanks to Nathan Yergler for spotting this. I took this chance to change ll_write() and ll_read() to *NOT* take NULL as a valid predicate; this makes the implementation simpler (and more correct).

Sadly, I have bumped the API and ABI numbers accordingly. It’s available in SourceForge at http://sf.net/projects/cctools, and will be uploaded to Debian and Fedora shortly (and will follow from Debian to Ubuntu).

I’m going to head to Argentina for a vacation and Debconf shortly, so there’ll be no activity from on liblicense for a few weeks. I would love help with liblicense in the form of further unit tests. Let’s squash those bugs by just demonstrating all the cases the license should work in.

No Comments »

32 to 64 bit remotely

Nathan Kinkade, July 15th, 2008

A couple months ago I posted here about some of our experiences with Varnish Cache as an HTTP accelerator. By and large I have been very impressed with Varnish. We even found that it had the unexpected benefit of acting as a buffer in front of Apache, preventing Apache from getting overwhelmed with too many slow requests. Apache would get wedged once it had reached it’s MaxClients limit, whereas Varnish seems to happily queue up thousands of requests even if the backend (Apache) is going slowly.

However, after a while we started running into other problems with Varnish, and I found the probable answer in a bug report at the Varnish site. It turns out that Varnish was written with a 64 bit system in mind. That isn’t to say that it won’t work nicely on a 32 bit system, just that you better not expect high server load, or else you’ll start running into resource limitations in a hurry. This left us with about 2 options: Move to 64 bit or ditch Varnish for something like Squid. Seeing as how I was loathe to do the latter, we decided to go 64 bit, which in any case is another logical step into the 21st century.

The problem was that our servers are co-located in data centers around the country. We didn’t want to hassle with reprovisioning all of the them. Asheesh did the the first remote conversion based on some outdated document he found on remotely converting from Red Hat Linux to Debian. It went well and we haven’t had a single problem on that converted machine since. Varnish loves 64bit.

I have now converted two more machines, and this last time I documented the steps I took. I post them here for future reference and with the hope that it may help someone else. Note that these steps are somewhat specific to Debian Linux, but the concepts should be generally applicable to any UNIX-like system. There are no real instructions below, so you just have to infer the method from the steps. See the aforementioned article for more verbose, though dated, explanations. BE WARNED that if you make a mistake and don’t have some lovely rescue method then you may be forced to call your hosting company to salvage the wreckage:

  • [ssh server]
  • aptitude install linux-image-amd64
  • reboot
  • [ssh server]
  • sudo su -
  • aptitude install debootstrap # if not already installed
  • swapoff -a
  • sfdisk -l /dev/sda # to determine swap partition, /dev/sda5 in this case
  • mke2fs -j /dev/sda5
  • mount /dev/sda5 /mnt
  • cfdisk /dev/sda # set /dev/sda5 to type 83 (Linux)
  • debootstrap –arch amd64 etch /mnt http://http.us.debian.org/debian
  • mv /mnt/etc /mnt/etc.LOL
  • cp -a /etc /mnt/
  • mv /mnt/boot /mnt/boot.LOL
  • cp -a /boot /mnt/ # this is really just so that the dpkg post-install hooks don’t issue lots of warnings about things not being in /boot that it expects.
  • chroot /mnt
  • aptitude update
  • aptitude dist-upgrade
  • aptitude install locales
  • dpkg-reconfigure locales # optional (I selected All locales, default UTF-8)
  • aptitude install ssh sudo grub vim # and any other things you want
  • aptitude install linux-image-amd64
  • vi /etc/fstab # change /dev/sda5 to mount on / and comment out old swap entry
  • mkdir /home/nkinkade # just so I have a home, not necessary really
  • exit # get out of chroot
  • vi /boot/grub/menu.lst # change root= of default option from sda6 to sda5
  • reboot
  • [ssh server]
  • sudo su -
  • mount /dev/sda6 /mnt
  • chroot mnt
  • dpkg –get-selections > ia32_dpkg_selections
  • exit
  • mv /home /home.LOL
  • cp -a /mnt/home /
  • mv /root /root.LOL
  • cp -a /mnt/root /
  • mkdir /mnt/ia32
  • mv /mnt/* /mnt/ia32
  • mv /mnt/.* /mnt/ia32
  • cp -a bin boot dev etc etc.LOL home initrd initrd.img lib lib64 media opt root sbin srv tmp usr var vmlinuz /mnt
  • mkdir /mnt/proc /mnt/sys
  • vi /mnt/etc/fstab # make /dev/sda6 be mounted on / again, leave swap commented out
  • vi /boot/grub/menu.lst # change the default boot option back to root=/dev/sda6
  • reboot
  • [ssh server]
  • sudo su -
  • mkswap /dev/sda5
  • vi /etc/fstab (uncomment swap line)
  • swapon -a
  • dpkg –set-selections < /ia32/ia32_dpkg_selections
  • apt-get dselect-upgrade # step through all the questions about changed /etc/files, etc.
2 Comments »