Nginx as a front-end proxy cache for WordPress

nginx-wp-love

The short version:

We put an nginx caching proxy server in front of our wordpress mu install and sped it up dramatically – in some cases a thousandfold. I’ve packaged up a plugin, along with installation instructions here – WordPress Nginx proxy cache integrator.

The long version:

Here at blogs.law.harvard.edu, our wordpress mu was having problems. We get a fair amount of traffic (650k+ visits/month), – combine that with ‘bots (good and bad) – and we were having serious problems. RSS feeds (we serve many from some pretty prominent blogs.) are expensive to create, files are gatewayed through PHP (on wpmu), and letting PHP dynamically create each page meant we were VERY close to maxing out our capacity – which we frequently did, bringing our blogs to a crawl.

WordPress – as lovely as it is – needs some kind of caching system in place once you start to see even moderate levels of traffic. There are many, many high quality and well-maintained options for caching – however, none of them really made me happy, or fit my definition of the “holy grail” of how a web app cache should work.

In my mind, caching should:

  • be high performance (digg and slashdot proof),
  • light-weight,
  • be structured to avoid invoking the heavy application frameworks it sits in front of. If you hit your app server (in this case, wordpress) – you’ve failed.
  • be as unobtrusive as possible: caching should be a completely separate layer that lives above your web apps,
  • have centralized and easily tweaked rules, and
  • be flexible enough to work for any type (or amount) of traffic.

So I decided to put a proxy in front of wordpress to static cache as much as possible. ALL non-authenticated traffic is served directly from the nginx file cache, taking some requests (such as RSS feed generation) from 6 pages/second to 7000+ pages/second. Oof. Nginx also handles logging and gzipping, leaving the heavier backend apaches to do what they do best: serve dynamic wordpress pages only when needed.

A frontend proxy also handles “lingering closes” – clients that fail to close a connection, or that take a long time to do so (say, for instance, because they’re on a slow connection). Taken to an extreme, lingering closes act as a “slow loris” attack, and without a frontend proxy your heavy apaches are left tied up. With a lightweight frontend proxy, you can handle more connections with less memory. Throw a cache in the mix and you can bypass the backend entirely, giving you absolutely SILLY scalability.

On nginx – it’s so efficient it’s scary. I’ve never seen it use more than 10 to 15 meg of RAM and a blip of CPU, even under our heaviest load. Our ganglia graphs don’t lie: we halved our memory requirements, doubled our outgoing network throughput and completely leveled out our load. We have had basically no problems since we set this up.

To make a long story short (too late), I packaged this up as a plugin along with detailed installation and configuration info. Check it out! Feedback appreciated: WordPress Nginx proxy cache integrator.

Convert mysql database from latin1 to utf8 the RIGHT way

You’ll see many blog posts around the interwebs stating that you can just dump a mysql database via mysqldump – globally replace “latin1” (or some other character set) in the dump file – and then import that into a utf8 database and it’ll just work. This appears, however, to be WRONG. It does not force mysql to convert the text, it only fools you into believing that that your latin1 characters have been converted. You have to actually convert the text yourself, the columns will just be unconverted latin1 sitting in a utf8 table.

One way to do this is to convert the column in question to binary and back again – assuming your database/table is set to utf8, this will force MySQL to convert the character set correctly.

Another – better – way is to just use iconv to convert during the dump process. This will convert latin1 characters to utf8 properly.

mysqldump --add-drop-table database_to_correct | replace CHARSET=latin1 CHARSET=utf8 | iconv -f latin1 -t utf8 | mysql database_to_correct

PLEASE correct me if I’m wrong – this seems like yet another mysql idiosyncrasy that shouldn’t exist.