Sep42006

Proxying Gotchas (and tricks) with Nginx

Filed under: nginx 

As I continue my migration away from Pound and Lighttpd, I'm discovering dark corners of Nginx. Tonight I ran into a problem proxying from Nginx to a Lighttpd backend that's running Mailman and it turns out to be a gem.

Because I've got a lot of sites to migrate, I'm trying to minimize the changes I make at any given point. Many sites rely on custom Lighty configs and with those I'm simply swapping Nginx in for Pound and proxying to Lighty. Later I'll work on eliminating Lighty.

Anyway, I have a site that's using Mailman, and because Nginx doesn't have CGI support, I'm leaving Lighty in place to handle Mailman for the moment. I tried an Nginx config that looked something like this:

location ~ /(mailman|pipermail)/ {
    proxy_pass http://127.0.0.1:8000/;
    proxy_redirect off;
    proxy_set_header Host $host;
    proxy_set_header X-Real-IP $remote_addr;
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
}

This failed to load with the following error:

* Checking nginx' configuration ...
2006/09/04 02:36:33 [emerg] 4067#0: "proxy_pass" may not have URI part in location given by regular expression or inside the "if" statement in /etc/nginx/nginx.conf:7
2006/09/04 02:36:33 [emerg] 4067#0: the configuration file /etc/nginx/nginx.conf test failed
* failed, please correct errors above

This seemed strange. Why would a proxy not be allowed within a regex location?

"Fine", I thought. "I'll just eliminate the regex for now and figure it out later." So I changed the config to look like this:

location /mailman/ {
    proxy_pass http://127.0.0.1:8000/;
    proxy_redirect off;
    proxy_set_header Host $host;
    proxy_set_header X-Real-IP $remote_addr;
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
}

This was accepted but I got a 404 whenever I tried to go to /mailman. Lighty is configured to run on 8000 and I could see from the Lighty log that the request was getting passed through. What was odd is that I could also see that the first part of the request header was getting stripped off. For instance, /mailman/subscribe would become just /subscribe and thus generate a 404.

Finally I realized I'd added a trailing slash to the proxy_pass directive in nginx.conf. Removing this solved the problem. I then decided to try the regex again. This too now worked.

My config now looked like this:

location ~ /(mailman|pipermail)/ {
    proxy_pass http://127.0.0.1:8000;
    proxy_redirect off;
    proxy_set_header Host $host;
    proxy_set_header X-Real-IP $remote_addr;
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
}

So what's the gem? Well, apparently you can strip off the location part of the URL by appending the trailing slash to the proxy_pass directive. This is actually a pretty useful feature if you are proxying to a server such as TurboGears that isn't mounted at the document root. Rather than mucking about trying to get the server root set correctly (which I've never managed to do), you can simply have Nginx strip it off.

[Update]

I wanted to give a more concrete example of using this feature, but didn't have one at the time I originally wrote this entry. I'm moving to a new blog and ended up using this.

The new blogging software I'm using runs on top of Zope, which, unless you are a Zope user, can lead to all sorts of mysterious requirements. Well, one of these requirements is that the blog reside mounted at /users (well, not /users specifically, but not the root). I wanted the blog to be the front page. I could solve this in Zope, I know, but it would be a pain to figure out. Enter Nginx:

Solution 1: rewrite urls.

The obvious and traditional (ala Apache) solution would be to use rewrite rules, and this does in fact, work:

server {
    listen 80;
    server_name blog.twisty-industries.com;
    location = / {
        rewrite (.*) /users last;
    }

    location / {
        proxy_pass http://127.0.0.3:8082;
        include /etc/nginx/proxy.conf;
    }
}

Is there a downside? Shrug. There is a slightly better way however:

Solution 2: proxy path:

server {
    listen 80;
    server_name blog.twisty-industries.com;
    location / {
        proxy_pass http://127.0.0.3:8082/users;
        include /etc/nginx/proxy.conf;
    }
}

I suspect this is a bit more efficient, both in lines of configuration and in request processing time. In short, I like it.



2 comments Leave a comment


Aug242006

Goodbye Lighty. Hello Nginx.

Filed under: nginx lighttpd 

Lately I've gotten fed up with Lighttpd. There's been outstanding bugs that are so familiar they've acquired names. The project's lead, Jan Kneche, seems more interested in schmoozing up to the Rails crowd than providing a decent product (which is ironic given that Mongrel aims to make Lighttpd irrelevant in the Rails world).

Anyway, a discussion today on the TurboGears list brought up an alternative.

Bob Ippolito replied to a discussion between myself and another person about whether to use Pound or Lighttpd as a reverse-proxy in front of TurboGears applications. I held that Pound is the correct solution as it is a proxy, whereas Lighttpd is a web server that can act as one. Further, I expressed my frustration regarding the state of Lighttpd development and the unmaintainability of its config files.

Bob offered up the following information:

One problem with Lighty is that it leaks memory like a sieve [1]. I audited it for a little bit and I gave up, it's a mess. I'd steer clear of it, it will quickly ruin your day if you throw a lot of traffic at it.

The only solution I know of that's extremely high performance that offers all of the features that you want is nginx [2], but its documentation is largely in Russian. I can't read Russian, but I was able to figure it out (the configuration language isn't Russian, neither is C source). I currently have nginx doing reverse proxy of over tens of millions of HTTP requests per day (thats a few hundred per second) on a single server. At peak load it uses about 15MB RAM and 10% CPU on my particular configuration (FreeBSD 6).

Under the same kind of load, apache falls over (after using 1000 or so processes and god knows how much RAM), pound falls over (too many threads, and using 400MB+ of RAM for all the thread stacks), and lighty leaks more than 20MB per hour (and uses more CPU, but not significantly more).

[1] http://trac.lighttpd.net/trac/ticket/758

[2] http://sysoev.ru/en/

I found this interesting, first off because I know that Bob has used Pound in the past and because, well, it's Bob. Also, someone on the #cherokee channel had suggested Nginx as an option, but since the docs were in Russian I was reluctant to commit to it.

Well after Bob's email, I started searching and it turns out that while the official docs are in Russian, there's a bit of English documentation on the web, and apparently some happy users as well:

Also of interest is that Nginx happens to live in Gentoo's portage.



4 comments Leave a comment


Aug292006

Unix Domain Sockets between Nginx and TurboGears

Filed under: nginx turbogears 

I decided to check once again if there remained a bug with specifying Unix domain sockets in the TurboGears configuration (we're on our third configuration system, so it seemed reasonable to think a bug or two might have gotten shaken out at some point ;-)

And apparently it did (or perhaps the note I make at the end of this article is why I've gotten it to work this time and failed previously).

Nginx configuration:

upstream mycluster {
    server unix:/path/to/socket-1;
    server unix:/path/to/socket-2;
    # ...
}

server {
    # ...
    location / {
        proxy_pass http://mycluster;
    }
}

TurboGears configuration:

server.socket_port=""
server.socket_file="/path/to/socket-1"

Repeat the above configuration in separate config files for each backend you want.

Note that it's necessary to set server.socket_port="" or else someone (TurboGears), somewhere sets it to a default value of 8080. This wouldn't matter except it's how CherryPy determines whether or not we want a domain socket vs a TCP port.



2 comments Leave a comment


Aug162008

Is Nginx a valid replacement for Apache?

Filed under: nginx apache 

I came across this post which, quite frankly, I consider to be complete and utter FUD. The author clearly at least tried Nginx or perused the documentation since he's got at least a passing familiarity with the software, but he seems to have missed the boat overall. I'm going to address his points one-by-one (at least as much as possible).

[Nginx] behaves in inexplicable ways for different browsers.

This complete and utter FUD. Browsers behave in inexplicable and different ways given the same content, but all Nginx does is serve content. It doesn't create content, so this claim is patently ridiculous.

The primary documentation is in Russian.

More FUD. It is true that Igor (the author of Nginx) is Russian and writes his documentation in Russian (surprise!). However, every bit of that documentation has been translated to English (and has ongoing efforts in several other languages as well, but I cannot attest to their completeness). This documentation has existed for over two years (and is linked to from the "primary" documentation), so there's little excuse to not be aware of it.

Nginx does not support .htaccess files.

Absolutely true. It's considered a feature. .htaccess files are certainly convenient, but they also introduce a serious performance penalty (if you enable them, Apache must check every directory on every page request to see if it has an .htaccess file and if it has changed since the last time it read it). One of Nginx' claims to fame is that it is much faster than Apache. One of the ways it achieves this is by not doing things that add lots of overhead to every request when there are faster ways of achieving the same thing. Further, .htaccess files are a potential attack vector for hackers, although this isn't the main reason Nginx doesn't support them.

Nginx requires you to have apache support tools lying around to do stuff.

Sort of true, but ultimately false. It's true that Nginx does not ship with its own version of htpasswd. It could (and such a program is trivial to write) but for whatever reason it doesn't (perhaps the assumption is that most people have Apache installed and so it would be redundant). But of course the web is full of such tools so you don't really need to install Apache just for this one tool. Also, the author says "tools", implying that anything except htpasswd is missing, but of course this is highly misleading. There's nothing else missing.

Nginx doesn't actually do anything beyond serve static HTML and binary assets... which is to say, it doesn't run php or perl or any of the other P's that you might find in the LAMP stack. What it does is take requests and proxies them to other servers that do know how to execute that code.

Once again, sort of true but highly misleading. The author suggests that all Nginx can do is proxy requests to another HTTP server. This is patently false. Nginx can use HTTP proxying or FastCGI to talk to backend applications. What it does not do is embed languages into the webserver or support CGI. There is a project to embed Python into Nginx (mod_wsgi) but it's not widely used as most Nginx users consider this separation of concerns a good thing. With mod_php or mod_perl, it can be a real pain to debug things like memory leaks because the programming language's interpreter runs within the Apache process. Apache's mod_language tools make for easy deployment but painful debugging. Also, unless you want to restart the entire web server anytime you make a change, you must use .htaccess files (whose drawbacks were outlined above). CGI isn't supported simply because, as Igor says "if you use CGI then you don't care about performance and you should just use Apache in that case". In any case, the difference between how Nginx handles dynamic content and how Apache does it boils down to how they pass the request along to another process (binary API vs HTTP/FastCGI).

The author sums up with this:

Finally, I am left with the question why? The ostensible reason is that it's faster and can therefore handle more requests. Even if we accept that as true (grumble, grumble), it only accomplishes that speed by passing the buck off to other servers. When you find a non-responsive site it's not because the static assets like images and HTML text are being served slowly... it's because the dynamic content generated by php/perl/python/ruby/whatever and the underly database from which the data is drawn cannot keep up. Nginx suffers that same failing... while requiring just as many resources because you now have to run so many different servers for each of the languages you want to code it.

Again, the author displays his complete and utter lack of understanding of a web stack and the difference between how Apache handles a request and how Nginx does it. Apache is a process-based server (threads being a type of process), whereas Nginx is asyncronous. Threaded programs can (but usually don't) perform as well as async programs if you are only measuring, say requests per second under a small to average load. What they don't do nearly as well is called scaling. The reason threaded programs don't scale as well is because they typically launch a separate thread for each incoming request. Not only is there latency introduced as the thread is spawned (this can be addressed to some degree with thread pools, but that introduces other issues), but the worst part is that a thread consumes a significant amount of RAM. On 32 bit Linux, this amount is typically 2MB per thread (the default stack size). That's 2MB per concurrent request, even if you are serving a 2K page of static HTML or a small image. The thread can also allocate more RAM if it needs to, but let's assume it doesn't for simplicity.

Let's say your server has 1GB of RAM and we ignore the RAM used by the kernel, Apache itself, PHP/Perl/Python/etc and all the other system processes. How many concurrent requests can you handle? Pretty easy: 1024MB/2MB = 512. Now for most sites, that's a pretty decent amount of traffic. Of course, we've consumed all the memory to do this, so if we get 513 requests we're now forced into swap. In reality, due to the other software running on the system, we'd certainly be forced into swap much, much sooner and we'd have much less available RAM to start with. Realistically, a 1GB Apache server could probably only handle around 100-200 simultaneous requests (much depends on the size of the response). Note that this is not the same as requests per second, which, if you have sub-second response times, 100-200 requests per second might only amount to 10-20 simultaneous requests.

The author's claim that site slowness isn't caused by static files is probably true for 90% of the web. Of course for sites that serve large files this is absolutely false. The longer a request takes to process, the more likely it is we'll start seeing a higher number of concurrent requests (since new requests come in faster than we can respond to them). This is why sites like youtube.com and torrentspy.com have switched to Nginx for serving static content. A single Nginx server can easily handle as many concurrent connections as imposed by the operating system limits (rather than memory limits). You don't need to be youtube.com to bring an Apache server to its knees if you serve many files that are more than a a few kilobytes each. A single Nginx server can easily replace a dozen similarly configured Apache servers for handling static content.

So what if you aren't serving big files? What difference does this make to you? Well, as a result of its asynchronous approach, Nginx also has the benefit of utilizing far less CPU. It doesn't take much traffic to drive Apache up into the 70-100% CPU utilization range. If you can manage to drive Nginx past 20% CPU then you don't need to read this as you certainly already have a team of scalability experts who can tell you the pitfalls of threading already.

The claim that Nginx is faster because it "hands off" processing to some other server is simply ludicrous. Apache does the same thing. The only difference is the protocol they use to communicate with that other process. Apache uses a binary API to pass information to PHP, Perl, Python, etc. Nginx uses HTTP or FastCGI. The processing of dynamic content is still done by the other process, not Apache or Nginx. Further, because Nginx uses less CPU and RAM, there are more resources available for that other process to use to get its job done. Frankly, this assertion left me flabbergasted. It's actually really difficult for me to believe that someone could have such a poor understanding of such a simple software stack.

Finally, I'm going to address the author's grudging allowance that Nginx might be faster ("grumble, grumble") for static content by simply asserting that Nginx stomps an absolute mudhole in Apache when it comes to overall performance. We aren't discussing a few KB/s faster or a few requests/second faster, we are discussing hundreds to thousands of requests per second faster for static content (depending on your hardware), all while using a fraction of the CPU and RAM Apache does before it fails.

Anyway. I'm going to try to forget I read this pile of rubbish. If the author wants to believe that Apache is better, so be it. I just find it unfortunate that a certain percentage of clueless people will find the article informative.



2 comments Leave a comment


Feb222009

Nginx 1, Apache 0

Filed under: nginx 

I spent the better part of the evening converting several sites belonging to a friend of mine from Apache to Nginx. Previously the front page of the biggest site (written in Joomla) took anywhere from 5 to 10 seconds to load. If request load got even moderately heavy, Apache would quickly exhaust the 256MB of memory available in the VPS and the site would become permanently unavailable as processes were killed by the kernel, requiring the VPS to be restarted.

http://wiki.nginx.org/images/0/0e/Absolut_nginx.jpg

After this happened three times over a period of a week, I decided to take some action. I tried the "easy method", tuning Apache by following guides I found that purported to make Apache suitable to a VPS, but nothing really helped. At best they made the site slower and delayed the inevitable crash. Finally I bit the bullet and converted them all to Nginx (I was reluctant to do so because I'm not terribly familiar with Joomla and there aren't many examples around for running Joomla on anything but Apache - and previous trials a few years ago with Nginx + Mambo turned out to be a huge pain).

At the end of it all, the Nginx configuration turned out to be remarkably simple, and page loads are down to under 3 seconds (the site is very image-heavy and Joomla isn't terribly efficient). And of course, Nginx's memory utilization never exceeds a few megabytes, regardless of load.

To be fair, I'm anything but an Apache expert, so I'm curious if there really is a solution to running something like Joomla in a VPS with 256MB of RAM under Apache. If you think you have the answer, please post it in the comments.



2 comments Leave a comment


Mar112009

New Nginx Wiki Live

Filed under: nginx mediawiki 

I decided to release the new Nginx wiki early as it is working great and I'm tired of trying to manage spam in MoinMoin (must every action take a dozen clicks and page loads to complete?)

I shouldn't have been surprised, but MediaWiki is quite small and fast. The new wiki only uses around ~96MB of the 256MB RAM available in the VPS (that includes all userspace processes: Nginx, PostgreSQL, PHP/MediaWiki, etc). That's a refreshing change from the hundreds of megabytes MoinMoin would suck down (although I blame much of that on Python 2.4).

Anyway, the new wiki is here. Enjoy!



2 comments Leave a comment


Sep252006

#nginx logs now available via Arkivo

Filed under: nginx arkivo irc 

UPDATE: The original archive appears to be dead, so I've set up a new Arkivo log here.

Thanks to Splee and his TurboGears-powered Arkivo, the nginx IRC logs are now publicly available.

Visit the #nginx IRC channel at irc.freenode.net and view the archives here.



0 comments Leave a comment


Sep172006

Nginx Wiki Going Full Speed

Filed under: nginx 

It appears the wiki is a great success. I contacted Igor Sysoev to let him know what we were doing and he's put a link on nginx.net so that people can find the wiki.

We've also had more people than ever (up to 8!) in the IRC channel, which isn't bad considering I hung out there by myself for a couple weeks prior to that.

Anyway, to date, most of the work on the wiki has been done by myself and Olle Jonsson, but we've had a few other contributors as well: technoweenie (of Mephisto fame) contributed some info on running Mongrel with Nginx and we've had at least one Russian speaker (Aleksandar Lazic, who wrote the original English draft documents we relied on so heavily) helping us get the translations right.

Overall, I think we can claim that we're around 50-60% done. Most of the documentation is translated, but we need to have reviews to validate that it's correct. We've put links back to the original Russian documentation at the bottom of each page to help people who are able to compare.

Next up, we'll need a fancy logo ;-) Hopefully Igor doesn't mind.

Don't forget: visit the Nginx wiki and help out!

There's also been German and Russian language sections started, but they need work!



0 comments Leave a comment


Sep162006

English Wiki for Nginx Now Online

Filed under: nginx 

I setup a Nginx wiki last night.

Feel free to create an account and add to the knowledge, fix errors, etc.

Also, join us on #nginx on irc.freenode.net!



0 comments Leave a comment


Sep42006

SquirrelMail under Nginx

Filed under: nginx squirrelmail 

This assumes that SquirrelMail is installed at /var/www/public/webmail.

The Nginx configuration:

location ~ /webmail/.*\.php {
    root /var/www/public;
    fastcgi_pass 127.0.0.1:1025;
    fastcgi_index index.php;
    fastcgi_param SCRIPT_FILENAME /var/www/public$fastcgi_script_name;
    fastcgi_param SCRIPT_NAME $fastcgi_script_name;
    fastcgi_param QUERY_STRING $query_string;
    fastcgi_param REQUEST_METHOD $request_method;
    fastcgi_param CONTENT_TYPE $content_type;
    fastcgi_param CONTENT_LENGTH $content_length;
    fastcgi_param REQUEST_URI $request_uri;
    fastcgi_param DOCUMENT_URI $document_uri;
    fastcgi_param DOCUMENT_ROOT /var/www/public/webmail;
    fastcgi_param SERVER_PROTOCOL $server_protocol;
    fastcgi_param REMOTE_ADDR $remote_addr;
    fastcgi_param REMOTE_PORT $remote_port;
    fastcgi_param SERVER_ADDR $server_addr;
    fastcgi_param SERVER_PORT $server_port;
    fastcgi_param SERVER_NAME $server_name;
    fastcgi_param REDIRECT_STATUS 200;
}

location ~ /webmail {
    index index.php;
    root /var/www/public;
}

Next, you must run a PHP FastCGI instance. For this purpose I'm using spawn-fcgi from the Lighttpd distribution:

/usr/local/bin/spawn-fcgi -a 127.0.0.1 -p 1025 -u sqmail -g sqmail -f /usr/lib/php4/bin/php-cgi


0 comments Leave a comment


Aug282006

Load Balancing for TurboGears using Nginx

Filed under: nginx turbogears 

I decided to test out Nginx's load-balancing capabilities. Turns out to be remarkably easy.

CherryPy is not a fast HTTP server (CP3 is supposedly at least twice as fast, but not released yet, and either way, TurboGears is currently tied to CP2). That's okay, but there are some things you can do to make sure that it doesn't become a bottleneck for your TurboGears website.

The first and obvious thing is to serve static files (images, javascript, css, etc) from a regular webserver. Lots of people use Apache or Lighttpd for this purpose. A few others (including myself) prefer to use Nginx.

This solves a lot of the performance issues right off the bat. However there's another technique that can help quite a bit as well: load balancing.

Nginx provides a simple and elegant way to load balance across two or more backend servers. Here's how you do it with TurboGears.

The TurboGears Setup Edit your prod.cfg file and add/change the following lines to look like:

server.socket_host="127.0.0.1"
server.socket_port=8000

Next, copy your prod.cfg file to a file called prod2.cfg and edit it with the following:

server.socket_host="127.0.0.1"
server.socket_port=8001

Next, start two TurboGears instances:

./start-myproject.py prod.cfg&
./start-myproject.py prod2.cfg&

The Nginx Configuration The next step is to edit your Nginx configuration and add the following:

upstream myproject {
    server 127.0.0.1:8000;
    server 127.0.0.1:8001;
}

server {
    listen: 80;
    server_name: www.domain.com;
    location / {
        proxy_pass http://myproject;
    }
}

Now restart Nginx.

Unbelievably, that's it. If you want to see the effects (and convince yourself it works), you can change your dev.cfg files in the same fashion (probably from two separate terminals) and click through your site like mad (or wget it) and watch as each TurboGears backend takes part of the load.

A couple things to note:

  • Sessions appear to work fine. I've used this technique with Identity and it doesn't appear to matter which backend the request goes to.
  • This probably won't work with SQLite. SQLite is not meant for concurrent access and running two TG servers against the same database is going to be a problem. Besides, if you are doing load balancing, why are you using a toy database anyway? ;-)


0 comments Leave a comment


Aug152006

Generating Static HTML from TurboGears (Part 2)

Filed under: turbogears nginx 

In my previous article , I showed that it's pretty easy to have TurboGears generate static HTML files. However, unless they get served somehow, it's pretty useless. In my quest to figure out Nginx, I decided to make this work.

The Nginx Configuration:

location / {
    default_type text/html;
    types { text/html html; }
    root /var/www/public_html/cache;

    if (-f $request_filename.html) {
        rewrite (.*) $request_uri.html last;
    }

    proxy_pass 127.0.0.1:8000;
    proxy_redirect off;
    proxy_set_header Host $host;
    proxy_set_header X-Real-IP $remote_addr;
   proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
}

The TurboGears Code

You'll need to add a directory under project/static for storing the cached HTML files. I call mine "cache". I think it's a good name so you should too =)

Add the following to your project's controllers.py:

class StaticOutFilter ( BaseFilter ):
    def before_finalize ( self ):
        if cherrypy.response.status is None:
            request = filter ( None, cherrypy.request.path.split ( '/' ) )
            path = os.path.join ( 'var/www/public_html/cache/', *tuple ( request ) ) + '.html'
            try:
                os.makedirs ( os.path.dirname ( path ) )
            except ( OSError, IOError ):
                pass # should log this
            file ( path, 'w' ).writelines ( cherrypy.response.body )

class Root ( controllers.RootController ):
    _cp_filters = [ StaticOutFilter ( ) ]
    # rest of root controller follows ...

At this point, whenever Nginx receives a request, it appends ".html" to the end of the URL and checks that it exists in the cache directory. If it does, it serves the static HTML file, if it doesn't it passes the request on to TurboGears which will generate the file.

Now clearly this simple setup is rife with limitations:

  • Changes made in the TurboGears app won't be reflected unless you manually remove the relevant HTML file(s).
  • Even if you do remove the files, truly dynamic content (such as displaying a username) won't be reflected.

The first problem is fairly trivial to solve. Assuming you provide an interface for editing content on your TurboGears site, simply add some code that cleans out the relevant cache file.

The second problem is much trickier. There are lots of ways to address it, depending upon your application's needs, but some ideas that come to mind are:

  • Add a decorator that indicates whethor or not a page is cacheable and have the filter check this attribute prior to trying to cache it.
  • Use AJAX to manage dynamic items.
  • For the specific example given of the username (and user-customizable items), it might make sense to have a separate view of the site for authenticated users that bypasses Nginx and goes straight to TurboGears (i.e. myaccount.domain.com rather than www.domain.com). If most of your traffic consists of anonymous users then this might be ideal. Serve static content to anonymous users and dynamic content to authenticated users.

Anyway, this aspect is left as an exercise for the reader (or perhaps myself at a later date). If you have suggestions, feel free to make a comment.



0 comments Leave a comment


Aug262006

Migrating from Lighttpd to Nginx

Filed under: lighttpd nginx pound 

I'm an enthusiast. For what? Apparently whatever looks really promising at the moment. That being said, upon hearing that Bob Ippolito has been successfully running Nginx as a high-traffic proxy, I decided to dive right in.

My current standard setup has been Pound proxy (which frankly I've been quite pleased with) in front of various web servers and services, the most common case being in front of Lighttpd and TurboGears. TurboGears doesn't require a separate web server, but CherryPy is suboptimal (to say the least) for serving static content and I host many sites that don't use TurboGears, so there you have it.

Anyway, I'm an enthusiast, but I'm also inclined to be my own crash-test dummy. I don't recommend things until I've at least taken them around the block a few times. So it was time to convert my own sites to Nginx and test the waters.

I started with develix.com as it's pretty lightweight and Nginx's job would consist mainly of serving static files and proxying to the TurboGears app develix.com runs on. Also the fact that develix.com is on it's own IP address simplified the migration. I could leave Pound in place for the other IP addresses and only tinker with develix.com.

I installed Nginx, edited the sample configuration file that came with it, commented out the develix.com entries (quite a few actually, since there are many subdomains), restarted Pound, then fired up Nginx. After a couple minor missteps, develix was happily chugging away on Nginx. Too easy.

Next I decided to migrate a customer's dedicated server. I figured if the experiment were a failure, it would be easy to switch back to his original configuration (Pound proxying to several Lighttpd instances, all running Joomla, Mambo, Sitedepth and Coppermine, i.e. a buttload of crappy, overly complicated PHP code). Further, because this server is mostly about serving large images and videos (yes, it hosts a few of those sites), I figured it would be both a good test of Nginx and a possible performance benefit to my customer.

I notified him of what I wanted to do, and since we'd been having issues with Lighty on that box, he was happy to let me have a go at it.

Well, it wasn't as easy as I'd hoped. Most of the issues were minor, but there were many (remember, Joomla . With lots of modules). Most of these were just a matter of figuring out what the hell was going on, making some rewrite rules, etc. Typical stuff when setting up a website.

One issue however, almost became a showstopper: it seems Nginx doesn't do CGI.

That's right. Of all things. Who would have thought? I mean, after static content, CGI is the simplest thing on the face of the earth. Additionally, it hadn't even occurred to me that he used CGI (only two scripts as it turns out). Anyway, long story short, I ended up writing a small Perl wrapper for the Perl CGI he had that let it be served via FastCGI and then running that from Lighttpd's spawn-fcgi program. This took a surprisingly long time to figure out how to do (think long, painful hours). You'd think there'd be a ready-made Perl script for doing this, but if there is, it's buried under tons of mirrors of the Perl documentation. By the time I finished, the script was quite short, on the order of a dozen lines, but all that meant was I was averaging two lines of code per hour. I won't say that Perl sucks, but I'm sure thinking it.

Anyway, once that was resolved, there was a bit of minor cleanup with file permissions, yada yada, but so far so good.



0 comments Leave a comment


Nov292007

Mapping proxied backends with Nginx

Filed under: nginx 

Nginx 0.6.20 introduced the ability to use variables with the proxy_pass directive. This seems like a rather small feature by itself, but it allows us to utilize the powerful map directive to better organize virtual hosting.

In our shared hosting configuration, we typically assign an entire internal IP address (127.0.0.1, 127.0.0.2, etc) to each client. This allows clients to map multiple ports to various applications (web apps, databases, etc) without too much concern about other users.

However, this can be difficult to keep in order since these addresses tend to get scattered across dozens of virtual host entries and include files in the Nginx configuration.

This is where map and using variables with the proxy_pass directive come in handy.

Here's a short example:

http {
    resolver 127.0.0.1; # dns server
    map_hash_max_size 4096; # needed if your hash table gets big

    map $host $internal_ip {
        hostnames;
        .develix.com            127.0.0.2;
        .twisty-industries.com  127.0.0.3;
        .codemongers.com        127.0.0.4;
    }

    server {
        server_name develix.com www.develix.com;
        listen 198.145.247.210:80;
        location / {
            proxy_pass http://$internal_ip:8000$request_uri;
            include /etc/nginx/proxy.conf;
        }
    }

    server {
        server_name twisty-industries.com www.twisty-industries.com;
        listen 198.145.247.210:80;
        location / {
            proxy_pass http://$internal_ip:8000$request_uri;
            include /etc/nginx/proxy.conf;
        }
    }

    server {
        server_name codemongers.com www.codemongers.com;
        listen 198.145.247.210:80;
        location / {
            proxy_pass http://$internal_ip:8000$request_uri;
            include /etc/nginx/proxy.conf;
        }
    }

    server {
        server_name wiki.codemongers.com;
        listen 198.145.247.210:80;
        location / {
            proxy_pass http://$internal_ip:8001$request_uri;
            include /etc/nginx/proxy.conf;
        }
    }
}

Note that while I could have put the port in the variable as well, this detracts from the flexibility, since each server section could conceivably utilize more than one port. It also tends to make the mapping less readable.

As you can see, the advantage of this approach is the ability to keep all the mappings in a single location, which is far easier to maintain.

Caveat: as of 0.6.20, the fastcgi_pass and upstream directives do not support variables, although I expect this will change soon.



0 comments Leave a comment


Apr52008

Nginx turns 1,000,000

Filed under: nginx 

Nginx broke the 1 million domain mark this month.



0 comments Leave a comment


Aug152008

Still a few Nginx shirts left...

Filed under: nginx 

Get the shirt that makes Patrick Swayze cry.

/images/nginx-shirt.jpg

$20USD via PayPal to sales(at)develix.com. Be sure to include your shirt size and shipping info (overseas is fine).

If you'd like more information, you can email me directly cliff(at)develix.com.



0 comments Leave a comment


Mar112009

New Nginx Wiki

Filed under: nginx 

I'm almost done with converting our former MoinMoin-based wiki to a MediaWiki instance.

Overall I'm pretty pleased. Page loads are much faster, deployment is much simpler, and the management seems (so far) to be much less cumbersome in general.

You can see the work-in-progress here. I'm still hacking on the theme a bit, but it's pretty close to what I want.



0 comments Leave a comment




Copyright © 2007, Cliff Wells