Blog

  • Very high performance web servers

    Have you ever tried to get Apache to handle 10,000 concurrent connections? For example, you have a very busy website and you enable keepalive on your web server. Then you set the timeout to something high like 300 seconds for ridiculously slow clients (sounds crazy but I think that’s Apache’s default). All of a sudden when you run netstat it tells you that you have thousands of clients with established connections to your machine.

    Apache can’t handle 10,000 connections efficiently because it uses a one-thread-per-connection model (or if you’re using prefork then one process per connection).

    If you want to allow your clients to use keepalive on your very busy website you need to use a server that uses an event notification model. That means that you have a single thread or process that manages thousands of sockets or connections. The sockets don’t block the execution of the thread but instead sit quietly until something happens and then have a way of notifying the thread that something happened and it better come take a look.

    Most of us use Linux these days – of course there are the BSD die hards but whatever. The linux kernel 2.6 introduced something called epoll that is an event notification system for applications that want to manage lots of file descriptors without blocking execution and be notified when something changes.

    Both lighttpd and nginx are two very fast web servers that use epoll and a non-blocking event notification model to manage thousands of connections with a single thread and just a few megs of ram (ram consumption is the real reason you can’t use apache for high concurrency). You can also spawn more than one thread on both servers if you’d like to have them use more than one processor or cpu core.

    I used to use lighttpd 1.4.x but it’s configuration really sucks because it’s so inflexible. I love nginx’s configuration because it’s very intuitive and very flexible. It also has some very cool modules including an experimental embedded perl module. The performance I’m getting out of it is nothing short of spectacular. I run 8 worker processes and each process consumes about 7 megs of RAM with a few modules loaded.

    So my config looks like:

    request ==> nginx_with_keepalive –> apache/appserver_nokeepalive

    If you’d like to read more about server models for handling huge numbers of clients, check out Dan Kegel’s page on the so called c10k problem where he documents a few other event models for servers and has a history lesson on event driven IO.

    Also, if you’re planning on running a high traffic server with high concurrency you should probably optimize your IP stack – here are a few suggestions I made a while back on how to do that.

  • The irrelevance of microsoft's search

    I put some cross-cluster traffic throttling in place yesterday using memcached – which rocks btw. In the last 12 hours I’ve blocked three sources – two were rogue crawlers from broadband ISP’s. The other was MSN’s live search crawler which is requesting more than 1 page per second sustained over 30 seconds. If it was Google I’d probably care, but Google has polite crawlers and unlike Google, Live search only sends me about 2% of my total search traffic.

  • How to fix munin's netstat passive connections increasing constantly

    Another thing I googled until I was all googled out and couldn’t find an answer, so for future explorers who pass by here, here’s the fix…

    If you’re running munin and you suddenly notice the number of netstat passive connections is constantly increasing in a linear fashion, rest assured it’s not your server that’s busy beating itself into oblivion. It’s a munin bug that’s easily fixed.

    If you run netstat and get something like this:

    netstat -s|grep passive
    3339672 passive connection openings
    7574 passive connections rejected because of time stamp

    …then it’s the passive connections rejected that’s confusing munin.

    To fix this edit:

    /usr/share/munin/plugins/netstat

    and change the line

    netstat -s | awk ‘/active connections/ { print “active.value ” $1 } /passive connection/ { print “passive.value ” $1 } /failed connection/ { print “failed.value ” $1 } /connection resets/ { print “resets.value ” $1 } /connections established/ { print “established.value ” $1 }’

    to

    netstat -s | awk ‘/active connections/ { print “active.value ” $1 } /passive connection openings/ { print “passive.value ” $1 } /failed connection/ { print “failed.value ” $1 } /connection resets/ { print “resets.value ” $1 } /connections established/ { print “established.value ” $1 }’

  • Sergio and Muse

    My good friend Sergio who is an extremely accomplished musician and who morphed himself from a spectacular bassist to spectacular drummer and can put most lead guiarists to shame once told me that Muse is the best rock band that has ever existed.

    Personally I don’t have the balls or the knowledge to make far reaching statements like that. And reading this I know you’re enumerating the thousands (millions?) of rock bands that have existed since African American slave communities sang their first question/answer folk songs and created the foundation for blues and then rock.

    But Serge is a smart guy and his opinion is not to be taken lightly. Go buy Muse – “Hysteria” and “Supermassive Black Hole” on iTunes and let me know what you think.

  • Why Free?

    A great article on wired about the free web economy.

    Interesting quote:

    “Anything you can consistently convert to cash is a form of currency itself, and Google plays the role of central banker for these new economies.”

  • I'm so dumb

    Don’t ever leave a website that starts to get any kind of traffic on the joke that calls itself GoDaddy. As a registrar they’re not bad but their DNS tool is very broken.

    I won’t bore you with tales of my screaming match at a manager there at 2am when a simple A record IP address change caused my image server’s address to drop in and out of their DNS at random. Or how the crankier I got the more he called me sir. Or how his colleague explained that if I choose to use their DNS service I need to know intuitively that I can’t make more than one change a day or their zone file gets corrupt – and how it’s standard procedure that you call them to do a “zone file refresh”. Or how he explained that a record I hadn’t changed at all dropped off their servers and the reason was because it’s an “Internet Thing”.

    I moved over to dnsmadeeasy.com today and so far they rock. They’re the lowest cost host that offers Anycast on their servers which gives pretty good protection against DDoS attacks – something that took out dnspark a while back when I used to use them.

  • Why I'm so glad I didn't use Rails

    I’ve been uncool for some time now. In 2000 when Java was really beginning to kick ass I grabbed a Java book and wrote some code. And I decided I was getting stuff done faster in Perl so I stuck with it. I felt like a dork who was playing with his bigwheels while the other kids had graduated to Ducati’s.

    But by and by I discovered that ModPerl kicks Java’s ass as far as performance goes and in fact loosely typed languages do rock. Not only that but anything I need has already been written and posted free in CPAN. And if you code in Java then Sun Microsystems and their friends will try to sell you stuff at every opportunity – it’s like going to the ball game where stadium forces the beer vendors to charge $10 a beer even if they make entry free and you get to play on the field.

    2 years ago at Jobster as the Java dev team was discovering the new and cool loosely typed but cleverly OO language called Ruby and it’s Rails framework I went and grabbed a Ruby book and wrote some code. I didn’t like that it didn’t have CPAN and the server model seemed clunky and immature. So I stuck with ModPerl. Again I felt like the kid left in the dust while the others went and played with the big boys.

    Turns out the big boys don’t care about you or your business. Here’s a slide from David Hansson, Rails creator:

    This is via Rob Conery’s blog which I found via Tony Wright’s blog. And here’s a quote from David:

    I’m not in this world to create Rails for you. I’m in this world to create Rails for me and if you happen to like that version of Rails that I’m creating for me, than you are going to have a great time.

    Read Rob’s full blog entry for a lot more insight on what is scary about Rails and its community.

    In Seattle last year I spent a lot of time networking in the startup community and meeting with many entrepreneurs. When we spoke technology choices every single one of them was planning on using Rails. Eventually it became a silly question and the answer was brushed if in a “duh, like obviously I’m using Rails” fashion.

    This year on the Seattle Tech Startup list – about 2 weeks ago – there was a thread with many entrepreneurs complaining bitterly about Rails’ shortcomings.

    Startups are risky enough without adding Rails.

    Sure I wake up at night and wonder if I’m the guy who insists on using Cobol while everyone has moved on to Pascal.

    But then I get out of bed and read the recent posts in the ModPerl archives, I check on the progress of Perl6 and I log onto my servers and check mod_status and how many requests they’re serving without breaking a sweat and I realize that it takes more than a bunch of arrogant eurotrash developers to create an enthusiastic open source community churning out great products.

    It takes a lot of love for the product from the community and from its developers. It takes an inspirational leader like Larry Wall or Linus Torvalds and their lieutenants. And it takes time.

  • Great Movie and Lawrence

    If your’e looking for a great crime movie with an excellent script and brilliant actors at the top of their game, go rent “The Brave One”.

    There are so many great lines in this movie, but I think the D.H. Lawrence quote got me: “The essential American soul is hard, isolate, stoic – and a killer.”

    Lawrence was an Englishman – and this should be taken in the context of his time: The late 1800’s to 1930. But I think the instinct survives today. Of course I’m a foreigner too so what the hell do I know.

    Speaking of great mainstream movies that steal from great literature, here’s another that you’ve head a thousand times:

    From “Self Pity” by Lawrence:

    I never saw a wild thing sorry for itself.
    A bird will fall frozen dead from a bough
    without ever having felt sorry for itself.

    When I see a disabled dog in Marymoor park playing like it doesn’t matter I think of this poem … and sometimes when I’ve pulled a 12 hour day and got another 5 to go.

    Poetry and Writing’s ability to inspire is quite amazing. Which is why I’m so glad the writers strike is now resolved and the architects of our culture got a raise.

  • Buffet

    CNBC had a 1 hour segment this evening with a collection of Buffet interviews. My favorite quote: “On Wall St we get the innovators then the imitators and then the swarming incompetents.”

    Buffet is a master at distilling an entire thesis into a sentence.

  • Slow lighttpd on Ubuntu 7.10 Gutsy Server with 200+ hits/sec?

    aaaah you say. Finally, after many a Google search finally I found someone who understands my pain. I know you’re in a rush and I can’t stand people who love the sound of their typing either, so here’s how you fix this little problem.

    If you have a brand new super fast server and a high traffic website (200+ requests per second) and you install lighttpd and it performs like a dog, try the following:

    Add this to your /etc/sysctl.conf file:

    net.ipv4.tcp_fin_timeout = 1
    net.ipv4.tcp_tw_recycle = 1

    net.core.wmem_max = 16777216
    net.ipv4.tcp_rmem = 4096 87380 16777216
    net.ipv4.tcp_wmem = 4096 65536 16777216
    net.ipv4.tcp_no_metrics_save = 1
    net.ipv4.tcp_moderate_rcvbuf = 1
    net.core.wmem_default = 16777216

    net.core.rmem_max = 16777216
    net.core.rmem_default = 16777216
    net.core.netdev_max_backlog = 262144
    net.core.somaxconn = 262144

    net.ipv4.tcp_syncookies = 1
    net.ipv4.tcp_max_orphans = 262144
    net.ipv4.tcp_max_syn_backlog = 262144
    net.ipv4.tcp_synack_retries = 2
    net.ipv4.tcp_syn_retries = 2

    #Only enable these if you’re dumb enough to have netfilter connection tracking enabled
    #net.ipv4.netfilter.ip_conntrack_max = 1048576
    #net.nf_conntrack_max = 1048576

    Then run

    sysctl -p

    Also make darn sure you don’t have netfilter’s conntrack modules enabled in the kernel. If you’re using shorewall on your lighttpd box this will probably be enabled. You can check if conntrack is enabled by checking if the file /proc/net/nf_conntrack exists. Also run lsmod and you’ll see a ton of modules starting with nf_contrack_

    To get rid of conntrack if it’s enabled I would avoid rmmodding them – rather remove the app that enabled it and reboot the box just to keep things sane.

    If you must insist in using conntrack then uncomment the last two lines in the sysctl.conf sample above.

    Google the individual params above and you’ll find a ton of explanation on each.