Code


Code27 May 2008 12:07 am

I thought I’d share a little moment I had recently. We rolled out a new version of Feedjit a few days ago. Nothing changed on the user interface - so no new user features. It was mostly performance enhancements on the back-end servers.

The new code was the results of many weeks of research and testing and several weeks of implementation. When we launched this weekend I was cautiously optimistic when I saw the load drop on the servers. And now that it’s a few days later I’m staring at our monitoring graphs with a huge smile on my face.

The reason this is a big win for us is because if we double performance then the number of servers we have to buy halves. And for a small company supporting a huge number of users that’s a very good thing.

Here’s the data from one of our busiest servers. The server has a quad core CPU in it so a load average of 4 is 100% busy. As you can see we were pushing things a little on this particular box. It’s fairly obvious where we rolled out the new code…

Most of the performance gain is from faster disk access code which means that the CPU spends almost no time waiting for the disk to do something. As you can see below the IO wait time has dropped to virtually zero. Disk is the slowest component in a server and is usually the bottleneck for any applications that store and retrieve data, so this is a really big win for us.


I wish I could go into more detail about our application and some specific numbers. In a few weeks I’ll hopefully be able to share more.

Mark.

Code23 Mar 2008 12:43 pm

I put some cross-cluster traffic throttling in place yesterday using memcached - which rocks btw. In the last 12 hours I’ve blocked three sources - two were rogue crawlers from broadband ISP’s. The other was MSN’s live search crawler which is requesting more than 1 page per second sustained over 30 seconds. If it was Google I’d probably care, but Google has polite crawlers and unlike Google, Live search only sends me about 2% of my total search traffic.

Code15 Mar 2008 01:02 am

Another thing I googled until I was all googled out and couldn’t find an answer, so for future explorers who pass by here, here’s the fix…

If you’re running munin and you suddenly notice the number of netstat passive connections is constantly increasing in a linear fashion, rest assured it’s not your server that’s busy beating itself into oblivion. It’s a munin bug that’s easily fixed.

If you run netstat and get something like this:

netstat -s|grep passive
3339672 passive connection openings
7574 passive connections rejected because of time stamp

…then it’s the passive connections rejected that’s confusing munin.

To fix this edit:

/usr/share/munin/plugins/netstat

and change the line

netstat -s | awk ‘/active connections/ { print “active.value ” $1 } /passive connection/ { print “passive.value ” $1 } /failed connection/ { print “failed.value ” $1 } /connection resets/ { print “resets.value ” $1 } /connections established/ { print “established.value ” $1 }’

to

netstat -s | awk ‘/active connections/ { print “active.value ” $1 } /passive connection openings/ { print “passive.value ” $1 } /failed connection/ { print “failed.value ” $1 } /connection resets/ { print “resets.value ” $1 } /connections established/ { print “established.value ” $1 }’

Code11 Mar 2008 01:36 pm

I’ve been uncool for some time now. In 2000 when Java was really beginning to kick ass I grabbed a Java book and wrote some code. And I decided I was getting stuff done faster in Perl so I stuck with it. I felt like a dork who was playing with his bigwheels while the other kids had graduated to Ducati’s.

But by and by I discovered that ModPerl kicks Java’s ass as far as performance goes and in fact loosely typed languages do rock. Not only that but anything I need has already been written and posted free in CPAN. And if you code in Java then Sun Microsystems and their friends will try to sell you stuff at every opportunity - it’s like going to the ball game where stadium forces the beer vendors to charge $10 a beer even if they make entry free and you get to play on the field.

2 years ago at Jobster as the Java dev team was discovering the new and cool loosely typed but cleverly OO language called Ruby and it’s Rails framework I went and grabbed a Ruby book and wrote some code. I didn’t like that it didn’t have CPAN and the server model seemed clunky and immature. So I stuck with ModPerl. Again I felt like the kid left in the dust while the others went and played with the big boys.

Turns out the big boys don’t care about you or your business. Here’s a slide from David Hansson, Rails creator:

This is via Rob Conery’s blog which I found via Tony Wright’s blog. And here’s a quote from David:

I’m not in this world to create Rails for you. I’m in this world to create Rails for me and if you happen to like that version of Rails that I’m creating for me, than you are going to have a great time.

Read Rob’s full blog entry for a lot more insight on what is scary about Rails and its community.

In Seattle last year I spent a lot of time networking in the startup community and meeting with many entrepreneurs. When we spoke technology choices every single one of them was planning on using Rails. Eventually it became a silly question and the answer was brushed if in a “duh, like obviously I’m using Rails” fashion.

This year on the Seattle Tech Startup list - about 2 weeks ago - there was a thread with many entrepreneurs complaining bitterly about Rails’ shortcomings.

Startups are risky enough without adding Rails.

Sure I wake up at night and wonder if I’m the guy who insists on using Cobol while everyone has moved on to Pascal.

But then I get out of bed and read the recent posts in the ModPerl archives, I check on the progress of Perl6 and I log onto my servers and check mod_status and how many requests they’re serving without breaking a sweat and I realize that it takes more than a bunch of arrogant eurotrash developers to create an enthusiastic open source community churning out great products.

It takes a lot of love for the product from the community and from its developers. It takes an inspirational leader like Larry Wall or Linus Torvalds and their lieutenants. And it takes time.

Code06 Mar 2008 12:16 am

From the nginx docs under Known Problems:

3. Nginx may laugh at your Perl code and hit on your girlfriend.

Code25 Dec 2007 09:36 pm

Google recently fixed a glaring vulnerability in gmail that allows an attacker to forward copies of all or some of your email to themselves by adding a filter to your gmail account. But not before someone lost their domain name to an attacker who then proceeded to try to sell it back to them for cash.

The gmail bug was a cross site request forgery exploit. The attack is incredibly simple. If a user is authenticated to a website, an attacker simply gets that user to load a URL that causes the user to effectively take some sort of action on that website. So by clicking a link in an email or on a website, or by simply loading up a malicious web page that contains an image URL with the correct query string parameters, an attacker can get an unsuspecting user to “do something” on a website they’re a member of.

Wikipedia has a good summary on CSRF and I recommend you read it if you haven’t already. Avoiding CSRF vulnerabilities in your web apps is easy: In all forms that require a user to be authenticated, simply reauthenticate them using some user-specific transient data. You could, for example, include a users session ID in a hidden form field and when the user submits the form check that the session ID in the form POST matches the session ID in the users cookie.

If your session ID’s change every time a user authenticates to your website, it effectively defeats this attack. For extra security you may want to either encrypt the session ID in the form’s hidden field, or set the hidden fields value to an MD5 hash of the real session ID.

The Google CSRF required a form POST which was only slightly more complex for an attacker to implement. But many CSRF attacks don’t require a POST and parameters can therefore appear in a URL query string. The effect of this is that your website can be exploited by one of your users simply loading an image on a malicious web page or in a malicious email.

Innovation and Code23 Dec 2007 11:36 am

A Microsoft quote from an NY Times article I’ve already cited has been bugging the crap out of me. It bugged me when I first blogged about this article and it bugged me as I wandered around B&N last night doing the last of my xmass shopping. I wound up in the management section and picked up a book on the top 10 mistakes leaders make. Staring at me as I flipped open chapter 5 was confirmation that I wasn’t nuts.

Here’s the quote that bugged me:

“I’m happy that by hiring a bunch of old hands, who have been through these wars for 10 or 20 years, we at least have a nucleus of people who kind of know what’s possible and what isn’t,”

I’ve lost count of how many times as a software developer I’ve sat down and said “I wonder if this is possible?”. When I created WorkZoo I wondered if it was possible to aggregate all the worlds jobs into a single database - and I got pretty darn close. When I created Geojoey I wondered if it was possible to have a rich pure Ajax application with a client-side MVC model - and it was. When I created LineBuzz I wondered if it was possible to post inline comments on arbitrary text on any web page - yes it’s possible. When I created Feedjit I wondered if it was possible to scale to serve real-time traffic data in a widget. We’re serving almost 100 Million real-time widgets per month now.

I started coding on an Apple IIe and later moved to IBM PC’s so in my youth Apple and Microsoft were symbols of innovation and I wanted to innovate the way they did. Apple’s still doing a great job, but it breaks my heart to see MS floundering like a fish out of water in the new world of broadband, browser standards, open source and dynamic web applications.

Come on guys. Get it together already!! Fire those know-it-alls, hire some new blood and pretend for a moment that the past doesn’t matter and that anything is possible.

Code01 Oct 2007 04:28 pm

Phil Bogle wrote recently about an awesome image resizing algorithm. I found out via a welsh view what happened to it. It’s been launched as a website called RSizr.com and is also available as a Gimp plugin called Liquid Rescale. It’s really really cool to see this amazing algo take the open source route.

It’s an incredibly smart algo - I tried it on a Google Analytics graph and it shrunk the graph without breaking the line while maintaining the text scale.

It’d be awesome to see this as a feature in Image Magick so we can put more web front-ends on it.

Code24 Sep 2007 02:35 pm

I have this little 64 bit dual core opteron that I’m busy torturing with way more traffic than it’s creators intended. I sometimes edit code on my live servers - only when I’m sure it’s not going to break anything and only when I’m wide-awake and fully caffeinated.  Today I tried to edit a file on a live box. In the time that it took ViM to delete the file and rewrite it to disk during the save operation (about 1/50th of a second), the webserver threw out 20 messages saying file not found.

Not to self: No more editing-code-on-live-cowboy-crap.

Code16 Aug 2007 01:11 am

I started work on this at 4pm and it’s now 2am. It’s called FEEDJIT and it’s a little experiment. If you like it go ahead and install it. A few minutes after I post this it should be in the sidebar of this blog.
Mark.

Next Page »

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.