Category: Code

  • Poem: When I Heard the Learn'd Astronomer

    This is a wonderful poem by Walt Whitman where he explores how the formalization of science and nature robs it of it’s mystery and wonder. If you’re a programmer who has done any time at a University, you’ll recognize Whitman’s sentiment.

    It first appeared in the “By the Roadside” section of the standard 1892 edition of Leaves of Grass.

    When I heard the learn’d astronomer;
    When the proofs, the figures, were ranged in columns before me;
    When I was shown the charts and the diagrams, to add, divide, and
    measure them;
    When I, sitting, heard the astronomer, where he lectured with much
    applause in the lecture-room,
    How soon, unaccountable, I became tired and sick;
    Till rising and gliding out, I wander’d off by myself,
    In the mystical moist night-air, and from time to time,
    Look’d up in perfect silence at the stars.

  • What in the world…

    Someone just arrived at my blog by Googling:

    WHAT IN THE WORLD IS GOING TO HAPPEN TO THESE PEOPLE IN THE US WITH THEIR SOCIAL SECURITY CHECKS

    A reminder of the hard problems Google’s engineers are working on.

  • Every national curriculum should require web programming for graduation

    Every primary and high school curriculum should include a mandator web programming course the same way it includes math and a first language.

    When’s the last time you used pythagoras? How about Euclids proof of the infinitude of primes? Both of these are popular in high school math curriculums.

    My sister one of the best chef’s in Cape Town. She writes about food and runs a restaurant review site. She doesn’t use pythagoras that often I’ll bet. But she recently asked me for shell access to the server her blog is hosted on so she could run “chmod 775 *” on a directory to fix a permissions issue. She’s also buying templates from Themeforest and knowing PHP would help her customize them and fix a few bugs.

    Most non-programmers think of programming as a 3 to 4 year computer science degree, a course of advanced calculus thrown in as a prerequisite for graduation and the ability to write a basic compiler or operating system.

    Here’s the truth: Most programmers spend 99% of their careers writing very simple code that is not that different from english. Most of it is a knowledge of syntax rather than opaque math or implementing complex algorithms. Ask any one. Most of them will tell you the last time they used calculus in programming was in school.

    It’s my strongly held belief that everyone should start learning basic programming starting age 7 and the course should continue through to graduation from high school and should be prerequisite the whole way. It should include the following:

    • PHP. It’s open source and the most popular programming language on Earth for many applications beyond just Web. My computer science grad friends are probably freaking out that I’ve chosen a loosely typed language that doesn’t require variable declaration and isn’t purer OO, but it’s the most popular language and it’s what real people actually use to get the job done.
    • Javascript. It runs in every browser and now on many servers.
    • HTML. Obviously.
    • CSS. Obviously.
    • SQL using MySQL.
    If a country were to require all it’s students to graduate from high school with a working knowledge of the above, it would be vastly more competitive. That’s the goal of a national education curriculum.
  • Which revision/source/version control software to use

    I got a question in the comments of my previous post re this, so I’m going to weigh in real quick:

    I’ve used CVS, Subversion (SVN) and Git and dabbled with a few commercial products.

    Use “git”. Here’s why:

    • If it’s not already the most popular, it will be soon.
    • It is used for the Linux Kernel and was designed by Linus Torvalds, creator of Linux. If it can handle Linux’s source and a distributed team that size, your project will do just fine.
    • It’s incredibly fast, which is important if you have lots of source and larger files.
    • It’s very robust. This was one of the original design considerations.
    • It’s designed to work well with a distributed team.
    • It’s extremely well supported and many complementary open source and proprietary products are available for git. Check out GitHub for example.
    • It specifically fixes flaws in previous revision control systems like CVS so there are many learnings built in that make it better than older systems.
    • If you plan to collaborate on an open source project, you’re probably going to be using Git anyway.
    I still have some of my legacy projects on Subversion purely because my deployment system is built on subversion. But everything new I do is on Git, both open and closed source.
  • Which programming language should I learn?

    I’ve been asked this question twice in the last 2 weeks by people wanting to write their first Web application. So I’m going to answer it here for anyone else interested:

    If you want to write Web applications you need to learn the following languages: Javascript, PHP, HTML, CSS and SQL. It sounds like a lot, but it really is not. You can learn enough of each of these languages to write a basic Web application within a week. Trust me. It’s easy!

    PHP is the guts of your Web application. It is the language that runs on your web server. It is also the only language where you have a choice about learning it or learning another language. You must learn HTML, CSS and Javascript and 99% of web programmers learn SQL to talk to a database. But there are many other languages to choose from that can do the same thing that PHP does.

    However, if you are starting out writing web applications, PHP is the first server language you should learn and here is why:

    1. PHP is used by a huge number of websites, both big and small. Most of Facebook is written in PHP. Wikipedia powered by Mediawiki is written in PHP.
    2. WordPress, the worlds most popular open source blog platform is written in PHP. If you know PHP you can change it any way you like or even contribute to the community. WordPress is used by eBay, Yahoo, Digg, The Wall Street Journal, Techcrunch, TMZ, Mashable and of course the whole of WordPress.com is powered by WordPress written in PHP.
    3. Most of the worlds best content management systems are written in PHP.
    4. The PHP community is massive and supportive, unlike the Ruby on Rails community for example.
    5. 99% of web programmers can understand PHP, even though some don’t realize it. (like Perl Developers)
    6. PHP is a mature language which means the bugs have all been ironed out and it runs fast!
    7. If you Google a question you have about PHP, you have a much higher likelihood of finding an answer than any other server programming language.
    8. Don’t learn Perl because even though it’s a mature, fast and popular language, it’s harder to learn than PHP.
    9. Don’t learn Java because Java is better suited to launching spacecraft and running systems that control oil rigs or banking software than Web applications. It is strongly typed which means that you need to write more lines of code to get the same thing done. It’s also harder to learn because it’s a purer object oriented language . It also is owned by Oracle which means it’s a commercial language and that means Oracle will continually be trying to sell you stuff by making things seem harder than they are and claiming they have the answer to the problem they created in your mind.
    10. Don’t learn .NET because it’s also a commercial language and pretty much everything made by Microsoft either will cost you money or will break a lot.
    11. Don’t learn Ruby because the guys who run the community are total a-holes who will insult you for asking beginner questions. Ruby is also way less popular than PHP or Perl even though it’s used to power Twitter. It’s also the reason Twitter is down so often.
    12. Don’t learn Brainf*ck, Cobol, D, Erlang, Fortran, Go, Haskell, Lisp, OCaml, Python or Smalltalk because these are languages that people tell you they know to show off. Some of them have specific advantages like parallelism, being a pure object oriented language or being compact. But they are not for you if you are starting out. In fact, the combination of PHP and Javascript will give you 99.99999% of what all these languages offer.
    You also need to learn two presentation languages: HTML and CSS. They are actually part of each other because HTML is not too useful without CSS and vice versa.

    HTML tells the browser the structure and content of a page. e.g. Put a form after a paragraph and have one field for email and one for full name.

    CSS tells the browser how to make that page look e.g. Which fonts to use, what size they should be, what colors, how wide or tall things on the page should be, how thick borders should be, how much padding to use and how thick to make margins.

    Then you also need to learn a data storage language called SQL which lets you talke to a database to store things like visitor names, email addresses and so on. For example, using SQL you can tell a database to store an email address and full name by saying “INSERT INTO visitors (name, email) values (‘Mark Maunder’, ‘mark@example.com’);. There are other ways to store data and a popular terrorist movement calling itself NoSQL has formed in the last 4 years and they spend their time sowing fear and doubt about SQL and confusing beginners like you. The reality is that 99% of web applications use SQL and continue to use SQL. It works, it’s fast, it’s easy to learn and everyone understands it. It’s used by WordPress, Wikipedia, Facebook and everyone else who counts, whether they like it or not. Just learn SQL!! I also recommend you use MySQL to store your web application’s data (even though it’s owned by Oracle) because it’s the most popular open source database out there. PHP applications use MySQL more than any other database engine on the web.

    To summarize, so far you need to learn:

    • Javascript (a programming language that runs on inside your visitor’s browser)
    • PHP (a programming language that runs on the server)
    • HTML (a presentation language that tells the browser the structure of a page)
    • CSS (a presentation language that tells the browser how to make a page look once it’ knows the structure)
    • SQL (a data access language that lets you store and retrieve data from a database)
    Each of these languages runs or executes in a certain place or environment:
    • Javascript runs inside the browser of someone who has visited your website. What’s cool about this is that it uses your visitor’s CPU and memory instead of the resources on your server.
    • PHP runs on your own web server. Most websites use a kind of “container” or application server to run PHP called Apache Web Server with something called mod_php installed. Apache handles all the web server stuff like receiving the request for the document and making sure it’s formatted correctly. It then passes the request to mod_php which is executing your PHP code. This actually runs your web application written in PHP, your program sends the response back to Apache which sends it back to your visitor.
    • HTML is interpreted by a visitor’s browser and tells the browser how to structure the page as it loads.
    • CSS is also interpreted by a visitor’s browser and tells the browser how to make the HTML look.
    • SQL is a language that you use inside your PHP application to talk to a database like MySQL. You will actually write SQL in your PHP code but it will be sent to the database engine which is where it is interpreted and executed. The database then sends your PHP code whatever it asked for, if anything. (sometimes you’re just inserting data and not asking a question)
    As you progress you will get familiar with the platforms you run each of these languages on. They include:
    • Linux is the operating system you will run on your server. Everything else on your server runs on top of Linux. Linux lets your web application talk to the server’s hardware.
    • Apache Web Server running mod_php. This is the application server you will use to run your PHP code. It will receive the web requests, forward them to your PHP code, and receive the response which is forwarded to your visitor.
    • MySQL database engine. You will talk to mysql using SQL which is written inside your PHP code.

    One last note to help you in your language decision making. It’s important that you understand there are a few phenomenon that may confuse you in your language research:

    The first is that some software developers have little life beyond writing software and have large egos. One of the few things they have to impress you with is their own intelligence. They will try to make programming sound harder than it actually is. It’s not hard. It’s easy.

    Secondly, an arrogant programmer may regale you with a list of programming languages to choose from and tell you that he or she knows them all. They may make the choice sound complicated. It’s not. They’re just showing off. Choose PHP and the set of tools listed above and you’ll be fine.

    Third, remember that there is always something new and shiny coming out that will get a lot of attention and is advertised to “change the way we…” or will “make everything you know about programming irrelevant”. Ignore the noise and stay focused on the basics. Until a new language, operating system, application or piece of hardware has been around for a while (usually at least 5 years), it’s going to be full of bugs, run slow, break often and it will be hard to get help by Googling because few people are using it and have had the problem you’re having.

    Lastly, many companies like Google and Facebook spend a lot of time and energy trying to attract the best software engineers in the world. Google associated themselves with NASA purely for this reason, even though they’re in completely different businesses. To draw attention to themselves as thought leaders in software they talk a lot about languages like Erlang, Haskell and so on. The reality is that their bread and butter languages are pretty ordinary – languages like C++ and PHP. So don’t get confused when you see Facebook talking about using Erlang for real-time chat. They’re just showing off. Their bread and butter is PHP, HTML, CSS, SQL and Javascript, like most of the rest of the Web.

    Who am I and how dare I express an opinion on this? I’ve been programming web applications since 2 years after the Web was invented. I’m the CEO and CTO of a company who’s web apps are seen by over 200 million unique people every month. I also own the company.  I’ve seen languages and platforms come and go including Netscape Commerce Server, Java Applets, Visual Basic, XML, NetWare, Windows NT, Microsoft IIS, thin clients, network computers, etc.

    The Web is Simple. Programming is Easy. Now go have fun!!

     

  • What a successful release looks like

    I thought I’d share a little moment I had recently. We rolled out a new version of Feedjit a few days ago. Nothing changed on the user interface – so no new user features. It was mostly performance enhancements on the back-end servers.

    The new code was the results of many weeks of research and testing and several weeks of implementation. When we launched this weekend I was cautiously optimistic when I saw the load drop on the servers. And now that it’s a few days later I’m staring at our monitoring graphs with a huge smile on my face.

    The reason this is a big win for us is because if we double performance then the number of servers we have to buy halves. And for a small company supporting a huge number of users that’s a very good thing.

    Here’s the data from one of our busiest servers. The server has a quad core CPU in it so a load average of 4 is 100% busy. As you can see we were pushing things a little on this particular box. It’s fairly obvious where we rolled out the new code…

    Most of the performance gain is from faster disk access code which means that the CPU spends almost no time waiting for the disk to do something. As you can see below the IO wait time has dropped to virtually zero. Disk is the slowest component in a server and is usually the bottleneck for any applications that store and retrieve data, so this is a really big win for us.


    I wish I could go into more detail about our application and some specific numbers. In a few weeks I’ll hopefully be able to share more.

    Mark.

  • The irrelevance of microsoft's search

    I put some cross-cluster traffic throttling in place yesterday using memcached – which rocks btw. In the last 12 hours I’ve blocked three sources – two were rogue crawlers from broadband ISP’s. The other was MSN’s live search crawler which is requesting more than 1 page per second sustained over 30 seconds. If it was Google I’d probably care, but Google has polite crawlers and unlike Google, Live search only sends me about 2% of my total search traffic.

  • How to fix munin's netstat passive connections increasing constantly

    Another thing I googled until I was all googled out and couldn’t find an answer, so for future explorers who pass by here, here’s the fix…

    If you’re running munin and you suddenly notice the number of netstat passive connections is constantly increasing in a linear fashion, rest assured it’s not your server that’s busy beating itself into oblivion. It’s a munin bug that’s easily fixed.

    If you run netstat and get something like this:

    netstat -s|grep passive
    3339672 passive connection openings
    7574 passive connections rejected because of time stamp

    …then it’s the passive connections rejected that’s confusing munin.

    To fix this edit:

    /usr/share/munin/plugins/netstat

    and change the line

    netstat -s | awk ‘/active connections/ { print “active.value ” $1 } /passive connection/ { print “passive.value ” $1 } /failed connection/ { print “failed.value ” $1 } /connection resets/ { print “resets.value ” $1 } /connections established/ { print “established.value ” $1 }’

    to

    netstat -s | awk ‘/active connections/ { print “active.value ” $1 } /passive connection openings/ { print “passive.value ” $1 } /failed connection/ { print “failed.value ” $1 } /connection resets/ { print “resets.value ” $1 } /connections established/ { print “established.value ” $1 }’

  • Why I'm so glad I didn't use Rails

    I’ve been uncool for some time now. In 2000 when Java was really beginning to kick ass I grabbed a Java book and wrote some code. And I decided I was getting stuff done faster in Perl so I stuck with it. I felt like a dork who was playing with his bigwheels while the other kids had graduated to Ducati’s.

    But by and by I discovered that ModPerl kicks Java’s ass as far as performance goes and in fact loosely typed languages do rock. Not only that but anything I need has already been written and posted free in CPAN. And if you code in Java then Sun Microsystems and their friends will try to sell you stuff at every opportunity – it’s like going to the ball game where stadium forces the beer vendors to charge $10 a beer even if they make entry free and you get to play on the field.

    2 years ago at Jobster as the Java dev team was discovering the new and cool loosely typed but cleverly OO language called Ruby and it’s Rails framework I went and grabbed a Ruby book and wrote some code. I didn’t like that it didn’t have CPAN and the server model seemed clunky and immature. So I stuck with ModPerl. Again I felt like the kid left in the dust while the others went and played with the big boys.

    Turns out the big boys don’t care about you or your business. Here’s a slide from David Hansson, Rails creator:

    This is via Rob Conery’s blog which I found via Tony Wright’s blog. And here’s a quote from David:

    I’m not in this world to create Rails for you. I’m in this world to create Rails for me and if you happen to like that version of Rails that I’m creating for me, than you are going to have a great time.

    Read Rob’s full blog entry for a lot more insight on what is scary about Rails and its community.

    In Seattle last year I spent a lot of time networking in the startup community and meeting with many entrepreneurs. When we spoke technology choices every single one of them was planning on using Rails. Eventually it became a silly question and the answer was brushed if in a “duh, like obviously I’m using Rails” fashion.

    This year on the Seattle Tech Startup list – about 2 weeks ago – there was a thread with many entrepreneurs complaining bitterly about Rails’ shortcomings.

    Startups are risky enough without adding Rails.

    Sure I wake up at night and wonder if I’m the guy who insists on using Cobol while everyone has moved on to Pascal.

    But then I get out of bed and read the recent posts in the ModPerl archives, I check on the progress of Perl6 and I log onto my servers and check mod_status and how many requests they’re serving without breaking a sweat and I realize that it takes more than a bunch of arrogant eurotrash developers to create an enthusiastic open source community churning out great products.

    It takes a lot of love for the product from the community and from its developers. It takes an inspirational leader like Larry Wall or Linus Torvalds and their lieutenants. And it takes time.

  • QOTD

    From the nginx docs under Known Problems:

    3. Nginx may laugh at your Perl code and hit on your girlfriend.