Month: March 2010

  • A nuclear Google may be a very good thing

    Update: April fools courtesy of Arrington and I got taken in bigtime. Ugh! Leaving the original post up as an ode to my naivete.

    52 years ago in August 1958 the United States was so confident in our ability to provide clean nuclear energy that we put one hundred and sixteen men in a tin can called the USS Nautilus and sent them along with a nuclear reactor under the North Pole. Many of the crew from that voyage remain alive and well today.

    What puzzles me is why the USA isn’t the undisputed world leader in nuclear power. Perhaps the title of undisputed leader in nuclear weaponry and leader in nuclear power are mutually exclusive. So it’s France who produce most of their power from nuclear reactors.

    Google getting into the business of Nuclear power is the most exciting development in nuclear power in this country since the Nautilus. Google’s data centers are massive power consumers, but what also consumes a lot of resources is power transmission. It consumes land, steel, and a lot of power is lost during transmission.

    If you can put a nuclear reactor on a 320ft submarine 60 years ago, you can build a clean nuclear reactor in 2010 with enough power to supply a local data center and the local town it provides employment for. My hope is that this is Google’s nuclear vision.

  • Facebook.com overtakes Google.com as most visited USA domain.

    In a press release from HitWise published on CNN Money a few minutes ago, Facebook.com just overtook Google.com as the most visited domain in the USA. This is possibly the most significant milestone in Facebook’s history as a large company. Here’s why:

    Most of Google’s revenue comes from their Ad business. Half of it comes from their own properties and the other half from a distributed network of sites. (sounding familiar already I’m sure).

    There is a lot of noise around Google’s other apps and experiments, but from a business perspective that’s all it is. Noise. Google is a cash creation machine and the cash is created by the ad network both on and off-site. To give you some perspective, Gmail ranks a distant third among email providers with 37 million uniques vs Yahoo Mail’s 106 million uniques 5 months ago [Comsore].

    So Google’s business is relatively simple. It’s a the best search engine in the world and an ad network with themselves as their own biggest customer.

    Google built this business by first creating an incredibly hot property that gave it’s users an incentive to provide it with awesome targeting data. Then it built an ad network around the targeting data.

    The hardest part about building Google was to create the hot property (the search engine) that incentivises users to keep coming back and feeding it more targeting data. The next part of creating Google was a little easier because if they screwed it up the first time they get to try and try again until they succeed in building a money printing machine on the back of this hot-property-with-targeting-data that really does print money.

    Facebook have the hot property that keeps users coming back and feeding it targeting data. They really screwed up the Ad network the first time they had a crack at it with Beacon. But, predictably, their users forgave them and they’ll keep having another crack at it until their money printing machine is running at optimum efficiency.

    There are a few reasons I believe Facebook may be a bigger success than Google long term:

    1. They have better data in the form of individual demographics, interests and data inferred from the social graph.
    2. They already have a distributed network of sites in the form of Facebook Connect which has deeper integration than AdSense. That means Facebook gets more data about visitors to those sites than Google AdSense.

    Possible risks:

    1. Their management team doesn’t have Eric Schmidt. Eric spent years getting schooled by Microsoft when he ran Novell. So he has the hunger, the scar-tissue and battlefield awareness that you need to compete with juggernauts.
    2. The hasbeen factor. Facebook has had a surprisingly good run and has proven to me it has legs because I’m still going back after a few years. But lets see if they can maintain that over the next decade.
  • How to limit website visitor bandwidth by country

    This technique is great if you have no customers from countryX but are being targeted by a DoS, unwanted crawlers, bots, scrapers and other baddies. Please don’t use this to discriminate against less profitable countries. The web should be open for all. Thanks.

    If you’re not already using Nginx, you should get it even if you already have a great web server. Put it in front and get it to act as a reverse proxy.

    First grab this perl script which you will use to convert Maxmind’s geo IP database into a format usable by Nginx.

    Then download Maxmind’s latest GeoLite country database in CSV format on this page.

    Then run:

    geo2nginx.pl < maxmind.csv > nginxGeo.txt

    Copy nginxGeo.txt into your nginx config directory.

    Then add the following text in the ‘http’ section of your nginx.conf file:

    geo $country {
    default no;
    include nginxGeo.txt;
    }

    Then add the following in the ‘server’ section of your nginx.conf file:

    if ($country ~ ^(?:US|CA|ES)$ ){
    set $limit_rate 10k;
    }
    if ($country ~ ^(?:BR|ZA)$ ){
    set $limit_rate 20k;
    }

    This limits anyone from the USA, Canada and Spain to a maximum of 10 kilobits per second of bandwidth. It gives anyone from Brazil and South Africa 20 Kbps of bandwidth. Every other country gets the maximum.

    You could use a exclamation character before the tilde (!~) to do the opposite. In other words, if you’re NOT from US, Canada or Spain, you get 10 Kbps, although I strongly advise against this policy.

    Remember that $limit_rate only limits per connection, so the amount of bandwidth each visitor has is $limit_rate X number_of_connections. See below to limit connections.

    Another interesting variable is limit_rate_after. The documentation on this is very very sparse, but from what I’ve gathered it is time based. So the first 1 minute of a connection will get full bandwidth, and then after that the limiting starts. Great for streaming sites I would think.

    There are two other great modules in Nginx but neither of them work inside ‘if’ directives which means you can’t use them to limit by country. They are the Limit Zone module which lets you limit the number of concurrent connections and the Limit Requests module which lets you limit the number of requests over a period of time. The Limit Requests module also has a burst variable which is very useful. Once again the documentation is sparse, but this comment from Igor (Nginx author) sheds some light on how bursting works.

    I’ve enabled all three features on our site. Bandwidth limiting by country, limiting concurrent connections and limiting requests over a time period. I serve around 20 to 40 million requests a day on a single nginx box and I haven’t noticed much performance degradation with the new config. It has quadrupled the size of each nginx process though to about 46M per process, but that’s still a lot smaller than most web server processes.

  • Great piece of writing from Rolling Stone

    Rolling Stone is running an article about the US pig farming industry. The opening paragraph is a beauty!

    Smithfield Foods, the largest and most profitable pork processor in the world, killed 27 million hogs last year. That’s a number worth considering. A slaughter-weight hog is fifty percent heavier than a person. The logistical challenge of processing that many pigs each year is roughly equivalent to butchering and boxing the entire human populations of New York, Los Angeles, Chicago, Houston, Philadelphia, Phoenix, San Antonio, San Diego, Dallas, San Jose, Detroit, Indianapolis, Jacksonville, San Francisco, Columbus, Austin, Memphis, Baltimore, Fort Worth, Charlotte, El Paso, Milwaukee, Seattle, Boston, Denver, Louisville, Washington, D.C., Nashville, Las Vegas, Portland, Oklahoma City and Tucson.

    The rest of the article is here.

    Smithfield’s response is here. If as they claim the article is false then I’m sure they’ll sue RS for libel. Somehow I don’t think we’re going to see that lawsuit though.

    Here’s a Google Maps satellite photo showing one of Smithfields ponds, via Hacker News.

  • Does your startup pass The Sleep Test

    Having coffee at 4am after an all-nighter with my co-founder and wife a few days ago we came up with a rather obvious but interesting concept. I’ll call it The Sleep Test.

    Unless your business earns revenue while you are sleeping, it won’t scale.

    If you’re an I.T. consultant or lawyer selling your own time, you can’t scale.

    If you’re a brick-layer who employs other brick layers and also employs a sales person, driver, accountant and all the other business components so that your business runs while you’re not there, you CAN scale.

    If you’re a web developer who writes an application that earns ad revenue or that earns subscription money while you sleep, you CAN scale.

    Most businesses start off with a founder selling their time and with the maximum earnable revenue being tightly limited by the founders available time. The founder works themselves into a stupor and at some point they go through what is often a difficult transition where they “step back” from the business and employ others to take over their various jobs. Many businesses don’t make this transition and it is the subject of much discussion in MBA programs world-wide. The birth of Kinko’s is a great example of this evolution. Paul Orfalea is dyslexic and in the story of Kinko’s he mentions how this forced him to step back from the business and employ others.

    Many “Web businesess” or “Software businesses” need to employ a sales team or have components like fulfillment that don’t scale easily or cheaply. But if your business is a “Web App” that earns you money through advertising or through subscriptions and where the application is the business, it scales incredibly well.

    Web App businesses scale so well that if you “get it right”, they automatically pass the sleep test from day one and they pass the test without you having to employ additional staff.

    Two types of Web App that often pass the sleep test are:

    1. A service that attracts huge numbers of an attractive demographic that can earn you ad revenue or
    2. A service that is so valuable to a group of people that they will pay you for it, preferably on a recurring basis

    Your web app business must also:

    1. Not require additional staff time per customer
    2. Not require additional staff time per dollar earned
    3. Market itself. If it’s marketing is limited by your time, you wont’ scale.
    4. Earn you substantially more money than your business costs to run.

    And that’s it. You need to build a web application that markets itself, earns more money than it burns and that is either wildly popular or wildly valuable.

    If you have a Web App that passes The Sleep Test, congratulations because you have just bypassed one of the most difficult stages of small business evolution and one of the most common points of failure that just about every other business type is forced to navigate.

    Final caveat: I’ve written this post discussing this concept in absolutes i.e. you either do or do not pass the sleep test. Of course in reality there is not a single web app business that does not need to employ more staff as their revenue and customer base grows. Google is a fine example of a business that is designed to avoid having to employ more people as revenue or customers grow and they employ over 20,000 people today. But this test is a useful way to measure and think about how efficiently your business will scale.