Blog

  • What I love about the HN community

    The smartest most helpful people hang out there. Thanks for the awesome theme Lucian!

     

     

     

     

  • Insulin may be a steroid masquerading as a hormone.

    Insulin
    Computer-generated image of six insulin molecules assembled in a hexamer.

    At the 1998 Winter Olympic Games in Nagano, a Russian medical officer asked the Olympic Committee whether the use of insulin was restricted to athletes who are insulin dependent diabetics. The incident drew attention to insulin and the IOC were swift to ban it as a performance enhancing drug.

    I recently posted a question on Quora asking what the best nutrition book is, and Nutrient Timing by Ivy and Portman came up. The book is excellent and has a huge amount of physiology data on how the human body makes and uses energy. The core concept is this:

    In the 45 minutes post exercise, your body has very high insulin sensitivity. This period is referred to as the Anabolic phase. By consuming a drink of protein and carbs in a 1:3 or 1:4 ratio, you can significantly boost your insulin level during this period. You can also prolong this period and increase recovery and growth by continuing to consume said drink 2 hours and again at 4 hours post exercise.

    Boosting your insulin levels post exercise reduces protein loss from muscles and improves protein retention. It also speeds recovery by replenishing glycogen and creatine stores. Ivy and Portman spend much of the book citing supporting research from many studies including the Marine Corps.

    The book also recommends taking an anti-oxidant post exercise to reduce muscle oxidation.

    According to Ivy and Portman and many other nutritionists, the best source of protein is Whey Protein Isolate (links to the one I bought recently) which is rich in branched chain amino acids (BCAA’s). The best source of carbs in their recipe is good old Sucrose (table sugar).

    Dara Torres
    Dara Torres at the 2008 summer Olympics.

    I was chatting to my wife about the book and she mentioned that Dara Torres (three silvers in the previous summer Olympics and the oldest swimmer to ever be on the US Olympic team) drinks chocolate milk as her favorite recovery drink. Chocolate is rich in anti-oxidants, it contains sucrose and milk has some protein, but not enough to make the ratio 4:1. (The sucrose to protein ratio is probably more like 16:1). So I’m guessing that Dara adds a source of protein like whey protein isolate to the drink.

    I’m training this year for either a half or full ironman next year and doing a half and full marathon this year to build up to it. I’m currently doing two 5 mile runs and one long run (currently 10 miles) each week. I also swim 2000m two to three times a week and I do the occasional core strength workout. As I built up to my current volume my energy level collapsed – both mental and physical. Once I started looking at my nutrition and using a post workout recovery nutrition plan I came back with a vengance. Two weeks after starting the plan I ran the fastest 5 mile pace I’ve ever run and felt great afterwards.

    After doing further reading online I’ve modified my recipe to have a 1:1 ratio of protein to carbohydrates post workout. A 3:1 or 4:1 ratio seems to build a lot of muscle and my goal is to stay lean but recover fast.

    My current post workout nutrition plan is:

    • 1 whole raw egg, 56 grams whey protein (two scoops), two tablespoons of molasses (high in phosphorus), a tablespoon of brown sugar, two cups of skim milk, a heaped spoon of cocoa powder. Blend and drink two thirds.
    • Drink the remaining third 1.5 to two hours after workout.

     

  • Where's the Disruption from the Change in Startup Economics?

    It’s been a year long break from blogging and getting back to writing and getting a so many new visitors this soon is cool. [Thanks HN!]

     

    This blog runs on the smallest available Linode 512 instance for $20/month. It runs several sites including family blogs and hobby sites. I run nginx on the front end and reverse proxy to 5 Apache children which saves me having to run roughly 100 Apache children to handle the brief spikes of around 20 hits per second I saw yesterday.

     

    Technologies like event-servers (Nginx, node.js, etc) and cheap and reliable virtualization may seem like old hat, but in 2005 Linode was charging $40/month for a 128Meg instance (it’s now $20/month for 512Megs, 88% cheaper) and Nginx was only going to hit main-stream use two years later. In fact Nginx only hit version 1.0 last month.

    Five years ago many companies or bloggers would have used a physical box with 3.5 Gigabytes of memory to handle 100 apache instances and the database for this kind of traffic. About $300/month based on current pricing for physical dedicated servers from ServerBeach which hasn’t changed much since 2005.

    With the move from hardware and multiprocess servers to virtualization and event-servers, hosting costs have dropped to 6% of what they were 5 years ago. A drop of 94% in a variable cost for any sector changes the economics in a way that usually causes disruption and innovation.

    So where is the disruption and innovation happening now that anyone can afford a million-hits-a-month server?

     

    Footnotes: An unstable version of Nginx was available in 2005/2006 and Lighttpd was also an alternative back then for reverse proxying. But it was for hardcore hackers who didn’t mind relatively unstable and bleeding-edge configurations. Mainstream configuration in 2005 was running big memory servers on dedicated machines with a huge number of Apache children. Sadly, much of the web is still run this way. I shudder to think of the environmental impact of all those front-end web boxes. I also don’t address the subject of Keep-Alive on Apache. Disabling Keep-Alive is a way to get a lot more bang for your hardware (specifically you need less memory because you run less apache children) while sacrificing some browser performance. The norm in 2005 was to leave keepalive enabled, but set to a short timeout. With Keepalive set to 15 seconds, my estimate of 100 apache instances for 20 hits per second is probably way too optimistic. With Keep-Alive disabled you would barely handle 20 requests per second with 100 children when taking into account latency per request for slower connections. Bandwidth cost is also a consideration, but gzip and running compressed code, using CDN versions of libs like jQuery that someone else hosts and running a stripped down site with few images helps. [Think Craigslist] With a page size of 100K, Linode’s 400GB bandwidth allowance gives you 4,194,304 pageviews.

     

  • Domain name search tools

    Clarence from Panabee pinged me a few minutes ago mentioning Panabee.com. I hadn’t heard of it and along with nxdom.com I’m going to add it to my toolkit to brainstorm available domain names.

    My attitude re names these days fluctates between the-name-is-everything and back to sanity.

    A week ago I was obsessed with the domain name WordPrice.com which a friendly cybersquatter wanted to sell me for $700. I even contacted the owner of a very similar mark and kindly got the OK to use it for what I intended. Then backed off at the last minute because a) I refuse to support cybersquatting and b) names are more about creating a well loved and well remembered brand than pretty words.

    Keep in mind the relative strength of different types of trademarks when you’re thinking about future brands. Make sure you do a USPTO search and at some point spend $500 with a TM attorney to get your use of your new mark on record and start the trademark clock. I also tend to screenshot a few 100-result google searches for any new potentially strong mark I’m going to use. I date them and file them. [Once you’ve had your ass handed to you in a trademark lawsuit like I have, you get paranoid]

     

  • It's OK to make an extra $2k per month if you're a programmer. Here's how.

    This quote, which went viral 2 months ago and that Steinbeck probably never said, has stuck with me:

    “Socialism never took root in America because the poor see themselves not as an exploited proletariat but as temporarily embarrassed millionaires.” ~Maybe not Steinbeck, but it’s cool and it’s true.

    As temporarily embarrassed millionaire programmers I feel we sometimes don’t pursue projects that could be buying awesome toys every month, making up for that underwater mortgage or adding valuable incremental income. Projects in this space aren’t the next Facebook or Twitter so they don’t pass the knock-it-out-the-park test.

    There are so many ideas in this neglected space that have worked and continue to work. Here’s a start:

    1. Do a site:.gov search on Google for downloadable government data.
    2. Come up with a range of data that you can republish in directory form. Spend a good few hours doing this and create a healthy collection of options.
    3. You might try a site:.edu search too and see if universities have anything interesting.
    4. site:.ac.uk site:.ac.za – you get the idea.
    5. Experiment with Google’s Keyword Tool.
    6. Make sure you’re signed in.
    7. Click Traffic Estimator on the left.
    8. Enter keywords that describe the data sets you’ve come up with. Enter a few to get a good indication each category or sector’s potential
    9. Look at search volume to find sectors that are getting high search volumes.
    10. Look at CPC to find busy sectors that also have advertisers that are paying top dollar for clicks.
    11. Finally, look at the Competition column to get an idea of how many advertisers are competing in the sector.
    12. First prize is high search volume, high CPC, high competition. Sometimes you can’t have it all, but get as close as you can.
    13. Now that you’ve chosen a lucrative sector with lots of spendy advertisers and have government or academic data you can republish, figure out a way to generate thousands of pages of content out of that data and solve someone’s problem. The problem could be “Why can’t I find a good site about XYZ when I google for such-and-such.”
    14. Give the site a good solid SEO link structure with breadcrumbs and cross-linking. Emphasize relevant keywords with the correct html tags and avoid duplicate content. Make sure the site performance is wicked fast or you’ll get penalized. Nginx reverse-proxying Apache is always a good bet.
    15. Tell the right people about your site and tell them regularly via great blog entries, insightful tweets, and networking in your site’s category.
    16. Keep monitoring Googlebot crawl activity, how your site is being indexed and tweak it for 6 months until it’s all indexed, ranking and getting around 50K visits per month (1666 visits per day).
    17. That’s 150,000 page views per month at 3 pages per visit average.
    18. At a 1.6% CTR with 0.85c CPC from Adsense you’re earning $2040 per month.

    Update: To clarify, “competition” above refers to competition among advertisers paying for clicks in a sector. More competition is a good thing for publishers because it means higher CPC and more ad inventory i.e. a higher likelihood an ad will be available for a specific page with specific subject matter in your space. [Thanks Bill!]

    Update2: My very good mate Joe Heitzeberg runs MediaPiston which is a great way to connect with high quality authors of original content. If you do have a moderate budget and are looking for useful and unique content to get started, give Joe and his crew a shout! They have great authors and have really nailed the QA and feedback process with their platform.

  • SEO: Don't use private registration

    This one is short and sweet. A new domain recently wasn’t getting any SEO traffic after 2 months. As soon as the registration was made non-private i.e. we removed the domainsByProxy mask on who owns the domain, it started getting traffic and has been growing ever since.

    Correlation does not equal causation, but it does give me pause.

    While ICANN has made it clear that the whois database has one purpose only, Google publicly stated they became a registrar to “increase the quality of our search results“.

     

  • SEO: Google may treat blogs differently

    A hobby site I have has around 300,000 pages indexed and good pagerank. It gets a fair amount of SEO traffic which has been growing. The rate at which Google indexes the site has been steadily climbing and is now indexing at around 2 to 3 pages per second.

    I added a new page on the site that was linked to from most other pages about a week ago. The page had a query string variable called “ref”. The instant it went live, Googlebot went crazy indexing the page and considering every permutation of “ref” to be a different page, even though the page generated was identical every time. The page quickly appeared in Googles index. I solved it by telling Googlebot to ignore “ref” through Webmaster Tools and temporarily disallowed indexing using robots.txt.

    A week later I added another new page. This time I used WordPress.org as a CMS and created a URL, lets call it “/suburl/” and published the new page as “/suburl/blog-entry-name.html”. Again I linked to it from every page on the site.

    Googlebot took a sniff at “/suburl/” and at “/suburl/?feed=rss2” and then a day later it grabbed “/suburl/author/authorname” but it never put the page in it’s search index and hasn’t visited since. The bot continues to crawl the rest of the site aggressively.

    Back in 2009, Matt Cutts (Google search quality team) mentioned that “WordPress takes care of 80-90% of (the mechanics of) Search Engine Optimization (SEO)”.

    A different interpretation is that “WordPress gives Google a machine readable platform with many heuristics that can be used to more accurately assess page quality”.

    One of those heuristics is age of the blog and number of blog entries. Creating a fresh blog on a fresh domain or subdomain and publishing a handful of affiliate targeted pages is a common splog (spam blog) tactic. So it’s possible that Google saw my one-page-blog and decided the page doesn’t get put in the index until the blog has credibility.

    So from now on when I have content to put online, I’m going to consider carefully whether I’m going to publish it using WordPress as a CMS with just a handful of blog entries, or if I’m going to hand-publish it (which has worked well for me so far).

    Let me know if your mileage varies.

  • What an Instant-Edu machine might do to Education

    The last two scifi novels I’ve read coincidentally both had a machine that can upload several years of education to your brain in a matter of hours. I was ruminating on what the effect would be on education if we invented the instant-edu machine today.

    Imagine you could instant-edu the Harvard Business School syllabus in a few hours. HBS’s 2010 revenue was $467 million. The 2011 MBA program has 937 students.  My HBS graduate friends tell me that it’s not about the education, it’s about the networking opportunities. So in the case of HBS, the instant-edu machine would not replace the experience, because really the HBS MBA program is quite possibly the most expensive and time consuming business networking program in the world.

    So how would HBS adapt to the instant-edu machine? They might revise the $102,000 tuition fees down slightly since all data contained in textbooks will simply be uploaded in a matter of hours.

    Since all documented parts of the syllabus will be instantly absorbed by all students, networking will be the core activity. But students won’t spend the time helping each other retain knowledge because it will already be retained. Instead they would focus on innovating using the knowledge they’ve gained. Throughout the 2 year period, they could innovate in different settings. One class might drop LSD and see if a new interpretation arises. Another might use debate to provoke innovative arguments or solutions.

    Or perhaps institutions like Harvard will disappear over time and we will revert to the 17th century Persian coffee house scene where thinkers are free to gather for the price of a cup of coffee and share and debate ideas and come up with new ones. Perhaps each coffee shop could have their own football team…

     

  • Back blogging

    After a 1 year without feeling the need to hold forth on issues I know very little about, I’m back blogging. The spammers got hold of my blog and I deleted thousands of garbage comments that managed to get through my spam filter. If I accidentally deleted yours or you’re unable to post a comment because you’re flagged as a spammer, email me to fix it.

     

  • How to reliably limit the amount of bandwidth your room mate or bad office colleague uses

    Update: It seems I’ve created a monster. I’ve had my first two Google searchers arrive on this blog entry searching for “limit roomate downloading” and “netgear limit roomate”. Well after years of experimenting with QoS this is the best method I’ve found to do exactly that, so enjoy.

    For part of the year I’m on a rural wifi network that, on a good day, gives me 3 megabits per second download speed and 700kbps upload speed. I’ve tried multiple rural providers, had them rip out their equipment because of the packet loss (that means you Skybeam), I’ve shouted at Qwest to upgrade the local exchange so we can get DSL, but for now I’m completely and utterly stuck on a 3 megabits downlink using Mile High Internet.

    I have an occasional room-mate, my nephew, who downloads movies on iTunes and it uses about 1.5 to 3 megabits. I’ve tried configuring quality of service (QoS) on various routers including Netgear and Linksys/Cisco and the problem is that I need a zero latency connection for my SSH sessions to my servers. So while QoS might be great if everyone’s using non-realtime services like iTunes downloads and web browsing, when you are using SSH or a VoIP product like Skype, it really sucks when someone is hogging the bandwidth.

    The problem arises because of the way most streaming movie players download movies. They don’t just do it using a smooth 1 megabit stream. They’ll suck down as much as your connection allows, buffer it and then use very little bandwidth for a few seconds, and then hog the entire connection again. If you are using SSH and you hit a key, it takes a while for the router to say: “Oh, you wanted some bandwidth, ok fine let me put this guy on hold. There. Now what did you want from me again? Hey you still there? Oh you just wanted one real-time keystroke. And now you’re gone. OK I guess I’ll let the other guy with a lower priority hog the bandwidth again until you hit another keystroke.”

    So the trick, if you want to effectively deal with the movie downloading room-mate is to limit the amount of bandwidth they can use. That way netflix, iTunes, youtube, amazon unbox or any other streaming service has to use a constant 1 megabit rather than bursting to 3 megabits and then dropping to zero – and you always have some bandwidth available without having to wait for the router to do it’s QoS thing.

    Here’s how you do it.

    First install DD-WRT firmware on your router. I use a Netgear WNDR3300 router and after using various Linksys/Cisco routers I swear by this one. It has two built in radios so you can create two wireless networks, one on 2Ghz and one of 5Ghz. It’s also fast and works 100% reliably.

    Then look up your router on dd-wrt’s site and download DD-WRT for your router and install it. I use version “DD-WRT v24-sp2 (10/10/09) std – build 13064”. There are newer builds available, but when I wrote this this was the recommended version.

    Once you’re all set up and you have  your basic wireless network with DD-WRT, make sure you disable QoS (it’s disabled by default).

    Then configure SSH on DD-WRT. It’s a two step process. First you have to click the “Services” tab and enable SSHd. Then you have to click the Administration tab and enable SSH remote management.

    Only the paid version of DD-WRT supports per user bandwidth limits, but I’m going to show you how to do it free with a few shell commands. I actually tried to buy the paid version of DD-WRT to do this, but their site is confusing and I couldn’t get confirmation they actually support this feature. So perhaps the author can clarify in a comment.

    Because you’re going to enter shell commands, I recommend adding a public key for password-less authentication when you log in to DD-WRT. It’s on the same DD-WRT page where you enabled  the SSHd.

    Tip: Remember that with DD-WRT, you have to “Save” any config changes you make and then “Apply settings”. Also DD-WRT gets confused sometimes when you make a lot of changes, so just reboot after saving and it’ll unconfuse itself.

    Now that you have SSHd set up, remote ssh login enabled and hopefully your public ssh keys all set up, here’s what you do.

    SSH to your router IP address:

    ssh root@192.168.1.1

    Enter password.

    Type “ifconfig” and check which interface your router has configured as your internal default gateway. The IP address is often 192.168.1.1. The interface is usually “br0”.

    Lets assume it’s br0.

    Enter the following command which clears all traffic control settings on interface br0:

    tc qdisc del dev br0 root

    Then enter the following:


    tc qdisc add dev br0 root handle 1: cbq \
    avpkt 1000 bandwidth 2mbit

    tc class add dev br0 parent 1: classid 1:1 cbq \
    rate 700kbit allot 1500 prio 5 bounded isolated

    tc filter add dev br0 parent 1: protocol ip \
    prio 16 u32 match ip dst 192.168.1.133 flowid 1:1

    tc filter add dev br0 parent 1: protocol ip \
    prio 16 u32 match ip src 192.168.1.133 flowid 1:1

    These commands will rate limit the IP address 192.168.1.133 to 700 kilobits per second.

    If you’ve set up automatic authentication and you’re running OS X, here’s a perl script that will do all this for you:

    #!/usr/bin/perl

    my $ip = $ARGV[0];
    my $rate = $ARGV[1];

    $ip =~ m/^\d+\.\d+\.\d+\.\d+$/ &&
    $rate =~ m/^\d+$/ ||
    die “Usage: ratelimit.pl\n”;

    $rate = $rate . ‘kbit’;

    print `ssh root\@192.168.1.1 “tc qdisc del dev br0 root”`;

    print `ssh root\@192.168.1.1 “tc qdisc add dev br0 root handle 1: cbq avpkt 1000 bandwidth 2mbit ; tc class add dev br0 parent 1: classid 1:1 cbq rate $rate allot 1500 prio 5 bounded isolated ; tc filter add dev br0 parent 1: protocol ip prio 16 u32 match ip dst $ip flowid 1:1 ; tc filter add dev br0 parent 1: protocol ip prio 16 u32 match ip src $ip flowid 1:1″`;

    You’ll see a few responses for DD-WRT when you run the script and might see an error about a file missing but that’s just because you tried to delete a rule on interface br0 that might not have existed when the script starts.

    These rules put a hard limit on how  much bandwidth an IP address can use. What you’ll find is that even if you rate limit your room mate to 1 megabit, as long as you have 500 kbit all to yourself, your SSH sessions will have absolutely no latency, Skype will not stutter, and life will be good again. I’ve tried many different configurations with various QoS products and have not ever achieved results as good as I’ve gotten with these rules.

    Notes: I’ve configured the rules on the internal interface even though most QoS rules are generally configured on an external interface because it’s the only thing that really really seems to work. The Cisco engineers among you may disagree, but go try it yourself before you comment. I’m using the Linux ‘tc’ command and the man page is here.

    PS: If you are looking for a great router to install DD-WRT on, try the Cisco-Linksys E3200. It has a ton of RAM and the CPU is actually faster at 500 MHz than the E4200 which is more expensive and only has a 480 MHz CPU. It also is the cheapest Gigabit Ethernet E series router that Cisco-Linksys offers. Here is the Cisco-Linksys E3200’s full specs on DD-WRT’s site. The E3200 is fully DD-WRT compatible but if you are lazy and don’t want to mess with DD-WRT, check out the built in QoS (Quality of Service) that the E3200 has built in on this video.