Blog

  • What the Web Sockets Protocol means for web startups

    Ian Hickson’s latest draft of the Web Sockets Protocol (WSP) is up for your reading pleasure. It got me thinking about the tangible benefits the protocol is going to offer over the long polling that my company and others have been using for our real-time products.

    The protocol works as follows:

    Your browser accesses a web page and loads, lets say, a javascript application. Then the javascript application decides it needs a constant flow of data to and from it’s web server. So it sends an HTTP request that looks like this:

    GET /demo HTTP/1.1
    Upgrade: WebSocket
    Connection: Upgrade
    Host: example.com
    Origin: http://example.com
    WebSocket-Protocol: sample

    The server responds with an HTTP response that looks like this:

    HTTP/1.1 101 Web Socket Protocol Handshake
    Upgrade: WebSocket
    Connection: Upgrade
    WebSocket-Origin: http://example.com
    WebSocket-Location: ws://example.com/demo
    WebSocket-Protocol: sample

    Now data can flow between the browser and server without having to send HTTP headers until the connection is broken down again.

    Remember that at this point, the connection has been established on top of a standard TCP connection. The TCP protocol provides a reliable delivery mechanism so the WSP doesn’t have to worry about that. It can just send or receive data and rest assured the very best attempt will be made to deliver it – and if delivery fails it means the connection has broken and WSP will be notified accordingly. WSP is not limited to any frame size because TCP takes care of that by negotiating an MSS (maximum segment size) when it establishes the connection. WSP is just riding on top of TCP and can shove as much data in each frame as it likes and TCP will take care of breaking that up into packets that will fit on the network.

    The WSP sends data using very lightweight frames. There are two ways the frames can be structured. The first frame type starts with a 0x00 byte (zero byte), consists of UTF-8 text and ends with a 0xFF byte with the UTF-8 text in between.

    The second WSP frame type starts with a byte that ranges from 0x80 to 0xFF, meaning the byte has the high-bit (or left-most binary bit) set to 1. Then there is a series of bytes that all have the high-bit set and the 7 right most bits define the data length. Then there’s a final byte that doesn’t have the high-bit set and the data follows and is the length specified. This second WSP frame type is presumably for binary data and is designed to provide some future proofing.

    If you’re still with me, here’s what this all means. Lets say you have a web application that has a real-time component. Perhaps it’s a chat application, perhaps it’s Google Wave, perhaps it’s something like my Feedjit Live that is hopefully showing a lot of visitors arriving here in real-time. Lets say you have 100,000 people using your application concurrently.

    The application has been built to be as efficient as possible using the current HTTP specification. So your browser connects and the server holds the connection open and doesn’t send the response until there is data available. That’s called long-polling and it avoids the old situation of your browser reconnecting every few seconds and getting told there’s no data yet along with a full load of HTTP headers moving back and forward.

    Lets assume that every 10 seconds the server or client has some new data they need to send to each other. Each time a full set of client and server headers are exchanged. They look like this:

    GET / HTTP/1.1
    User-Agent: ...some long user agent string...
    Host: markmaunder.com
    Accept: */*
    
    HTTP/1.1 200 OK
    Date: Sun, 25 Oct 2009 17:32:19 GMT
    Server: Apache
    X-Powered-By: PHP/5.2.3
    X-Pingback: https://markmaunder.com/xmlrpc.php
    Connection: close
    Transfer-Encoding: chunked
    Content-Type: text/html; charset=UTF-8

    That’s 373 bytes of data. Some simple math tells us that 100,000 people generating 373 bytes of data every 10 seconds gives us a network throughput of 29,840,000 bits per second or roughly 30 Megabits per second.

    That’s 30 Mbps just for HTTP headers.

    With the WSP every frame only has 2 bytes of packaging. 100,000 people X 2 bytes = 200,000 bytes per 10 seconds or 160 Kilobits per second.

    So WSP takes 30 Mbps down to 160 Kbps for 100,000 concurrent users of your application. And that’s what Hickson and the WSP team and trying to do for us.

    Google would be the single biggest winner if the WSP became standard in browsers and browser API’s like Javascript. Google’s goal is to turn the browser into an operating system and give their applications the ability to run on any machine that has a browser. Operating systems have two advantages over browsers: They have direct access to the network and they have local file system storage. If you solve the network problem you also solve the storage problem because you can store files over the network.

    Hickson is also working on the HTML 5 specification for Google, but the current date the recommendation is expected to be ratified is 2022. WSP is also going to take time to be ratified and then incorporated into Javascript (and other) API’s. But it is so strategically important for Google that I expect to see it in Chrome and in Google’s proprietary web servers in the near future.

  • SSL Network problem follow-up

    It’s now exactly a week since I blogged about my SSL issues over our network. To summarize, when fetching documents on the web via HTTPS from my servers, the connection would just hang halfway through until it timed out. I had confirmed that it wasn’t the infamous PMTU ICMP issue that is common if you’re fetching documents via HTTPS from a misconfigured web server. It was being caused by inbound HTTPS data packets getting dropped and when the retransmit would occur the retransmitted packets would also get dropped. Exactly the same packet every time would get dropped.

    Last night we solved it. We’ve been working with Cisco for the last week and have been through several of their engineers with no progress. I was seeing packets arriving on my provider’s switch (we have a great working relationship and share a lot of data like sniffer logs) – but the packet was not arriving on my switch We had isolated it to the layer 2 infrastructure.

    Last night we decided to throw a hail-mary and my provider changed the switch module my two HSRP uplinks were connected to from one 24 port module to another. And holycrap it fixed the problem. We then reconfigured routes and everything else so that the only thing that had changed was the 24 port module. And it was still fixed.

    This is the strangest thing I’ve seen and the Cisco guys we were working with echoed that. It’s extremely rare for Layer 2 infrastructure which is fairly brain-dead to cause errors with packets that have a higher level protocol like HTTPS in common. These devices examine the layer-2 header with the MAC address and either forward the entire packet or not. The one thing we did notice is that the packets that were getting dropped were the last data packet in a PDU (protocol data unit) and were therefore slightly shorter by about 100 bytes than the other packets in the PDU that were stuffed full of data.

    But we’ve exorcised the network ghosts and data is flowing smoothly again.

  • How to integrate PHP, Perl and other languages on Apache

    I have this module that a great group of guys in Malaysia have put together. But their language of choice is PHP and mine is Perl. I need to modify it slightly to integrate it. For example, I need to add my own session code so that their code knows if my user is logged in or not and who they are.

    I started writing PHP but quickly started duplicating code I’d already written in Perl. Fetch the session from the database, de-serialize the session data, that sort of thing. I also ran into issues trying to recreate my Perl decryption routines in PHP. [I use non-mainstream ciphers]

    Then I found ways to run Perl inside PHP and vice-versa. But I quickly realized that’s a very bad idea. Not only are you creating a new Perl or PHP interpreter for every request, but you’re still duplicating code, and you’re using a lot more memory to run interpreters in addition to what mod_php and mod_perl already run.

    Eventually I settled on creating a very lightweight wrapper function in PHP called doPerl. It looks like this:

    $associativeArrayResult = doPerl(functionName, associativeArrayWithParameters);

    function doPerl($func, $arrayData){
     $ch = curl_init();
     $ip = '127.0.0.1';
     $postData = array(
     json => json_encode($arrayData),
     auth => 'myPassword',
     );
     curl_setopt($ch,CURLOPT_POST, TRUE);
     curl_setopt($ch,CURLOPT_POSTFIELDS, $postData);
     curl_setopt($ch, CURLOPT_URL, "http://" . $ip . "/webService/" . $func . "/");
     curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
     $output = curl_exec($ch);
     curl_close($ch);
     $data = json_decode($output, TRUE);
     return $data;
    }

    On the other side I have a very fast mod_perl handler that only allows connections from 127.0.0.1 (the local machine). I deserialise the incoming JSON data using Perl’s JSON::from_json(). I use eval() to execute the function name that is, as you can see above, part of the URL. I reserialize the result using Perl’s JSON::to_json($result) and send it back to the PHP app as the HTML body.

    This is very very fast because all PHP and Perl code that executes is already in memory under mod_perl or mod_php. The only overhead is the connection creation, sending of packet data across the network connection and connection breakdown. Some of this is handled by your server’s hardware. [And of course the serialization/deserialization of the JSON data on both ends.]

    The connection creation is a three way handshake, but because there’s no latency on the link it’s almost instantaneous. The transferring of data is faster than a network because the MTU on your lo interface (the 127.0.0.1 interface) is 16436 bytes instead of the normal 1500 bytes. That means the entire request or response fits inside a single packet. And connection termination is again just two packets from each side and because of the zero latency it’s super fast.

    I use JSON because it’s less bulky than XML and on average it’s faster to parse across all languages. Both PHP and Perl’s JSON routines are ridiculously fast.

    My final implementation on the PHP side is a set of wrapper classes that use the doPerl() function to do their work. Inside the classes I use caching liberally, either in instance variables, or if the data needs to persist across requests I use PHP’s excellent APC cache to store the data in shared memory.

    Update: On request I’ve posted the perl web service handler for this here. The Perl code allows you to send parameters via POST using either a query parameter called ‘json’ and including escaped JSON that will get deserialized and passed to your function, or you can just use regular post style name value pairs that will be sent as a hashref to your function. I’ve included one test function called hello() in the code. Please note this web service module lets you execute arbitrary perl code in the web service module’s namespace and doesn’t filter out double colon’s, so really you can just do whatever the hell you want. So I’ve included two very simple security mechanisms that I strongly recommend you don’t remove. It only allows requests from localhost, and you must include an ‘auth’ post parameter containing a password (currently set to ‘password’). You’re going to have to implement the MM::Util::getIP() routine to make this work and it’s really just a one liner:

    sub getIP {
     my $r = shift @_;
     return $r->headers_in->{'X-Forwarded-For'} ?
        $r->headers_in->{'X-Forwarded-For'} :
        $r->connection->get_remote_host();
    }
  • Routers treat HTTPS and HTTP traffic differently

    OSI Network Model

    Well the title says it all. Internet routers live at Layer 3 [the Network Layer] of the OSI model which I’ve included to the left. HTTP and HTTPS live at Layer 7 (Application layer) of the OSI model, although some may argue HTTPS lives at Layer 6.

    So how is it that Layer 3 devices like routers treat HTTPS traffic differently?

    Because HTTPS servers set the DF or Do Not Fragment IP flag on packets and regular HTTP servers do not.

    This matters because HTTP and HTTPS usually transfer a lot of data. That means that the packets are usually quite large and are often the maximum allowed size.

    So if a server sends out a very big HTTP packet and it goes through a route on the network that does not allow packets that size, then the router in question simply breaks the packet up.

    But if a server sends out a big HTTPS packet and it hits a route that doesn’t allow packets that size, the routers on that route can’t break the packet up. So they drop the packet and send back an ICMP message telling the machine that sent the big packet to adjust it’s MTU (maximum transfer unit) size and resend the packet. This is called Path MTU Discovery.

    This can create some interesting problems that don’t exist with plain HTTP. For example, if your ops team has gotten a little overzealous with security and decided to filter out all ICMP traffic, your web server won’t receive any of those ICMP messages I’ve described above telling it to break up it’s packets and resend them. So large secure packets that usually are sent halfway through a secure HTTPS connection will just be dropped. So visitors to your website who are across network paths that need to have their packets broken up into smaller pieces will see half-loaded pages from the secure part of your site.

    If you have the problem I’ve described above there are two solutions: If you’re a webmaster, make sure your web server can receive ICMP messages [You need to allow ICMP code 4 “Fragmentation needed and DF bit set”]. If you’re a web surfer (client) and are trying to access a secure site that has ICMP disabled, adjust your network card’s MTU to be smaller than the default (usually the default is 1500 for ethernet).

    But the bottom line is that if everything else is working fine and you are having a problem sending or receiving HTTPS traffic, know that the big difference with HTTPS traffic over regular web traffic is that the packets can’t be broken up.

  • China's influence in Africa

    CHINA IN AFRICAAs an African American, or rather, an American African (I’m white and African born), I hear a constant flow of stories about China’s increasing influence in Africa. They’ve clearly taken a long term view on Africa, perhaps motivated by their projected energy and natural resources needs. If you subscribe to the US view that free trade is good, then this is a good thing. [You can’t have it both ways folks!]

    Whether or not you think it’s good for the continent, the data is surprising:

    • The China National Petroleum Corporation (CNPC) is the single largest shareholder (40 percent) in the Greater Nile Petroleum Operating Company, which controls Sudan’s oil fields and has invested $3 billion in refinery and pipeline con­struction in Sudan since 1999. Sudan now supplies 7% of China’s total oil.
    • In March 2004, Beijing extended a $2 billion loan to Angola in exchange for a contract to supply 10,000 barrels of crude oil per day.
    • In July 2005, PetroChina concluded an $800 million deal with the Nigerian National Petro­leum Corporation to purchase 30,000 barrels of oil per day for one year.
    • In January 2006, China National Offshore Oil Corporation (CNOOC), after failing to acquire American-owned Unocal, purchased a 45 per­cent stake in a Nigerian offshore oil and gas field for $2.27 billion and promised to invest an addi­tional $2.25 billion in field development.
    • In April 2003, approximately 175 People’s Liber­ation Army (PLA) soldiers and a 42-man medical team were deployed to the Democratic Republic of Congo on a peacekeeping mission.
    • In December 2003, 550 peacekeeping troops, equipped with nearly 200 military vehicles and water-supply trucks, were sent to Liberia.
    • China has also deployed about 4,000 PLA troops to southern Sudan to guard an oil pipeline and reaffirmed its intention to strengthen military collaboration and exchanges with Ethi­opia, Liberia, Nigeria, and Sudan.
  • The DOW 10K priced as opportunity cost

    Economists love the concept of opportunity cost because it gives you a the real long-term value of an investment or purchase in relative terms – which is really the only way to calculate value. On Wednesday the DOW hit 10,000 again. The US financial press did their part to ring the bell while the banking community celebrated the boost in perceived value and the increased likelihood that the public would buy their wares.

    Fox News, like clockwork, has given former asshole president Bush credit for the recovery. (Skip to 3:00 in the video) “He took the bold moves and look where we are today..”.

    John Authers in the Finanial Times is almost embarrassed on Thursday as he delivers the news of what a DOW 10K means in real, opportunity cost terms. If you invested in the DOW in 1999:

    • Relative to emerging markets you’ve lost 80% of your money.
    • Relative to gold you’ve lost 75% of your money.
    • And even in dollar terms corrected for inflation (using the CPI) you’ve lost around 23% of your money.

    dow10k

  • Using and understanding the world-wide city database data

    One of the most popular pages on this blog is a post I wrote two years ago titled “World wide cities database and other free geospatial data“. There are still few people out there who realize that not only can you get a free world-wide cities database from the national geospatial ingelligence agency in the US, but they have around 4 million other points around the world that even include things like undersea features, palm groves, vineyards and a lot more.

    I got an email from Jamil today asking about how to interpret the data in the NGIA’s database. You can find the data he’s referring to at the NGIA’s site. Each record has a feature classification and a feature designation code. You can see the schema (but without what the codes are) here. For some reason I couldn’t find the actual classifications and designations on the site.

    I did find them posted here. The information may be included in the NGIA’s download files – I haven’t checked.

  • SSL Timeouts and layer 3 infrastructure

    I’ve spent the last 5 days agonizing over a very hard problem on my network. Using curl, LWP::UserAgent, openssl, wget or any other SSL client, I’d see connections either timeout or hang halfway through the transfer. Everything else works fine including secure protocols like SSH and TLS. In fact inbound SSL connections work great too. It’s just when I connect to an external SSL host that it hiccups.

    If you remember your OSI model, SSL is well above layer 3 (IP addresses and routers) and layer 2 (LAN traffic routed via MAC addresses). So the last place I planned to look was the network infrastructure.

    I eliminated specific clients by trying others and I eliminated the OS by spinning up virtual machines running other versions of Linux. I elminated my physical hardware by reproducing it on a non Dell server and having one of the ops guys repro it on his OS X macbook.

    And just to prove it was the network, which is all that was left, I set up a VPN from one of my machines that tunnelled all traffic over the VPN to a machine on an external network that acted as the router, thereby encapsulating the layer 2 and 3 traffic in a layer 4 and 5 VPN. And the problem went away. So I knew it was the network.

    Tonight a few minutes ago my colo provider took down my local router and I gracefully failed over to the redundant router, and lo and behold the problem has gone away.

    I still don’t know what it is, but what I do know is that a big chunk of layer 3 infrastructure has been changed and it’s fixed a layer 5 problem. What’s weird is that TCP connections (which is what SSL rides on top of) have delivery confirmation. So if the problem was packet loss, TCP would just request the packet again. So it’s something else and something that only affects SSL – and only connections bound from my servers out to the Internet.

    The reason I’m posting this is because during the hours I spent Googling this issue this week (and finding nothing) I saw a lot of complaints about SSL timeouts and no solutions. So if you’re getting timeouts like this, check your underlying infrastructure and you might just be surprised. To verify that it’s a network problem, set up a VLAN using PPTP. Set up NAT on the external linux machine that is your VLAN server. Then disable the default gateway on the machine having the issue (the VLAN client) and verify that all traffic is routing via your VLAN. Then try and reproduce the SSL timeout and if it doesn’t occur, it’s probably your layer 2 or 3 infrastructure.