Ian Hickson’s latest draft of the Web Sockets Protocol (WSP) is up for your reading pleasure. It got me thinking about the tangible benefits the protocol is going to offer over the long polling that my company and others have been using for our real-time products.
The protocol works as follows:
GET /demo HTTP/1.1 Upgrade: WebSocket Connection: Upgrade Host: example.com Origin: http://example.com WebSocket-Protocol: sample
The server responds with an HTTP response that looks like this:HTTP/1.1 101 Web Socket Protocol Handshake Upgrade: WebSocket Connection: Upgrade WebSocket-Origin: http://example.com WebSocket-Location: ws://example.com/demo WebSocket-Protocol: sample
Now data can flow between the browser and server without having to send HTTP headers until the connection is broken down again.
Remember that at this point, the connection has been established on top of a standard TCP connection. The TCP protocol provides a reliable delivery mechanism so the WSP doesn’t have to worry about that. It can just send or receive data and rest assured the very best attempt will be made to deliver it – and if delivery fails it means the connection has broken and WSP will be notified accordingly. WSP is not limited to any frame size because TCP takes care of that by negotiating an MSS (maximum segment size) when it establishes the connection. WSP is just riding on top of TCP and can shove as much data in each frame as it likes and TCP will take care of breaking that up into packets that will fit on the network.
The WSP sends data using very lightweight frames. There are two ways the frames can be structured. The first frame type starts with a 0×00 byte (zero byte), consists of UTF-8 text and ends with a 0xFF byte with the UTF-8 text in between.
The second WSP frame type starts with a byte that ranges from 0×80 to 0xFF, meaning the byte has the high-bit (or left-most binary bit) set to 1. Then there is a series of bytes that all have the high-bit set and the 7 right most bits define the data length. Then there’s a final byte that doesn’t have the high-bit set and the data follows and is the length specified. This second WSP frame type is presumably for binary data and is designed to provide some future proofing.
If you’re still with me, here’s what this all means. Lets say you have a web application that has a real-time component. Perhaps it’s a chat application, perhaps it’s Google Wave, perhaps it’s something like my Feedjit Live that is hopefully showing a lot of visitors arriving here in real-time. Lets say you have 100,000 people using your application concurrently.
The application has been built to be as efficient as possible using the current HTTP specification. So your browser connects and the server holds the connection open and doesn’t send the response until there is data available. That’s called long-polling and it avoids the old situation of your browser reconnecting every few seconds and getting told there’s no data yet along with a full load of HTTP headers moving back and forward.
Lets assume that every 10 seconds the server or client has some new data they need to send to each other. Each time a full set of client and server headers are exchanged. They look like this:GET / HTTP/1.1 User-Agent: ...some long user agent string... Host: markmaunder.com Accept: */* HTTP/1.1 200 OK Date: Sun, 25 Oct 2009 17:32:19 GMT Server: Apache X-Powered-By: PHP/5.2.3 X-Pingback: http://markmaunder.com/xmlrpc.php Connection: close Transfer-Encoding: chunked Content-Type: text/html; charset=UTF-8
That’s 373 bytes of data. Some simple math tells us that 100,000 people generating 373 bytes of data every 10 seconds gives us a network throughput of 29,840,000 bits per second or roughly 30 Megabits per second.
That’s 30 Mbps just for HTTP headers.
With the WSP every frame only has 2 bytes of packaging. 100,000 people X 2 bytes = 200,000 bytes per 10 seconds or 160 Kilobits per second.
So WSP takes 30 Mbps down to 160 Kbps for 100,000 concurrent users of your application. And that’s what Hickson and the WSP team and trying to do for us.