Blog

How to integrate PHP, Perl and other languages on Apache
I have this module that a great group of guys in Malaysia have put together. But their language of choice is PHP and mine is Perl. I need to modify it slightly to integrate it. For example, I need to add my own session code so that their code knows if my user is logged in or not and who they are.

I started writing PHP but quickly started duplicating code I’d already written in Perl. Fetch the session from the database, de-serialize the session data, that sort of thing. I also ran into issues trying to recreate my Perl decryption routines in PHP. [I use non-mainstream ciphers]

Then I found ways to run Perl inside PHP and vice-versa. But I quickly realized that’s a very bad idea. Not only are you creating a new Perl or PHP interpreter for every request, but you’re still duplicating code, and you’re using a lot more memory to run interpreters in addition to what mod_php and mod_perl already run.

Eventually I settled on creating a very lightweight wrapper function in PHP called doPerl. It looks like this:

$associativeArrayResult = doPerl(functionName, associativeArrayWithParameters);
```
function doPerl($func, $arrayData){
 $ch = curl_init();
 $ip = '127.0.0.1';
 $postData = array(
 json => json_encode($arrayData),
 auth => 'myPassword',
 );
 curl_setopt($ch,CURLOPT_POST, TRUE);
 curl_setopt($ch,CURLOPT_POSTFIELDS, $postData);
 curl_setopt($ch, CURLOPT_URL, "http://" . $ip . "/webService/" . $func . "/");
 curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
 $output = curl_exec($ch);
 curl_close($ch);
 $data = json_decode($output, TRUE);
 return $data;
}
```
On the other side I have a very fast mod_perl handler that only allows connections from 127.0.0.1 (the local machine). I deserialise the incoming JSON data using Perl’s JSON::from_json(). I use eval() to execute the function name that is, as you can see above, part of the URL. I reserialize the result using Perl’s JSON::to_json($result) and send it back to the PHP app as the HTML body.

This is very very fast because all PHP and Perl code that executes is already in memory under mod_perl or mod_php. The only overhead is the connection creation, sending of packet data across the network connection and connection breakdown. Some of this is handled by your server’s hardware. [And of course the serialization/deserialization of the JSON data on both ends.]

The connection creation is a three way handshake, but because there’s no latency on the link it’s almost instantaneous. The transferring of data is faster than a network because the MTU on your lo interface (the 127.0.0.1 interface) is 16436 bytes instead of the normal 1500 bytes. That means the entire request or response fits inside a single packet. And connection termination is again just two packets from each side and because of the zero latency it’s super fast.

I use JSON because it’s less bulky than XML and on average it’s faster to parse across all languages. Both PHP and Perl’s JSON routines are ridiculously fast.

My final implementation on the PHP side is a set of wrapper classes that use the doPerl() function to do their work. Inside the classes I use caching liberally, either in instance variables, or if the data needs to persist across requests I use PHP’s excellent APC cache to store the data in shared memory.

Update: On request I’ve posted the perl web service handler for this here. The Perl code allows you to send parameters via POST using either a query parameter called ‘json’ and including escaped JSON that will get deserialized and passed to your function, or you can just use regular post style name value pairs that will be sent as a hashref to your function. I’ve included one test function called hello() in the code. Please note this web service module lets you execute arbitrary perl code in the web service module’s namespace and doesn’t filter out double colon’s, so really you can just do whatever the hell you want. So I’ve included two very simple security mechanisms that I strongly recommend you don’t remove. It only allows requests from localhost, and you must include an ‘auth’ post parameter containing a password (currently set to ‘password’). You’re going to have to implement the MM::Util::getIP() routine to make this work and it’s really just a one liner:
```
sub getIP {
 my $r = shift @_;
 return $r->headers_in->{'X-Forwarded-For'} ?
    $r->headers_in->{'X-Forwarded-For'} :
    $r->connection->get_remote_host();
}
```
October 24, 2009
Efficiency

October 22, 2009
Routers treat HTTPS and HTTP traffic differently

Well the title says it all. Internet routers live at Layer 3 [the Network Layer] of the OSI model which I’ve included to the left. HTTP and HTTPS live at Layer 7 (Application layer) of the OSI model, although some may argue HTTPS lives at Layer 6.

So how is it that Layer 3 devices like routers treat HTTPS traffic differently?

Because HTTPS servers set the DF or Do Not Fragment IP flag on packets and regular HTTP servers do not.

This matters because HTTP and HTTPS usually transfer a lot of data. That means that the packets are usually quite large and are often the maximum allowed size.

So if a server sends out a very big HTTP packet and it goes through a route on the network that does not allow packets that size, then the router in question simply breaks the packet up.

But if a server sends out a big HTTPS packet and it hits a route that doesn’t allow packets that size, the routers on that route can’t break the packet up. So they drop the packet and send back an ICMP message telling the machine that sent the big packet to adjust it’s MTU (maximum transfer unit) size and resend the packet. This is called Path MTU Discovery.

This can create some interesting problems that don’t exist with plain HTTP. For example, if your ops team has gotten a little overzealous with security and decided to filter out all ICMP traffic, your web server won’t receive any of those ICMP messages I’ve described above telling it to break up it’s packets and resend them. So large secure packets that usually are sent halfway through a secure HTTPS connection will just be dropped. So visitors to your website who are across network paths that need to have their packets broken up into smaller pieces will see half-loaded pages from the secure part of your site.

If you have the problem I’ve described above there are two solutions: If you’re a webmaster, make sure your web server can receive ICMP messages [You need to allow ICMP code 4 “Fragmentation needed and DF bit set”]. If you’re a web surfer (client) and are trying to access a secure site that has ICMP disabled, adjust your network card’s MTU to be smaller than the default (usually the default is 1500 for ethernet).

But the bottom line is that if everything else is working fine and you are having a problem sending or receiving HTTPS traffic, know that the big difference with HTTPS traffic over regular web traffic is that the packets can’t be broken up.

October 20, 2009
China's influence in Africa
As an African American, or rather, an American African (I’m white and African born), I hear a constant flow of stories about China’s increasing influence in Africa. They’ve clearly taken a long term view on Africa, perhaps motivated by their projected energy and natural resources needs. If you subscribe to the US view that free trade is good, then this is a good thing. [You can’t have it both ways folks!]

Whether or not you think it’s good for the continent, the data is surprising:
- The China National Petroleum Corporation (CNPC) is the single largest shareholder (40 percent) in the Greater Nile Petroleum Operating Company, which controls Sudan’s oil fields and has invested $3 billion in refinery and pipeline construction in Sudan since 1999. Sudan now supplies 7% of China’s total oil.
- In March 2004, Beijing extended a $2 billion loan to Angola in exchange for a contract to supply 10,000 barrels of crude oil per day.
- In July 2005, PetroChina concluded an $800 million deal with the Nigerian National Petroleum Corporation to purchase 30,000 barrels of oil per day for one year.
- In January 2006, China National Offshore Oil Corporation (CNOOC), after failing to acquire American-owned Unocal, purchased a 45 percent stake in a Nigerian offshore oil and gas field for $2.27 billion and promised to invest an additional $2.25 billion in field development.
- In April 2003, approximately 175 People’s Liberation Army (PLA) soldiers and a 42-man medical team were deployed to the Democratic Republic of Congo on a peacekeeping mission.
- In December 2003, 550 peacekeeping troops, equipped with nearly 200 military vehicles and water-supply trucks, were sent to Liberia.
- China has also deployed about 4,000 PLA troops to southern Sudan to guard an oil pipeline and reaffirmed its intention to strengthen military collaboration and exchanges with Ethiopia, Liberia, Nigeria, and Sudan.
October 18, 2009
The DOW 10K priced as opportunity cost
Economists love the concept of opportunity cost because it gives you a the real long-term value of an investment or purchase in relative terms – which is really the only way to calculate value. On Wednesday the DOW hit 10,000 again. The US financial press did their part to ring the bell while the banking community celebrated the boost in perceived value and the increased likelihood that the public would buy their wares.

Fox News, like clockwork, has given former asshole president Bush credit for the recovery. (Skip to 3:00 in the video) “He took the bold moves and look where we are today..”.

John Authers in the Finanial Times is almost embarrassed on Thursday as he delivers the news of what a DOW 10K means in real, opportunity cost terms. If you invested in the DOW in 1999:
- Relative to emerging markets you’ve lost 80% of your money.
- Relative to gold you’ve lost 75% of your money.
- And even in dollar terms corrected for inflation (using the CPI) you’ve lost around 23% of your money.
October 17, 2009
Using and understanding the world-wide city database data

One of the most popular pages on this blog is a post I wrote two years ago titled “World wide cities database and other free geospatial data“. There are still few people out there who realize that not only can you get a free world-wide cities database from the national geospatial ingelligence agency in the US, but they have around 4 million other points around the world that even include things like undersea features, palm groves, vineyards and a lot more.

I got an email from Jamil today asking about how to interpret the data in the NGIA’s database. You can find the data he’s referring to at the NGIA’s site. Each record has a feature classification and a feature designation code. You can see the schema (but without what the codes are) here. For some reason I couldn’t find the actual classifications and designations on the site.

I did find them posted here. The information may be included in the NGIA’s download files – I haven’t checked.

October 17, 2009
SSL Timeouts and layer 3 infrastructure

I’ve spent the last 5 days agonizing over a very hard problem on my network. Using curl, LWP::UserAgent, openssl, wget or any other SSL client, I’d see connections either timeout or hang halfway through the transfer. Everything else works fine including secure protocols like SSH and TLS. In fact inbound SSL connections work great too. It’s just when I connect to an external SSL host that it hiccups.

If you remember your OSI model, SSL is well above layer 3 (IP addresses and routers) and layer 2 (LAN traffic routed via MAC addresses). So the last place I planned to look was the network infrastructure.

I eliminated specific clients by trying others and I eliminated the OS by spinning up virtual machines running other versions of Linux. I elminated my physical hardware by reproducing it on a non Dell server and having one of the ops guys repro it on his OS X macbook.

And just to prove it was the network, which is all that was left, I set up a VPN from one of my machines that tunnelled all traffic over the VPN to a machine on an external network that acted as the router, thereby encapsulating the layer 2 and 3 traffic in a layer 4 and 5 VPN. And the problem went away. So I knew it was the network.

Tonight a few minutes ago my colo provider took down my local router and I gracefully failed over to the redundant router, and lo and behold the problem has gone away.

I still don’t know what it is, but what I do know is that a big chunk of layer 3 infrastructure has been changed and it’s fixed a layer 5 problem. What’s weird is that TCP connections (which is what SSL rides on top of) have delivery confirmation. So if the problem was packet loss, TCP would just request the packet again. So it’s something else and something that only affects SSL – and only connections bound from my servers out to the Internet.

The reason I’m posting this is because during the hours I spent Googling this issue this week (and finding nothing) I saw a lot of complaints about SSL timeouts and no solutions. So if you’re getting timeouts like this, check your underlying infrastructure and you might just be surprised. To verify that it’s a network problem, set up a VLAN using PPTP. Set up NAT on the external linux machine that is your VLAN server. Then disable the default gateway on the machine having the issue (the VLAN client) and verify that all traffic is routing via your VLAN. Then try and reproduce the SSL timeout and if it doesn’t occur, it’s probably your layer 2 or 3 infrastructure.

October 17, 2009
How to mirror someone elses web server with iptables
It took me a while to find this – I needed it for testing purposes, nothing malicious. If you’d like your web server somewhere on the web to pretend to be any other web server, even a secure one, you can do the following. x.x.x.x is your own server and y.y.y.y is the ip of the server you’re trying to mirror. I’m also assuming you only have one network card in the machine and it’s called eth0. The following will mirror a secure web server. If you’d like to mirror a regular web server, replace 443 with port 80.
```
iptables -t nat -A PREROUTING -p tcp -i eth0 -d x.x.x.x --dport 443 -j DNAT --to y.y.y.y
iptables -t nat -A POSTROUTING -p tcp -o eth0 -d y.y.y.y --dport 443 -j MASQUERADE
```
If this doesn’t work you probably have to enable packet forwarding like this:
```
echo 1 > /proc/sys/net/ipv4/ip_forward
```
October 16, 2009
Super fast & easy virtual server setup on Ubuntu (Jaunty)

While I upgrade to Karmic, here’s a quick setup to get a virtual ubuntu server running on a real ubuntu server:

As root:

ubuntu-vm-builder kvm jaunty --hostname dev2 --addpkg openssh-server vim -d /usr/local/vms/dev2 --mem 256 --libvirt qemu:///system

This will create a jaunty jackalope ubuntu virtual server using the KVM hypervisor. The hostname will be dev2. It will add the openssh-server package as well as vim. It will put it in the /usr/local/vms/dev2 directory. It’ll allocate 256 Megs of memory for the machine. The libvirt options automatically adds your new machine to the qemu:///system domain.

Once you’re done you can run:

virsh

In the virsh shell type:

list --all

You should see your new machine listed.

To set up networking type ‘edit dev2′.

Change (or add) the following:

<interface type=’bridge’>
<source bridge=’br0’/>
<target dev=’vnet0’/>
</interface>

Leave out anything about a MAC address because virsh will automatically add that for you.

Now the hard part. You want to create a linux bridge.

If you have only one network interface on the box you’re going to need physical access. I’m going to assume that’s the case. [If you have a second, just leave it up and make sure you’re ssh’ing in via that port]

ifconfig eth0 down ifconfig eth0 0.0.0.0 brctl addbr br0 brctl addif br0 eth0 ifconfig br0 up

At this point your bridge is up and your virtual machine can use it, but the guest OS doesn’t have an IP of it’s own. So:

ifconfig br0 192.168.123.123 netmask 255.255.255.0

Now add a default gateway to your host:

route add default gw 192.168.123.1

Now comes another tricky part. If you’re running all this on a machine with a GUI, life is easy. I’m going to assume you, like me, run ubuntu server. You need to launch your new virtual machine and you need to connect to it using VNC. Lets say you have a MacBook and want to run the VNC client on that. Here’s what you do:

On the macbook launch a terminal. Go to root with: sudo su –

Run:

ssh -f -N -L 5900:127.0.0.1:88 root@your_host_machines_ip

On the host machine run:

ssh -f -N -L 88:localhost:5900 root@your_host_machines_ip

Now go and download Chicken of the VNC for your Mac.

Now on the host operating system run:

virsh start dev2

Then launch Chicken of the VNC and just connect to localhost. Bang you should have a console!

Now edit your network settings:

vim /etc/network/interfaces

Just configure your network as per normal as if the machine was on your physical network. Something like:

auto eth0 iface eth0 inet static address 10.1.0.13 netmask 255.255.255.0 gateway 10.1.0.1

Then do

/etc/init.d/network restart

And … unless I’ve forgotten a step which is quite likely … you should be up and running. Make sure the ssh server is running on your new server and try and ssh to your virtual server’s IP from the host machine.

If you can’t ping the default gateway make sure your firewall software (if you have any) isn’t interfering. If you run shorewall you want to change the following:

Edit the /etc/shorewall/interfaces file and change ‘eth0’ to ‘br0’

Also add routeback,bridge to br0 so it looks something like this:

net br0 detect routeback,bridge,tcpflags,norfc1918,routefilter,nosmurfs,logmartians

Restart shorewall and give it a try.

Now if you want to upgrade your new virtual Jaunty machine to karmic, simply do a:

apt-get install update-manager-core do-release-upgrade -d

I’ll try to include the settings in the host /etc/network/interfaces for br0 soon.

If you’re still stuck, here are some great links:

Introduction to Linux bridging.

Info on libvirt.

Setting up a bridge.

ubuntu-vm-builder short guide.

October 15, 2009
The profitable business of taking money from startups

Under the guise of fostering innovation, guys like The Life Sciences and Healthcare Venture Summit, who spammed me today are happily taking money from entrepreneurs and offering a tax deductible day out of the office in return. Perhaps I’m inspired by Jason Calcanis’s recent jihad against investors that charge you to pitch, but these high cost ‘for-the-startup-community’ events are a waste of time and money and something that’s been grating me for some time now.

The event above charges you $595 for early registration. It’s a one day event. If they net 5000 suckers, that’s $2,975,000 in revenue. Host an event every 2 months and you’re into a more than $17 million dollar business.

Have you ever tried to elevator pitch an investor at a startup ‘networking’ event? Don’t!

Have you ever learned anything new at a startup event? Sure you have, but you’re surrounded by your competitors and the instant it hits everyone else’s ears it’s useless to you as a potential differentiator. And it’ll be all over Techmeme tomorrow anyway.

Real networking is done one on one. It’s not about handing out business cards and expecting a few ‘hits’. It’s about investing your time and talent in people and their businesses. One day they may have an opportunity to return the favor, but there’s never any expectation. Relationships are built through shared experiences, not by breathing the air someone else recently finished with.

Real innovation is done by doing it. You don’t create something new by getting a history lesson in a crowded room.

The really useful data is found where everyone else isn’t looking. Have you ever looked at the wealth of excellent government data out there? Did you know there are huge cults of quilting and scrap-booking blogging communities out there?

/end_rant

October 15, 2009