I put some cross-cluster traffic throttling in place yesterday using memcached – which rocks btw. In the last 12 hours I’ve blocked three sources – two were rogue crawlers from broadband ISP’s. The other was MSN’s live search crawler which is requesting more than 1 page per second sustained over 30 seconds. If it was Google I’d probably care, but Google has polite crawlers and unlike Google, Live search only sends me about 2% of my total search traffic.
Leave a Reply