PHP array() is a little scary

Push 100,000 elements onto a PHP array() where each element is a four element associative array (a hash in Perl speak). Here’s the data being pushed:

array(
  'owner' => 100,
  'host' => 'www.example.com.co.uk',
  'path' => '/this/is/an/example/path.html',
  'hostkey' => '1111'
)

The memory grows by over 80 megabytes.

Pushing takes less than a second or two but shifting off the first 1000 elements takes over 17 seconds on my machine.

Now take that same data and create a basic FIFO class that has push() and shift() methods. Use pack() and unpack() to store the data in a long string. Total time to push 100,000 and shift the first 1000 elements is around 1 second. Total memory is 7 megabytes which is less than 10% of PHP’s internal array()’s consumption.

PHP’s splFixedArray class which is advertised as mainly having a speed advantage doesn’t fair much better. With a fixed array created of 100,000 elements and loading and unloading the same associative array() it grows by 75 megs but is very fast at half a second. Just for fun I pushed 100,000 elements on an splFixedArray which are simply the values of the test associative array concatenated into a string and it’s still weighs in at 13 megabytes.

Here’s the FIFO class:

class wfArray {
        private $data = "";
        private $shiftPtr = 0;
        public function __construct($keys){
                $this->keys = $keys;
        }
        public function push($val){ //associative array with keys that match those given to constructor
                foreach($this->keys as $key){
                        $this->data .= pack('N', strlen($val[$key])) . $val[$key];
                }
        }
        public function shift(){
                $arr = array();
                if(strlen($this->data) < 1){ return null; }
                foreach($this->keys as $key){
                        $len = unpack('N', substr($this->data, $this->shiftPtr, 4));
                        $len = $len[1];
                        $arr[$key] = substr($this->data, $this->shiftPtr + 4, $len);
                        $this->shiftPtr += 4 + $len;
                }
                return $arr;
        }
}

Here’s the test script using the FIFO class with the array() tests commented out.

require_once('wfArray.php');
error_reporting(E_ALL);
$p1 = memory_get_peak_usage();
$stime = microtime(true);
//$arr = array();
$arr = new wfArray(array('owner', 'host', 'path', 'hostkey'));
for($i = 0; $i < 100000; $i++){
        //array_push($arr, array(
        $arr->push(array(
                'owner' => 100,
                'host' => 'www.example.com.co.uk',
                'path' => '/this/is/an/example/path.html',
                'hostkey' => '1111'
                ));
        if($i % 1000 == 0){ echo $i . "\n"; }
}
$i = 0;
while($elem = $arr->shift()){
//while($elem = array_shift($arr)){
        $i++;
        if($i > 1000){ break; }
        if(! ($elem['owner'] == 100 && $elem['host'] == 'www.example.com.co.uk' && $elem['path'] == '/this/is/an/example/path.html' && $elem['hostkey'] == '1111')){
                die("Problem");
        }
}
echo "\nTotal time: " . sprintf('%.3f', microtime(true) - $stime) . "\n";
$p2 = memory_get_peak_usage();
echo "Grew: " . ($p2 - $p1) . "\n";

16 thoughts on “PHP array() is a little scary

  1. yes, it faster. but you would miss all of php array advantages because actually you store data as string.

    what about this scenario :
    shifting $array then get value of $array[99999] ?

    i think the result would be the same as normal php array operation.

    i never use array_shift, cause it can be tricked by index manipulation when accessing array elements.
    faster and consume less memory

  2. A bit scary indeed. But I don’t see anyone putting 100,000 items in an array really either.

    • What structure should then be used in PHP for medium-size (100k-1m) datasets? This isn’t speaking about large data, processing a list of 100.000 numbers was a reasonably common task even in 1990.

  3. It’s “fare much better”.

    This class has a very limited set of use cases… yeah, if you take most of the features out of a structure with a lot of features, it will probably be leaner! Can’t think of many cases where I could actually use it, and we use arrays quite a lot.

Comments are closed.