PHP array() is a little scary

Push 100,000 elements onto a PHP array() where each element is a four element associative array (a hash in Perl speak). Here’s the data being pushed:

array(
  'owner' => 100,
  'host' => 'www.example.com.co.uk',
  'path' => '/this/is/an/example/path.html',
  'hostkey' => '1111'
)

The memory grows by over 80 megabytes.

Pushing takes less than a second or two but shifting off the first 1000 elements takes over 17 seconds on my machine.

Now take that same data and create a basic FIFO class that has push() and shift() methods. Use pack() and unpack() to store the data in a long string. Total time to push 100,000 and shift the first 1000 elements is around 1 second. Total memory is 7 megabytes which is less than 10% of PHP’s internal array()’s consumption.

PHP’s splFixedArray class which is advertised as mainly having a speed advantage doesn’t fair much better. With a fixed array created of 100,000 elements and loading and unloading the same associative array() it grows by 75 megs but is very fast at half a second. Just for fun I pushed 100,000 elements on an splFixedArray which are simply the values of the test associative array concatenated into a string and it’s still weighs in at 13 megabytes.

Here’s the FIFO class:

class wfArray {
        private $data = "";
        private $shiftPtr = 0;
        public function __construct($keys){
                $this->keys = $keys;
        }
        public function push($val){ //associative array with keys that match those given to constructor
                foreach($this->keys as $key){
                        $this->data .= pack('N', strlen($val[$key])) . $val[$key];
                }
        }
        public function shift(){
                $arr = array();
                if(strlen($this->data) < 1){ return null; }
                foreach($this->keys as $key){
                        $len = unpack('N', substr($this->data, $this->shiftPtr, 4));
                        $len = $len[1];
                        $arr[$key] = substr($this->data, $this->shiftPtr + 4, $len);
                        $this->shiftPtr += 4 + $len;
                }
                return $arr;
        }
}

Here’s the test script using the FIFO class with the array() tests commented out.

require_once('wfArray.php');
error_reporting(E_ALL);
$p1 = memory_get_peak_usage();
$stime = microtime(true);
//$arr = array();
$arr = new wfArray(array('owner', 'host', 'path', 'hostkey'));
for($i = 0; $i < 100000; $i++){
        //array_push($arr, array(
        $arr->push(array(
                'owner' => 100,
                'host' => 'www.example.com.co.uk',
                'path' => '/this/is/an/example/path.html',
                'hostkey' => '1111'
                ));
        if($i % 1000 == 0){ echo $i . "\n"; }
}
$i = 0;
while($elem = $arr->shift()){
//while($elem = array_shift($arr)){
        $i++;
        if($i > 1000){ break; }
        if(! ($elem['owner'] == 100 && $elem['host'] == 'www.example.com.co.uk' && $elem['path'] == '/this/is/an/example/path.html' && $elem['hostkey'] == '1111')){
                die("Problem");
        }
}
echo "\nTotal time: " . sprintf('%.3f', microtime(true) - $stime) . "\n";
$p2 = memory_get_peak_usage();
echo "Grew: " . ($p2 - $p1) . "\n";

Comments

8 responses to “PHP array() is a little scary”

  1. tszming Avatar
    tszming

    Have you tried SplQueue?

    See this: https://bugs.php.net/bug.php?id=18829

  2. Hmm Avatar
    Hmm

    It’s “fare much better”.

    This class has a very limited set of use cases… yeah, if you take most of the features out of a structure with a lot of features, it will probably be leaner! Can’t think of many cases where I could actually use it, and we use arrays quite a lot.

  3. aleks Avatar
    aleks

    A bit scary indeed. But I don’t see anyone putting 100,000 items in an array really either.

    1. PeterisP Avatar
      PeterisP

      What structure should then be used in PHP for medium-size (100k-1m) datasets? This isn’t speaking about large data, processing a list of 100.000 numbers was a reasonably common task even in 1990.

  4. moe_zhank Avatar
    moe_zhank

    yes, it faster. but you would miss all of php array advantages because actually you store data as string.

    what about this scenario :
    shifting $array then get value of $array[99999] ?

    i think the result would be the same as normal php array operation.

    i never use array_shift, cause it can be tricked by index manipulation when accessing array elements.
    faster and consume less memory

  5. Ray Avatar
    Ray

    This is great, but what if we want to retrieve data when using this pack technique?

Leave a Reply

Your email address will not be published. Required fields are marked *