Hello,
I am using PHP 5.0.4 with Apache 2, on WinXP Pro. This behavior
appears to be fundamental however, and should not be affected by
platform.
It would seem that there is some kind of bug in the process that
creates the reference when it is being assigned to an array element
within itself. If it is already referenced, it just assigns the
existing reference and avoids the problem.
As I was reading the documentation about references, I noticed the
warning about "complex arrays". Here is what it says:
----- begin PHP docs -----
Complex arrays are sometimes rather copied than referenced. Thus
following example will not work as expected. Example 21-3. References
with complex arrays
<?php
$top = array(
'A' => array(),
'B' => array(
'B_b' => array(),
),
);
$top['A']['parent'] = &$top;
$top['B']['parent'] = &$top;
$top['B']['B_b']['data'] = 'test';
print_r($top['A']['parent']['B']['B_b']); // array()
?>
----- end PHP docs -----
So I decided to play around with this, and sure enough I got the same
results. But after making a few changes, I discovered that the
behavior is caused by recursively assigning the reference, and not due
to the array structure itself. Consider this example code:
<?php
$top = array( 'X' => 1, 'Y' => 1, 'Z' => 0 );
$top['Y'] = &$top;
$top['X'] = &$top;
$top['Z'] = 4;
echo "X = ", var_dump( $top['X']['Z'] );
echo "Y = ", var_dump( $top['Y']['Z'] );
echo "Z = ", var_dump( $top['Z'] );
?>
The results show:
X = int(0)
Y = int(4)
Z = int(4)
The result for X still has the original value. This would seem to
agree with the orginal statement in the PHP docs. But, if you reverse
the order of the reference assignments like this:
$top['Y'] = &$top;
$top['X'] = &$top;
$top['Z'] = 4;
Then, the result for Y has the original value.
X = int(4)
Y = int(0)
Z = int(4)
Furthermore, if you assign some arbitrary variable a reference to $top
before doing anything else...
$z = &$top;
The output appears to be "correct" now:
X = int(4)
Y = int(4)
Z = int(4)
Looking at the var_dump of the actual resulting array struct reveals
that where the reference is being assigned matches the contents of the
array at *that point in time* up until the next recursive reference
assignment. At that point $top branched off, leaving the first one
pointing to the previous instance. So it would appear that it is
actually a fractal array structure :) Each member that gets another
ref to $top adds another fractal generation to the var_dump.
But nonetheless, assigning a reference to a separate variable stops the
branching for all subsequent recursive reference assignments. So it
appears that assigning a reference to an element within itself like
that does not correctly create the reference, unless the reference was
already created.
Another way to demonstrate this is to compare the output of these two
scripts:
<?php
$top = array( 'W' => 1, 'X' => 1, 'Y' => 1, 'Z' => 0 );
$top['WW'] = 'wwwww'; $top['W'] = &$top; unset ( $top['WW'] );
$top['XX'] = 'xxxxx'; $top['X'] = &$top; unset ( $top['XX'] );
$top['YY'] = 'yyyyy'; $top['Y'] = &$top; unset ( $top['YY'] );
$top['Z'] = 4;
var_dump( $top );
?>
<?php
$top_t = array( 'W' => 1, 'X' => 1, 'Y' => 1, 'Z' => 0 );
$top_t['WW'] = 'wwwww'; $top_w = $top_t; $top_w['W'] = &$top_w; unset (
$top_w['WW'] );
$top_w['XX'] = 'xxxxx'; $top_x = $top_w; $top_x['X'] = &$top_x; unset (
$top_x['XX'] );
$top_x['YY'] = 'yyyyy'; $top_y = $top_x; $top_y['Y'] = &$top_y; unset (
$top_y['YY'] );
var_dump( $top_y );
?>
Basically, the output is the same, except that in the first one, the
dump of $top['Y']['Y'] is shorted to "*RECURSION*" but not in the
second.
Now compare the output of these two scripts:
<?php
for ( $i = 0; $i < 10; $i ++ ) $top[$i] = &$top;
$top['A'] = 'AAAAA'; var_dump( $top );
?>
<?php
$z = &$top;
for ( $i = 0; $i < 10; $i ++ ) $top[$i] = &$top;
$top['A'] = 'AAAAA'; var_dump( $top );
?>
The only difference is the reference assignment at the beginning of the
second one. But the results differ vastly. The first outputs over
157457 lines, while the second only outputs about 2554 lines.
What I would like to know, is if this is actually a bug, or if this
behavior is intentional. I do not have much knowledge of how the Zend
Engine creates the references. But I find it at least somewhat
dangerous that it can have that "fractal" effect so easily. And since
it can be avoided by simply setting a reference to a separate variable,
I am convinced that something is broken (or at least VERY confusing).
The design of the language does not seem to justify this statement:
"make a direct reference assignment to the array before adding
recursive references inside it, or else the structure will branch with
each assignment"
So... any thoughts?
Thanks,
Kaptain524