Please help!
This MIGHT even be a bug in PHP!
I'll provide version numbers and site specific information (browser, OS,
and kernel versions) if others cannot reproduce this problem.
I'm running into some PHP behavior that I do not understand in PHP 5.1.2.
I need to parse the HTML from the following carefully constructed URI:
http://crenner.smugmug.com/homepage/...allery/1960121
The problem is that when PHP downloads the HTML using file_get_contents,
or any other method of opening a remote file in PHP that I have tried,
it gives me the wrong page!
This URI is supposed to yield the HTML from the page at
http://crenner.smugmug.com/gallery/1960121 , but with the "allthumbs"
version of the page, selectable from the dropdown box at the top of the
page.
The correct page is downloaded in IE, SeaMonkey, and in wget!
But when downloading in PHP, I get the HTML from the page at
http://crenner.smugmug.com/gallery/1960121 , but with the "smugmug
small" version of the page, selectable from the dropdown box at the top
of the page.
Please note that the templatechange.mg page is merely a server-side
script that takes the arguments passed to it (TemplateID and origin),
and redirects the browser to the correct version of the page at
"origin", based on the "TemplateID".
Here is how to reproduce the problem:
* Download the page with wget so that you have a copy of the correct
results:
--commandline start here--
wget
"http://crenner.smugmug.com/homepage/templatechange.mg?TemplateID=7&origin=http://crenner.smugmug.com/gallery/1960121"
-O correct.html
--commandline end here--
* Download the same page with php 5.1.2:
--file incorrect.php start here--
<?php
print(file_get_contents("http://crenner.smugmug.com/homepage/templatechange.mg?TemplateID=7&origin=http://crenner.smugmug.com/gallery/1960121"));
?>
--file incorrect.php end here--
--commandline start here--
php incorrect.php incorrect.html
--commandline end here--
* You should now have two very different HTML files (correct.html and
incorrect.html), even though both were downloaded using the same URI!
* Open correct.html in a web browser. You will see a thumbnails
("allthumbs") only version of a smugmug.com picture gallery.
* Open incorrect.html in a web browser. You will see a paginated
version of the same smugmug.com picture gallery ("smugmug small"), with
a larger image on the right.
I know that I could make a workaround by having my PHP scripts call wget
instead of using intrinsic functions to download the HTML. This is not
practical for me for a number of reasons, including code portability and
streamlining.
Can anyone help me with this? I know that the templatechange.mg uses a
302 to redirect the browser, based on the output I get from wget. I
also know that the redirect is happening in PHP (even if it is happening
incorrectly), because I'm not getting the contents of the
templatechange.mg file, but a different version of the gallery itself.
This is driving me crazy. I can find no logical reason why PHP would
yield different results for the same URI than I get in 3 other browsers
(SeaMonkey, IE, and wget).
I have also attached the results pages and the php script (correct.html,
incorrect.html, and incorrect.php) in php_download_strangeness.tar.bz2
(a bzip2 compressed tar archive)
- Chuck Renner