While trying to learn the ins and outs of the php CURL library, I
decided to write a php script that posts a form on the Chicago Board
of Options (CBOE) web site, which returns an ASCII text file. CBOE
appears to keep your form query data in cookies, so this seemed like
a good use of curl.
Well, my script works just fine when run from the commandline.
When accessed from my browser, it returns an empty string where the
data should be. I have no idea why. Here is my script. Below it I
will add further comments.
========== begin script optionchain.php ==========
<?php
// cookie and error log path - SET THIS BEFORE TESTING
define(TMPFILEP ATH, "/mf/home/unicorn/shell/tmp");
/*
* This script gets option chain data in a comma-delimited text file
* from the Chicago Board of Options web site www.cboe.com.
*
* Example for Microsoft (MSFT) stock options:
*
* URL syntax: http://example.com/optionchain.php?ticker=MSFT
*
* Commandline: % php optionchain.php MSFT
*/
$tickersymbol = isset($_GET['ticker']) ? $_GET['ticker']
: $_SERVER['argv'][1]; // get argument from commandline if no $_GET
$ch = curl_init(); // initialize curl
curl_setopt($ch , CURLOPT_VERBOSE , true); // verbose errors
$er = fopen(TMPFILEPA TH.'/curl_err.txt', 'w'); // error log file
curl_setopt($ch , CURLOPT_STDERR, $er); // log the errors
curl_setopt($ch , CURLOPT_AUTOREF ERER, true);
curl_setopt($ch , CURLOPT_FOLLOWL OCATION, true);
curl_setopt($ch , CURLOPT_RETURNT RANSFER, true);
curl_setopt($ch , CURLOPT_COOKIEJ AR, TMPFILEPATH.'/cboe_cookie.txt ');
curl_setopt($ch , CURLOPT_REFERER ,
'http://www.cboe.com/delayedQuote/QuoteTableDownl oad.aspx');
curl_setopt($ch , CURLOPT_URL,
'http://www.cboe.com/delayedQuote/QuoteTableDownl oad.aspx');
curl_setopt($ch , CURLOPT_USERAGE NT,
'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)');
// "log in" to web site by accessing first page (sets cookie)
$discard = curl_exec($ch);
// now post the form. The result will come after setting more
// cookies and receiving a redirect to a different URL,
// http://www.cboe.com/delayedQuote/QuoteData.dat
// which returns data using the query data contained in a cookie.
// AFTER that we get redirected back to the original page,
// so limit redirects to 1
curl_setopt($ch , CURLOPT_MAXREDI RS, 1);
curl_setopt($ch , CURLOPT_POST, true); // enable HTTP POST
curl_setopt($ch , CURLOPT_POSTFIE LDS,
'__EVENTTARGET= '
.'&__EVENTARGUM ENT='
.'&__VIEWSTATE= '.urlencode('dD wtODQ5MjIyNjc7O z5rmegY+4O27l7u WcpGd4iU+1RpAA= =')
.'&ucHeader:ucC BOEHeaderLinks: ucCBOEHeaderSea rch:searchtext= '
.'&ucHeader:ucC BOEHeaderLinks: ucCBOEHeaderSea rch:Button1=Sea rch'
.'&ucQuoteTable DownloadCtl:txt Ticker='.$ticke rsymbol
.'&ucQuoteTable DownloadCtl:cmd Submit=Download ');
// Get data (RETURNS NULL FROM BROWSER, WORKS FROM COMMANDLINE ??)
$content = curl_exec($ch);
// Close resources
curl_close ($ch);
fclose($er);
// display result
print "<pre>Data:\n{$ content}\nEnd</pre>\n";
?>
========== end script optionchain.php ==========
Now, my two files set in CURLOPT_COOKIEJ AR and CURLOPT_STDERR are
world-writable, so there shouldn't be a problem there. Both files
contain information after running the script from the commandline.
The correct data is returned; commandline execution works fine.
However, after running the script from the browser, the cookiejar
is empty, and the error logfile has information suggesting that
cookies weren't dealt with in any way. I suspect this might be
why the browser is returning a null result, but why it would work
differently from the browser, I don't know. Any thoughts?
-A