By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
445,813 Members | 1,257 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 445,813 IT Pros & Developers. It's quick & easy.

parsing website with a "searching...." page

P: n/a
I'm trying to parse a table on a webpage to pull down some data I
need. The page is based off of information entered into a form. when
you submit the data from the form it displays a "Searching..." page
then, refreshes and displays the table I want. I have code that grabs
data from the page using cURL but when I look at the data it contains
the "Searching..." page and not the table that I want. below is the
code i have so far....Thanks in advance for any help.

<?php

$url="http://www.website.com";

$post_data = array();
$post_data['postvar1'] = "val1";
$post_data['postvar2'] = "val2";
$o="";
foreach($post_data as $k=>$v)
{
$o.= "$k=".utf8_encode($v)."&";
}

$post_data=substr($o,0,-1);
$ch= curl_init();
curl_setopt($ch, CURLOPT_POST,1);
curl_setopt($ch, CURLOPT_HEADER,0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_URL,$url);
curl_setopt($ch, CURLOPT_POSTFIELDS, $post_data);
$result = curl_exec($ch);
curl_close($ch);
$result=explode("\n",$result);

?>

Apr 6 '07 #1
Share this Question
Share on Google+
3 Replies


P: n/a
Aaron wrote:
I'm trying to parse a table on a webpage to pull down some data I
need. The page is based off of information entered into a form. when
you submit the data from the form it displays a "Searching..." page
then, refreshes and displays the table I want. I have code that grabs
data from the page using cURL but when I look at the data it contains
the "Searching..." page and not the table that I want. below is the
code i have so far....Thanks in advance for any help.

<?php

$url="http://www.website.com";

$post_data = array();
$post_data['postvar1'] = "val1";
$post_data['postvar2'] = "val2";
$o="";
foreach($post_data as $k=>$v)
{
$o.= "$k=".utf8_encode($v)."&";
}

$post_data=substr($o,0,-1);
$ch= curl_init();
curl_setopt($ch, CURLOPT_POST,1);
curl_setopt($ch, CURLOPT_HEADER,0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_URL,$url);
curl_setopt($ch, CURLOPT_POSTFIELDS, $post_data);
$result = curl_exec($ch);
curl_close($ch);
$result=explode("\n",$result);

?>
Hi,

The page is probably using JavaScript to give that effect.
Inspect the content of $result to see if this is the case.
Look for divs that are not visible and made visible when the page loads (end
of script, or an onLoad event).
The data you want might very well be inside the $result.

If not, give more information WHAT the $result contained.

Regards,
Erwin Moller
Apr 6 '07 #2

P: n/a
On Apr 6, 10:55 am, Erwin Moller
<since_humans_read_this_I_am_spammed_too_m...@spam yourself.comwrote:
Aaron wrote:
I'm trying to parse a table on a webpage to pull down some data I
need. The page is based off of information entered into a form. when
you submit the data from the form it displays a "Searching..." page
then, refreshes and displays the table I want. I have code that grabs
data from the page using cURL but when I look at the data it contains
the "Searching..." page and not the table that I want. below is the
code i have so far....Thanks in advance for any help.
<?php
$url="http://www.website.com";
$post_data = array();
$post_data['postvar1'] = "val1";
$post_data['postvar2'] = "val2";
$o="";
foreach($post_data as $k=>$v)
{
$o.= "$k=".utf8_encode($v)."&";
}
$post_data=substr($o,0,-1);
$ch= curl_init();
curl_setopt($ch, CURLOPT_POST,1);
curl_setopt($ch, CURLOPT_HEADER,0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_URL,$url);
curl_setopt($ch, CURLOPT_POSTFIELDS, $post_data);
$result = curl_exec($ch);
curl_close($ch);
$result=explode("\n",$result);
?>

Hi,

The page is probably using JavaScript to give that effect.
Inspect the content of $result to see if this is the case.
Look for divs that are not visible and made visible when the page loads (end
of script, or an onLoad event).
The data you want might very well be inside the $result.

If not, give more information WHAT the $result contained.

Regards,
Erwin Moller- Hide quoted text -

- Show quoted text -
Heres basicly what was returned by $result

<!-- Vignette V6 Fri Apr 06 09:56:47 2007 -->
<script language="javascript">

function changePassword() {

window.open('https://www.secure-hallmark.xxxxx.com/ldap/
em_change_password.asp', 'ChangePassword',
'height=250,width=400,screenX=100,screenY=100');
}

function forgotPassword() {

window.open('/auth/forgotpasswordform/0,2290,,00.html',
'ForgotPassword', 'height=250,width=400,screenX=100,screenY=100');
}
</script><html><head><title>xxxxxxx</title>
<!--INCLUDE VIRTUAL="/ast/functions.asp" -->

<script language="Javascript">
//Code only to exsist if ONTCred Cookie is missing
//Check every 15 sec for ONTCred and reload page if it
exsists. Only once though
var intRunOnce = 0;
function fctCheckONTcred() {
var allcookies = "";
allcookies = document.cookie;
var vgnONTpos =
allcookies.indexOf("ONTCred=");
if (vgnONTpos != -1){
if (intRunOnce == 0) {
topFrame.location.reload();
intRunOnce = 1;
}
}else{
setTimeout('fctCheckONTcred()',
15000);
}
}
</script>

</head>
<frameset rows="125,*" frameborder="NO" border="0" noresize
framespacing="0">

<frame name="topFrame" target="_top"
onload="setTimeout('fctCheckONTcred()', 15000);" scrolling="NO"
noresize src="xxxxx.html" marginheight="0" marginwidth="0">

<frame name="mainFrame" src="https://xxxxxxx.com/webapp/wcs/
stores/servlet/RemoteAdvancedSearchView?
langId=-1&storeId=500201&catalogId=500201&manufacturer=bav 99&qasFilter=FALSE"
nor
esize marginheight="0" marginwidth="0">
</frameset>
</html>
<!--WEBSIDESTORY CODE HBX1.4 (SSL)-->
<!-- EM SSL - 25/Apr/2006 14:40 -->
<!--COPYRIGHT 1997-2005 WEBSIDESTORY,INC. ALL RIGHTS RESERVED.
U.S.PATENT No.6,393,479B1 & 6,766,370. INFO:http://websidestory.com/
privacy-->
<script language="javascript">

function hbxStrip(a)
{
a = a.split("|").join("");
a = a.split("&").join("");
a = a.split("'").join("");
a = a.split("#").join("");
a = a.split("$").join("");
a = a.split("%").join("");
a = a.split("^").join("");
a = a.split("*").join("");
a = a.split(":").join("");
a = a.split("!").join("");
a = a.split("<").join("");
a = a.split(">").join("");
a = a.split("~").join("");
a = a.split(";").join("");
a = a.split(" ").join("+");
return a;
}

var _hbEC=0,_hbE=new Array;function _hbEvent(a,b){b=_hbE[_hbEC++]=new
Object();b._N=a;b._C=0;return b;}
var hbx=_hbEvent("pv");hbx.vpc="HBX0140s";hbx.gn="ehg-
memecinc.hitbox.com";

//BEGIN EDITABLE SECTION
//CONFIGURATION VARIABLES
hbx.acct="DM551006HAMC;DM560223K8CS";//EM/GLOBAL
hbx.pn=hbxStrip("xxxxxxxxx");//PAGE NAME(S)
hbx.mlc="CONTENT+CATEGORY";//MULTI-LEVEL CONTENT CATEGORY
hbx.pndef="title";//DEFAULT PAGE NAME
hbx.ctdef="full";//DEFAULT CONTENT CATEGORY

//OPTIONAL PAGE VARIABLES
//ACTION SETTINGS
hbx.fv="";//FORM VALIDATION MINIMUM ELEMENTS OR SUBMIT FUNCTION NAME
hbx.lt="auto";//LINK TRACKING
hbx.dlf="n";//DOWNLOAD FILTER
hbx.dft="n";//DOWNLOAD FILE NAMING
hbx.elf="n";//EXIT LINK FILTER

//SEGMENTS AND FUNNELS
hbx.seg="";//VISITOR SEGMENTATION
hbx.fnl="";//FUNNELS

//CAMPAIGNS
hbx.cmp="";//CAMPAIGN ID
hbx.cmpn="";//CAMPAIGN ID IN QUERY
hbx.dcmp="";//DYNAMIC CAMPAIGN ID
hbx.dcmpn="";//DYNAMIC CAMPAIGN ID IN QUERY
hbx.dcmpe="";//DYNAMIC CAMPAIGN EXPIRATION
hbx.dcmpre="";//DYNAMIC CAMPAIGN RESPONSE EXPIRATION
hbx.hra="";//RESPONSE ATTRIBUTE
hbx.hqsr="";//RESPONSE ATTRIBUTE IN REFERRAL QUERY
hbx.hqsp="";//RESPONSE ATTRIBUTE IN QUERY
hbx.hlt="";//LEAD TRACKING
hbx.hla="";//LEAD ATTRIBUTE
hbx.gp="";//CAMPAIGN GOAL
hbx.gpn="";//CAMPAIGN GOAL IN QUERY
hbx.hcn="";//CONVERSION ATTRIBUTE
hbx.hcv="";//CONVERSION VALUE
hbx.cp="null";//LEGACY CAMPAIGN
hbx.cpd="";//CAMPAIGN DOMAIN

//CUSTOM VARIABLES
hbx.ci="";//CUSTOMER ID
hbx.hc1="";//CUSTOM 1
hbx.hc2="";//CUSTOM 2
hbx.hc3="";//CUSTOM 3
hbx.hc4="";//CUSTOM 4
hbx.hrf="";//CUSTOM REFERRER
hbx.pec="";//ERROR CODES

//INSERT CUSTOM EVENTS

//END EDITABLE SECTION
</script>

<script language="javascript1.1">
//hbx.js,HBX1.5,COPYRIGHT 1997-2005 WEBSIDESTORY,INC. ALL RIGHTS
RESERVED. U.S.PATENT No.6,393,479B1 & 6,766,370. INFO:http://
websidestory.com/privacy
var _vjs="HBX0150.01s";
var
_dl=".exe,.zip,.wav,.wmv,.mp3,.mov,.mpg,.avi,.doc, .pdf,.xls,.ppt,.gz";
function _NA(a){return new Array(a?a:0)}function _NO(){return new
Object()}
var
_mn=_hbq="",_hbA=_NA(),_hud="undefined",_lv=_NO(), _ec=_if=_ll=_hec=_hfs=_hfc=_fvf=_ic=_pC=_fc=_pv=0, _hbi=new
Image(),_hbin=_NA(),_pA=_NA();
_lv.id=_lv.pos=_lv.l="";_hbE=_D("hbE")?_hbE:"";_hb EC=_D("hbEC")?_hbEC:
0;var _ex="expires=Wed, 1 Jan 2020 00:00:00
GMT",_lvm=150,_lidt="lid",_lpost="lpos";
function _D(v){return(typeof eval("window._"+v)!=_hud)?
eval("window._"+v):""}function _DD(v){return(typeof v!=_hud)?1:0}
function _A(v,c){return escape((_D("lc")=="y"&&_DD(c))?_TL(v):v)}
function _B(){return 0}function _GP(){return
location.protocol=="https:"?"https://":"http://"}
function _IC(a,b,c){return a.charAt(b)==c?1:0}function _II(a,b,c)
{return a.indexOf(b,c?c:0)}function _IL(a){return a!=_hud?a.length:0}
function _IF(a,b,c){return a.lastIndexOf(b,c?c:_IL(a))}function
_IP(a,b){return a.split(b)}
function _IS(a,b,c){return b>_IL(a)?"":a.substring(b,c!=null?
c:_IL(a))}
function _RP(a,b,c,d){d=_II(a,b);if(d>-1){a=_RP(_IS(a,0,d)+","+_IS(a,d
+_IL(b),_IL(a)),b,c)}return a}
function _TL(a){return a.toLowerCase()}function _TS(a){return
a.toString()}function _TV(){_hbSend()}function _SV(a,b,c)
{_hbSet(a,b,c)}
function _VS(a,b){eval("_"+a+"='"+b+"'")}
function _VC(a,b,c,d){b=_IP(a,",");for(c=0;c<_IL(b);c++)
{d=_IP(b[c],"|");_VS(d[0],(_D(d[0]))?_D(d[0]):d[1]?d[1]:"")}}
function _VL(a,b){for(a=0;a<_hbEC;a++){_pv=_hbE[a];if(_pv._N=="pv")
{for(b in _pv){if(_EE(b)){_VS(b,_pv[b])}}}}
_VC("pn|PUT+PAGE+NAME+HERE,mlc|CONTENT+CATEGORY,el f|n,dlf|n,dft|
n,pndef|title,ctdef|full,cp|null,hcn|")}_VL();
function _ER(a,b,c){_hbi.src=_GP()+_gn+"/HG?hc="+_mn+"&hb="+_A(_acct)
+"&hec=1&vjs="+_vjs+"&vpc=ERR&ec=1&err="+((type of a=="string")?_A(a
+"-"+c):"Unknown")}
function _EE(a){return(a!="_N"&&a!="_C")?1:0}_EV(window,"er ror",_ER);
function _hbSend(c,a,i){a="";_hec++;for(i in _hbA)if(typeof _hbA[i]!
="function")a+="&"+i+"="+_hbA[i];_Q(_hbq+"&hec="+_hec+a
+_hbSendEV());_hbA=_NA()}
function _hbSet(a,b,c,d,e){d=_II(_hbq,"&"+a+"=");if(d>-1)
{e=_II(_hbq,"&",d+1);e=e>d?e:_IL(_hbq);if(a=="n"|| a=="vcon")
{_hbq=_IS(_hbq,0,d)+"&"+a+"="+b+
_IS(_hbq,e);_hec=-1;if(a=="n"){_pn=b}else{_mlc=b}}else{_hbq=_IS(_hbq ,
0,d)+_IS(_hbq,e)}}if((a!="n")&&(a!="vcon"))_hbA[a]=(c==0)?b:_A(b)}
function _hbRedirect(a,b,c,d,e,f,g)
{_SV("n",a);_SV("vcon",b);if(_DD(d)&&_IL(d)>0){d=_ IC(d,0,"&")?_IS(d,
1,_IL(d)):d;e=_IP(d,"&");for(f=0;f<_IL(e);
f++){g=_IP(e[f],"=");_SV(g[0],g[1])}}_TV();if(c!=""){_SV("hec",
0);setTimeout("location.href='"+c+"'",500)}}
function _hbSendEV(a,b,c,d,e,f,x,i){a='',c='',e=_IL(_hbE);f or(b=0;b<e;b
++){c=_hbE[b];for(var d in c){if(_EE(d)&&c[d].match){x=c[d].match(/\
[\]/g);
if(x!=null&&_IL(x)>c._C)c._C=_IL(x)}}for(d in c){if(_EE(d)&&c[d].match)
{x=c[d].match(/\[\]/g);x=(x==null)?0:_IL(x);for(i=x;i<c._C;i++)c[d]
+="[]"}}}
for(b=0;b<e;b++){c=_hbE[b];for(f=b+1;f<e;f++){if(_hbE[f]!
=null&&c._N==_hbE[f]._N){for(d in c){if(_EE(d)&&_hbE[f]!=null)c[d]
+="[]"+_hbE[f][d];
_hbE[f][d]=""}}}for(d in c){if(_EE(d)&&c._N!=""&&c._N!="pv"){a
+="&"+c._N+"."+d+"="+_RP(_A(c[d]),"%5B
%5D",",")}}}_hbE=_NA();_hbEC=0;return a}
function _hbM(a,b,c,d)
{_SV('n',a);_SV('vcon',b);if(_IL(c)>0)_SV(c,d);_TV ()}
function _hbPageView(p,m){_hec=-1;_hbM(p,m,"")}function _hbExitLink(n)
{_hbM(_pn,_mlc,"el",n)}function _hbDownload(n){_hbM(_pn,_mlc,"fn",n)}
function _hbVisitorSeg(n,p,m){_SV("n",p);_SV("vcon",m);_SV( "seg",n,
1);_TV()}function _hbCampaign(n,p,m){_hbM(p,m,"cmp",n)}
function _hbFunnel(n,p,m){_hbM(p,m,"fnl",n)}function _hbGoalPage(n,p,m)
{_hbM(p,m,"gp",n)}
function _hbLink(a,b,c){_SV("lid",a);if(_DD(b))_SV("lpos",b );_TV()}
function _LE(a,b,c,d,e,f,g,h,i,j,k,l){b="([0-9A-Za-z\\-]*\
\.)",c=location.hostname,d=a.href,h='',i='';eval(" __f=/"+b+"*"+b
+"/");if(_DD(__f)){__f.exec(c);
j=(_DD(_elf))?_elf:"";if(j!="n"){if(_II(j,"!")>-1){h=_IS(j,
0,_II(j,"!"));i=_IS(j,_II(j,"!")
+1,_IL(j))}else{h=j}}k=0;if(_DD(_elf)&&_elf!="n"){
if(_IL(i)){l=_IP(i,",");for(g=0;g<_IL(l);g+
+)if(_II(d,l[g])>-1)return}if(_IL(h)){l=_IP(h,",");for(g=0;g<_IL(h); g+
+)if(_II(d,l[g])>-1)k=1}}
if(_II(a.hostname,RegExp.$2)<0||k){ e=_IL(d)-1;return _IC(d,e,'/')?
_IS(d,0,e):d}}}
function _LD(a,b,c,d,e,f){b=a.pathname,d='',e='';b=_IS(b,_I F(b,"/")
+1,_IL(b));c=(_DD(_dlf))?_dlf:"";if(c!="n"){if(_II (c,"!")>-1){d=","+
_IS(c,0,_II(c,"!"));e=","+_IS(c,_II(c,"!")
+1,_IL(c))}else{d=","+c}}f=_II(b,"?");b=(f>-1)?_IS(b,
0,f):b;if(_IF(b,".")>-1){f=_IS(b,_IF(b,"."),_IL(b));
if(_II(_dl+d,f)>-1&&_II(e,f)<0){var dl=b;if(_DD(_dft))
{if(_dft=="y"&&a.name){dl=a.name}else if(_dft=="full")
{dl=a.pathname}}return dl}}}
function _LP(a,b,c){for(c=0;c<_IL(a);c++){if(b==0)
{if(_IL(_lv.l)<_lvm)_LV(a[c]);else break}else
if(b==1)_EV(a[c],'mousedown',_LT)}}
function _LV(a,b,c){b=_LN(a);c=b[0]+b[1];if(_IL(c)){_lv.id+=_A(b[0])
+",";_lv.pos+=_A(b[1])+",";_lv.l+=c}}
function _LN(a,b,c,d){b=a.href;b+=a.name?
a.name:"";c=_LVP(b,_lidt);d=_LVP(b,_lpost);return[c,d]}
function _LT(e){if((e.which&&e.which==1)||(e.button&&e.butt on==1)){var
a=document.all?window.event.srcElement:this;for(va r i=0;i<4;i++)
{if(a.tagName&&
_TL(a.tagName)!="a"&&_TL(a.tagName)!="area"){a=a.p arentElement}}var
b=_LN(a),c='',d='';a.lid=b[0];a.lpos=b[1];if(_D("lt")&&_lt!="manual")
{if((a.tagName&&
_TL(a.tagName)=="area")){if(!_IL(a.lid)){if(a.pare ntNode)
{if(a.parentNode.name)a.lid=a.parentNode.name;else
a.lid=a.parentNode.id}}if(!_IL(a.lpos))
a.lpos=a.coords}else{if(_IL(a.lid)<1)a.lid=_LS(a.t ext?
a.text:a.innerText?a.innerText:"");if(!_IL(a.lid)| |
_II(_TL(a.lid),"<img")>-1)a.lid=_LI(a)}}
if(!_IL(a.lpos)&&_D("lt")=="auto_pos"&&a.tagName&& _TL(a.tagName)!
="area"){c=document.links;for(d=0;d<_IL(c);d++){if (a==c[d]){a.lpos=d
+1;break}}}
var _f=0,j='',k='',l=(a.protocol)?_TL(a.protocol):"";
if(l&&l!="mailto:"&&l!="javascript:")
{j=_LE(a),k=_LD(a);if(_DD(k))a.fn=k;else if(_DD(j))a.el=j}
if(_D("lt")&&_IC(_lt,0,"n")!=1&&_DD(a.lid)&&_IL(a. lid)>0)
{_SV("lid",a.lid);if(_DD(a.lpos))_SV("lpos",a.lpos );_f=1}if(_DD(a.fn))
{_SV("fn",a.fn);_f=2}
else if(_DD(a.el)){_SV("el",a.el);_f=1}if(_f>0){_TV()}} }
function _LVP(a,b,c,d,e){c=_II(a,"&"+b+"=");c=c<0?_II(a,"?" +b
+"="):c;if(c>-1){d=_II(a,'&',c+_IL(b)+2);e=_IS(a,c+_IL(b)+2,d>-1?
d:_IL(a));
if(!_ec){if(!(_II(e,"//")==0))return e}else return e}return ""}
function _LI(a){var
b=""+a.innerHTML,bu=_TL(b),i=_II(bu,"<img");if(bu& &i>-1){eval("__f=/
src\s*=\s*['\"]?([^'\" ]+)['\"]?/i");__f.exec(b);
if(RegExp.$1)b=RegExp.$1}return b}
function _LSP(a,b,c,d){d=_IP(a,b);return d.join(c)}
function _LS(a,b,c,d,e,f,g){c=_D("lim")?_lim:100;b=(_IL(a)> c)?_A(_IS(a,
0,c)):_A(a);b=_LSP(b,"%0A","%20");b=_LSP(b,"%0D"," %20");b=_LSP(b,"%09","%20");
c=_IP(b,"%20");d=_NA();e=0;for(f=0;f<_IL(c);f++)
{g=_RP(c[f],"%20","");if(_IL(g)>0){d[e++]=g}}b=d.join("%20");return
unescape(b)}
function _EM(a,b,c,d)
{a=_D("fv");b=_II(a,";"),c=parseInt(a);d=3;if(_TL( a)=="n")
{d=999;_fv=""}else if(b>-1){d=_IS(a,0,b);_fv=_IS(a,b+1,_IL(a))}
else if(c>0){d=c;_fv=""}return d}
function _FF(e){var a=(_bnN)?this:_EVO(e);_hlf=(a.lf)?a.lf:""}
function _FU(e){if(_hfs==0&&_IL(_hlf)>0&&_fa==1){_hfs=1;if( _hfc)
{_SV("sf","1")}else if(_IL(_hlf)>0)
{_SV("lf",_hlf)}_TV();_hlf="",_hfs=0,_hfc=0}}
function _FO(e){var
a=true;if(_DD(this._FS))eval("try{a=this._FS()}cat ch(e){}");if(a!
=false)_hfc=1;return a}
function _FA(a,b,c,d,e,f,g,h,i,ff,fv,s){b=a.forms;ff=new
Object();f=_EM();for(c=0;c<_IL(b);c++)
{ff=b[c],d=0,s=0,e=ff.elements,fv=eval(_D("fv"));
if(_DD(fv)&&_TL(_TS(fv))!="n"&&fv!=""&&typeof fv=="function"){_fv=new
Function("if("+_fv+"())
{_fvf=0;_hfc=1}");_EV(ff,"submit",_fv),_fvf=1,_fa= 1}
g=ff.name?ff.name:"forms["+c+"]";for(h=0;h<_IL(e);h++)
{if(e[h].type&&"hiddenbuttonsubmitimagereset".indexOf(e[h].type)<0&&d+
+>=f)break}if(d>=f){_fa=1;
for(h=0;h<_IL(e);h++)
{i=e[h];if(i.type&&"hiddenbuttonsubmitimagereset".indexOf (i.type)<0)
{i.lf=g+".";i.lf+=(i.name&&i.name!="")?i.name:"ele ments["+h+"]";
_EV(i,"focus",_FF)}}ff._FS=null;ff._FS=ff.onsubmit ;if(_DD(ff._FS)&&ff._FS!
=null){ff.onsubmit=_FO}else if(!(_bnN&&_bv<5)&&_hM&&!(_bnI&&!_I5))
{if((!_bnI)||
(_II(navigator.userAgent,"Opera")>-1))
{ff.onsubmit=_FO}else{_EV(ff,"submit",_FO);
eval("try{document.forms["+c+"]._FS=document.forms["+c
+"].submit;document.forms["+c+"].submit=_FO;throw ''}catch(E){}")}}}}}
function _GR(a,b,c,d){if(!_D("hrf"))return a;if(_II(_hrf,"http",
0)>-1)return _hrf;b=window.location.search;b=_IL(b)>1?_IS(b,
1,_IL(b)):"";
c=_II(b,_hrf+"=");if(c>-1){ d=_II(b,"&",c+1);d=d>c?d:_IL(b);b=_IS(b,c
+_IL(_hrf)+1,d)}return(b!=_hud&&_IL(b)>0)?b:a}
function _PO(a,b,c,d,e,f,g){d=location,e=d.pathname,f=_IS(e ,_IF(e,"/")
+1),g=document.title;if(a&&b==c){return(_pndef=="t itle"&&g!=""&&g!=d&&
!(_bnN&&_II(g,"http")>0))?g:f?f:_pndef}else{return b==c?(e==""||
e=="/")?"/":_IS(e,(_ctdef!="full")?
_IF(e,"/",_IF(e,"/")-2):_II(e,"/"),_IF(e,"/"))
:(b=="/")?b:((_II(b,"/")?"/":"")+(_IF(b,"/")==_IL(b)-1?_IS(b,
0,_IL(b)-1):b))}}
function _PP(a,b,c,d){return ""+(c>-1?_PO(b,_IS(a,0,c),d)
+";"+_PP(_IS(a,c+1),b,_II(_IS(a,c+1),";")):_PO(b,a ,d))}
_mlc=_PP(_mlc,0,_II( _mlc,";"),"CONTENT+CATEGORY");_pn=_PP(_pn,
1,_II(_pn,";"),"PUT+PAGE+NAME+HERE");
function _NN(a){return _D(a)!="none"}if(_NN("lt")){_LP(document.links,
0)}
function _E(a){var b="";var d=_IP(a,",");for(var c=0;c<_IL(d);c++)b
+="&"+d[c]+"="+_A(_D(d[c]));return b}
function _F(a,b){return(!_II(a,"?"+b+"="))?0:_II(a,"&"+b+"= ")}function
_G(a,b,c,d){var e=_F(a,b);if(d&&e<0&&top&&window!=top){e=_F(_tls,b );
if(e>-1)a=_tls};return(e>-1)?_IS(a,e+2+_IL(b),(_II(a,"&",e+1)>-1)?
_II(a,"&",e+1):_IL(a)):c}
function _H(a,b,c){if(!a)a=c;if(_I5||_N6)
{eval("try{_vv=_G(location.search,'"+a+"','"+b+"', 1)}"+__c
+"{}")}else{_vv=_G(location.search,a,b,1)}retur n unescape(_vv)}
function _I(a,b,c,d){__f=_IS(a,_II(a,"?"));if(b){if(_I5||_N 6)
{eval("try{_hra=_G(__f,_hqsr,_hra,0)}"+__c
+"{}")}else{_hra=_G(__f,_hqsr,_hra,0)}};
if(c&&!_hra){if(_I5||_N6){eval("try{_hra=_G(locati on.search,_hqsp,_hra,
1)}"+__c+"{}")}else{_hra=_G(location.search,_hqsp, _hra,1)}};
if(d&&!_hra)_hra=d;return _hra}function _J(a,b,c,d)
{c=_II(a,"CP=");d=_II(a,b,c+3);return(c<0)?"null": _IS(a,c+3,(d<0)?
_IL(a):d)}
var
__r=".referrer",_rf=_A(eval("document"+__r)),_et=0 ,_oe=0,_we=0,_ar="",_hM=(!
(_II(navigator.userAgent,"Mac")>-1)),_tls="";
_bv=parseInt(navigator.appVersion);_bv=(_bv>99)?(_ bv/100):_bv;var
__f,_hrat=_D("hra"),_hra="",__c="catch(_e)",_hbi=n ew
Image(),_fa=0,_hlfs=0,_hoc=0,
_hlf='',_ce='',_ln='',_pl='',_bn=navigator.appName ,_bn=(_II(_bn,"Microsoft")?
_bn:"MSIE"),_bnN=(_bn=="Netscape"),_bnI=(_bn=="MSI E"),
_hck="*; path=/; "+(_D("cpd")&&_D("cpd")!=""?(" domain=."+_D("cpd")+";
"):"")
+_ex,_N6=(_bnN&&_bv>4),_I5=false,_ss="na",_sc="na" ,_sv=11,_cy="u",_hp="u",
_tp=_D("ptc");if(_bn=="MSIE"){var
_nua=navigator.userAgent,_is=_II(_nua,_bn),_if=_II (_nua,".",_is);if(_if>_is)_I5=_nua.substring(_is
+5,_if)>=5}
if(_N6||_I5)eval("try{_tls=top.location.search}cat ch(_e){}")
function _PV()
{_dcmpe=_H(_D("dcmpe"),_D("dcmpe"),"DCMPE");_dcmpr e=_H(_D("dcmpre"),_D("dcmpre"),"DCMPRE");_vv="";_c mp=_H(_D("cmpn"),_D("cmp"),"CMP");
_gp=_H(_D("gpn"),_D("gp"),"GP");_dcmp=_H(_D("dcmpn "),_D("dcmp"),"DCMP");if(_II(_cmp,"SFS-")>-1)
{document.cookie="HBCMP="+_cmp+"; path=/;"+
(_D("cpd")&&_D("cpd")!=""?(" domain=."+_D("cpd")+"; "):"")
+_ex}if(_bnI&&_bv>3)_ln=navigator.userLanguage;
if(_bnN){if(_bv>3)_ln=navigator.language;if(_bv>2) for(var
i=0;i<_IL(navigator.plugins);i++)_pl+=navigator.pl ugins[i].name
+":"};_cp=_D("cp");
if(location.search&&_TL(_cp)=="null")_cp=_J(locati on.search,"&");if(_II(document.cookie,"CP=")>-1)
{
_ce="y";_hd=_J(document.cookie,"*");if(_TL(_hd)!=" null"&&_cp=="null")
{_cp=_hd}else{document.cookie="CP="+_cp
+_hck}}else{document.cookie="CP="+_cp+_hck;
_ce=(_II(document.cookie,"CP=")>-1)?"y":"n"};if(window.screen)
{_sv=12;_ss=screen.width+"*"+screen.height;_sc=_bn I?
screen.colorDepth:screen.pixelDepth;
if(_sc==_hud)_sc="na"};_ra=_NA();if(_ra.toSource||
(_bnI&&_ra.shift))_sv=13;if(_I5&&_hM)
{if(_II(""+navigator.appMinorVersion,"Privacy")>-1)_ce="p";
if(document.body&&document.body.addBehavior)
{document.body.addBehavior("#default#homePage");_h p=document.body.isHomePage(location.href)?"y":"n";
document.body.addBehavior("#default#clientCaps");_ cy=document.body.connectionType}};var
_hcc=(_DD(_hcn))?_D("hcv"):"";if(!_D("gn"))_gn="eh g.hitbox.com";
if(_D("ct")&&!_D("mlc"))_mlc=_ct;_ar=_GP()+_gn+"/HG?hc="+_mn
+"&hb="+_A(_acct)+"&cd=1&hv=6&n="+_A(_pn,1)+"&con= &vcon="+_A(_mlc,
1)+"&tt="+_D("lt")+
"&ja="+(navigator.javaEnabled()?"y":"n")+"&dt="+(n ew Date()).getHours()
+"&zo="+(new Date()).getTimezoneOffset()
+"&lm="+Date.parse(document.lastModified)
+(_tp?("&pt="+_tp):"")+_E((_bnN?"bn,":"")
+"ce,ss,sc,sv,cy,hp,ln,vpc,vjs,hec,pec,cmp,gp,dcmp ,dcmpe,dcmpre,cp,fnl")
+"&seg="+_D("seg")+"&epg="+_D("epg")+
"&cv="+_A(_hcc)+"&gn="+_A(_D("hcn"))+"&ld="+_A(_D( "hlt"))
+"&la="+_A(_D("hla"))+"&c1="+_A(_D("hc1"))+"&c2="+ _A(_D("hc2"))
+"&c3="+_A(_D("hc3"))+"&c4="+
_A(_D("hc4"))+"&customerid="+_A(_D("ci")?_ci:_D("c id"))
+"&lv.id="+_lv.id+"&lv.pos="+_lv.pos+"&ttt="+_lidt +","+_lpost;
if(_I5||_N6){eval("try{_rf=_A(top.document"+__r+") +''}"+__c
+"{_rf=_A(document"+__r+")+''}")}
else{if(top.document&&_IL(parent.frames)>1)
{_rf=_A(eval("document"+__r))+""}else if(top.document)
{_rf=_A(eval("top.document"+__r))+""}}if((_rf==_hu d)||
(_rf==""))_rf="bookmark";_rf=unescape(_rf);_rf=_GR (_rf);_hra=_I(_rf,_D("hqsr"),_D("hqsp"),_hrat);_ar
+="&ra="+_A(_hra)+"&rf="+_A(_IS(_rf,0,500))+
"&pl="+_A(_pl)+_hbSendEV();if(_D("onlyMedia")!="y" )_hbi.src=_ar
+"&hid="+Math.random();_hbq=_IS(_ar,
0,_II(_ar,"&hec"));_hbE=_NA()}_PV();
function _Q(a){var b="";b=new Image();b.src=a+"&hid="+Math.random()}
function __X(a){if(_ec==0){_ec=1;a=document;if(_NN("lt")||_ NN("dlf")||
_NN("elf")){_LP(a.links,1)}if(_NN("fv"))_FA(a)}}
function _EV(a,b,c){if(a.addEventListener)
{a.addEventListener(b,c,false)}else if(a.attachEvent)
{a.attachEvent("on"+b,c)}}
function _EVO(e){return document.all?window.event.srcElement:this}
_EV(window,"load",__X);_EV(window,"unload",_FU);ev al('setTimeout("__X()",
3000)');
</script>
<!--END WEBSIDESTORY CODE-->

Apr 6 '07 #3

P: n/a
Aaron wrote:
On Apr 6, 10:55 am, Erwin Moller
<since_humans_read_this_I_am_spammed_too_m...@spam yourself.comwrote:
>Aaron wrote:
>>I'm trying to parse a table on a webpage to pull down some data I
need. The page is based off of information entered into a form. when
you submit the data from the form it displays a "Searching..." page
then, refreshes and displays the table I want. I have code that grabs
data from the page using cURL but when I look at the data it contains
the "Searching..." page and not the table that I want. below is the
code i have so far....Thanks in advance for any help.
<?php
$url="http://www.website.com";
$post_data = array();
$post_data['postvar1'] = "val1";
$post_data['postvar2'] = "val2";
$o="";
foreach($post_data as $k=>$v)
{
$o.= "$k=".utf8_encode($v)."&";
}
$post_data=substr($o,0,-1);
$ch= curl_init();
curl_setopt($ch, CURLOPT_POST,1);
curl_setopt($ch, CURLOPT_HEADER,0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_URL,$url);
curl_setopt($ch, CURLOPT_POSTFIELDS, $post_data);
$result = curl_exec($ch);
curl_close($ch);
$result=explode("\n",$result);
?>
Hi,

The page is probably using JavaScript to give that effect.
Inspect the content of $result to see if this is the case.
Look for divs that are not visible and made visible when the page loads (end
of script, or an onLoad event).
The data you want might very well be inside the $result.

If not, give more information WHAT the $result contained.

Regards,
Erwin Moller- Hide quoted text -

- Show quoted text -

Heres basicly what was returned by $result
<snipped lots of code>

Erwin is correct. That's using a LOT of javascript, plus it's using
frames. This one is going to be very tough to scrape - you'll need to
decode what the javascript does and emulate it with Curl to get the page.

But it may also be that the webmaster implemented this in part to keep
anyone from scraping the screen. Most sites do not like this.

You'd be better off contacting the owner and seeing if there is another
way to get the information.

--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
js*******@attglobal.net
==================
Apr 6 '07 #4

This discussion thread is closed

Replies have been disabled for this discussion.