472,780 Members | 1,233 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 472,780 software developers and data experts.

parsing website with a "searching...." page

I'm trying to parse a table on a webpage to pull down some data I
need. The page is based off of information entered into a form. when
you submit the data from the form it displays a "Searching..." page
then, refreshes and displays the table I want. I have code that grabs
data from the page using cURL but when I look at the data it contains
the "Searching..." page and not the table that I want. below is the
code i have so far....Thanks in advance for any help.

<?php

$url="http://www.website.com";

$post_data = array();
$post_data['postvar1'] = "val1";
$post_data['postvar2'] = "val2";
$o="";
foreach($post_data as $k=>$v)
{
$o.= "$k=".utf8_encode($v)."&";
}

$post_data=substr($o,0,-1);
$ch= curl_init();
curl_setopt($ch, CURLOPT_POST,1);
curl_setopt($ch, CURLOPT_HEADER,0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_URL,$url);
curl_setopt($ch, CURLOPT_POSTFIELDS, $post_data);
$result = curl_exec($ch);
curl_close($ch);
$result=explode("\n",$result);

?>

Apr 6 '07 #1
3 3205
Aaron wrote:
I'm trying to parse a table on a webpage to pull down some data I
need. The page is based off of information entered into a form. when
you submit the data from the form it displays a "Searching..." page
then, refreshes and displays the table I want. I have code that grabs
data from the page using cURL but when I look at the data it contains
the "Searching..." page and not the table that I want. below is the
code i have so far....Thanks in advance for any help.

<?php

$url="http://www.website.com";

$post_data = array();
$post_data['postvar1'] = "val1";
$post_data['postvar2'] = "val2";
$o="";
foreach($post_data as $k=>$v)
{
$o.= "$k=".utf8_encode($v)."&";
}

$post_data=substr($o,0,-1);
$ch= curl_init();
curl_setopt($ch, CURLOPT_POST,1);
curl_setopt($ch, CURLOPT_HEADER,0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_URL,$url);
curl_setopt($ch, CURLOPT_POSTFIELDS, $post_data);
$result = curl_exec($ch);
curl_close($ch);
$result=explode("\n",$result);

?>
Hi,

The page is probably using JavaScript to give that effect.
Inspect the content of $result to see if this is the case.
Look for divs that are not visible and made visible when the page loads (end
of script, or an onLoad event).
The data you want might very well be inside the $result.

If not, give more information WHAT the $result contained.

Regards,
Erwin Moller
Apr 6 '07 #2
On Apr 6, 10:55 am, Erwin Moller
<since_humans_read_this_I_am_spammed_too_m...@spam yourself.comwrote:
Aaron wrote:
I'm trying to parse a table on a webpage to pull down some data I
need. The page is based off of information entered into a form. when
you submit the data from the form it displays a "Searching..." page
then, refreshes and displays the table I want. I have code that grabs
data from the page using cURL but when I look at the data it contains
the "Searching..." page and not the table that I want. below is the
code i have so far....Thanks in advance for any help.
<?php
$url="http://www.website.com";
$post_data = array();
$post_data['postvar1'] = "val1";
$post_data['postvar2'] = "val2";
$o="";
foreach($post_data as $k=>$v)
{
$o.= "$k=".utf8_encode($v)."&";
}
$post_data=substr($o,0,-1);
$ch= curl_init();
curl_setopt($ch, CURLOPT_POST,1);
curl_setopt($ch, CURLOPT_HEADER,0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_URL,$url);
curl_setopt($ch, CURLOPT_POSTFIELDS, $post_data);
$result = curl_exec($ch);
curl_close($ch);
$result=explode("\n",$result);
?>

Hi,

The page is probably using JavaScript to give that effect.
Inspect the content of $result to see if this is the case.
Look for divs that are not visible and made visible when the page loads (end
of script, or an onLoad event).
The data you want might very well be inside the $result.

If not, give more information WHAT the $result contained.

Regards,
Erwin Moller- Hide quoted text -

- Show quoted text -
Heres basicly what was returned by $result

<!-- Vignette V6 Fri Apr 06 09:56:47 2007 -->
<script language="javascript">

function changePassword() {

window.open('https://www.secure-hallmark.xxxxx.com/ldap/
em_change_password.asp', 'ChangePassword',
'height=250,width=400,screenX=100,screenY=100');
}

function forgotPassword() {

window.open('/auth/forgotpasswordform/0,2290,,00.html',
'ForgotPassword', 'height=250,width=400,screenX=100,screenY=100');
}
</script><html><head><title>xxxxxxx</title>
<!--INCLUDE VIRTUAL="/ast/functions.asp" -->

<script language="Javascript">
//Code only to exsist if ONTCred Cookie is missing
//Check every 15 sec for ONTCred and reload page if it
exsists. Only once though
var intRunOnce = 0;
function fctCheckONTcred() {
var allcookies = "";
allcookies = document.cookie;
var vgnONTpos =
allcookies.indexOf("ONTCred=");
if (vgnONTpos != -1){
if (intRunOnce == 0) {
topFrame.location.reload();
intRunOnce = 1;
}
}else{
setTimeout('fctCheckONTcred()',
15000);
}
}
</script>

</head>
<frameset rows="125,*" frameborder="NO" border="0" noresize
framespacing="0">

<frame name="topFrame" target="_top"
onload="setTimeout('fctCheckONTcred()', 15000);" scrolling="NO"
noresize src="xxxxx.html" marginheight="0" marginwidth="0">

<frame name="mainFrame" src="https://xxxxxxx.com/webapp/wcs/
stores/servlet/RemoteAdvancedSearchView?
langId=-1&storeId=500201&catalogId=500201&manufacturer=bav 99&qasFilter=FALSE"
nor
esize marginheight="0" marginwidth="0">
</frameset>
</html>
<!--WEBSIDESTORY CODE HBX1.4 (SSL)-->
<!-- EM SSL - 25/Apr/2006 14:40 -->
<!--COPYRIGHT 1997-2005 WEBSIDESTORY,INC. ALL RIGHTS RESERVED.
U.S.PATENT No.6,393,479B1 & 6,766,370. INFO:http://websidestory.com/
privacy-->
<script language="javascript">

function hbxStrip(a)
{
a = a.split("|").join("");
a = a.split("&").join("");
a = a.split("'").join("");
a = a.split("#").join("");
a = a.split("$").join("");
a = a.split("%").join("");
a = a.split("^").join("");
a = a.split("*").join("");
a = a.split(":").join("");
a = a.split("!").join("");
a = a.split("<").join("");
a = a.split(">").join("");
a = a.split("~").join("");
a = a.split(";").join("");
a = a.split(" ").join("+");
return a;
}

var _hbEC=0,_hbE=new Array;function _hbEvent(a,b){b=_hbE[_hbEC++]=new
Object();b._N=a;b._C=0;return b;}
var hbx=_hbEvent("pv");hbx.vpc="HBX0140s";hbx.gn="ehg-
memecinc.hitbox.com";

//BEGIN EDITABLE SECTION
//CONFIGURATION VARIABLES
hbx.acct="DM551006HAMC;DM560223K8CS";//EM/GLOBAL
hbx.pn=hbxStrip("xxxxxxxxx");//PAGE NAME(S)
hbx.mlc="CONTENT+CATEGORY";//MULTI-LEVEL CONTENT CATEGORY
hbx.pndef="title";//DEFAULT PAGE NAME
hbx.ctdef="full";//DEFAULT CONTENT CATEGORY

//OPTIONAL PAGE VARIABLES
//ACTION SETTINGS
hbx.fv="";//FORM VALIDATION MINIMUM ELEMENTS OR SUBMIT FUNCTION NAME
hbx.lt="auto";//LINK TRACKING
hbx.dlf="n";//DOWNLOAD FILTER
hbx.dft="n";//DOWNLOAD FILE NAMING
hbx.elf="n";//EXIT LINK FILTER

//SEGMENTS AND FUNNELS
hbx.seg="";//VISITOR SEGMENTATION
hbx.fnl="";//FUNNELS

//CAMPAIGNS
hbx.cmp="";//CAMPAIGN ID
hbx.cmpn="";//CAMPAIGN ID IN QUERY
hbx.dcmp="";//DYNAMIC CAMPAIGN ID
hbx.dcmpn="";//DYNAMIC CAMPAIGN ID IN QUERY
hbx.dcmpe="";//DYNAMIC CAMPAIGN EXPIRATION
hbx.dcmpre="";//DYNAMIC CAMPAIGN RESPONSE EXPIRATION
hbx.hra="";//RESPONSE ATTRIBUTE
hbx.hqsr="";//RESPONSE ATTRIBUTE IN REFERRAL QUERY
hbx.hqsp="";//RESPONSE ATTRIBUTE IN QUERY
hbx.hlt="";//LEAD TRACKING
hbx.hla="";//LEAD ATTRIBUTE
hbx.gp="";//CAMPAIGN GOAL
hbx.gpn="";//CAMPAIGN GOAL IN QUERY
hbx.hcn="";//CONVERSION ATTRIBUTE
hbx.hcv="";//CONVERSION VALUE
hbx.cp="null";//LEGACY CAMPAIGN
hbx.cpd="";//CAMPAIGN DOMAIN

//CUSTOM VARIABLES
hbx.ci="";//CUSTOMER ID
hbx.hc1="";//CUSTOM 1
hbx.hc2="";//CUSTOM 2
hbx.hc3="";//CUSTOM 3
hbx.hc4="";//CUSTOM 4
hbx.hrf="";//CUSTOM REFERRER
hbx.pec="";//ERROR CODES

//INSERT CUSTOM EVENTS

//END EDITABLE SECTION
</script>

<script language="javascript1.1">
//hbx.js,HBX1.5,COPYRIGHT 1997-2005 WEBSIDESTORY,INC. ALL RIGHTS
RESERVED. U.S.PATENT No.6,393,479B1 & 6,766,370. INFO:http://
websidestory.com/privacy
var _vjs="HBX0150.01s";
var
_dl=".exe,.zip,.wav,.wmv,.mp3,.mov,.mpg,.avi,.doc, .pdf,.xls,.ppt,.gz";
function _NA(a){return new Array(a?a:0)}function _NO(){return new
Object()}
var
_mn=_hbq="",_hbA=_NA(),_hud="undefined",_lv=_NO(), _ec=_if=_ll=_hec=_hfs=_hfc=_fvf=_ic=_pC=_fc=_pv=0, _hbi=new
Image(),_hbin=_NA(),_pA=_NA();
_lv.id=_lv.pos=_lv.l="";_hbE=_D("hbE")?_hbE:"";_hb EC=_D("hbEC")?_hbEC:
0;var _ex="expires=Wed, 1 Jan 2020 00:00:00
GMT",_lvm=150,_lidt="lid",_lpost="lpos";
function _D(v){return(typeof eval("window._"+v)!=_hud)?
eval("window._"+v):""}function _DD(v){return(typeof v!=_hud)?1:0}
function _A(v,c){return escape((_D("lc")=="y"&&_DD(c))?_TL(v):v)}
function _B(){return 0}function _GP(){return
location.protocol=="https:"?"https://":"http://"}
function _IC(a,b,c){return a.charAt(b)==c?1:0}function _II(a,b,c)
{return a.indexOf(b,c?c:0)}function _IL(a){return a!=_hud?a.length:0}
function _IF(a,b,c){return a.lastIndexOf(b,c?c:_IL(a))}function
_IP(a,b){return a.split(b)}
function _IS(a,b,c){return b>_IL(a)?"":a.substring(b,c!=null?
c:_IL(a))}
function _RP(a,b,c,d){d=_II(a,b);if(d>-1){a=_RP(_IS(a,0,d)+","+_IS(a,d
+_IL(b),_IL(a)),b,c)}return a}
function _TL(a){return a.toLowerCase()}function _TS(a){return
a.toString()}function _TV(){_hbSend()}function _SV(a,b,c)
{_hbSet(a,b,c)}
function _VS(a,b){eval("_"+a+"='"+b+"'")}
function _VC(a,b,c,d){b=_IP(a,",");for(c=0;c<_IL(b);c++)
{d=_IP(b[c],"|");_VS(d[0],(_D(d[0]))?_D(d[0]):d[1]?d[1]:"")}}
function _VL(a,b){for(a=0;a<_hbEC;a++){_pv=_hbE[a];if(_pv._N=="pv")
{for(b in _pv){if(_EE(b)){_VS(b,_pv[b])}}}}
_VC("pn|PUT+PAGE+NAME+HERE,mlc|CONTENT+CATEGORY,el f|n,dlf|n,dft|
n,pndef|title,ctdef|full,cp|null,hcn|")}_VL();
function _ER(a,b,c){_hbi.src=_GP()+_gn+"/HG?hc="+_mn+"&hb="+_A(_acct)
+"&hec=1&vjs="+_vjs+"&vpc=ERR&ec=1&err="+((type of a=="string")?_A(a
+"-"+c):"Unknown")}
function _EE(a){return(a!="_N"&&a!="_C")?1:0}_EV(window,"er ror",_ER);
function _hbSend(c,a,i){a="";_hec++;for(i in _hbA)if(typeof _hbA[i]!
="function")a+="&"+i+"="+_hbA[i];_Q(_hbq+"&hec="+_hec+a
+_hbSendEV());_hbA=_NA()}
function _hbSet(a,b,c,d,e){d=_II(_hbq,"&"+a+"=");if(d>-1)
{e=_II(_hbq,"&",d+1);e=e>d?e:_IL(_hbq);if(a=="n"|| a=="vcon")
{_hbq=_IS(_hbq,0,d)+"&"+a+"="+b+
_IS(_hbq,e);_hec=-1;if(a=="n"){_pn=b}else{_mlc=b}}else{_hbq=_IS(_hbq ,
0,d)+_IS(_hbq,e)}}if((a!="n")&&(a!="vcon"))_hbA[a]=(c==0)?b:_A(b)}
function _hbRedirect(a,b,c,d,e,f,g)
{_SV("n",a);_SV("vcon",b);if(_DD(d)&&_IL(d)>0){d=_ IC(d,0,"&")?_IS(d,
1,_IL(d)):d;e=_IP(d,"&");for(f=0;f<_IL(e);
f++){g=_IP(e[f],"=");_SV(g[0],g[1])}}_TV();if(c!=""){_SV("hec",
0);setTimeout("location.href='"+c+"'",500)}}
function _hbSendEV(a,b,c,d,e,f,x,i){a='',c='',e=_IL(_hbE);f or(b=0;b<e;b
++){c=_hbE[b];for(var d in c){if(_EE(d)&&c[d].match){x=c[d].match(/\
[\]/g);
if(x!=null&&_IL(x)>c._C)c._C=_IL(x)}}for(d in c){if(_EE(d)&&c[d].match)
{x=c[d].match(/\[\]/g);x=(x==null)?0:_IL(x);for(i=x;i<c._C;i++)c[d]
+="[]"}}}
for(b=0;b<e;b++){c=_hbE[b];for(f=b+1;f<e;f++){if(_hbE[f]!
=null&&c._N==_hbE[f]._N){for(d in c){if(_EE(d)&&_hbE[f]!=null)c[d]
+="[]"+_hbE[f][d];
_hbE[f][d]=""}}}for(d in c){if(_EE(d)&&c._N!=""&&c._N!="pv"){a
+="&"+c._N+"."+d+"="+_RP(_A(c[d]),"%5B
%5D",",")}}}_hbE=_NA();_hbEC=0;return a}
function _hbM(a,b,c,d)
{_SV('n',a);_SV('vcon',b);if(_IL(c)>0)_SV(c,d);_TV ()}
function _hbPageView(p,m){_hec=-1;_hbM(p,m,"")}function _hbExitLink(n)
{_hbM(_pn,_mlc,"el",n)}function _hbDownload(n){_hbM(_pn,_mlc,"fn",n)}
function _hbVisitorSeg(n,p,m){_SV("n",p);_SV("vcon",m);_SV( "seg",n,
1);_TV()}function _hbCampaign(n,p,m){_hbM(p,m,"cmp",n)}
function _hbFunnel(n,p,m){_hbM(p,m,"fnl",n)}function _hbGoalPage(n,p,m)
{_hbM(p,m,"gp",n)}
function _hbLink(a,b,c){_SV("lid",a);if(_DD(b))_SV("lpos",b );_TV()}
function _LE(a,b,c,d,e,f,g,h,i,j,k,l){b="([0-9A-Za-z\\-]*\
\.)",c=location.hostname,d=a.href,h='',i='';eval(" __f=/"+b+"*"+b
+"/");if(_DD(__f)){__f.exec(c);
j=(_DD(_elf))?_elf:"";if(j!="n"){if(_II(j,"!")>-1){h=_IS(j,
0,_II(j,"!"));i=_IS(j,_II(j,"!")
+1,_IL(j))}else{h=j}}k=0;if(_DD(_elf)&&_elf!="n"){
if(_IL(i)){l=_IP(i,",");for(g=0;g<_IL(l);g+
+)if(_II(d,l[g])>-1)return}if(_IL(h)){l=_IP(h,",");for(g=0;g<_IL(h); g+
+)if(_II(d,l[g])>-1)k=1}}
if(_II(a.hostname,RegExp.$2)<0||k){ e=_IL(d)-1;return _IC(d,e,'/')?
_IS(d,0,e):d}}}
function _LD(a,b,c,d,e,f){b=a.pathname,d='',e='';b=_IS(b,_I F(b,"/")
+1,_IL(b));c=(_DD(_dlf))?_dlf:"";if(c!="n"){if(_II (c,"!")>-1){d=","+
_IS(c,0,_II(c,"!"));e=","+_IS(c,_II(c,"!")
+1,_IL(c))}else{d=","+c}}f=_II(b,"?");b=(f>-1)?_IS(b,
0,f):b;if(_IF(b,".")>-1){f=_IS(b,_IF(b,"."),_IL(b));
if(_II(_dl+d,f)>-1&&_II(e,f)<0){var dl=b;if(_DD(_dft))
{if(_dft=="y"&&a.name){dl=a.name}else if(_dft=="full")
{dl=a.pathname}}return dl}}}
function _LP(a,b,c){for(c=0;c<_IL(a);c++){if(b==0)
{if(_IL(_lv.l)<_lvm)_LV(a[c]);else break}else
if(b==1)_EV(a[c],'mousedown',_LT)}}
function _LV(a,b,c){b=_LN(a);c=b[0]+b[1];if(_IL(c)){_lv.id+=_A(b[0])
+",";_lv.pos+=_A(b[1])+",";_lv.l+=c}}
function _LN(a,b,c,d){b=a.href;b+=a.name?
a.name:"";c=_LVP(b,_lidt);d=_LVP(b,_lpost);return[c,d]}
function _LT(e){if((e.which&&e.which==1)||(e.button&&e.butt on==1)){var
a=document.all?window.event.srcElement:this;for(va r i=0;i<4;i++)
{if(a.tagName&&
_TL(a.tagName)!="a"&&_TL(a.tagName)!="area"){a=a.p arentElement}}var
b=_LN(a),c='',d='';a.lid=b[0];a.lpos=b[1];if(_D("lt")&&_lt!="manual")
{if((a.tagName&&
_TL(a.tagName)=="area")){if(!_IL(a.lid)){if(a.pare ntNode)
{if(a.parentNode.name)a.lid=a.parentNode.name;else
a.lid=a.parentNode.id}}if(!_IL(a.lpos))
a.lpos=a.coords}else{if(_IL(a.lid)<1)a.lid=_LS(a.t ext?
a.text:a.innerText?a.innerText:"");if(!_IL(a.lid)| |
_II(_TL(a.lid),"<img")>-1)a.lid=_LI(a)}}
if(!_IL(a.lpos)&&_D("lt")=="auto_pos"&&a.tagName&& _TL(a.tagName)!
="area"){c=document.links;for(d=0;d<_IL(c);d++){if (a==c[d]){a.lpos=d
+1;break}}}
var _f=0,j='',k='',l=(a.protocol)?_TL(a.protocol):"";
if(l&&l!="mailto:"&&l!="javascript:")
{j=_LE(a),k=_LD(a);if(_DD(k))a.fn=k;else if(_DD(j))a.el=j}
if(_D("lt")&&_IC(_lt,0,"n")!=1&&_DD(a.lid)&&_IL(a. lid)>0)
{_SV("lid",a.lid);if(_DD(a.lpos))_SV("lpos",a.lpos );_f=1}if(_DD(a.fn))
{_SV("fn",a.fn);_f=2}
else if(_DD(a.el)){_SV("el",a.el);_f=1}if(_f>0){_TV()}} }
function _LVP(a,b,c,d,e){c=_II(a,"&"+b+"=");c=c<0?_II(a,"?" +b
+"="):c;if(c>-1){d=_II(a,'&',c+_IL(b)+2);e=_IS(a,c+_IL(b)+2,d>-1?
d:_IL(a));
if(!_ec){if(!(_II(e,"//")==0))return e}else return e}return ""}
function _LI(a){var
b=""+a.innerHTML,bu=_TL(b),i=_II(bu,"<img");if(bu& &i>-1){eval("__f=/
src\s*=\s*['\"]?([^'\" ]+)['\"]?/i");__f.exec(b);
if(RegExp.$1)b=RegExp.$1}return b}
function _LSP(a,b,c,d){d=_IP(a,b);return d.join(c)}
function _LS(a,b,c,d,e,f,g){c=_D("lim")?_lim:100;b=(_IL(a)> c)?_A(_IS(a,
0,c)):_A(a);b=_LSP(b,"%0A","%20");b=_LSP(b,"%0D"," %20");b=_LSP(b,"%09","%20");
c=_IP(b,"%20");d=_NA();e=0;for(f=0;f<_IL(c);f++)
{g=_RP(c[f],"%20","");if(_IL(g)>0){d[e++]=g}}b=d.join("%20");return
unescape(b)}
function _EM(a,b,c,d)
{a=_D("fv");b=_II(a,";"),c=parseInt(a);d=3;if(_TL( a)=="n")
{d=999;_fv=""}else if(b>-1){d=_IS(a,0,b);_fv=_IS(a,b+1,_IL(a))}
else if(c>0){d=c;_fv=""}return d}
function _FF(e){var a=(_bnN)?this:_EVO(e);_hlf=(a.lf)?a.lf:""}
function _FU(e){if(_hfs==0&&_IL(_hlf)>0&&_fa==1){_hfs=1;if( _hfc)
{_SV("sf","1")}else if(_IL(_hlf)>0)
{_SV("lf",_hlf)}_TV();_hlf="",_hfs=0,_hfc=0}}
function _FO(e){var
a=true;if(_DD(this._FS))eval("try{a=this._FS()}cat ch(e){}");if(a!
=false)_hfc=1;return a}
function _FA(a,b,c,d,e,f,g,h,i,ff,fv,s){b=a.forms;ff=new
Object();f=_EM();for(c=0;c<_IL(b);c++)
{ff=b[c],d=0,s=0,e=ff.elements,fv=eval(_D("fv"));
if(_DD(fv)&&_TL(_TS(fv))!="n"&&fv!=""&&typeof fv=="function"){_fv=new
Function("if("+_fv+"())
{_fvf=0;_hfc=1}");_EV(ff,"submit",_fv),_fvf=1,_fa= 1}
g=ff.name?ff.name:"forms["+c+"]";for(h=0;h<_IL(e);h++)
{if(e[h].type&&"hiddenbuttonsubmitimagereset".indexOf(e[h].type)<0&&d+
+>=f)break}if(d>=f){_fa=1;
for(h=0;h<_IL(e);h++)
{i=e[h];if(i.type&&"hiddenbuttonsubmitimagereset".indexOf (i.type)<0)
{i.lf=g+".";i.lf+=(i.name&&i.name!="")?i.name:"ele ments["+h+"]";
_EV(i,"focus",_FF)}}ff._FS=null;ff._FS=ff.onsubmit ;if(_DD(ff._FS)&&ff._FS!
=null){ff.onsubmit=_FO}else if(!(_bnN&&_bv<5)&&_hM&&!(_bnI&&!_I5))
{if((!_bnI)||
(_II(navigator.userAgent,"Opera")>-1))
{ff.onsubmit=_FO}else{_EV(ff,"submit",_FO);
eval("try{document.forms["+c+"]._FS=document.forms["+c
+"].submit;document.forms["+c+"].submit=_FO;throw ''}catch(E){}")}}}}}
function _GR(a,b,c,d){if(!_D("hrf"))return a;if(_II(_hrf,"http",
0)>-1)return _hrf;b=window.location.search;b=_IL(b)>1?_IS(b,
1,_IL(b)):"";
c=_II(b,_hrf+"=");if(c>-1){ d=_II(b,"&",c+1);d=d>c?d:_IL(b);b=_IS(b,c
+_IL(_hrf)+1,d)}return(b!=_hud&&_IL(b)>0)?b:a}
function _PO(a,b,c,d,e,f,g){d=location,e=d.pathname,f=_IS(e ,_IF(e,"/")
+1),g=document.title;if(a&&b==c){return(_pndef=="t itle"&&g!=""&&g!=d&&
!(_bnN&&_II(g,"http")>0))?g:f?f:_pndef}else{return b==c?(e==""||
e=="/")?"/":_IS(e,(_ctdef!="full")?
_IF(e,"/",_IF(e,"/")-2):_II(e,"/"),_IF(e,"/"))
:(b=="/")?b:((_II(b,"/")?"/":"")+(_IF(b,"/")==_IL(b)-1?_IS(b,
0,_IL(b)-1):b))}}
function _PP(a,b,c,d){return ""+(c>-1?_PO(b,_IS(a,0,c),d)
+";"+_PP(_IS(a,c+1),b,_II(_IS(a,c+1),";")):_PO(b,a ,d))}
_mlc=_PP(_mlc,0,_II( _mlc,";"),"CONTENT+CATEGORY");_pn=_PP(_pn,
1,_II(_pn,";"),"PUT+PAGE+NAME+HERE");
function _NN(a){return _D(a)!="none"}if(_NN("lt")){_LP(document.links,
0)}
function _E(a){var b="";var d=_IP(a,",");for(var c=0;c<_IL(d);c++)b
+="&"+d[c]+"="+_A(_D(d[c]));return b}
function _F(a,b){return(!_II(a,"?"+b+"="))?0:_II(a,"&"+b+"= ")}function
_G(a,b,c,d){var e=_F(a,b);if(d&&e<0&&top&&window!=top){e=_F(_tls,b );
if(e>-1)a=_tls};return(e>-1)?_IS(a,e+2+_IL(b),(_II(a,"&",e+1)>-1)?
_II(a,"&",e+1):_IL(a)):c}
function _H(a,b,c){if(!a)a=c;if(_I5||_N6)
{eval("try{_vv=_G(location.search,'"+a+"','"+b+"', 1)}"+__c
+"{}")}else{_vv=_G(location.search,a,b,1)}retur n unescape(_vv)}
function _I(a,b,c,d){__f=_IS(a,_II(a,"?"));if(b){if(_I5||_N 6)
{eval("try{_hra=_G(__f,_hqsr,_hra,0)}"+__c
+"{}")}else{_hra=_G(__f,_hqsr,_hra,0)}};
if(c&&!_hra){if(_I5||_N6){eval("try{_hra=_G(locati on.search,_hqsp,_hra,
1)}"+__c+"{}")}else{_hra=_G(location.search,_hqsp, _hra,1)}};
if(d&&!_hra)_hra=d;return _hra}function _J(a,b,c,d)
{c=_II(a,"CP=");d=_II(a,b,c+3);return(c<0)?"null": _IS(a,c+3,(d<0)?
_IL(a):d)}
var
__r=".referrer",_rf=_A(eval("document"+__r)),_et=0 ,_oe=0,_we=0,_ar="",_hM=(!
(_II(navigator.userAgent,"Mac")>-1)),_tls="";
_bv=parseInt(navigator.appVersion);_bv=(_bv>99)?(_ bv/100):_bv;var
__f,_hrat=_D("hra"),_hra="",__c="catch(_e)",_hbi=n ew
Image(),_fa=0,_hlfs=0,_hoc=0,
_hlf='',_ce='',_ln='',_pl='',_bn=navigator.appName ,_bn=(_II(_bn,"Microsoft")?
_bn:"MSIE"),_bnN=(_bn=="Netscape"),_bnI=(_bn=="MSI E"),
_hck="*; path=/; "+(_D("cpd")&&_D("cpd")!=""?(" domain=."+_D("cpd")+";
"):"")
+_ex,_N6=(_bnN&&_bv>4),_I5=false,_ss="na",_sc="na" ,_sv=11,_cy="u",_hp="u",
_tp=_D("ptc");if(_bn=="MSIE"){var
_nua=navigator.userAgent,_is=_II(_nua,_bn),_if=_II (_nua,".",_is);if(_if>_is)_I5=_nua.substring(_is
+5,_if)>=5}
if(_N6||_I5)eval("try{_tls=top.location.search}cat ch(_e){}")
function _PV()
{_dcmpe=_H(_D("dcmpe"),_D("dcmpe"),"DCMPE");_dcmpr e=_H(_D("dcmpre"),_D("dcmpre"),"DCMPRE");_vv="";_c mp=_H(_D("cmpn"),_D("cmp"),"CMP");
_gp=_H(_D("gpn"),_D("gp"),"GP");_dcmp=_H(_D("dcmpn "),_D("dcmp"),"DCMP");if(_II(_cmp,"SFS-")>-1)
{document.cookie="HBCMP="+_cmp+"; path=/;"+
(_D("cpd")&&_D("cpd")!=""?(" domain=."+_D("cpd")+"; "):"")
+_ex}if(_bnI&&_bv>3)_ln=navigator.userLanguage;
if(_bnN){if(_bv>3)_ln=navigator.language;if(_bv>2) for(var
i=0;i<_IL(navigator.plugins);i++)_pl+=navigator.pl ugins[i].name
+":"};_cp=_D("cp");
if(location.search&&_TL(_cp)=="null")_cp=_J(locati on.search,"&");if(_II(document.cookie,"CP=")>-1)
{
_ce="y";_hd=_J(document.cookie,"*");if(_TL(_hd)!=" null"&&_cp=="null")
{_cp=_hd}else{document.cookie="CP="+_cp
+_hck}}else{document.cookie="CP="+_cp+_hck;
_ce=(_II(document.cookie,"CP=")>-1)?"y":"n"};if(window.screen)
{_sv=12;_ss=screen.width+"*"+screen.height;_sc=_bn I?
screen.colorDepth:screen.pixelDepth;
if(_sc==_hud)_sc="na"};_ra=_NA();if(_ra.toSource||
(_bnI&&_ra.shift))_sv=13;if(_I5&&_hM)
{if(_II(""+navigator.appMinorVersion,"Privacy")>-1)_ce="p";
if(document.body&&document.body.addBehavior)
{document.body.addBehavior("#default#homePage");_h p=document.body.isHomePage(location.href)?"y":"n";
document.body.addBehavior("#default#clientCaps");_ cy=document.body.connectionType}};var
_hcc=(_DD(_hcn))?_D("hcv"):"";if(!_D("gn"))_gn="eh g.hitbox.com";
if(_D("ct")&&!_D("mlc"))_mlc=_ct;_ar=_GP()+_gn+"/HG?hc="+_mn
+"&hb="+_A(_acct)+"&cd=1&hv=6&n="+_A(_pn,1)+"&con= &vcon="+_A(_mlc,
1)+"&tt="+_D("lt")+
"&ja="+(navigator.javaEnabled()?"y":"n")+"&dt="+(n ew Date()).getHours()
+"&zo="+(new Date()).getTimezoneOffset()
+"&lm="+Date.parse(document.lastModified)
+(_tp?("&pt="+_tp):"")+_E((_bnN?"bn,":"")
+"ce,ss,sc,sv,cy,hp,ln,vpc,vjs,hec,pec,cmp,gp,dcmp ,dcmpe,dcmpre,cp,fnl")
+"&seg="+_D("seg")+"&epg="+_D("epg")+
"&cv="+_A(_hcc)+"&gn="+_A(_D("hcn"))+"&ld="+_A(_D( "hlt"))
+"&la="+_A(_D("hla"))+"&c1="+_A(_D("hc1"))+"&c2="+ _A(_D("hc2"))
+"&c3="+_A(_D("hc3"))+"&c4="+
_A(_D("hc4"))+"&customerid="+_A(_D("ci")?_ci:_D("c id"))
+"&lv.id="+_lv.id+"&lv.pos="+_lv.pos+"&ttt="+_lidt +","+_lpost;
if(_I5||_N6){eval("try{_rf=_A(top.document"+__r+") +''}"+__c
+"{_rf=_A(document"+__r+")+''}")}
else{if(top.document&&_IL(parent.frames)>1)
{_rf=_A(eval("document"+__r))+""}else if(top.document)
{_rf=_A(eval("top.document"+__r))+""}}if((_rf==_hu d)||
(_rf==""))_rf="bookmark";_rf=unescape(_rf);_rf=_GR (_rf);_hra=_I(_rf,_D("hqsr"),_D("hqsp"),_hrat);_ar
+="&ra="+_A(_hra)+"&rf="+_A(_IS(_rf,0,500))+
"&pl="+_A(_pl)+_hbSendEV();if(_D("onlyMedia")!="y" )_hbi.src=_ar
+"&hid="+Math.random();_hbq=_IS(_ar,
0,_II(_ar,"&hec"));_hbE=_NA()}_PV();
function _Q(a){var b="";b=new Image();b.src=a+"&hid="+Math.random()}
function __X(a){if(_ec==0){_ec=1;a=document;if(_NN("lt")||_ NN("dlf")||
_NN("elf")){_LP(a.links,1)}if(_NN("fv"))_FA(a)}}
function _EV(a,b,c){if(a.addEventListener)
{a.addEventListener(b,c,false)}else if(a.attachEvent)
{a.attachEvent("on"+b,c)}}
function _EVO(e){return document.all?window.event.srcElement:this}
_EV(window,"load",__X);_EV(window,"unload",_FU);ev al('setTimeout("__X()",
3000)');
</script>
<!--END WEBSIDESTORY CODE-->

Apr 6 '07 #3
Aaron wrote:
On Apr 6, 10:55 am, Erwin Moller
<since_humans_read_this_I_am_spammed_too_m...@spam yourself.comwrote:
>Aaron wrote:
>>I'm trying to parse a table on a webpage to pull down some data I
need. The page is based off of information entered into a form. when
you submit the data from the form it displays a "Searching..." page
then, refreshes and displays the table I want. I have code that grabs
data from the page using cURL but when I look at the data it contains
the "Searching..." page and not the table that I want. below is the
code i have so far....Thanks in advance for any help.
<?php
$url="http://www.website.com";
$post_data = array();
$post_data['postvar1'] = "val1";
$post_data['postvar2'] = "val2";
$o="";
foreach($post_data as $k=>$v)
{
$o.= "$k=".utf8_encode($v)."&";
}
$post_data=substr($o,0,-1);
$ch= curl_init();
curl_setopt($ch, CURLOPT_POST,1);
curl_setopt($ch, CURLOPT_HEADER,0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_URL,$url);
curl_setopt($ch, CURLOPT_POSTFIELDS, $post_data);
$result = curl_exec($ch);
curl_close($ch);
$result=explode("\n",$result);
?>
Hi,

The page is probably using JavaScript to give that effect.
Inspect the content of $result to see if this is the case.
Look for divs that are not visible and made visible when the page loads (end
of script, or an onLoad event).
The data you want might very well be inside the $result.

If not, give more information WHAT the $result contained.

Regards,
Erwin Moller- Hide quoted text -

- Show quoted text -

Heres basicly what was returned by $result
<snipped lots of code>

Erwin is correct. That's using a LOT of javascript, plus it's using
frames. This one is going to be very tough to scrape - you'll need to
decode what the javascript does and emulate it with Curl to get the page.

But it may also be that the webmaster implemented this in part to keep
anyone from scraping the screen. Most sites do not like this.

You'd be better off contacting the owner and seeing if there is another
way to get the information.

--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
js*******@attglobal.net
==================
Apr 6 '07 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
by: Mindful_Spirit | last post by:
I'm trying to set up a basic email feed back form like this, and was wondering about some basic configuration settings. I have used code from this website. I have it working just fine. I'm...
0
by: Rahmi Acar | last post by:
http://finalgate.ath.cx it got a simple search engine and you might try it out search for syntax description based on i.e. int, else, if, do and so on it will then return to a page containing a...
18
by: Atara | last post by:
In my apllication I use the following code: '-- My Code: Public Shared Function strDate2Date(ByVal strDate As String) As System.DateTime Dim isOk As Boolean = False If (strDate Is Nothing)...
13
by: Kobee | last post by:
Hi, I'm having a few issues adapting to new 2.0 "website" project vs. the old 1.1 "web application". One of the major issues I'm having is with the notion of namespaces. Using the old way, I...
2
by: weezman | last post by:
I have a coworker trying to help me with an ASP.NET site. He just installed Visual Studio 2005 professional edition, and he has no option to "Open Website" from the Start Page or File>Open>Website....
1
by: craigkenisston | last post by:
I can't believe what is happening on my computer right now. I have a web project, file-system based on something like c:\Projects\ProjectX\www\. I had to make some changes and testing in a...
4
by: | last post by:
Using VS.NET I am wondering what methods developers use to deploy ASP.NET website content to a remote server, either using FTP or network file copy. Ideally there would be a one-button or...
1
thatos
by: thatos | last post by:
I have a site which could help some people who are struggling in python this website has some examples of python which some people might find useful I would like to know how can make it to get more...
0
by: bruce | last post by:
Hi Fredrick Thanks for the reply. But since I don't have control of the initial text, is there something with python that will strip/replace this... or are you saying I should do a...
0
by: Rina0 | last post by:
Cybersecurity engineering is a specialized field that focuses on the design, development, and implementation of systems, processes, and technologies that protect against cyber threats and...
0
by: erikbower65 | last post by:
Using CodiumAI's pr-agent is simple and powerful. Follow these steps: 1. Install CodiumAI CLI: Ensure Node.js is installed, then run 'npm install -g codiumai' in the terminal. 2. Connect to...
0
linyimin
by: linyimin | last post by:
Spring Startup Analyzer generates an interactive Spring application startup report that lets you understand what contributes to the application startup time and helps to optimize it. Support for...
0
by: erikbower65 | last post by:
Here's a concise step-by-step guide for manually installing IntelliJ IDEA: 1. Download: Visit the official JetBrains website and download the IntelliJ IDEA Community or Ultimate edition based on...
0
by: Taofi | last post by:
I try to insert a new record but the error message says the number of query names and destination fields are not the same This are my field names ID, Budgeted, Actual, Status and Differences ...
14
DJRhino1175
by: DJRhino1175 | last post by:
When I run this code I get an error, its Run-time error# 424 Object required...This is my first attempt at doing something like this. I test the entire code and it worked until I added this - If...
0
by: Rina0 | last post by:
I am looking for a Python code to find the longest common subsequence of two strings. I found this blog post that describes the length of longest common subsequence problem and provides a solution in...
0
by: lllomh | last post by:
How does React native implement an English player?
0
by: Mushico | last post by:
How to calculate date of retirement from date of birth

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.