473,326 Members | 2,133 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,326 software developers and data experts.

possible issue with mechanize/python parsing

hi...

it appears that i'm running into a possible problem with
mechanize/browser/python rgarding the "select_form" method. i've tried the
following and get the error listed:

br.select_form(nr = 1)
br.select_form(name="foo")
br.select_form(name=foo)
br.select_form(name="foo")
here's a short test app, as well as the html to be placed in a test data
file....

everything is straight forward...

any thoughts/comments/ideas would be helpful. i have the latest mechanize
from the svn repos.

thanks

-bruce
the error i get is:
Traceback (most recent call last):
File "./axess.py", line 127, in ?
br.select_form(name = "main")
File "build/bdist.linux-i686/egg/mechanize/_mechanize.py", line 354, in
select_form
mechanize._mechanize.BrowserStateError: not viewing HTML

test code
-------------------------------
#! /usr/bin/env python
#test python script
import re
import libxml2dom
import urllib
import urllib2
import sys, string
#import numarray
import httplib
from mechanize import Browser
import mechanize

########################
#
# Parsing App Information
########################


# datafile
tfile = open("stanford.dat", 'wr+')

cj = mechanize.CookieJar()
br = Browser()
if __name__ == "__main__":
# main app
#----------------------------
# start trying to get the stanford pages
cj = mechanize.CookieJar()
br = Browser()

fh = open('axess.dat')
s = fh.read()
fh.close()
br.open("file:///home/test/axess.dat")

print "foo"

# particular cookiejar)
br.set_cookiejar(cj)
# Log information about HTTP redirects and Refreshes.
##br.set_debug_redirects(True)
# Log HTTP response bodies (ie. the HTML, most of the time).

#WARNING!!!!!! using this will apparently
#kill the Browser instance!!!
#br.set_debug_responses(True)
# Print HTTP headers.
# br.set_debug_http(True)
# br.set_handle_redirect(True)
# br.set_handle_referer(True)

response = br.response() # this is a copy of response

#get the option/semester name
snamepath =
"/html/body[@class='PSPAGE']/form[2]/table/tr[7]/td[3]/select/@name"

#get the form name
fnamepath = "/html/body[@class='PSPAGE']/form[2]/attribute::name"

s = response.read()
print response.read()
print s
#we now have the semester page...
d = libxml2dom.parseString(s, html=1)

#get option name
sem_optname = d.xpath(snamepath)
sem_optname = sem_optname[0].nodeValue

print "sem = ",sem_optname

ff = d.xpath(fnamepath)
fname = ff[0].nodeValue
print "fname = ",fname
br.select_form(name = "main")

print "ssssss"
sys.exit()

data file
-------------------------------
<html lang='en'>
<!-- Copyright (c) 2000-2005 PeopleSoft, Inc. All Rights Reserved. -->
<! IE/6.0/WINNT; ToolsRel=8.22.05; Page=CLASS_SRCH_ENTRY;
Component=CLASS_SEARCH; Menu=SA_LEARNER_SERVICES; -->
<HEAD>
<script language='JavaScript'>
var totalTimeoutMilliseconds = 1200000;
var warningTimeoutMilliseconds = 1080000;
var timeOutURL =
'https://axess.stanford.edu/psp/psp_prd/?cmd=expire&languageCd=ENG';
var timeoutWarningPageURL =
'https://psweb.stanford.edu/servlets/iclientservlet/a2k_prd/?ICType=Script&I
CScriptProgramName=WEBLIB_TIMEOUT.PT_TIMEOUTWARNIN G.FieldFormula.IScript_TIM
EOUTWARNING';
</script>
<TITLE>View Schedule of Classes</TITLE>
<LINK REL=STYLESHEET TYPE='TEXT/CSS'
HREF='/servlets/cs/a2k_prd/cache/PSSTYLEDEF_STF_ENG_1.css'>
<SCRIPT language='JavaScript'>
var baseKey_main = "";
var altKey_main = "05678\xbc\xbe\xbf\xde";
var ctrlKey_main = "JK";
var bTabOverTB_main = false;
var bTabOverPg_main = false;
var bTabOverNonPS_main = false;
</SCRIPT>
<SCRIPT LANGUAGE='javascript'
SRC='/servlets/cs/a2k_prd/cache/PT_SCRIPTIE600_ENG_main_1.js'>
</SCRIPT>
<SCRIPT LANGUAGE='JavaScript'>
document.domain = "stanford.edu";
</SCRIPT>
<SCRIPT LANGUAGE='javascript'
SRC='/servlets/cs/a2k_prd/cache/PT_PAGESCRIPT_ENG_main_1.js'>
</SCRIPT>
<SCRIPT LANGUAGE='javascript'
SRC='/servlets/cs/a2k_prd/cache/PT_SAVEWARNINGSCRIPT_ENG_main_1.js'>
</SCRIPT>
<SCRIPT LANGUAGE='javascript'
SRC='/servlets/cs/a2k_prd/cache/PT_ISCROSSDOMAIN_ENG_main_1.js'>
</SCRIPT>
<SCRIPT LANGUAGE='JavaScript'>
function submitAction_main(form, name)
{
form.ICAction.value=name;
form.ICXPos.value=getScrollX();
form.ICYPos.value=getScrollY();
form.submit();
}
</SCRIPT>
<SCRIPT LANGUAGE='javascript'
SRC='/servlets/cs/a2k_prd/cache/PT_EDITSCRIPT_ENG_main_1.js'>
</SCRIPT>
</HEAD>
<BODY CLASS='PSPAGE' onLoad="
setFocus_main('STF_AX_CLAS_DRV_STRM',-1);
setEventHandlers_main('ICFirstAnchor_main', 'ICLastAnchor_main', false);
setupTimeout();
setKeyEventHandler_main();
"
onunload=""
>
<a name='ICFirstAnchor_main'></a>
<FORM NAME='HelpURL' METHOD=POST Action=""><INPUT TYPE=hidden NAME=ICHelpUrl
VALUE="http://psrepos.stanford.edu:9070/PSOL/htmldoc/f1search.htm?ContextID=
CLASS_SRCH_ENTRY&LangCD=ENG"></FORM>
<table cols='2' width='100%' cellpadding='0' cellspacing='0' hspace='0'
vspace='0'>
<tr>
<td width='90%'></td><td width='10%' nowrap='nowrap' align='right'><a
href="http://psrepos.stanford.edu:9070/PSOL/htmldoc/f1search.htm?ContextID=C
LASS_SRCH_ENTRY&LangCD=ENG" target='help' accesskey='9' tabindex='1'
class='PSHYPERLINK'>Help</a></td></tr>
</table>
<br />
<FORM NAME='main' METHOD=POST
Action="/servlets/iclientservlet/a2k_prd/?ICType=Panel&Menu=SA_LEARNER_SERVI
CES&Market=GBL&PanelGroupName=CLASS_SEARCH" autocomplete=off>
<INPUT TYPE=hidden NAME=ICType VALUE=Panel>
<INPUT TYPE=hidden NAME=ICElementNum VALUE="0">
<INPUT TYPE=hidden NAME=ICStateNum VALUE="2">
<INPUT TYPE=hidden NAME=ICAction VALUE=None>
<INPUT TYPE=hidden NAME=ICXPos VALUE=0>
<INPUT TYPE=hidden NAME=ICYPos VALUE=0>
<INPUT TYPE=hidden NAME=ICFocus VALUE="">
<input type=hidden name=ICChanged value='-1' />
<TABLE BORDER=0 CELLPADDING=0 CELLSPACING=0 COLS=9 WIDTH=553>
<TR>
<TD WIDTH=8 HEIGHT=8></TD>
<TD WIDTH=4></TD>
<TD WIDTH=8></TD>
<TD WIDTH=4></TD>
<TD WIDTH=1></TD>
<TD WIDTH=95></TD>
<TD WIDTH=36></TD>
<TD WIDTH=148></TD>
<TD WIDTH=249></TD>
</TR>
<TR>
<TD HEIGHT=24></TD>
<TD COLSPAN=8 VALIGN=TOP ALIGN=LEFT>
<LABEL FOR='DERIVED_AS_LBL_TITLE_NAME_PART' CLASS='PATRANSACTIONTITLE'
>Class Search</LABEL>
</TD>
</TR>
<TR>
<TD HEIGHT=29 COLSPAN=2></TD>
<TD COLSPAN=7 VALIGN=TOP ALIGN=LEFT>
<LABEL FOR='DERIVED_AS_LBL_CLASS_LBL' CLASS='PAPAGETITLE' >Select
Term</LABEL>
</TD>
</TR>
<TR>
<TD HEIGHT=20 COLSPAN=5></TD>
<TD COLSPAN=4 VALIGN=TOP ALIGN=LEFT>
<DIV CLASS='PAPAGEINSTRUCTIONS' >&nbsp;</DIV>
</TD>
</TR>
<TR>
<TD HEIGHT=20 COLSPAN=5></TD>
<TD COLSPAN=4 VALIGN=TOP ALIGN=LEFT>
<DIV CLASS='PAPAGEINSTRUCTIONS' >Select the term you wish to search, and
then click Basic Search,</DIV>
</TD>
</TR>
<TR>
<TD HEIGHT=43 COLSPAN=5></TD>
<TD COLSPAN=4 VALIGN=TOP ALIGN=LEFT>
<DIV CLASS='PAPAGEINSTRUCTIONS' >Advanced Search, or Independent Study
Search to continue.</DIV>
</TD>
</TR>
<TR>
<TD HEIGHT=55 COLSPAN=4></TD>
<TD COLSPAN=2 VALIGN=TOP ALIGN=LEFT>
<LABEL FOR='STF_AX_CLAS_DRV_STRM' CLASS='PSDROPDOWNLABEL' >*Select a
Term:</LABEL>
</TD>
<TD COLSPAN=3 VALIGN=TOP ALIGN=LEFT>
<SELECT NAME='STF_AX_CLAS_DRV_STRM' ID='STF_AX_CLAS_DRV_STRM' SIZE=1
TABINDEX=15 CLASS='PSDROPDOWNLIST' STYLE="width:207px; " >
<OPTION VALUE="1068">(1068)&nbsp; 2005-2006 Summer
<OPTION VALUE="1066">(1066)&nbsp; 2005-2006 Spring
<OPTION VALUE="1064">(1064)&nbsp; 2005-2006 Winter
<OPTION VALUE="1062">(1062)&nbsp; 2005-2006 Autumn
<OPTION VALUE="" SELECTED>
</SELECT>
</TD>
</TR>
<TR>
<TD HEIGHT=122 COLSPAN=3></TD>
<TD COLSPAN=4 NOWRAP VALIGN=TOP ALIGN=LEFT>
<INPUT TYPE=BUTTON NAME='CLASS_SRCH_BASIC' ID='CLASS_SRCH_BASIC' TABINDEX=17
VALUE="Basic Search" CLASS='PSPUSHBUTTON' STYLE="width:124px; "
ONCLICK="submitAction_main(this.form,this.name);">
</TD>
<TD NOWRAP VALIGN=TOP ALIGN=LEFT>
<INPUT TYPE=BUTTON NAME='CLASS_SRCH_ADV' ID='CLASS_SRCH_ADV' TABINDEX=18
VALUE="Advanced Search" CLASS='PSPUSHBUTTON' STYLE="width:124px; "
ONCLICK="submitAction_main(this.form,this.name);">
</TD>
<TD NOWRAP VALIGN=TOP ALIGN=LEFT>
<INPUT TYPE=BUTTON NAME='CLASS_SRCH_ADV1' ID='CLASS_SRCH_ADV1' TABINDEX=16
VALUE="Independent Study Search" CLASS='PSPUSHBUTTON' STYLE="width:152px; "
ONCLICK="submitAction_main(this.form,this.name);">
</TD>
</TR>
</TABLE>
</FORM>
<a name='ICLastAnchor_main'></a>
</BODY>
</HTML>

Jul 10 '06 #1
0 2344

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
by: bruce | last post by:
hi... update to an ongoing issue i've been having regarding html/Browser and selecting forms. i've created a basic test app, and created a stripped down page of html. the html has a single...
12
by: John | last post by:
I have to write a spyder for a webpage that uses html + javascript. I had it written using mechanize but the authors of the webpage now use a lot of javascript. Mechanize can no longer do the job....
1
by: tedpottel | last post by:
Hi, I am trying to install the mechanize lib so I can use python to do webbrowseing. First I set up easy_install When I ran the script, it download the files ok, then I got these error...
1
by: bruce | last post by:
Hi. Got a test web page, that basically has two "<html" tags in it. Examining the page via Firefox/Dom Inspector, I can create a test xpath query "/html/body/form" which gets the target form for...
1
by: bruce | last post by:
evening... using mechanize/Browser, i can easily do a url/get, and process submitting a form that uses a GET as the action. however, I'm not quite sure how to implement the submittal of a form,...
0
by: Mohamed Yousef | last post by:
Hello , i don't know about mechanize but in general all you have to do is a simple socket text sending connect to the server then send it POST headers to page in question.. what to send ?...
2
by: Rex | last post by:
Hello, I am working on an academic research project where I need to log in to a website (www.lexis.com) over HTTPS and execute a bunch of queries to gather a data set. I just discovered the...
0
by: trihaitran | last post by:
I am trying to write a web scraper and am having trouble accessing pages that require authentication. I am attempting to utilise the mechanize library, but am having difficulties. The site I am...
1
by: tedpottel | last post by:
Hi, I can read the home page using the mechanize lib. Is there a way to load in web pages using filename.html instad of servername/ filename.html. Lots of time the links just have the file...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
1
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.