473,706 Members | 2,380 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

possible issue with mechanize/python parsing

hi...

it appears that i'm running into a possible problem with
mechanize/browser/python rgarding the "select_for m" method. i've tried the
following and get the error listed:

br.select_form( nr = 1)
br.select_form( name="foo")
br.select_form( name=foo)
br.select_form( name="foo")
here's a short test app, as well as the html to be placed in a test data
file....

everything is straight forward...

any thoughts/comments/ideas would be helpful. i have the latest mechanize
from the svn repos.

thanks

-bruce
the error i get is:
Traceback (most recent call last):
File "./axess.py", line 127, in ?
br.select_form( name = "main")
File "build/bdist.linux-i686/egg/mechanize/_mechanize.py", line 354, in
select_form
mechanize._mech anize.BrowserSt ateError: not viewing HTML

test code
-------------------------------
#! /usr/bin/env python
#test python script
import re
import libxml2dom
import urllib
import urllib2
import sys, string
#import numarray
import httplib
from mechanize import Browser
import mechanize

############### #########
#
# Parsing App Information
############### #########


# datafile
tfile = open("stanford. dat", 'wr+')

cj = mechanize.Cooki eJar()
br = Browser()
if __name__ == "__main__":
# main app
#----------------------------
# start trying to get the stanford pages
cj = mechanize.Cooki eJar()
br = Browser()

fh = open('axess.dat ')
s = fh.read()
fh.close()
br.open("file:///home/test/axess.dat")

print "foo"

# particular cookiejar)
br.set_cookieja r(cj)
# Log information about HTTP redirects and Refreshes.
##br.set_debug_ redirects(True)
# Log HTTP response bodies (ie. the HTML, most of the time).

#WARNING!!!!!! using this will apparently
#kill the Browser instance!!!
#br.set_debug_r esponses(True)
# Print HTTP headers.
# br.set_debug_ht tp(True)
# br.set_handle_r edirect(True)
# br.set_handle_r eferer(True)

response = br.response() # this is a copy of response

#get the option/semester name
snamepath =
"/html/body[@class='PSPAGE']/form[2]/table/tr[7]/td[3]/select/@name"

#get the form name
fnamepath = "/html/body[@class='PSPAGE']/form[2]/attribute::name "

s = response.read()
print response.read()
print s
#we now have the semester page...
d = libxml2dom.pars eString(s, html=1)

#get option name
sem_optname = d.xpath(snamepa th)
sem_optname = sem_optname[0].nodeValue

print "sem = ",sem_optna me

ff = d.xpath(fnamepa th)
fname = ff[0].nodeValue
print "fname = ",fname
br.select_form( name = "main")

print "ssssss"
sys.exit()

data file
-------------------------------
<html lang='en'>
<!-- Copyright (c) 2000-2005 PeopleSoft, Inc. All Rights Reserved. -->
<! IE/6.0/WINNT; ToolsRel=8.22.0 5; Page=CLASS_SRCH _ENTRY;
Component=CLASS _SEARCH; Menu=SA_LEARNER _SERVICES; -->
<HEAD>
<script language='JavaS cript'>
var totalTimeoutMil liseconds = 1200000;
var warningTimeoutM illiseconds = 1080000;
var timeOutURL =
'https://axess.stanford. edu/psp/psp_prd/?cmd=expire&lan guageCd=ENG';
var timeoutWarningP ageURL =
'https://psweb.stanford. edu/servlets/iclientservlet/a2k_prd/?ICType=Script& I
CScriptProgramN ame=WEBLIB_TIME OUT.PT_TIMEOUTW ARNING.FieldFor mula.IScript_TI M
EOUTWARNING';
</script>
<TITLE>View Schedule of Classes</TITLE>
<LINK REL=STYLESHEET TYPE='TEXT/CSS'
HREF='/servlets/cs/a2k_prd/cache/PSSTYLEDEF_STF_ ENG_1.css'>
<SCRIPT language='JavaS cript'>
var baseKey_main = "";
var altKey_main = "05678\xbc\xbe\ xbf\xde";
var ctrlKey_main = "JK";
var bTabOverTB_main = false;
var bTabOverPg_main = false;
var bTabOverNonPS_m ain = false;
</SCRIPT>
<SCRIPT LANGUAGE='javas cript'
SRC='/servlets/cs/a2k_prd/cache/PT_SCRIPTIE600_ ENG_main_1.js'>
</SCRIPT>
<SCRIPT LANGUAGE='JavaS cript'>
document.domain = "stanford.e du";
</SCRIPT>
<SCRIPT LANGUAGE='javas cript'
SRC='/servlets/cs/a2k_prd/cache/PT_PAGESCRIPT_E NG_main_1.js'>
</SCRIPT>
<SCRIPT LANGUAGE='javas cript'
SRC='/servlets/cs/a2k_prd/cache/PT_SAVEWARNINGS CRIPT_ENG_main_ 1.js'>
</SCRIPT>
<SCRIPT LANGUAGE='javas cript'
SRC='/servlets/cs/a2k_prd/cache/PT_ISCROSSDOMAI N_ENG_main_1.js '>
</SCRIPT>
<SCRIPT LANGUAGE='JavaS cript'>
function submitAction_ma in(form, name)
{
form.ICAction.v alue=name;
form.ICXPos.val ue=getScrollX() ;
form.ICYPos.val ue=getScrollY() ;
form.submit();
}
</SCRIPT>
<SCRIPT LANGUAGE='javas cript'
SRC='/servlets/cs/a2k_prd/cache/PT_EDITSCRIPT_E NG_main_1.js'>
</SCRIPT>
</HEAD>
<BODY CLASS='PSPAGE' onLoad="
setFocus_main(' STF_AX_CLAS_DRV _STRM',-1);
setEventHandler s_main('ICFirst Anchor_main', 'ICLastAnchor_m ain', false);
setupTimeout();
setKeyEventHand ler_main();
"
onunload=""
>
<a name='ICFirstAn chor_main'></a>
<FORM NAME='HelpURL' METHOD=POST Action=""><INPU T TYPE=hidden NAME=ICHelpUrl
VALUE="http://psrepos.stanfor d.edu:9070/PSOL/htmldoc/f1search.htm?Co ntextID=
CLASS_SRCH_ENTR Y&LangCD=ENG" ></FORM>
<table cols='2' width='100%' cellpadding='0' cellspacing='0' hspace='0'
vspace='0'>
<tr>
<td width='90%'></td><td width='10%' nowrap='nowrap' align='right'>< a
href="http://psrepos.stanfor d.edu:9070/PSOL/htmldoc/f1search.htm?Co ntextID=C
LASS_SRCH_ENTRY &LangCD=ENG" target='help' accesskey='9' tabindex='1'
class='PSHYPERL INK'>Help</a></td></tr>
</table>
<br />
<FORM NAME='main' METHOD=POST
Action="/servlets/iclientservlet/a2k_prd/?ICType=Panel&M enu=SA_LEARNER_ SERVI
CES&Market=GBL& PanelGroupName= CLASS_SEARCH" autocomplete=of f>
<INPUT TYPE=hidden NAME=ICType VALUE=Panel>
<INPUT TYPE=hidden NAME=ICElementN um VALUE="0">
<INPUT TYPE=hidden NAME=ICStateNum VALUE="2">
<INPUT TYPE=hidden NAME=ICAction VALUE=None>
<INPUT TYPE=hidden NAME=ICXPos VALUE=0>
<INPUT TYPE=hidden NAME=ICYPos VALUE=0>
<INPUT TYPE=hidden NAME=ICFocus VALUE="">
<input type=hidden name=ICChanged value='-1' />
<TABLE BORDER=0 CELLPADDING=0 CELLSPACING=0 COLS=9 WIDTH=553>
<TR>
<TD WIDTH=8 HEIGHT=8></TD>
<TD WIDTH=4></TD>
<TD WIDTH=8></TD>
<TD WIDTH=4></TD>
<TD WIDTH=1></TD>
<TD WIDTH=95></TD>
<TD WIDTH=36></TD>
<TD WIDTH=148></TD>
<TD WIDTH=249></TD>
</TR>
<TR>
<TD HEIGHT=24></TD>
<TD COLSPAN=8 VALIGN=TOP ALIGN=LEFT>
<LABEL FOR='DERIVED_AS _LBL_TITLE_NAME _PART' CLASS='PATRANSA CTIONTITLE'
>Class Search</LABEL>
</TD>
</TR>
<TR>
<TD HEIGHT=29 COLSPAN=2></TD>
<TD COLSPAN=7 VALIGN=TOP ALIGN=LEFT>
<LABEL FOR='DERIVED_AS _LBL_CLASS_LBL' CLASS='PAPAGETI TLE' >Select
Term</LABEL>
</TD>
</TR>
<TR>
<TD HEIGHT=20 COLSPAN=5></TD>
<TD COLSPAN=4 VALIGN=TOP ALIGN=LEFT>
<DIV CLASS='PAPAGEIN STRUCTIONS' >&nbsp;</DIV>
</TD>
</TR>
<TR>
<TD HEIGHT=20 COLSPAN=5></TD>
<TD COLSPAN=4 VALIGN=TOP ALIGN=LEFT>
<DIV CLASS='PAPAGEIN STRUCTIONS' >Select the term you wish to search, and
then click Basic Search,</DIV>
</TD>
</TR>
<TR>
<TD HEIGHT=43 COLSPAN=5></TD>
<TD COLSPAN=4 VALIGN=TOP ALIGN=LEFT>
<DIV CLASS='PAPAGEIN STRUCTIONS' >Advanced Search, or Independent Study
Search to continue.</DIV>
</TD>
</TR>
<TR>
<TD HEIGHT=55 COLSPAN=4></TD>
<TD COLSPAN=2 VALIGN=TOP ALIGN=LEFT>
<LABEL FOR='STF_AX_CLA S_DRV_STRM' CLASS='PSDROPDO WNLABEL' >*Select a
Term:</LABEL>
</TD>
<TD COLSPAN=3 VALIGN=TOP ALIGN=LEFT>
<SELECT NAME='STF_AX_CL AS_DRV_STRM' ID='STF_AX_CLAS _DRV_STRM' SIZE=1
TABINDEX=15 CLASS='PSDROPDO WNLIST' STYLE="width:20 7px; " >
<OPTION VALUE="1068">(1 068)&nbsp; 2005-2006 Summer
<OPTION VALUE="1066">(1 066)&nbsp; 2005-2006 Spring
<OPTION VALUE="1064">(1 064)&nbsp; 2005-2006 Winter
<OPTION VALUE="1062">(1 062)&nbsp; 2005-2006 Autumn
<OPTION VALUE="" SELECTED>
</SELECT>
</TD>
</TR>
<TR>
<TD HEIGHT=122 COLSPAN=3></TD>
<TD COLSPAN=4 NOWRAP VALIGN=TOP ALIGN=LEFT>
<INPUT TYPE=BUTTON NAME='CLASS_SRC H_BASIC' ID='CLASS_SRCH_ BASIC' TABINDEX=17
VALUE="Basic Search" CLASS='PSPUSHBU TTON' STYLE="width:12 4px; "
ONCLICK="submit Action_main(thi s.form,this.nam e);">
</TD>
<TD NOWRAP VALIGN=TOP ALIGN=LEFT>
<INPUT TYPE=BUTTON NAME='CLASS_SRC H_ADV' ID='CLASS_SRCH_ ADV' TABINDEX=18
VALUE="Advanced Search" CLASS='PSPUSHBU TTON' STYLE="width:12 4px; "
ONCLICK="submit Action_main(thi s.form,this.nam e);">
</TD>
<TD NOWRAP VALIGN=TOP ALIGN=LEFT>
<INPUT TYPE=BUTTON NAME='CLASS_SRC H_ADV1' ID='CLASS_SRCH_ ADV1' TABINDEX=16
VALUE="Independ ent Study Search" CLASS='PSPUSHBU TTON' STYLE="width:15 2px; "
ONCLICK="submit Action_main(thi s.form,this.nam e);">
</TD>
</TR>
</TABLE>
</FORM>
<a name='ICLastAnc hor_main'></a>
</BODY>
</HTML>

Jul 10 '06 #1
0 2381

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
2437
by: bruce | last post by:
hi... update to an ongoing issue i've been having regarding html/Browser and selecting forms. i've created a basic test app, and created a stripped down page of html. the html has a single form. i get the following error: fname = main <<<< the app can find the frame from the XPath...
12
5840
by: John | last post by:
I have to write a spyder for a webpage that uses html + javascript. I had it written using mechanize but the authors of the webpage now use a lot of javascript. Mechanize can no longer do the job. Does anyone know how I could automate my spyder to understand javascript? Is there a way to control a browser like firefox from python itself? How about IE? That way, we do not have to go thru something like mechanize?
1
1926
by: tedpottel | last post by:
Hi, I am trying to install the mechanize lib so I can use python to do webbrowseing. First I set up easy_install When I ran the script, it download the files ok, then I got these error messages sun is not reganized as a internal command I did a sercah on sun.* and the sercah came up empty, am I missing
1
2216
by: bruce | last post by:
Hi. Got a test web page, that basically has two "<html" tags in it. Examining the page via Firefox/Dom Inspector, I can create a test xpath query "/html/body/form" which gets the target form for the test. The issue comes when I examine the page's source html. It looks like: <html> <body> </body>
1
3085
by: bruce | last post by:
evening... using mechanize/Browser, i can easily do a url/get, and process submitting a form that uses a GET as the action. however, I'm not quite sure how to implement the submittal of a form, that uses the POST action. Anyone have a short chunk of code that I can observer, that uses the mechanize.Browser implentation? in searching the net, i haven't found any...
0
909
by: Mohamed Yousef | last post by:
Hello , i don't know about mechanize but in general all you have to do is a simple socket text sending connect to the server then send it POST headers to page in question.. what to send ? getting this your self is much better and one of the best tools is LiveHTTPHeaders - an addon for firefox - so open it post the form in browser then see what you got in LiveHttpHeaders , it will also help you more read through the form
2
4899
by: Rex | last post by:
Hello, I am working on an academic research project where I need to log in to a website (www.lexis.com) over HTTPS and execute a bunch of queries to gather a data set. I just discovered the mechanize module, which seems great because it's a high-level tool. However, I can't find any decent documentation for mechanize apart from the docstrings, which are pretty thin. So I just followed some other examples I found online, to produce the...
0
3485
by: trihaitran | last post by:
I am trying to write a web scraper and am having trouble accessing pages that require authentication. I am attempting to utilise the mechanize library, but am having difficulties. The site I am trying to login is http://www.princetonreview.com/Login3.aspx?uidbadge= user: bugmenot2008@yahoo.com pass: letmeinalready Previously I did something similar to another site: schoolfinder.com. Here is my code for that: import cookielib...
1
2234
by: tedpottel | last post by:
Hi, I can read the home page using the mechanize lib. Is there a way to load in web pages using filename.html instad of servername/ filename.html. Lots of time the links just have the file name. I'm trying to read in the links name and then vsit those pages. here is the sample code I am ussing.
0
8781
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
9285
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
9155
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
9050
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
8993
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
1
6614
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
4709
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
3147
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
3
2094
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.