hi...
update to an ongoing issue i've been having regarding html/Browser and
selecting forms.
i've created a basic test app, and created a stripped down page of html. the
html has a single form.
i get the following error:
fname = main <<<< the app can find the frame from the XPath...
Traceback (most recent call last):
File "./axess.py", line 90, in ?
br.select_form(name = "main") <<<<< app is dying!!!
File "build/bdist.linux-i686/egg/mechanize/_mechanize.py", line 354, in
select_form
mechanize._mechanize.BrowserStateError: not viewing HTML
any thoughts/ideas/comments will be useful!!
thanks
-bruce
test code
---------------------------
import re
import libxml2dom
import urllib
import urllib2
import sys, string
#import numarray
import httplib
from mechanize import Browser, RobustFactory
import mechanize
from BeautifulSoup import *
########################
#
# Parsing App Information
########################
# datafile
tfile = open("stanford.dat", 'wr+')
cj = mechanize.CookieJar()
br = Browser()
if __name__ == "__main__":
# main app
#----------------------------
# start trying to get the stanford pages
cj = mechanize.CookieJar()
# br = Browser(factory=RobustFactory())
br = Browser()
fh = open('axess1.dat')
s = fh.read()
fh.close()
br.open("file:///home/test/axess1.dat")
# br.open(s)
print "foo"
# particular cookiejar)
br.set_cookiejar(cj)
response = br.response() # this is a copy of response
fnamepath = "/html/body[@class='PSPAGE']/form[1]/attribute::name"
s = response.read()
print response.read()
d = libxml2dom.parseString(s, html=1)
ff = d.xpath(fnamepath)
fname = ff[0].nodeValue
print "fname = ",fname
br.select_form(name = "main")
print "ssssss"
sys.exit()
test html
---------------------------
<html lang='en'>
<head>
<title>View Schedule of Classes</title>
</head>
<body class='PSPAGE' >
<br>
<form name="main" method="post" action=
"/servlets/iclientservlet/a2k_prd/?ICType=Panel&Menu=SA_LEARNER_SERVICES
&Market=GBL&PanelGroupName=CLASS_SEARCH "
autocomplete="off" id="main">
</form>
</body>
</html>
hi john...
this is in regards to the web/parsing/factory/beautifulsoup....
to reiterate, i have python 2.4, mechanize, browser, beatifulsoup installed.
i have the latest mech from svn.
i'm getting the same err as reported by john t. the code/err follows.. (i
can resend the test html if you need)
any thoughts/pointers/etc would be helpful...
thanks
-bruce
test code
#! /usr/bin/env python
#test python script
import re
import libxml2dom
import urllib
import urllib2
import sys, string
#import numarray
import httplib
from mechanize import Browser, RobustFactory
import mechanize
import BeautifulSoup
########################
#
# Parsing App Information
########################
# datafile
tfile = open("stanford.dat", 'wr+')
cj = mechanize.CookieJar()
br = Browser()
if __name__ == "__main__":
# main app
#----------------------------
# start trying to get the stanford pages
cj = mechanize.CookieJar()
br = Browser(factory=RobustFactory())
fh = open('axess.dat')
s = fh.read()
fh.close()
br.open("file:///home/test/axess.dat")