473,383 Members | 1,785 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,383 software developers and data experts.

python mechanize/libxml2dom question

hi...

i've got the following situation, with the following test url:
"http://schedule.psu.edu/soc/fall/Alloz/a-c/acctg.html#".

i can generate a list of the tables i want for the courses on the page.
however, when i try to create the xpath query, and plug it into the xpath
within python, i'm missing something. if i have a parent xpath query, that
generates a list of results/nodes... how can i then use the individual
parent node, and trigger off of it, to get further information.

i tried using the following chunk of code with no luck.

#s is the html from the course file
d = libxml2dom.parseString(s, html=1)

#at this point, we should have a vaild "d" representation
print "sdddd=",s

aa=libxml2dom.toString(d)
print "hereeeeee \n\n\n"
print "aa",aa
#sys.exit()

# **** course names

cpath='//table[position()>0]/descendant::td[position()=2][@width="85%"]/../t
d[1]/font/a[2]/text()'

cpath_=[]
cpath_=d.xpath(cpath)

print "len=",len(cpath_)
if len(cpath_)>0:

for cpath in cpath_:
#get the coursename info
cname=cpath.toString()
print "cpath=",cpath
print "cname=",cname
rr="./../../../../../../following-sibling::table//tr[position()>1]"

rr=cpath.xpath()
print "rrlen=",len(rr)
print rr[0].toString()
sys.exit()
i'm assuming that there's a libxml2node method that will do what i need that
i'm missing...

pointers/comments would be helpful here...

thanks!
Sep 2 '08 #1
1 1913
bruce wrote:
i've got the following situation, with the following test url:
"http://schedule.psu.edu/soc/fall/Alloz/a-c/acctg.html#".

i can generate a list of the tables i want for the courses on the page.
however, when i try to create the xpath query, and plug it into the xpath
within python, i'm missing something. if i have a parent xpath query, that
generates a list of results/nodes... how can i then use the individual
parent node, and trigger off of it, to get further information.
[code example stripped]

You should really use lxml. It has callable XPath objects that feel like
Python functions, and its Element objects have a getparent() method that gets
you to the parent of the node. Plus, text strings that you get back from an
XPath evaluation also have a getparent() method that returns the Element
object that holds the text. I think that's what you were looking for.

Stefan
Sep 2 '08 #2

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

6
by: test | last post by:
Hi everyone, I'm creating a desktop Python application that requires web-based authentication for accessing additional application features. HTTP GET is really simple. HTTP POST is not (at...
3
by: bruce | last post by:
hi paul... in playing around with the test python app (see below) i've got a couple of basic questions. i can't seem to find the answers via google, and when i've looked in the libxml2dom stuff...
0
by: bruce | last post by:
hi... it appears that i'm running into a possible problem with mechanize/browser/python rgarding the "select_form" method. i've tried the following and get the error listed: br.select_form(nr...
2
by: sauce | last post by:
Hi, this a newbie question, forgive me! Ok so I have some code written using the libxml2dom package (because I want to parse html pages) and I would like to have this work on a web server. The web...
3
by: bruce | last post by:
Hi... got a short test app that i'm playing with. the goal is to get data off the page in question. basically, i should be able to get a list of "tr" nodes, and then to iterate/parse them....
1
by: bruce | last post by:
Hi. Got a test web page, that basically has two "<html" tags in it. Examining the page via Firefox/Dom Inspector, I can create a test xpath query "/html/body/form" which gets the target form for...
0
by: bruce | last post by:
hi... i've got the following situation, with the following test url: "http://schedule.psu.edu/soc/fall/Alloz/a-c/acctg.html#". i can generate a list of the tables i want for the courses on the...
0
by: bruce | last post by:
hi... i can use an xpath query to create a node from an html/dom representation. however, if i have a node, is there a way to generate an xpath query from the node. in testing with...
0
by: John J Lee | last post by:
On Mon, 1 Sep 2008, bruce wrote: Just a general point: try lxml.etree instead? Friendlier API. John
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...
0
by: ryjfgjl | last post by:
In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.