471,323 Members | 1,545 Online
Bytes | Software Development & Data Engineering Community
Post +

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 471,323 software developers and data experts.

parsing "&A" in a string..

Hi.

a pretty simple question, i'm guessing.

i have a text/html string that looks like:
....(A&E)

the issue i have is that when i parse it using xpath/node/toString,
i get the following

....(A&E;).

note the semicolon ";". I've tried to use the encoding function of toString
with no luck..

the test chunk of code i'm using is:
..
..
..
dpath="//div/ul[@id='leftNavListing']/li[position()>0]//a/text()"
ldepts_=d.xpath(dpath)
if len(ldepts_)>0:
for ldept in ldepts_:
dept=ldept.nodeValue
print "dept =",ldept.toString()
start=re.search("\(",dept).span()
end=re.search("\)",dept).span()
print start,end
print dept[start[0]],dept[end[0]]
dept=dept[start[1]:end[0]]
print dept
..
..
..

so, any thoughts/pointers as to how i can remove the ";" would be helpful.
i'm assuming that there is a way to enforce a given encoding, that would
remove the ";" issue...

thanks

Aug 31 '08 #1
0 779

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

1 post views Thread by David Furey | last post: by
5 posts views Thread by Mateusz Loskot | last post: by
4 posts views Thread by barney | last post: by
5 posts views Thread by martin | last post: by
14 posts views Thread by Arne | last post: by
reply views Thread by Fredrik Lundh | last post: by
reply views Thread by bruce | last post: by
reply views Thread by rosydwin | last post: by

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.