i have walked a directory and have written the foll xml document.
one of the folder had "&" character so i replaced it by "&"
#------------------test1.xml
<Directory>
<dirname>C:\Documents and Settings\Administrator\Desktop\1\bye
w&y </dirname>
<file>
<name>def.txt</name>
<time>200607130417</time>
</file>
</Directory>
<Directory>
<dirname>C:\Documents and Settings\Administrator\Desktop\1\hii
wx</dirname>
<file>
<name>abc.txt</name>
<time>200607130415</time>
</file>
</Directory
now in my python code i want to parse this doc and print the directory
name.
###----------handler------------filename---handler.py
from xml.sax.handler import ContentHandler
class oldHandler(ContentHandler):
def __init__(self):
self.dn = 0
def startElement(self, name, attrs):
if name=='dirname':
self.dn=1
def characters(self,str):
if self.dn:
print str
def endElement(self, name):
if name == 'dirname':
self.dn=0
#---------------------------------------------------------------------
#main code--- fname----art.py
import sys
from xml.sax import make_parser
from handlers import oldHandler
ch = oldHandler()
saxparser = make_parser()
saxparser.setContentHandler(ch)
saxparser.parse(sys.argv[1])
#-----------------------------------------------------------------------------
i run the code as: $python art.py test1.xml
i am getting output as:
C:\Documents and Settings\Administrator\Desktop\1\bye w
&
y
C:\Documents and Settings\Administrator\Desktop\1\hii wx
where as i need an output which should look like this.
C:\Documents and Settings\Administrator\Desktop\1\bye w&y
C:\Documents and Settings\Administrator\Desktop\1\hii wx
Can someone tell me the solution for this. 7 1699
Kirt wrote:
i have walked a directory and have written the foll xml document.
one of the folder had "&" character so i replaced it by "&"
#------------------test1.xml
<Directory>
<dirname>C:\Documents and Settings\Administrator\Desktop\1\bye
w&y </dirname>
<file>
<name>def.txt</name>
<time>200607130417</time>
</file>
</Directory>
<Directory>
<dirname>C:\Documents and Settings\Administrator\Desktop\1\hii
wx</dirname>
<file>
<name>abc.txt</name>
<time>200607130415</time>
</file>
</Directory
now in my python code i want to parse this doc and print the directory
name.
###----------handler------------filename---handler.py
from xml.sax.handler import ContentHandler
class oldHandler(ContentHandler):
def __init__(self):
self.dn = 0
def startElement(self, name, attrs):
if name=='dirname':
self.dn=1
def characters(self,str):
if self.dn:
print str
The problem is here. "print" adds a newline. Don't use print, just append the
characters (to a string or list) until the endElement callback is called.
def endElement(self, name):
if name == 'dirname':
self.dn=0
#---------------------------------------------------------------------
#main code--- fname----art.py
import sys
from xml.sax import make_parser
from handlers import oldHandler
ch = oldHandler()
saxparser = make_parser()
saxparser.setContentHandler(ch)
saxparser.parse(sys.argv[1])
#-----------------------------------------------------------------------------
i run the code as: $python art.py test1.xml
i am getting output as:
C:\Documents and Settings\Administrator\Desktop\1\bye w
&
y
C:\Documents and Settings\Administrator\Desktop\1\hii wx
where as i need an output which should look like this.
C:\Documents and Settings\Administrator\Desktop\1\bye w&y
C:\Documents and Settings\Administrator\Desktop\1\hii wx
Can someone tell me the solution for this.
How do i append characters to a string?
actually my entire handler code is
class oldHandler(ContentHandler):
def __init__(self):
self.fn = 0
self.dn = 0
self.i=[]
self.x=""
self.y=""
self.z=""
self.t=0
self.xx=''
def startElement(self, name, attrs):
if name=='dirname':
self.dn=1
if name=='name':
self.fn=1
if name=='time':
self.t=1
def characters(self,str):
if self.dn:
self.x=str
if self.fn:
self.y=str
if self.t:
self.z=str
ss= self.x+'/'+self.y+','+self.z+ '\r \n'
self.i.append(ss)
def endElement(self, name):
if name == 'dirname':
self.dn=0
if name=='name':
self.fn=0
if name=='time':
self.t=0
def endDocument(self):
f=open('old.txt', 'w')
self.i.sort
f.writelines(self.i)
f.close
so my old.txt now looks like this
y+def.txt,200607130417
C:\Documents and Settings\Administrator\Desktop\1\hii
wx\abc.txt,200607130415
But i wont the output as
C:\Documents and Settings\Administrator\Desktop\1\bye
w&y\def.txt,200607130417
C:\Documents and Settings\Administrator\Desktop\1\hii
wx\abc.txt,200607130415
Stefan Behnel wrote:
Kirt wrote:
i have walked a directory and have written the foll xml document.
one of the folder had "&" character so i replaced it by "&"
#------------------test1.xml
<Directory>
<dirname>C:\Documents and Settings\Administrator\Desktop\1\bye
w&y </dirname>
<file>
<name>def.txt</name>
<time>200607130417</time>
</file>
</Directory>
<Directory>
<dirname>C:\Documents and Settings\Administrator\Desktop\1\hii
wx</dirname>
<file>
<name>abc.txt</name>
<time>200607130415</time>
</file>
</Directory
now in my python code i want to parse this doc and print the directory
name.
###----------handler------------filename---handler.py
from xml.sax.handler import ContentHandler
class oldHandler(ContentHandler):
def __init__(self):
self.dn = 0
def startElement(self, name, attrs):
if name=='dirname':
self.dn=1
def characters(self,str):
if self.dn:
print str
The problem is here. "print" adds a newline. Don't use print, just append the
characters (to a string or list) until the endElement callback is called.
def endElement(self, name):
if name == 'dirname':
self.dn=0
#---------------------------------------------------------------------
#main code--- fname----art.py
import sys
from xml.sax import make_parser
from handlers import oldHandler
ch = oldHandler()
saxparser = make_parser()
saxparser.setContentHandler(ch)
saxparser.parse(sys.argv[1])
#-----------------------------------------------------------------------------
i run the code as: $python art.py test1.xml
i am getting output as:
C:\Documents and Settings\Administrator\Desktop\1\bye w
&
y
C:\Documents and Settings\Administrator\Desktop\1\hii wx
where as i need an output which should look like this.
C:\Documents and Settings\Administrator\Desktop\1\bye w&y
C:\Documents and Settings\Administrator\Desktop\1\hii wx
Can someone tell me the solution for this.
Kirt wrote:
How do i append characters to a string?
I think the normal approach is to store an empty string (or list) in an
attribute in startElement(), append to it in characters() and use the result
in endElement().
def startElement(self, ...):
self.chars = ''
def characters(self, s):
self.chars += s
def endElement(self, ...):
value = self.chars
Or use a list and do this:
def endElement(self, ...):
value = ''.join(self.char_list)
Maybe you should consider switching to iterparse() of ElementTree or lxml.
Should be a bit easier to use than SAX ... http://effbot.org/zone/element-iterparse.htm http://codespeak.net/svn/lxml/trunk/doc/api.txt
Stefan
Stefan Behnel wrote:
>Kirt wrote:
>>i have walked a directory and have written the foll xml document. one of the folder had "&" character so i replaced it by "&" #------------------test1.xml <Directory> <dirname>C:\Documents and Settings\Administrator\Desktop\1\bye w&y </dirname> <file> <name>def.txt</name> <time>200607130417</time> </file> </Directory> <Directory> <dirname>C:\Documents and Settings\Administrator\Desktop\1\hii wx</dirname> <file> <name>abc.txt</name> <time>200607130415</time> </file> </Directory
now in my python code i want to parse this doc and print the directory name. ###----------handler------------filename---handler.py from xml.sax.handler import ContentHandler class oldHandler(ContentHandler): def __init__(self): self.dn = 0 def startElement(self, name, attrs): if name=='dirname': self.dn=1
def characters(self,str): if self.dn: print str
The problem is here. "print" adds a newline. Don't use print, just append the characters (to a string or list) until the endElement callback is called.
>> def endElement(self, name): if name == 'dirname': self.dn=0
#--------------------------------------------------------------------- #main code--- fname----art.py import sys from xml.sax import make_parser from handlers import oldHandler
ch = oldHandler() saxparser = make_parser()
saxparser.setContentHandler(ch) saxparser.parse(sys.argv[1]) #----------------------------------------------------------------------------- i run the code as: $python art.py test1.xml
i am getting output as:
C:\Documents and Settings\Administrator\Desktop\1\bye w & y C:\Documents and Settings\Administrator\Desktop\1\hii wx
where as i need an output which should look like this. C:\Documents and Settings\Administrator\Desktop\1\bye w&y
C:\Documents and Settings\Administrator\Desktop\1\hii wx
Can someone tell me the solution for this.
thanx stefan ur approach worked.
Stefan Behnel wrote:
Kirt wrote:
How do i append characters to a string?
I think the normal approach is to store an empty string (or list) in an
attribute in startElement(), append to it in characters() and use the result
in endElement().
def startElement(self, ...):
self.chars = ''
def characters(self, s):
self.chars += s
def endElement(self, ...):
value = self.chars
Or use a list and do this:
def endElement(self, ...):
value = ''.join(self.char_list)
Maybe you should consider switching to iterparse() of ElementTree or lxml.
Should be a bit easier to use than SAX ...
http://effbot.org/zone/element-iterparse.htm http://codespeak.net/svn/lxml/trunk/doc/api.txt
Stefan
Stefan Behnel wrote:
Kirt wrote: i have walked a directory and have written the foll xml document. one of the folder had "&" character so i replaced it by "&" #------------------test1.xml <Directory> <dirname>C:\Documents and Settings\Administrator\Desktop\1\bye w&y </dirname> <file> <name>def.txt</name> <time>200607130417</time> </file> </Directory> <Directory> <dirname>C:\Documents and Settings\Administrator\Desktop\1\hii wx</dirname> <file> <name>abc.txt</name> <time>200607130415</time> </file> </Directory
now in my python code i want to parse this doc and print the directory name. ###----------handler------------filename---handler.py from xml.sax.handler import ContentHandler class oldHandler(ContentHandler): def __init__(self): self.dn = 0 def startElement(self, name, attrs): if name=='dirname': self.dn=1
def characters(self,str): if self.dn: print str
The problem is here. "print" adds a newline. Don't use print, just append the
characters (to a string or list) until the endElement callback is called.
def endElement(self, name): if name == 'dirname': self.dn=0
#--------------------------------------------------------------------- #main code--- fname----art.py import sys from xml.sax import make_parser from handlers import oldHandler
ch = oldHandler() saxparser = make_parser()
saxparser.setContentHandler(ch) saxparser.parse(sys.argv[1]) #----------------------------------------------------------------------------- i run the code as: $python art.py test1.xml
i am getting output as:
C:\Documents and Settings\Administrator\Desktop\1\bye w & y C:\Documents and Settings\Administrator\Desktop\1\hii wx
where as i need an output which should look like this. C:\Documents and Settings\Administrator\Desktop\1\bye w&y
C:\Documents and Settings\Administrator\Desktop\1\hii wx
Can someone tell me the solution for this.
A SAX parser can notify a text node by calling any number of times the
characters method so you need to accumulate all the information you
receive on the characters method and output the text when you get a
notification different than characters.
Best Regards,
George
---------------------------------------------------------------------
George Cristian Bina
<oXygen/XML Editor, Schema Editor and XSLT Editor/Debugger http://www.oxygenxml.com
Kirt wrote:
i have walked a directory and have written the foll xml document.
one of the folder had "&" character so i replaced it by "&"
#------------------test1.xml
<Directory>
<dirname>C:\Documents and Settings\Administrator\Desktop\1\bye
w&y </dirname>
<file>
<name>def.txt</name>
<time>200607130417</time>
</file>
</Directory>
<Directory>
<dirname>C:\Documents and Settings\Administrator\Desktop\1\hii
wx</dirname>
<file>
<name>abc.txt</name>
<time>200607130415</time>
</file>
</Directory
now in my python code i want to parse this doc and print the directory
name.
###----------handler------------filename---handler.py
from xml.sax.handler import ContentHandler
class oldHandler(ContentHandler):
def __init__(self):
self.dn = 0
def startElement(self, name, attrs):
if name=='dirname':
self.dn=1
def characters(self,str):
if self.dn:
print str
def endElement(self, name):
if name == 'dirname':
self.dn=0
#---------------------------------------------------------------------
#main code--- fname----art.py
import sys
from xml.sax import make_parser
from handlers import oldHandler
ch = oldHandler()
saxparser = make_parser()
saxparser.setContentHandler(ch)
saxparser.parse(sys.argv[1])
#-----------------------------------------------------------------------------
i run the code as: $python art.py test1.xml
i am getting output as:
C:\Documents and Settings\Administrator\Desktop\1\bye w
&
y
C:\Documents and Settings\Administrator\Desktop\1\hii wx
where as i need an output which should look like this.
C:\Documents and Settings\Administrator\Desktop\1\bye w&y
C:\Documents and Settings\Administrator\Desktop\1\hii wx
Can someone tell me the solution for this.
Note that any good SAX tutorial will demonstrate how to buffer the
characters() events, if you don't feel like reinventing the solution
yourself.
Kirt wrote:
i have walked a directory and have written the foll xml document.
one of the folder had "&" character so i replaced it by "&"
#------------------test1.xml
<Directory>
<dirname>C:\Documents and Settings\Administrator\Desktop\1\bye
w&y </dirname>
[...]
#-----------------------------------------------------------------------------
i run the code as: $python art.py test1.xml
i am getting output as:
C:\Documents and Settings\Administrator\Desktop\1\bye w
&
y
C:\Documents and Settings\Administrator\Desktop\1\hii wx
where as i need an output which should look like this.
C:\Documents and Settings\Administrator\Desktop\1\bye w&y
C:\Documents and Settings\Administrator\Desktop\1\hii wx
This SAX filter is another way of doing it: http://aspn.activestate.com/ASPN/Coo.../Recipe/265881
Stefan This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics
by: C. Titus Brown |
last post by:
Hi all,
while playing with PBP/mechanize/ClientForm, I ran into a problem with
the way htmllib.HTMLParser was handling encoded tag attributes.
Specifically, the following HTML was not being...
|
by: DrTebi |
last post by:
Hello,
I have the following problem:
I used to "encode" my email address within links, in order to avoid (most)
email spiders. So I had a link like this:
<a...
|
by: BoonHead, The Lost Philosopher |
last post by:
I think the .NET framework is great!
It's nice, clean and logical; in contradiction to the old Microsoft.
It only saddens me that the new Microsoft still doesn't under stand there own...
|
by: Eric Osman |
last post by:
Hi,
I'm looking for a javascript function that will convert input such as this:
<CLUB Code="
into this:
<CLUB Code="
|
by: barney |
last post by:
Hello,
I' m using .NET System.Xml.XmlDOcument.
When I do the following:
XmlDocument xml = new XmlDocument();
xml.Load("blah");
....
xml.Save("blub");
I've got the problem that the following...
|
by: martin |
last post by:
Hi,
I would be extremly grateful for some help on producing an xml fragemt.
The fragment that I wish to produce should look like this
<Addresses>
<Address>&qout;Somebody's Name"...
|
by: Arne |
last post by:
A lot of Firefox users I know, says they have problems with validation
where the ampersand sign has to be written as & to be valid. I don't
have Firefox my self and don't wont to install it only...
|
by: Kirt |
last post by:
i have walked a directory and have written the foll xml document.
one of the folder had "&" character so i replaced it by "&"
#------------------test1.xml
<Directory>
<dirname>C:\Documents and...
|
by: Ragnar |
last post by:
Hi,
2 issues left with my tidy-work:
1) Tidy transforms a "&" in the source-xml into a "&" in the tidied
version. My XML-Importer cannot handle it
2) in a long <title>-string a wrap is...
|
by: Naresh1 |
last post by:
What is WebLogic Admin Training?
WebLogic Admin Training is a specialized program designed to equip individuals with the skills and knowledge required to effectively administer and manage Oracle...
|
by: AndyPSV |
last post by:
HOW CAN I CREATE AN AI with an .executable file that would suck all files in the folder and on my computerHOW CAN I CREATE AN AI with an .executable file that would suck all files in the folder and...
|
by: Arjunsri |
last post by:
I have a Redshift database that I need to use as an import data source. I have configured the DSN connection using the server, port, database, and credentials and received a successful connection...
|
by: Matthew3360 |
last post by:
Hi,
I have been trying to connect to a local host using php curl. But I am finding it hard to do this. I am doing the curl get request from my web server and have made sure to enable curl. I get a...
|
by: Carina712 |
last post by:
Setting background colors for Excel documents can help to improve the visual appeal of the document and make it easier to read and understand. Background colors can be used to highlight important...
|
by: BLUEPANDA |
last post by:
At BluePanda Dev, we're passionate about building high-quality software and sharing our knowledge with the community. That's why we've created a SaaS starter kit that's not only easy to use but also...
|
by: Ricardo de Mila |
last post by:
Dear people, good afternoon...
I have a form in msAccess with lots of controls and a specific routine must be triggered if the mouse_down event happens in any control.
Than I need to discover what...
|
by: Johno34 |
last post by:
I have this click event on my form. It speaks to a Datasheet Subform
Private Sub Command260_Click()
Dim r As DAO.Recordset
Set r = Form_frmABCD.Form.RecordsetClone
r.MoveFirst
Do
If...
|
by: ezappsrUS |
last post by:
Hi,
I wonder if someone knows where I am going wrong below. I have a continuous form and two labels where only one would be visible depending on the checkbox being checked or not. Below is the...
| |