472,992 Members | 3,401 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 472,992 software developers and data experts.

xml.dom - reading from a file

Is the way to use DOM for an xml file as follows:
1) Read the file into a string
2) Call xml.dom.minidom.parseString(string)

Jul 18 '05 #1
4 4138
sashan wrote:
Is the way to use DOM for an xml file as follows:
1) Read the file into a string
2) Call xml.dom.minidom.parseString(string)


It's one way, but xml.dom.minidom.parse(f) is generally better. f can
be a filename OR a file object open for reading.
Alex

Jul 18 '05 #2
On Mon, 17 Nov 2003 09:54:37 GMT, Alex Martelli <al***@aleax.it> wrote:
sashan wrote:
Is the way to use DOM for an xml file as follows:
1) Read the file into a string
2) Call xml.dom.minidom.parseString(string)


It's one way, but xml.dom.minidom.parse(f) is generally better. f can
be a filename OR a file object open for reading.

That reminds me ...

Is there a BDFL pronouncement or dev consensus on implementation of accepting
either filename or file-object?

E.g., should one
assert type(filename) is str
or
assert isinstance(filename,str)
or
??

and is the file object alternative

assert isinstance(filename, file) # too restrictive IMO
or
assert hasattr(filename,'read') and callable(filename.read) # what about next?
or
??

I guess the generic idea is that filename-when-it-is-a-file-object will be bound to
something that produces a sequence of strings, so shouldn't an iterator/generator
be acceptable as well? (E.g., I expect generator expressions will be handy for
test inputs etc.)

So should one look for a next method?

And, given a generic source of string chunks (must they be str instances or could they
be generator chunks recursively?) is there a blessed efficient wrapper function that will
convert the str chunk stream to an object that can fake a file instance more completely
(e.g., for readline etc.)?

In fact, why not a standard function to convert this kind of either-or argument into
a file instance proxy? Then policy and behavior could be standardized, and people
wouldn't be wondering and re-inventing wheel variants.

Let's see...
vars(file).keys()

['softspace', 'encoding', 'xreadlines', 'readlines', 'flush', 'close', 'seek', '__init__', 'newl
ines', '__setattr__', '__new__', 'readinto', 'next', 'write', 'closed', 'tell', 'mode', 'isatty'
, 'truncate', 'read', '__getattribute__', '__iter__', 'readline', 'fileno', 'writelines', 'name'
, '__doc__', '__delattr__', '__repr__']

Well, probably not all that (except maybe for nice error messages) and maybe the wrapping function
should accept some keyword arguments for 'strict' vs 'warn' and maybe optional callback vs exception
raising?

Oh, and what about when the arg is already a standard file instance? Should e.g., mode be
overridable when that is feasible?

In summary, the proposed goal is to make usage a no-brainer by providing a standard wrapping function
for file-or-filename args.

Regards,
Bengt Richter
Jul 18 '05 #3
In article <bp**********@216.39.172.122>, Bengt Richter wrote:
On Mon, 17 Nov 2003 09:54:37 GMT, Alex Martelli <al***@aleax.it> wrote:
sashan wrote:
Is the way to use DOM for an xml file as follows:
1) Read the file into a string
2) Call xml.dom.minidom.parseString(string)


It's one way, but xml.dom.minidom.parse(f) is generally better. f can
be a filename OR a file object open for reading.

That reminds me ...

Is there a BDFL pronouncement or dev consensus on implementation of accepting
either filename or file-object?

[snip]

The standard in such cases is usually the "leap before you look"
idiom, I should think, using try/except and catching signature-related
exceptions. In this case you might try to call read() and revert to
opening the file if there is no read method.

--
Magnus Lie Hetland "In this house we obey the laws of
http://hetland.org thermodynamics!" Homer Simpson
Jul 18 '05 #4
On Tue, 18 Nov 2003 13:58:47 +0000 (UTC), Magnus Lie Hetland wrote:
In article <bp**********@216.39.172.122>, Bengt Richter wrote:
On Mon, 17 Nov 2003 09:54:37 GMT, Alex Martelli <al***@aleax.it> wrote:
sashan wrote:

Is the way to use DOM for an xml file as follows:
1) Read the file into a string
2) Call xml.dom.minidom.parseString(string)

It's one way, but xml.dom.minidom.parse(f) is generally better. f can
be a filename OR a file object open for reading.

That reminds me ...

Is there a BDFL pronouncement or dev consensus on implementation of
accepting either filename or file-object?

[snip]

The standard in such cases is usually the "leap before you look"
idiom, I should think, using try/except and catching signature-related
exceptions. In this case you might try to call read() and revert to
opening the file if there is no read method.


Another ordering would be to make the parameter a file, without trying
to read first :

def my_function( f ) :
try :
f = file(f, "r")
except TypeError : pass

f.read()

If 'f' is a valid path, then you'll have an open file. If it is
already a file you'll get a type error ("coercing to Unicode: need
string or buffer, file found"). If it is neither, then the .read()
will fail.

--
Q: What is the difference between open-source and commercial software?
A: If you have a problem with commercial software you can call a phone
number and they will tell you it might be solved in a future version.
For open-source sofware there isn't a phone number to call, but you
get the solution within a day.

www: http://dman13.dyndns.org/~dman/ jabber: dm**@dman13.dyndns.org
Jul 18 '05 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
by: Xah Lee | last post by:
# -*- coding: utf-8 -*- # Python # to open a file and write to file # do f=open('xfile.txt','w') # this creates a file "object" and name it f. # the second argument of open can be
1
by: fabrice | last post by:
Hello, I've got trouble reading a text file (event viewer dump) by using the getline() function... After 200 - 300 lines that are read correctly, it suddenly stops reading the rest of the...
19
by: Lionel B | last post by:
Greetings, I need to read (unformatted text) from stdin up to EOF into a char buffer; of course I cannot allocate my buffer until I know how much text is available, and I do not know how much...
4
by: Oliver Knoll | last post by:
According to my ANSI book, tmpfile() creates a file with wb+ mode (that is just writing, right?). How would one reopen it for reading? I got the following (which works): FILE *tmpFile =...
6
by: Rajorshi Biswas | last post by:
Hi folks, Suppose I have a large (1 GB) text file which I want to read in reverse. The number of characters I want to read at a time is insignificant. I'm confused as to how best to do it. Upon...
1
by: Need Helps | last post by:
Hello. I'm writing an application that writes to a file a month, day, year, number of comments, then some strings for the comments. So the format for each record would look like:...
7
by: John Dann | last post by:
I'm trying to read some binary data from a file created by another program. I know the binary file format but can't change or control the format. The binary data is organised such that it should...
5
blazedaces
by: blazedaces | last post by:
Ok, so you know my problem, java is running out of memory reading with SAX, the event-based xml parser intended more-so than DOM for extremely large files. I'll try to explain what I've been doing...
6
by: efrenba | last post by:
Hi, I came from delphi world and now I'm doing my first steps in C++. I'm using C++builder because its ide is like delphi although I'm trying to avoid the vcl. I need to insert new features...
2
by: Derik | last post by:
I've got a XML file I read using a file_get_contents and turn into a simpleXML node every time index.php loads. I suspect this is causing a noticeable lag in my page-execution time. (Or the...
2
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 4 Oct 2023 starting at 18:00 UK time (6PM UTC+1) and finishing at about 19:15 (7.15PM) The start time is equivalent to 19:00 (7PM) in Central...
0
by: Aliciasmith | last post by:
In an age dominated by smartphones, having a mobile app for your business is no longer an option; it's a necessity. Whether you're a startup or an established enterprise, finding the right mobile app...
0
tracyyun
by: tracyyun | last post by:
Hello everyone, I have a question and would like some advice on network connectivity. I have one computer connected to my router via WiFi, but I have two other computers that I want to be able to...
4
NeoPa
by: NeoPa | last post by:
Hello everyone. I find myself stuck trying to find the VBA way to get Access to create a PDF of the currently-selected (and open) object (Form or Report). I know it can be done by selecting :...
1
by: Teri B | last post by:
Hi, I have created a sub-form Roles. In my course form the user selects the roles assigned to the course. 0ne-to-many. One course many roles. Then I created a report based on the Course form and...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 1 Nov 2023 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM) Please note that the UK and Europe revert to winter time on...
0
NeoPa
by: NeoPa | last post by:
Introduction For this article I'll be focusing on the Report (clsReport) class. This simply handles making the calling Form invisible until all of the Reports opened by it have been closed, when it...
0
isladogs
by: isladogs | last post by:
The next online meeting of the Access Europe User Group will be on Wednesday 6 Dec 2023 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, Mike...
4
by: GKJR | last post by:
Does anyone have a recommendation to build a standalone application to replace an Access database? I have my bookkeeping software I developed in Access that I would like to make available to other...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.