
December 28th, 2006, 07:05 PM
| | | xml bug?
I am using the standard xml library to create another library able to
read, and maybe write,
xmp files.
Then an xml library bug popped out:
xml.dom.minidom was unable to parse an xml file that came from an
example provided by an official organism.( http://www.iptc.org/IPTC4XMP)
The parsed file was somewhat hairy, but I have been able to reproduce
the bug with a simplified
version, that goes:
<?xpacket begin='' id='W5M0MpCehiHzreSzNTczkc9d'?>
<x:xmpmeta xmlns:x='adobe:ns:meta/' x:xmptk='XMP toolkit 3.0-28,
framework 1.6'>
<rdf:RDF xmlns:rdf='http://www.w3.org/1999/02/22-rdf-syntax-ns#'
xmlns:iX='http://ns.adobe.com/iX/1.0/'>
<rdf:Description rdf:about='uuid:f5b64178-9394-11d9-bb8e-a67e6693b6e9'
xmlns:xmpPLUS='XMP Photographic Licensing Universal System (xmpPLUS,
http://ns.adobe.com/xap/1.0/PLUS/)'>
<xmpPLUS:CreditLineReq>False</xmpPLUS:CreditLineReq>
<xmpPLUS:ReuseAllowed>False</xmpPLUS:ReuseAllowed>
</rdf:Description>
</rdf:RDF>
</x:xmpmeta>
<?xpacket end='w'?>
The offending part is the one that goes: xmpPLUS='....'
it triggers an exception: ValueError: too many values to unpack,
in _parse_ns_name. Some debugging showed an obvious mistake
in the scanning of the name argument, that goes beyond the closing
" ' ".
Im aware I dont give here enough matter to allow full understanding
of the bug. But thats not the place for this, and thats not my point.
Now my points are:
- how do I spot the version of a given library? There is a __version__
attribute of the module, is that it?
- How do I access to a given library buglist? Maybe this one is known,
about to be fixed, it would then be useless to report it.
- How do I report bugs, on a standard lib?
- I tried to copy the lib somewhere, put it BEFORE the official lib in
"the path" (that is:sys.path), the stack shown by the traceback
still shows the original files being used. Is there a special
mechanism bypassing the sys.path search, for standard libs? (I may
be wrong on this, it seems hard to believe...)
- does someone know a good tool to validate an xml file?
btw, my code:
from nxml.dom import minidom
....
class whatever:
def __init__(self, inStream):
xmldoc = minidom.parse(inStream)
Thanks for any help... | 
December 28th, 2006, 07:25 PM
| | | Re: xml bug?
"Imbaud Pierre" <pierre.imbaud@laposte.netwrote in message
news:4594130c$0$318$426a74cc@news.free.fr... Quote:
Now my points are:
- how do I spot the version of a given library? There is a __version__
attribute of the module, is that it?
| Yes, the module maintainer should be incrementing this version for each new
release and so it should properly correspond to the actual revision of code. Quote:
- How do I access to a given library buglist? Maybe this one is known,
about to be fixed, it would then be useless to report it.
| Not exactly sure, but this is probably a good place to start: http://docs.python.org/modindex.html Quote: |
- How do I report bugs, on a standard lib?
| I found this link: http://sourceforge.net/tracker/?grou...70&atid=105470
by looking under the "help" item at www.python.org (an excellent starting
place for all sorts of things). Quote:
- I tried to copy the lib somewhere, put it BEFORE the official lib in
"the path" (that is:sys.path), the stack shown by the traceback
still shows the original files being used. Is there a special
mechanism bypassing the sys.path search, for standard libs? (I may
be wrong on this, it seems hard to believe...)
| My understanding is sys.path is searched in order. The first entry is
usually the empty string, interpreted to mean the current directory. If you
modify sys.path to put the directory containing your modified code in front
of where the standard library is found, your code should be the one used.
That is not the case? Quote: |
- does someone know a good tool to validate an xml file?
| Typing "XML validator" into google returns a bunch. I think I would start
with the one at w3.org: http://validator.w3.org/ | 
December 28th, 2006, 09:55 PM
| | | Re: xml bug?
Erik Johnson a écrit : Quote:
"Imbaud Pierre" <pierre.imbaud@laposte.netwrote in message
news:4594130c$0$318$426a74cc@news.free.fr... Quote:
>>Now my points are:
>>- how do I spot the version of a given library? There is a __version__
> attribute of the module, is that it?
|
Yes, the module maintainer should be incrementing this version for eachnew
release and so it should properly correspond to the actual revision of code. Quote:
>>- How do I access to a given library buglist? Maybe this one is known,
> about to be fixed, it would then be useless to report it.
|
Not exactly sure, but this is probably a good place to start: http://docs.python.org/modindex.html | But python.org was the right entry point, it sent me to the bug
tracker: http://sourceforge.net/tracker/?grou...70&atid=105470
Its a bit short on explanations... And I found unsolved issues,
3 years old! this indexes the modules, not the buglist! Right! Same place to fetch and to submit. Fair. Quote:
by looking under the "help" item at www.python.org (an excellent starting
place for all sorts of things). Quote:
>>- I tried to copy the lib somewhere, put it BEFORE the official lib in
> "the path" (that is:sys.path), the stack shown by the traceback
> still shows the original files being used. Is there a special
> mechanism bypassing the sys.path search, for standard libs? (I may
> be wrong on this, it seems hard to believe...)
|
My understanding is sys.path is searched in order. The first entry is
usually the empty string, interpreted to mean the current directory. Ifyou
modify sys.path to put the directory containing your modified code in front
of where the standard library is found, your code should be the one used.
That is not the case?
| I put it in front, as for the unix PATH... Quote: Quote: |
>>- does someone know a good tool to validate an xml file?
|
Typing "XML validator" into google returns a bunch. I think I would start
with the one at w3.org: http://validator.w3.org/ | Ill try this. Thanks a lot, my friend! | 
December 29th, 2006, 01:05 AM
| | | Re: xml bug?
At Thursday 28/12/2006 15:58, Imbaud Pierre wrote: Quote:
>The offending part is the one that goes: xmpPLUS='....'
>it triggers an exception: ValueError: too many values to unpack,
>in _parse_ns_name. Some debugging showed an obvious mistake
>in the scanning of the name argument, that goes beyond the closing
>" ' ".
>
>Now my points are:
>- how do I spot the version of a given library? There is a __version__
attribute of the module, is that it?
| Usually, yes. But it's not required at all, and may have another
name. Look at the offending module. Quote:
>- I tried to copy the lib somewhere, put it BEFORE the official lib in
"the path" (that is:sys.path), the stack shown by the traceback
still shows the original files being used. Is there a special
mechanism bypassing the sys.path search, for standard libs? (I may
be wrong on this, it seems hard to believe...)
| When the module is inside a package -as in this case- it's a bit
harder. Code says `import xml.dom.modulename`, not `import
modulename`. So even if you put modulename.py earlier in the path, it
won't be found.
Some alternatives:
- modify the library in-place. It's the easiest way if you don't
redistribute your code.
- same as above but using an installer (checking version numbers, of course)
- "monkey patching". That is, in a new module of your own, imported
early on your application, write the corrected version of the offending method:
def _parse_ns_name(...):
...doing the right thing...
from xml.dom import modulename
modulename._parse_ns_name = _parse_ns_name
(maybe checking version numbers too)
--
Gabriel Genellina
Softlab SRL
__________________________________________________
Preguntá. Respondé. Descubrí.
Todo lo que querías saber, y lo que ni imaginabas,
está en Yahoo! Respuestas (Beta).
¡Probalo ya! http://www.yahoo.com.ar/respuestas | 
December 30th, 2006, 12:55 AM
| | | Re: xml bug?
Imbaud Pierre schrieb: Quote:
- how do I spot the version of a given library? There is a __version__
attribute of the module, is that it?
| Contrary to what others have said: for modules included in the standard
library (and if using these modules, rather than using PyXML), you
should use sys.version_info to identify a version. Quote:
- How do I access to a given library buglist? Maybe this one is known,
about to be fixed, it would then be useless to report it.
| Others have already pointed you to SF. Quote: |
- How do I report bugs, on a standard lib?
| Likewise. Quote:
- I tried to copy the lib somewhere, put it BEFORE the official lib in
"the path" (that is:sys.path), the stack shown by the traceback
still shows the original files being used. Is there a special
mechanism bypassing the sys.path search, for standard libs? (I may
be wrong on this, it seems hard to believe...)
| Which lib? "minidom.py"? Well, you are likely importing
"xml.dom.minidom", not "minidom". So adding another minidom.py
to a directory in sys.path won't help.
Regards,
Martin | 
December 30th, 2006, 12:55 AM
| | | Re: xml bug?
Imbaud Pierre schrieb: That's true, and likely to grow. Contributions are welcome!
Regards,
Martin | 
December 30th, 2006, 10:35 PM
| | | Re: xml bug?
Martin v. Löwis a écrit : Quote:
Imbaud Pierre schrieb: Quote:
>>- how do I spot the version of a given library? There is a __version__
> attribute of the module, is that it?
|
Contrary to what others have said: for modules included in the standard
library (and if using these modules, rather than using PyXML), you
should use sys.version_info to identify a version. Quote:
>>- How do I access to a given library buglist? Maybe this one is known,
> about to be fixed, it would then be useless to report it.
|
Others have already pointed you to SF. Quote: |
>>- How do I report bugs, on a standard lib?
|
Likewise. Quote:
>>- I tried to copy the lib somewhere, put it BEFORE the official lib in
> "the path" (that is:sys.path), the stack shown by the traceback
> still shows the original files being used. Is there a special
> mechanism bypassing the sys.path search, for standard libs? (I may
> be wrong on this, it seems hard to believe...)
|
Which lib? "minidom.py"? Well, you are likely importing
"xml.dom.minidom", not "minidom". So adding another minidom.py
to a directory in sys.path won't help.
Regards,
Martin
| I did import xml!
Maybe my mistake came from copying the whole tree from the standard
lib: comprising .pyc, .pyo... maybe the .pyc contained references to
previous sources?
Got rid of these, did reload ALL the modules, then exited/re-entered
the interpreter (ipython, btw...), and it eventually accessed the new
modules...
Btw, I pushed debugging further, the bug seem to stem from C code,
hence nothing easy to fix... Ill indeed submit a bug.
Thanks for your help! I obviously screamed for help before being
helpless, apologies... | 
January 8th, 2007, 04:15 PM
| | | closed issue
I submitted a bug, to sourceforge. Was answered (pretty fast) the file
I dealt with was the buggy part. I then submitted a bug to the file
author, who agreed, and fixed. End of the story.
All I could complain about, with the xml.dom library, is how obscure
the exception context was: I did violate SOME xml rule, ideally the
exception should show the rule, and the faulty piece of data. But I
know this has a cost, both runtime cost and developper-s time cost.
Imbaud Pierre a écrit : Quote:
I am using the standard xml library to create another library able to
read, and maybe write,
xmp files.
Then an xml library bug popped out:
xml.dom.minidom was unable to parse an xml file that came from an
example provided by an official organism.(http://www.iptc.org/IPTC4XMP)
The parsed file was somewhat hairy, but I have been able to reproduce
the bug with a simplified
version, that goes:
<?xpacket begin='' id='W5M0MpCehiHzreSzNTczkc9d'?>
<x:xmpmeta xmlns:x='adobe:ns:meta/' x:xmptk='XMP toolkit 3.0-28,
framework 1.6'>
<rdf:RDF xmlns:rdf='http://www.w3.org/1999/02/22-rdf-syntax-ns#'
xmlns:iX='http://ns.adobe.com/iX/1.0/'>
<rdf:Description rdf:about='uuid:f5b64178-9394-11d9-bb8e-a67e6693b6e9'
xmlns:xmpPLUS='XMP Photographic Licensing Universal System (xmpPLUS,
http://ns.adobe.com/xap/1.0/PLUS/)'>
<xmpPLUS:CreditLineReq>False</xmpPLUS:CreditLineReq>
<xmpPLUS:ReuseAllowed>False</xmpPLUS:ReuseAllowed>
</rdf:Description>
</rdf:RDF>
</x:xmpmeta>
<?xpacket end='w'?>
The offending part is the one that goes: xmpPLUS='....'
it triggers an exception: ValueError: too many values to unpack,
in _parse_ns_name. Some debugging showed an obvious mistake
in the scanning of the name argument, that goes beyond the closing
" ' ".
Im aware I dont give here enough matter to allow full understanding
of the bug. But thats not the place for this, and thats not my point.
Now my points are:
- how do I spot the version of a given library? There is a __version__
attribute of the module, is that it?
- How do I access to a given library buglist? Maybe this one is known,
about to be fixed, it would then be useless to report it.
- How do I report bugs, on a standard lib?
- I tried to copy the lib somewhere, put it BEFORE the official lib in
"the path" (that is:sys.path), the stack shown by the traceback
still shows the original files being used. Is there a special
mechanism bypassing the sys.path search, for standard libs? (I may
be wrong on this, it seems hard to believe...)
- does someone know a good tool to validate an xml file?
btw, my code:
from nxml.dom import minidom
...
class whatever:
def __init__(self, inStream):
xmldoc = minidom.parse(inStream)
Thanks for any help...
| |
Posting Rules
| You may not post new threads You may not post replies You may not post attachments You may not edit your posts HTML code is Off | | | | | | What is Bytes?
We are a network of experts and professionals in IT and software development that help one another with answers to tough questions and share insights.
Get the best answers to your questions from over network members.
|