By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
448,837 Members | 1,629 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 448,837 IT Pros & Developers. It's quick & easy.

Expat 1.95.8 fails on XML with newline

P: n/a
I saw something similar on the sourceforge bugs list but it was from
2001 so I assume it's fixed by now.

O/S: WinXP SP2 and WinCE. Expat lib linked in VC++ 6 SP6.

I have the following XML (simplified for discussion purposes) The XML
starts and ends with the braces.

{

<?xml version="1.0" encoding="UTF-8"?>
<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/"
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<soapenv:Body>
</soapenv:Body>
</soapenv:Envelope>}

NOTICE the newline(0A) at the beginning of the file. Now I use the
following C++ code to read from the XML file:

do
{
size_t len = fread(buf, 1, sizeof(buf), xmlfile);
done = len < sizeof(buf);
if (XML_Parse(parser, buf, len, done) == XML_STATUS_ERROR)
{
return ReturnLua(State, 1, "Error while parsing.");
}
} while (!done);

This works mint on all other XML files, but not with that one. This is
how the XML file is returned to me by a SOAP server. What happens is
that on first pass through the while loop, XML_Parse doesn't even go
into the functions previously set, it instantly returns
XML_STATUS_ERROR, and the rest is history.

I would like to know if the error is in my file reading code or an
Expat bug. If it is the latter, is there a patch or quick fix? And if
it is my code, then what could I do to strip the initial newline(s)?

Thanks in advance!

Jeff Lambert
Jul 20 '05 #1
Share this Question
Share on Google+
4 Replies


P: n/a
Jeff Lambert wrote:
{

<?xml version="1.0" encoding="UTF-8"?>
<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/"
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<soapenv:Body>
</soapenv:Body>
</soapenv:Envelope>}

NOTICE the newline(0A) at the beginning of the file.


This is not allowed. Nothing is allowed prior to the XML declaration.
--
Johannes Koch
In te domine speravi; non confundar in aeternum.
(Te Deum, 4th cent.)
Jul 20 '05 #2

P: n/a

NOTICE the newline(0A) at the beginning of the file. Now I use the
following C++ code to read from the XML file:

That makes the file not well formed, so any parser should reject it.
harsh but fair:-)

David
Jul 20 '05 #3

P: n/a
Jeff Lambert wrote:
I saw something similar on the sourceforge bugs list but it was from
2001 so I assume it's fixed by now.

O/S: WinXP SP2 and WinCE. Expat lib linked in VC++ 6 SP6.

I have the following XML (simplified for discussion purposes) The XML
starts and ends with the braces.

{

<?xml version="1.0" encoding="UTF-8"?>
<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/"
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<soapenv:Body>
</soapenv:Body>
</soapenv:Envelope>}

NOTICE the newline(0A) at the beginning of the file. Now I use the
following C++ code to read from the XML file:

do
{
size_t len = fread(buf, 1, sizeof(buf), xmlfile);
done = len < sizeof(buf);
if (XML_Parse(parser, buf, len, done) == XML_STATUS_ERROR)
{
return ReturnLua(State, 1, "Error while parsing.");
}
} while (!done);

This works mint on all other XML files, but not with that one. This is
how the XML file is returned to me by a SOAP server. What happens is
that on first pass through the while loop, XML_Parse doesn't even go
into the functions previously set, it instantly returns
XML_STATUS_ERROR, and the rest is history.

I would like to know if the error is in my file reading code or an
Expat bug. If it is the latter, is there a patch or quick fix? And if
it is my code, then what could I do to strip the initial newline(s)?


The given sample XML file (with whitespace before the xml declaration)
is processed without errors in my build of XMLgawk with Expat 1.95.8 on
Windows XP. You can find the sources at:

http://home.vrweb.de/~juergen.kahrs/gawk/XML/

You can download the source of "XMLpuller" and look how it uses Expat.

NOTE: I work in Windows, but I've tested the input file with both Unix
(LF) and Windows (CR+LF) newlines.
--
To reply by e-mail, please remove the extra dot
in the given address: m.collado -> mcollado

Jul 20 '05 #4

P: n/a
Jeff Lambert wrote:
I saw something similar on the sourceforge bugs list but it was from
2001 so I assume it's fixed by now.

O/S: WinXP SP2 and WinCE. Expat lib linked in VC++ 6 SP6.

I have the following XML (simplified for discussion purposes) The XML
starts and ends with the braces.

{

<?xml version="1.0" encoding="UTF-8"?>
<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/"
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<soapenv:Body>
</soapenv:Body>
</soapenv:Envelope>}

NOTICE the newline(0A) at the beginning of the file. Now I use the
following C++ code to read from the XML file:

do
{
size_t len = fread(buf, 1, sizeof(buf), xmlfile);
done = len < sizeof(buf);
if (XML_Parse(parser, buf, len, done) == XML_STATUS_ERROR)
{
return ReturnLua(State, 1, "Error while parsing.");
}
} while (!done);

This works mint on all other XML files, but not with that one. This is
how the XML file is returned to me by a SOAP server. What happens is
that on first pass through the while loop, XML_Parse doesn't even go
into the functions previously set, it instantly returns
XML_STATUS_ERROR, and the rest is history.

I would like to know if the error is in my file reading code or an
Expat bug. If it is the latter, is there a patch or quick fix? And if
it is my code, then what could I do to strip the initial newline(s)?

Thanks in advance!

Jeff Lambert


Please disregard my previous (cancelled) post about correct processing
in Windows. I was wrong.

My apologies.
--
To reply by e-mail, please remove the extra dot
in the given address: m.collado -> mcollado

Jul 20 '05 #5

This discussion thread is closed

Replies have been disabled for this discussion.