471,092 Members | 1,583 Online
Bytes | Software Development & Data Engineering Community
Post +

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 471,092 software developers and data experts.

"Invalid hexadecimal character reference" error parsing an XML withSAX processor

Hy to everyone

I have created a simple SAX parser for a very simple XML file.

When I run the code that follows I get this error:

"Invalid hexadecimal character reference"

The strange thing is If I change the "chunk size" for the data I send
to the parser, the error row changes. This behaviour is very strange!

I have done a one more test and I have set the chunkSize equals to the
file size and I have the same error at the end of the file.

The same XML file processed with another language doesn't raise any
error.

I use PHP 5.2.3 and a LAMP (AppServ Open Project - 2.5.9 for Windows)
on
a Windows VISTA PC.

The code I have used follows:

public function create_parser($filename)
{
$this->fp = fopen($filename, 'r');
$this->fsize = filesize($filename);
$this->parser = xml_parser_create();
xml_set_element_handler($this->parser,
'Parser::start_element','Parser::end_element');
xml_set_character_data_handler($this->parser, 'Parser::char_data');
}
public function parse()
{
//$blockSize = 4*1024;
$blockSize = $this->fsize; echo 'Lunghezza file: '.$this-
>fsize;
while ($data = fread($this->fp, $blockSize))
{
//$data = str_replace('\n','',$data);
if (!xml_parse($this->parser, $data, feof($this->fp)))
{
echo 'Parser error: ('.xml_get_current_byte_index($this-
>parser).')
\''.xml_error_string($this->parser).'\' at line '.
xml_get_current_line_number($this->parser). ' at col ' .
xml_get_current_column_number($this->parser);
return false;
}
}
return true;
}
A piece of the XML following:

<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE dblp SYSTEM "dblp.dtd">
<dblp>
<incollection mdate="2002-01-03"
key="books/acm/kim95/AnnevelinkACFHK95">
<author>
Jurgen Annevelink
</author>
<author>
Rafiul Ahad
</author>
<author>
Amelia Carlson
</author>
<author>
Daniel H. Fishman
</author>
<author>
Michael L. Heytens
</author>
<author>

.....
The Industrial Information Technology Handbook
</booktitle>
<url>
db/books/collections/IITHandbook2005.html#SeyfarthK05
</url>
</incollection>
</dblp>

Aug 5 '08 #1
0 1147

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

38 posts views Thread by Haines Brown | last post: by
6 posts views Thread by mihailsmilev | last post: by

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.