472,368 Members | 2,604 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 472,368 software developers and data experts.

Question about using SHIFT-JIS encoding with libxml2

Hi,

I am using libxml2 for xml parsing. When the client application sends
data to libxml2 in UTF-8 format, it works fine.

But, I have a scenarion in which the client application sends data to
libxml2 parser in SHIFT-JIS format.

The following error is thrown by libxml2 -

"Parsing error in results: Input is not proper UTF-8, indicate
encoding !

In libxml2 documentation at http://www.xmlsoft.org/encoding.html I
read that libxml2 can support any encoding by calling the
xmlSwitchEncoding() routine.
What do I have to do to make libxml2 support SHIFT-JIS format? I want
to continue supporting UTF-8 also.
Thanks,
Saumya

Apr 10 '07 #1
6 4445
sa************@gmail.com wrote:
But, I have a scenarion in which the client application sends data to
libxml2 parser in SHIFT-JIS format.

The following error is thrown by libxml2 -

"Parsing error in results: Input is not proper UTF-8, indicate
encoding !
Does the XML contain an XML declaration indicating the encoding e.g.
<?xml version="1.0" encoding="SHIFT-JIS"?>

--

Martin Honnen
http://JavaScript.FAQTs.com/
Apr 10 '07 #2
Does the XML contain an XML declaration indicating the encoding e.g.
<?xml version="1.0" encoding="SHIFT-JIS"?>
Yes, it does. I thought that should that be enough to tell the libxml2
parser that the encoding format is SHIFT-JIS.
Does libxml2 support SHIFT-JIS encoding ? I want to keep the support
for UTF-8 intact too. Is it possible?
Does libxml2 convert SHIFT-JIS to UTF-8 internally if it is supplied
in XML declaration as above?

Thanks,
Saumya

On Apr 10, 7:20 pm, Martin Honnen <ma*******@yahoo.dewrote:
sa************@gmail.com wrote:
But, I have a scenarion in which the client application sends data to
libxml2 parser in SHIFT-JIS format.
The following error is thrown by libxml2 -
"Parsing error in results: Input is not proper UTF-8, indicate
encoding !

Does the XML contain an XML declaration indicating the encoding e.g.
<?xml version="1.0" encoding="SHIFT-JIS"?>

--

Martin Honnen
http://JavaScript.FAQTs.com/

Apr 11 '07 #3
On Tue, 10 Apr 2007 22:13:25 -0700, sa************@gmail.com scripst:
Yes, it does. I thought that should that be enough to tell the libxml2
parser that the encoding format is SHIFT-JIS. Does libxml2 support
SHIFT-JIS encoding ? I want to keep the support for UTF-8 intact too. Is
it possible? Does libxml2 convert SHIFT-JIS to UTF-8 internally if it is
supplied in XML declaration as above?
This looks promising (and yes, do read both referenced tutorials)
http://xmlsoft.org/encoding.html

Matej
Apr 11 '07 #4
sa************@gmail.com wrote:
Does libxml2 support SHIFT-JIS encoding ?
I don't know offhand. Find its documentation?
Does libxml2 convert SHIFT-JIS to UTF-8 internally if it is supplied
in XML declaration as above?
Most Java-based XML processors actually convert to UTF-16 internally,
since that's a native character representation in Java. I don't know
what libxml2 is using, but I would expect they're doing something
similar -- convert to some standardized internal form, process that,
then convert back. Some tools have tried to avoid the double conversion
when data is being passed straight through, but recognizing and taking
advantage of that optimization is not easy.

--
() ASCII Ribbon Campaign | Joe Kesselman
/\ Stamp out HTML e-mail! | System architexture and kinetic poetry
Apr 11 '07 #5
On Apr 11, 7:13 am, "saumya.agar...@gmail.com"
<saumya.agar...@gmail.comwrote:
Does libxml2 support SHIFT-JIS encoding ? I want to keep the support
for UTF-8 intact too. Is it possible?
For what it's worth, the source code contains the following (in
version 2.6.27):

case XML_CHAR_ENCODING_2022_JP:
__xmlErrEncoding(ctxt, XML_ERR_UNSUPPORTED_ENCODING,
"encoding not supported %s\n",
BAD_CAST "ISO-2022-JP", NULL);
break;
case XML_CHAR_ENCODING_SHIFT_JIS:
__xmlErrEncoding(ctxt, XML_ERR_UNSUPPORTED_ENCODING,
"encoding not supported %s\n",
BAD_CAST "Shift_JIS", NULL);
break;
case XML_CHAR_ENCODING_EUC_JP:
__xmlErrEncoding(ctxt, XML_ERR_UNSUPPORTED_ENCODING,
"encoding not supported %s\n",
BAD_CAST "EUC-JP", NULL);
break;

Apr 12 '07 #6
"Arndt Jonasson" <ar************@gmail.comwrites:
For what it's worth, the source code contains the following (in
version 2.6.27):
However, according to the webpage (link to which I sent to this
thread) libxml can use iconv and all its supported codepages
(i.e., whatever you have even dreamed about).

Matej
Apr 12 '07 #7

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

11
by: William Stacey | last post by:
Given any arbitrary byte (or int for that matter), is there a way to figure out: 1) The *number of one's set without a loop? Maybe some formula?. 2) Looking from left the right, if all 1's are...
8
by: John Salerno | last post by:
Ok, for those who have gotten as far as level 2 (don't laugh!), I have a question. I did the translation as such: import string alphabet = string.lowercase code = string.lowercase + 'ab'...
13
by: HARDCORECODER | last post by:
ok here is the question. I want to exract the first 4 bits in a int so let say int b = somenumber; int result= 0; result = b << 4; if I got this right result should contain the 4 bits that...
7
by: gokkog | last post by:
Hello, Recently I have the book Programming Pearls. Nice read! Perhaps it is weekend, I cannot understand the following codes well: #define BITSPERWORD 32 #define SHIFT 5 #define MASK 0x1F...
15
by: Christopher Layne | last post by:
So I recently ran into a situation where I invoked UB without specifically knowing I did it. Yes, human, I know. What exactly is/was the rationale for not allowing shifts to be the same width of...
4
by: hui11 | last post by:
Hi, According to the doc at mozilla, http://developer.mozilla.org/en/docs/DOM:event.charCode, the charCode takes shift into consideration when pressed. I found that to be true for other cases...
8
by: Perl Beginner | last post by:
I am using Win32. I have created an excel spreadsheet, formatted the columns and rows, and would like to write to the cells…all of this using Spreadsheet::WriteExcel. My issue is, my script is very...
1
by: pitjpz | last post by:
We have moved our Database to another server. The server it was on used SQL 4 and the new one its on now uses SQL5 the only problem we can find is that when you attempt to delete a record from...
1
by: matthewroth | last post by:
I have searched high and low and am stumped on this. The below code is a checklist form for work. there are 2 shifts 7 days a week and it displays the checklist items for each shift. something i did...
9
by: lorlarz | last post by:
I still have a question regarding the following code, in a commonly used routine. First, Here's the code in question: Function.prototype.bind = function(){ var fn = this, args =...
2
by: Kemmylinns12 | last post by:
Blockchain technology has emerged as a transformative force in the business world, offering unprecedented opportunities for innovation and efficiency. While initially associated with cryptocurrencies...
0
by: Naresh1 | last post by:
What is WebLogic Admin Training? WebLogic Admin Training is a specialized program designed to equip individuals with the skills and knowledge required to effectively administer and manage Oracle...
0
by: antdb | last post by:
Ⅰ. Advantage of AntDB: hyper-convergence + streaming processing engine In the overall architecture, a new "hyper-convergence" concept was proposed, which integrated multiple engines and...
0
hi
by: WisdomUfot | last post by:
It's an interesting question you've got about how Gmail hides the HTTP referrer when a link in an email is clicked. While I don't have the specific technical details, Gmail likely implements measures...
1
by: Matthew3360 | last post by:
Hi, I have been trying to connect to a local host using php curl. But I am finding it hard to do this. I am doing the curl get request from my web server and have made sure to enable curl. I get a...
0
by: Rahul1995seven | last post by:
Introduction: In the realm of programming languages, Python has emerged as a powerhouse. With its simplicity, versatility, and robustness, Python has gained popularity among beginners and experts...
1
by: Johno34 | last post by:
I have this click event on my form. It speaks to a Datasheet Subform Private Sub Command260_Click() Dim r As DAO.Recordset Set r = Form_frmABCD.Form.RecordsetClone r.MoveFirst Do If...
1
by: ezappsrUS | last post by:
Hi, I wonder if someone knows where I am going wrong below. I have a continuous form and two labels where only one would be visible depending on the checkbox being checked or not. Below is the...
0
by: jack2019x | last post by:
hello, Is there code or static lib for hook swapchain present? I wanna hook dxgi swapchain present for dx11 and dx9.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.