473,701 Members | 2,628 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Question about using SHIFT-JIS encoding with libxml2

Hi,

I am using libxml2 for xml parsing. When the client application sends
data to libxml2 in UTF-8 format, it works fine.

But, I have a scenarion in which the client application sends data to
libxml2 parser in SHIFT-JIS format.

The following error is thrown by libxml2 -

"Parsing error in results: Input is not proper UTF-8, indicate
encoding !

In libxml2 documentation at http://www.xmlsoft.org/encoding.html I
read that libxml2 can support any encoding by calling the
xmlSwitchEncodi ng() routine.
What do I have to do to make libxml2 support SHIFT-JIS format? I want
to continue supporting UTF-8 also.
Thanks,
Saumya

Apr 10 '07 #1
6 4544
sa************@ gmail.com wrote:
But, I have a scenarion in which the client application sends data to
libxml2 parser in SHIFT-JIS format.

The following error is thrown by libxml2 -

"Parsing error in results: Input is not proper UTF-8, indicate
encoding !
Does the XML contain an XML declaration indicating the encoding e.g.
<?xml version="1.0" encoding="SHIFT-JIS"?>

--

Martin Honnen
http://JavaScript.FAQTs.com/
Apr 10 '07 #2
Does the XML contain an XML declaration indicating the encoding e.g.
<?xml version="1.0" encoding="SHIFT-JIS"?>
Yes, it does. I thought that should that be enough to tell the libxml2
parser that the encoding format is SHIFT-JIS.
Does libxml2 support SHIFT-JIS encoding ? I want to keep the support
for UTF-8 intact too. Is it possible?
Does libxml2 convert SHIFT-JIS to UTF-8 internally if it is supplied
in XML declaration as above?

Thanks,
Saumya

On Apr 10, 7:20 pm, Martin Honnen <ma*******@yaho o.dewrote:
sa************@ gmail.com wrote:
But, I have a scenarion in which the client application sends data to
libxml2 parser in SHIFT-JIS format.
The following error is thrown by libxml2 -
"Parsing error in results: Input is not proper UTF-8, indicate
encoding !

Does the XML contain an XML declaration indicating the encoding e.g.
<?xml version="1.0" encoding="SHIFT-JIS"?>

--

Martin Honnen
http://JavaScript.FAQTs.com/

Apr 11 '07 #3
On Tue, 10 Apr 2007 22:13:25 -0700, sa************@ gmail.com scripst:
Yes, it does. I thought that should that be enough to tell the libxml2
parser that the encoding format is SHIFT-JIS. Does libxml2 support
SHIFT-JIS encoding ? I want to keep the support for UTF-8 intact too. Is
it possible? Does libxml2 convert SHIFT-JIS to UTF-8 internally if it is
supplied in XML declaration as above?
This looks promising (and yes, do read both referenced tutorials)
http://xmlsoft.org/encoding.html

Matej
Apr 11 '07 #4
sa************@ gmail.com wrote:
Does libxml2 support SHIFT-JIS encoding ?
I don't know offhand. Find its documentation?
Does libxml2 convert SHIFT-JIS to UTF-8 internally if it is supplied
in XML declaration as above?
Most Java-based XML processors actually convert to UTF-16 internally,
since that's a native character representation in Java. I don't know
what libxml2 is using, but I would expect they're doing something
similar -- convert to some standardized internal form, process that,
then convert back. Some tools have tried to avoid the double conversion
when data is being passed straight through, but recognizing and taking
advantage of that optimization is not easy.

--
() ASCII Ribbon Campaign | Joe Kesselman
/\ Stamp out HTML e-mail! | System architexture and kinetic poetry
Apr 11 '07 #5
On Apr 11, 7:13 am, "saumya.agar... @gmail.com"
<saumya.agar... @gmail.comwrote :
Does libxml2 support SHIFT-JIS encoding ? I want to keep the support
for UTF-8 intact too. Is it possible?
For what it's worth, the source code contains the following (in
version 2.6.27):

case XML_CHAR_ENCODI NG_2022_JP:
__xmlErrEncodin g(ctxt, XML_ERR_UNSUPPO RTED_ENCODING,
"encoding not supported %s\n",
BAD_CAST "ISO-2022-JP", NULL);
break;
case XML_CHAR_ENCODI NG_SHIFT_JIS:
__xmlErrEncodin g(ctxt, XML_ERR_UNSUPPO RTED_ENCODING,
"encoding not supported %s\n",
BAD_CAST "Shift_JIS" , NULL);
break;
case XML_CHAR_ENCODI NG_EUC_JP:
__xmlErrEncodin g(ctxt, XML_ERR_UNSUPPO RTED_ENCODING,
"encoding not supported %s\n",
BAD_CAST "EUC-JP", NULL);
break;

Apr 12 '07 #6
"Arndt Jonasson" <ar************ @gmail.comwrite s:
For what it's worth, the source code contains the following (in
version 2.6.27):
However, according to the webpage (link to which I sent to this
thread) libxml can use iconv and all its supported codepages
(i.e., whatever you have even dreamed about).

Matej
Apr 12 '07 #7

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

11
1464
by: William Stacey | last post by:
Given any arbitrary byte (or int for that matter), is there a way to figure out: 1) The *number of one's set without a loop? Maybe some formula?. 2) Looking from left the right, if all 1's are continuous with not intervening 0's. 0's on the end ok. TIA -- William Stacey, MVP
8
2261
by: John Salerno | last post by:
Ok, for those who have gotten as far as level 2 (don't laugh!), I have a question. I did the translation as such: import string alphabet = string.lowercase code = string.lowercase + 'ab' clue = "g fmnc wms bgblr rpylqjyrc gr zw fylb. rfyrq ufyr amknsrcpq ypc dmp. bmgle gr gl zw fylb gq glcddgagclr ylb rfyr'q ufw rfgq rcvr gq qm jmle. sqgle qrpgle.kyicrpylq() gq pcamkkclbcb. lmu ynnjw ml rfc spj."
13
2311
by: HARDCORECODER | last post by:
ok here is the question. I want to exract the first 4 bits in a int so let say int b = somenumber; int result= 0; result = b << 4; if I got this right result should contain the 4 bits that were shifter to the left right? :) and if I'm wrong how can i get an x number of bits into a variable?
7
2583
by: gokkog | last post by:
Hello, Recently I have the book Programming Pearls. Nice read! Perhaps it is weekend, I cannot understand the following codes well: #define BITSPERWORD 32 #define SHIFT 5 #define MASK 0x1F #define N 10000000 int a;
15
1778
by: Christopher Layne | last post by:
So I recently ran into a situation where I invoked UB without specifically knowing I did it. Yes, human, I know. What exactly is/was the rationale for not allowing shifts to be the same width of the datatype one is shifting? Also, for most common platforms (oh, alright, x86), it's okay to do at the assembly level, isn't it? (provided the opcodes allow it, I guess that's my question as well).
4
7056
by: hui11 | last post by:
Hi, According to the doc at mozilla, http://developer.mozilla.org/en/docs/DOM:event.charCode, the charCode takes shift into consideration when pressed. I found that to be true for other cases except when Alt+Shift is pressed. For example, the charCode alt+shift+a does not reflect a capitalized a, 'A', or alt+shift+1 does not reflect '!', instead they should 'a' and '1', respectively. Furthermore, I found that when alt+shift is pressed, the...
8
9415
by: Perl Beginner | last post by:
I am using Win32. I have created an excel spreadsheet, formatted the columns and rows, and would like to write to the cells…all of this using Spreadsheet::WriteExcel. My issue is, my script is very vast with a lot of subroutines, but I need the excel spreadsheet created in the main subroutine becasue this is where the data is that i want to capture. So if I create and format the spreadsheet within the main subroutine, and as it loops through,...
1
2287
by: pitjpz | last post by:
We have moved our Database to another server. The server it was on used SQL 4 and the new one its on now uses SQL5 the only problem we can find is that when you attempt to delete a record from the DB the following happens: When Deleting a record: Fatal Error: Can't call method "fetchrow_arrayref" on an undefined value at GT::SQL::File::delete_records line 275. Stack Trace: GT::Base (2704): main::fatal called at...
1
1348
by: matthewroth | last post by:
I have searched high and low and am stumped on this. The below code is a checklist form for work. there are 2 shifts 7 days a week and it displays the checklist items for each shift. something i did not plan on was a bi-weekly task. (dumb oversight, I know) Can anyone point me in the right direction to add an item that would only occur every 2 weeks. thanks in advance <? include_once('./sql_connect.php'); include_once('./links.php');...
9
1534
by: lorlarz | last post by:
I still have a question regarding the following code, in a commonly used routine. First, Here's the code in question: Function.prototype.bind = function(){ var fn = this, args = Array.prototype.slice.call(arguments), object = args.shift(); return function(){ return fn.apply(object, args.concat(Array.prototype.slice.call(arguments)));
0
8736
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
8649
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
1
8977
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
7824
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
6571
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5904
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
4410
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
4662
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
2
2398
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.