Hello,
do you know a good program to test what sort of charachters encoding
is used in a file.
I use iconv but it only can translate from a charachter encoding to an
other. The problem is that I have some files and the way I get them
doesn't assure me that what encoding they pretend to be is the one
they use.
Thanks for threading on this subject with me.
P.S. I doesn't think that test all the encoding possibilities with
iconv is a good solution. 2 2215
YGUEL wrote: Hello, do you know a good program to test what sort of charachters encoding is used in a file. I use iconv but it only can translate from a charachter encoding to an other. The problem is that I have some files and the way I get them doesn't assure me that what encoding they pretend to be is the one they use.
Thanks for threading on this subject with me.
P.S. I doesn't think that test all the encoding possibilities with iconv is a good solution.
I have see the Appendix F of XML 1.0 but does-it exists a code which
does that ?
"YGUEL" <ma**********@l ibertysurf.fr> wrote in message
news:53******** *************** ***@posting.goo gle.com... Hello, do you know a good program to test what sort of charachters encoding is used in a file.
Conformant xml parsers do this up to certain point (the ones that implements
xml spec 1.0 appendix F as you mentioned).
I use iconv but it only can translate from a charachter encoding to an other. The problem is that I have some files and the way I get them doesn't assure me that what encoding they pretend to be is the one they use.
The problem here is there is no idiot proof way to do this -
if we have this kind of document for example:
<?xml version="1.0" encoding="ISO-8859-1"?>
<doc>*</doc>
where * would be copyright sign for example (ASCII value xA9)
BUT despite of ISO-8859-1 being specified document would have
been saved in UTF-8 and thus * would be saved as ASCII
values xC2xA9. Now if you load that file with xml parser
you get xC3x82xC2xA9 (first 2 bytes is xC2 converted to ÚTF-8
and last to bytes is A9 converted to UTF-8)
bytes xC2 and xA9 being perfectly legal latin1 characters, how
would you detect that the file was saved in wrong encoding?
Thanks for threading on this subject with me.
P.S. I doesn't think that test all the encoding possibilities with iconv is a good solution.
If you're dealing with xml, xml declaration with encoding="whate ver"
specified would be only recognized by an xml parser, not iconv,
there might be some solutions available I'm not aware though, try google.
with respect,
Toni Uusitalo This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics |
by: lievemario |
last post by:
I've written a small xsl file,
but it is not working, it doesn't do the xsl:when test,
what is wrong with it?
The xsl-file:
********
|
by: kmunderwood |
last post by:
I am trying to combine "if match=" and "when test"
I am a newbie, and have made both work separately, but I can not seem
to combine them.
This is my xml("index.xml")page(I can not change this, it comes to me
this way.
<?xml version="1.0" encoding="iso-8859-1" ?>
|
by: H Lee |
last post by:
Hi,
I'm an XML newbie, and not sure if this is the appropriate newsgroup to post
my question, so feel free to suggest other newgroups where I should post
this message if this is the case.
I'm having issues using XmlTextWriter, saving it out to a file with UTF8
encoding, and seeing "dirty", or "human unreadable" characters show up
*right...
|
by: Nick |
last post by:
Hi,
Any sample code which test the encoding of a text file?
Most results I found in google is : unless you read the file, you will
never know the encoding,
but the problem is I need to tell the streamreader the encoding to be
read, how to do it?
|
by: Nick |
last post by:
Hi,
I am trying to output a string of chinese characters as a
text file. When I open a file for writing from VB, the
file is automatically set to UTF-8 encoding (can tell by
opening the file from notepad). However, when I open
this file from a Chinese program that does not support
unicode, garbage is displayed. So what I have to do is...
| |
by: Chris McDonough |
last post by:
ElementTree's XML serialization routine implied by tree._write(file,
node, encoding, namespaces looks like this (elided):
def _write(self, file, node, encoding, namespaces):
# write XML to file
tag = node.tag
if tag is Comment:
file.write("<!-- %s -->" % _escape_cdata(node.text, encoding))
elif tag is ProcessingInstruction:...
|
by: Netkiller |
last post by:
#!/usr/bin/python
# -*- coding: utf-8 -*-
"""
Project: Network News Transport Protocol Server Program
Description:
基于数æ®åº“的新闻组,实现BBSå‰ç«¯ä½¿ç”¨NNTPåè®®æ¥è®¿é—®è´´å
Reference:
NNTPå议: http://www.mibsoftware.com/userkt/0099.htm
æ£åˆ™è¡¨è¾¾å¼ï¼š...
|
by: Jens Jensen |
last post by:
I use StreamReader and StreamWriter to manipulate a text file file .
The problem is that my Norgian characters get altered.
How can i preserve the character set to ASCI ?
Thanks
JJ
|
by: grbeal |
last post by:
How do I test for a child element with xsl if condition?
We have a vendor application that outputs an XML file containing records of School Closings due to inclement weather. That XML file gets FTP'd to my web host when the Access database is changed. I'm using Dreamweaver to create an XSLT fragment to read the XML file and include the HTML...
|
by: marktang |
last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main...
|
by: Hystou |
last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it.
First, let's disable language...
| |
by: Oralloy |
last post by:
Hello folks,
I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>".
The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed.
This is as boiled down as I can make it. ...
|
by: Hystou |
last post by:
Overview:
Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For...
|
by: tracyyun |
last post by:
Dear forum friends,
With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the...
|
by: agi2029 |
last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then...
|
by: conductexam |
last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one.
At the time of converting from word file to html my equations which are in the word document file was convert...
|
by: TSSRALBI |
last post by:
Hello
I'm a network technician in training and I need your help.
I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs.
The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols.
I succeeded, with both firewalls in...
| |
by: 6302768590 |
last post by:
Hai team
i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
| |