473,402 Members | 2,050 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,402 software developers and data experts.

Saving XML as UTF-8?

How do I load and save a UTF-8 document in XML in ASP/VBS?
Well, the loading* is not the problem actually -- the file is in UTF-8,
and understood correctly -- but once saved, the UTF-8 is replaced by
what seems to be iso-8859-1 (which Flash doesn't understand, but that's
another problem). Any help greatly appreciated.
* Something like this...
set xDoc = server.createObject("Msxml2.DOMDocument")
xDoc.async = false
xDoc.load sPath
Jul 22 '05 #1
7 4958


Philipp Lenssen wrote:
How do I load and save a UTF-8 document in XML in ASP/VBS?

Well, the loading* is not the problem actually -- the file is in UTF-8,
and understood correctly -- but once saved, the UTF-8 is replaced by
what seems to be iso-8859-1 * Something like this...
set xDoc = server.createObject("Msxml2.DOMDocument")
xDoc.async = false
xDoc.load sPath


I am pretty sure if you then use
xDoc.save Server.MapPath(filename)
later then the encoding is preserved.
Are you by chance saving by writing xDoc.xml with the FileSystemObject?

The MSXML 4 docs say about the save method:

"Character encoding is based on the encoding attribute in the XML
declaration, such as <?xml version="1.0" encoding="windows-1252"?>. When
no encoding attribute is specified, the default setting is UTF-8."

which supports my view that the encoding the document has when being
loaded is preserved when saving.


--

Martin Honnen
http://JavaScript.FAQTs.com/
Jul 22 '05 #2
Martin Honnen wrote:
Philipp Lenssen wrote:
How do I load and save a UTF-8 document in XML in ASP/VBS?

I am pretty sure if you then use
xDoc.save Server.MapPath(filename)
later then the encoding is preserved.
Are you by chance saving by writing xDoc.xml with the
FileSystemObject?


Thanks so far Martin, this is my save method:

xDoc.save server.mapPath(sPath)

So no, I'm not using the FSO...
Any idea what's happening?

--
Google Blogoscoped
http://blog.outer-court.com
Jul 22 '05 #3


Philipp Lenssen wrote:

Philipp Lenssen wrote:

How do I load and save a UTF-8 document in XML in ASP/VBS?
this is my save method:

xDoc.save server.mapPath(sPath)


You say the file is saved as iso-8859-1, does MSXML really save it with
that encoding and put a
<?xml version="1.0" encoding="iso-8859-1"?>
in there, or why do you think that MSXML saves as iso-8859-1?

--

Martin Honnen
http://JavaScript.FAQTs.com/
Jul 22 '05 #4
Martin Honnen wrote:
Philipp Lenssen wrote:

Philipp Lenssen wrote:
> How do I load and save a UTF-8 document in XML in ASP/VBS?
>

this is my save method:

xDoc.save server.mapPath(sPath)


You say the file is saved as iso-8859-1, does MSXML really save it
with that encoding and put a <?xml version="1.0"
encoding="iso-8859-1"?> in there, or why do you think that MSXML
saves as iso-8859-1?


Let me put it this way. I use my own Netpadd editor, which doesn't
support UTF-8. I know because whenever I open UTF-8, I see this "i>?"
as first character. So when I want to open UTF-8, I use Notepad.
The files however that *were* UTF-8 when I put them in this tool which
I'm programming (a simple text translation tool), they are coming out
"fine" for my non-UTF-8 Netpadd once they are saved. So they lost their
"UTF-8ness" without me saying so in ASP!

Thanks so far, and hope you have more hints!
--
Google Blogoscoped
http://blog.outer-court.com
Jul 22 '05 #5
UTF-8 does not by itself add special characters to the start of a file. If
the files are plain XML the first non-whitespace character should be "<".
Unicode files do have 2 special characters at the beginning.

What operating system are you running on when you open files in Notepad? The
version of notepad included with NT, Win2000, and WinXP Pro is capable of
saving files in ANSI, Unicode, or UTF-8

How are you opening the files from the ASP script? If possible show the
simplest *working* code (just read and then write the file) that duplicates
the problem along with a sample XML file.
--
--Mark Schupp
Head of Development
Integrity eLearning
www.ielearning.com

"Philipp Lenssen" <in**@outer-court.com> wrote in message
news:35*************@individual.net...
Martin Honnen wrote:
Philipp Lenssen wrote:

> > Philipp Lenssen wrote:
> >
> >
> > > How do I load and save a UTF-8 document in XML in ASP/VBS?
> > >

> this is my save method:
>
> xDoc.save server.mapPath(sPath)
>


You say the file is saved as iso-8859-1, does MSXML really save it
with that encoding and put a <?xml version="1.0"
encoding="iso-8859-1"?> in there, or why do you think that MSXML
saves as iso-8859-1?


Let me put it this way. I use my own Netpadd editor, which doesn't
support UTF-8. I know because whenever I open UTF-8, I see this "i>?"
as first character. So when I want to open UTF-8, I use Notepad.
The files however that *were* UTF-8 when I put them in this tool which
I'm programming (a simple text translation tool), they are coming out
"fine" for my non-UTF-8 Netpadd once they are saved. So they lost their
"UTF-8ness" without me saying so in ASP!

Thanks so far, and hope you have more hints!
--
Google Blogoscoped
http://blog.outer-court.com

Jul 22 '05 #6


Philipp Lenssen wrote:
Martin Honnen wrote:
You say the file is saved as iso-8859-1, does MSXML really save it
with that encoding and put a <?xml version="1.0"
encoding="iso-8859-1"?> in there, or why do you think that MSXML
saves as iso-8859-1?


Let me put it this way. I use my own Netpadd editor, which doesn't
support UTF-8. I know because whenever I open UTF-8, I see this "i>?"
as first character. So when I want to open UTF-8, I use Notepad.
The files however that *were* UTF-8 when I put them in this tool which
I'm programming (a simple text translation tool), they are coming out
"fine" for my non-UTF-8 Netpadd once they are saved. So they lost their
"UTF-8ness" without me saying so in ASP!


Frankly to use a tool that doesn't understand UTF-8 to check whether a
file is UTF-8 encoded doesn't sound like a reliable way, it might simply
be a byte order mark at the beginning of the file and that mark is
optional in UTF-8.

I don't really how to help on that, I would use an XML parser to check
whether the file is properly encoded, simply loading the file in IE/Win
should do to check that.

If you have the application online then post a URL (or better two, one
to the original, one two the saved XML) then someone here could check
whether it is really UTF-8 or ISO-8859-1 what you get there.

--

Martin Honnen
http://JavaScript.FAQTs.com/
Jul 22 '05 #7
Martin Honnen wrote:
Philipp Lenssen wrote:

If you have the application online then post a URL (or better two,
one to the original, one two the saved XML) then someone here could
check whether it is really UTF-8 or ISO-8859-1 what you get there.


It's already solved, IIRC I posted this here already.

--
Google Blogoscoped
http://blog.outer-court.com
Jul 22 '05 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

12
by: Mike Dee | last post by:
A very very basic UTF-8 question that's driving me nuts: If I have this in the beginning of my Python script in Linux: #!/usr/bin/env python # -*- coding: UTF-8 -*- should I - or should I...
19
by: Svennglenn | last post by:
I'm working on a program that is supposed to save different information to text files. Because the program is in swedish i have to use unicode text for ÅÄÖ letters. When I run the following...
6
by: hilio | last post by:
I have an asp application that should allow the user to enter Unicode characters. The characters appear correctly in the browser. When saved in sql 2000 thought they are converted to question...
2
by: Cesar Ronchese | last post by:
Hello, All! I'm working with accentuated characters in my XML files, and I have found problems to load and save it. First, for this case, I always have my XML in memory, and I load it via...
2
by: THY | last post by:
Hi, I am developing a website in english & chinese both language. whenever I save, it required I set the encoding in advanced save options. But I found there are 4 related to unicode, can anyone...
0
by: Ersin Gençtürk | last post by:
hi, I am working with utf-8 encoded aspx files.But in visual studio.net 2003 when I make somechanges to a utf-8 aspx file and click save , it reverts the encoding to another (win-1254) format.I...
5
by: Neil G Jarman | last post by:
Hi, I would like to save my user's passwords as an encrypted sting. Are their built in functions for doing this? It's not financial data or anything, just to keep away prying eyes. many...
2
by: =?Utf-8?B?Um9iZXJ0SGlsbEVEUw==?= | last post by:
I have classic asp files (IIS6) that need to be saved in UTF-8 format. I save a file with Notepad using SAVE-AS & UTF-8 format. If I reopen the file using Notepad and bring up the save-as dialog...
5
by: Lucvdv | last post by:
This would better be described by 'serialization' than 'interop', but I didn't find a newsgroup that seems closer on topic. The problem in a few words: I save data with DataSet.WriteXML, but I...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.