473,320 Members | 1,978 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,320 software developers and data experts.

copy XML file -- three extra bytes???

Hi,

I am copying an xml file like so.

Dim xmlDoc As New XmlDocument
xmlDoc.Load("C:\Program Files\Templates\message.msg")
Console.WriteLine("Tmaplate loaded")
xmlDoc.Save("C:\Program Files\Templates\copy.xml")
Console.WriteLine("message saved")

Now the xml file copies and is capable of being end with IE, however the xml
file that is prodced is not able to be copied using the method above.

The reason is the produced xml file has three additional bytes at the start
of it (ie before the "<xml" part)

my question is.

does anybody know why this is and how to get rid of the three additional
bytes at the start of the file.

many thanks in advance.

martin.


Nov 18 '05 #1
4 2229
martin wrote:
Hi,

I am copying an xml file like so.

Dim xmlDoc As New XmlDocument
xmlDoc.Load("C:\Program Files\Templates\message.msg")
Console.WriteLine("Tmaplate loaded")
xmlDoc.Save("C:\Program Files\Templates\copy.xml")
Console.WriteLine("message saved")

Now the xml file copies and is capable of being end with IE, however the xml
file that is prodced is not able to be copied using the method above.

The reason is the produced xml file has three additional bytes at the start
of it (ie before the "<xml" part)

my question is.

does anybody know why this is and how to get rid of the three additional
bytes at the start of the file.


The file is being save in a Unicode encoding. The 3 additional byes are
a Unicode BOM (Byte Order Mark).

you can probably solve the problem by either specifying that the file is
encoded with Unicode in the <?xml ...> declaration tag, or by saving the
file in ASCII:

dim stream as StreamWriter
try
stream = New StreamWriter( "C:\Program Files\Templates\copy.xml",
false, System.Text.Encoding.Default)

xmlDoc.Save( stream)
Console.WriteLine("message saved")
catch
Console.WriteLine( "Error saving file")
finally
if (Not stream Is nothing)
stream.Close()
end if
end try
--
mikeb
Nov 18 '05 #2
mikeb wrote:
martin wrote:
Hi,

I am copying an xml file like so.

Dim xmlDoc As New XmlDocument
xmlDoc.Load("C:\Program Files\Templates\message.msg")
Console.WriteLine("Tmaplate loaded")
xmlDoc.Save("C:\Program Files\Templates\copy.xml")
Console.WriteLine("message saved")

Now the xml file copies and is capable of being end with IE, however
the xml
file that is prodced is not able to be copied using the method above.

The reason is the produced xml file has three additional bytes at the
start
of it (ie before the "<xml" part)

my question is.

does anybody know why this is and how to get rid of the three additional
bytes at the start of the file.


The file is being save in a Unicode encoding. The 3 additional byes are
a Unicode BOM (Byte Order Mark).

you can probably solve the problem by either specifying that the file is
encoded with Unicode in the <?xml ...> declaration tag, or by saving the
file in ASCII:

dim stream as StreamWriter
try
stream = New StreamWriter( "C:\Program Files\Templates\copy.xml",
false, System.Text.Encoding.Default)

xmlDoc.Save( stream)
Console.WriteLine("message saved")
catch
Console.WriteLine( "Error saving file")
finally
if (Not stream Is nothing)
stream.Close()
end if
end try


Clarification: the Unicode encoding that you're seeing is probably UTF-8.

In any case, I played around a little bit more with your sample code,
and I had to manually change the encoding specified in the input file to
be incorrect to get xmlDoc.Load() to throw an exception. In other words,
xmlDoc.Load() does not seem to mind the BOM header, unless the encoding
attribute in the <?xml ...?> tag is lying.

Can you post a very, very small XML file that causes the problem you're
seeing?
--
mikeb
Nov 18 '05 #3
You are correct,
The problem now becomes now to create an xml file with the line

<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>

(with the UTF encoding set to 8)

using xmldocument.load

or do I just have to revert to your ascii method??

many thanks for the help, I have included samples below that demonstarte my
problem.
The xml file is generted in code rather than include files to this message.
================
Try

Dim doc As New XmlDocument

doc.LoadXml("<?xml version=""1.0"" encoding=""UTF-8"" standalone=""yes""?>"
& _

"<Message version=""1.1"" id="""">" & _

"<Attributes>" & _

"<Priority></Priority>" & _

"<DeleteAttaches></DeleteAttaches>" & _

"</Attributes>" & _

"</Message>")

doc.Save("C:\Program Files\Templates\ThreeByteError.xml")

Console.WriteLine("Saved the dodgy xml file")

doc.LoadXml("<?xml version=""1.0"" standalone=""yes""?>" & _

"<Message version=""1.1"" id="""">" & _

"<Attributes>" & _

"<Priority></Priority>" & _

"<DeleteAttaches></DeleteAttaches>" & _

"</Attributes>" & _

"</Message>")

doc.Save("C:\Program Files\Templates\NoThreeByteError.xml")

Console.WriteLine("Saved the fine xml file")

Console.WriteLine("Press a key to close")

Console.ReadLine()

Catch ex As Exception

Console.WriteLine("***ERROR***")

Console.WriteLine(ex.Message)

End Try

Console.WriteLine("Press a key to close")

Console.ReadLine()

End Sub

================

Now run the follwoing at he command line to see the problem

type "C:\Program Files\Templates\ThreeByteError.xml"

type "C:\Program Files\Templates\NoThreeByteError.xml"

fc "C:\Program Files\Templates\NoThreeByteError.xml" "C:\Program
Files\Templates\ThreeByteError.xml"
cheers

martin.
Nov 18 '05 #4
martin wrote:
You are correct,
The problem now becomes now to create an xml file with the line

<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>

(with the UTF encoding set to 8)
Well, the documentation for StreamWriter indicates that a BOM will be
written unless the encoding used is Encoding.Default.

However, at least for XmlDocument.Load(), the BOM poses no problem on my
machine - it loads just fine.

If there's some other software that you need to load the XML document
into that does not handle the BOM, I suppose you have a few options:

- write the file using Encoding.Default.
- post-process the output file to remove the BOM
- upgrade the software that doesn't like the BOM to handle it properly

I'm sure there are others, too.

using xmldocument.load

or do I just have to revert to your ascii method??

many thanks for the help, I have included samples below that demonstarte my
problem.
The xml file is generted in code rather than include files to this message.
================
Try

Dim doc As New XmlDocument

doc.LoadXml("<?xml version=""1.0"" encoding=""UTF-8"" standalone=""yes""?>"
& _

"<Message version=""1.1"" id="""">" & _

"<Attributes>" & _

"<Priority></Priority>" & _

"<DeleteAttaches></DeleteAttaches>" & _

"</Attributes>" & _

"</Message>")

doc.Save("C:\Program Files\Templates\ThreeByteError.xml")

Console.WriteLine("Saved the dodgy xml file")

doc.LoadXml("<?xml version=""1.0"" standalone=""yes""?>" & _

"<Message version=""1.1"" id="""">" & _

"<Attributes>" & _

"<Priority></Priority>" & _

"<DeleteAttaches></DeleteAttaches>" & _

"</Attributes>" & _

"</Message>")

doc.Save("C:\Program Files\Templates\NoThreeByteError.xml")

Console.WriteLine("Saved the fine xml file")

Console.WriteLine("Press a key to close")

Console.ReadLine()

Catch ex As Exception

Console.WriteLine("***ERROR***")

Console.WriteLine(ex.Message)

End Try

Console.WriteLine("Press a key to close")

Console.ReadLine()

End Sub

================

Now run the follwoing at he command line to see the problem

type "C:\Program Files\Templates\ThreeByteError.xml"

type "C:\Program Files\Templates\NoThreeByteError.xml"

fc "C:\Program Files\Templates\NoThreeByteError.xml" "C:\Program
Files\Templates\ThreeByteError.xml"
cheers

martin.

--
mikeb
Nov 18 '05 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

8
by: RonHiler | last post by:
My copy constructor is crashing my program, and I can't figure out why. I'll try to make the code listing as short as I can. Here are the two headers: class BreakthroughClass { public:...
5
by: zambak | last post by:
Hi I have assignment for some wierd compression alghoritam that will read in from a file convert characters to 5 bit codes and then write out compressed version of the original file. For...
8
by: Alex | last post by:
Hi all, can someone please show me how to get the size of a file with ANSI C? both in text mode and binary mode. thanks in advance.
26
by: Michel Rouzic | last post by:
I have a binary file used to store the values of variables in order to use them again. I easily know whether the file exists or not, but the problem is, in case the program has been earlier...
4
by: Phillip Ian | last post by:
Version: VS 2005 I took the sample code from help about encrypting and decrypting strings, and changed it to work directly with byte arrays and get the key and IV values from functions I've...
9
by: Alan T | last post by:
Is it possible to copy a file from one location to another? eg. from C:\Temp\Document\TestDoc.doc to C:\Deploy\Document\TestDoc.doc
4
by: pradqdo | last post by:
Hi folks, I have a very strange problem when I try to port my client/server program to cygwin. It is a simple shell program where the server executes client's commands + it can send and receive...
6
by: Thomas Kowalski | last post by:
Hi, currently I am reading a huge (about 10-100 MB) text-file line by line using fstreams and getline. I wonder whether there is a faster way to read a file line by line (with std::string line)....
3
by: =?Utf-8?B?TG9yZW4=?= | last post by:
I’m trying to encrypt and decrypt a file in vb.net. I am using the TripleDESCryptoServiceProvider encryption found in System.Security.Cryptography. Below is the code for my Encrypt and Decrypt...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
1
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.