Connecting Tech Pros Worldwide Help | Site Map

BUG in StreamWriter

Guest
 
Posts: n/a
#1: Nov 15 '05
Hi,

When constructing StreamWriter with the following..
FileStream f = new FileStream(..);
StreamWriter s = new StreamWriter(f);

Then attempt to write out åäö letters they become garbage.

BUT

If we call StreamWriter as follows...
FileStream f = new FileStream(..);
StreamWriter s = new StreamWriter(f, System.Text.Encoding.Default);

Its ok. So why is default not the actual DEFAULT as it says on the ctor?

It seems to me either the ctor is wrong or the name .Default is misleading.

Thanks.


Jon Skeet [C# MVP]
Guest
 
Posts: n/a
#2: Nov 15 '05

re: BUG in StreamWriter


<discussion@discussion.microsoft.com> wrote:[color=blue]
>
> When constructing StreamWriter with the following..
> FileStream f = new FileStream(..);
> StreamWriter s = new StreamWriter(f);
>
> Then attempt to write out åäö letters they become garbage.
>
> BUT
>
> If we call StreamWriter as follows...
> FileStream f = new FileStream(..);
> StreamWriter s = new StreamWriter(f, System.Text.Encoding.Default);
>
> Its ok. So why is default not the actual DEFAULT as it says on the ctor?
>
> It seems to me either the ctor is wrong or the name .Default is misleading.[/color]

..Default is *slightly* misleading, although all the information is in
the documentation. The docs for new StreamWriter(Stream) say:

<quote>
This constructor creates a StreamWriter with UTF-8 encoding whose
GetPreamble method returns an empty byte array. The BaseStream property
is initialized using the stream parameter.
</quote>

However, the brief summary saying that it uses "the default" encoding
is misleading (I'll mail MS about it).

..Default means the default *platform* encoding - but pretty much
everything in .NET itself uses UTF-8 by default.

--
Jon Skeet - <skeet@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Guest
 
Posts: n/a
#3: Nov 15 '05

re: BUG in StreamWriter


So UTF8 cant handle umlaut characters it seems then


"Jon Skeet [C# MVP]" <skeet@pobox.com> wrote in message
news:MPG.1a5215068edbe66f989caa@msnews.microsoft.c om...
<discussion@discussion.microsoft.com> wrote:[color=blue]
>
> When constructing StreamWriter with the following..
> FileStream f = new FileStream(..);
> StreamWriter s = new StreamWriter(f);
>
> Then attempt to write out åäö letters they become garbage.
>
> BUT
>
> If we call StreamWriter as follows...
> FileStream f = new FileStream(..);
> StreamWriter s = new StreamWriter(f, System.Text.Encoding.Default);
>
> Its ok. So why is default not the actual DEFAULT as it says on the ctor?
>
> It seems to me either the ctor is wrong or the name .Default is[/color]
misleading.

..Default is *slightly* misleading, although all the information is in
the documentation. The docs for new StreamWriter(Stream) say:

<quote>
This constructor creates a StreamWriter with UTF-8 encoding whose
GetPreamble method returns an empty byte array. The BaseStream property
is initialized using the stream parameter.
</quote>

However, the brief summary saying that it uses "the default" encoding
is misleading (I'll mail MS about it).

..Default means the default *platform* encoding - but pretty much
everything in .NET itself uses UTF-8 by default.

--
Jon Skeet - <skeet@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too


Jon Skeet [C# MVP]
Guest
 
Posts: n/a
#4: Nov 15 '05

re: BUG in StreamWriter


<discussion@discussion.microsoft.com> wrote:[color=blue]
> So UTF8 cant handle umlaut characters it seems then[/color]

Yes it can. It's just that whatever you were using to read the file
presumably wasn't aware that it was encoded in UTF-8.

--
Jon Skeet - <skeet@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Guest
 
Posts: n/a
#5: Nov 15 '05

re: BUG in StreamWriter


According to windows file system it says ASCII :D

I thought that was standard enough :D Because I used the same format all the
way thru the code and its umlauted ok but when its writing (using the
default ctors) its garbled. I wiped the file, changed it to construct the
SR with Encoding.Default and its saving the umlat charset now, howcome the
usual ctor with FileStream doesnt save umlaut chars then as nowwhere else
did I specify any form of encoding until this change to fix it.



"Jon Skeet [C# MVP]" <skeet@pobox.com> wrote in message
news:MPG.1a5217001aa7020f989cad@msnews.microsoft.c om...[color=blue]
> <discussion@discussion.microsoft.com> wrote:[color=green]
> > So UTF8 cant handle umlaut characters it seems then[/color]
>
> Yes it can. It's just that whatever you were using to read the file
> presumably wasn't aware that it was encoded in UTF-8.
>
> --
> Jon Skeet - <skeet@pobox.com>
> http://www.pobox.com/~skeet
> If replying to the group, please do not mail me too[/color]


Guest
 
Posts: n/a
#6: Nov 15 '05

re: BUG in StreamWriter


<?xml version="1.0" encoding="utf-8"?>

was even defined in the XML file that I got the string from, its even stored
in the String type correctly its just when writing to the file.

Normal calls specified WITHOUT encoding parameters did NOT save the umlaut
chars.



"Jon Skeet [C# MVP]" <skeet@pobox.com> wrote in message
news:MPG.1a5217001aa7020f989cad@msnews.microsoft.c om...[color=blue]
> <discussion@discussion.microsoft.com> wrote:[color=green]
> > So UTF8 cant handle umlaut characters it seems then[/color]
>
> Yes it can. It's just that whatever you were using to read the file
> presumably wasn't aware that it was encoded in UTF-8.
>
> --
> Jon Skeet - <skeet@pobox.com>
> http://www.pobox.com/~skeet
> If replying to the group, please do not mail me too[/color]


Guest
 
Posts: n/a
#7: Nov 15 '05

re: BUG in StreamWriter


Opening the text file in notepad and selecting save as shows its ANSI, not
UTF8- how come the file create when appending does not store the file as
UTF8 then as thats suppost to be the default that you state?

That would cause the mixmatch if the file create is creating as ANSI and all
methods default to UTF8.




<discussion@discussion.microsoft.com> wrote in message
news:evgArYTyDHA.1764@TK2MSFTNGP10.phx.gbl...[color=blue]
> <?xml version="1.0" encoding="utf-8"?>
>
> was even defined in the XML file that I got the string from, its even[/color]
stored[color=blue]
> in the String type correctly its just when writing to the file.
>
> Normal calls specified WITHOUT encoding parameters did NOT save the umlaut
> chars.
>
>
>
> "Jon Skeet [C# MVP]" <skeet@pobox.com> wrote in message
> news:MPG.1a5217001aa7020f989cad@msnews.microsoft.c om...[color=green]
> > <discussion@discussion.microsoft.com> wrote:[color=darkred]
> > > So UTF8 cant handle umlaut characters it seems then[/color]
> >
> > Yes it can. It's just that whatever you were using to read the file
> > presumably wasn't aware that it was encoded in UTF-8.
> >
> > --
> > Jon Skeet - <skeet@pobox.com>
> > http://www.pobox.com/~skeet
> > If replying to the group, please do not mail me too[/color]
>
>[/color]


Jon Skeet [C# MVP]
Guest
 
Posts: n/a
#8: Nov 15 '05

re: BUG in StreamWriter


<discussion@discussion.microsoft.com> wrote:[color=blue]
> According to windows file system it says ASCII :D[/color]

What do you mean by "according to the Windows file system"?
[color=blue]
> I thought that was standard enough :D[/color]

ASCII doesn't have any characters with accents.
[color=blue]
> Because I used the same format all the
> way thru the code and its umlauted ok but when its writing (using the
> default ctors) its garbled. I wiped the file, changed it to construct the
> SR with Encoding.Default and its saving the umlat charset now, howcome the
> usual ctor with FileStream doesnt save umlaut chars then as nowwhere else
> did I specify any form of encoding until this change to fix it.[/color]

It *does* save umlaut characters, it's just that what you're using to
read the file isn't recognising that it's UTF-8. You later say:
[color=blue]
> Opening the text file in notepad and selecting save as shows its ANSI,
> not UTF8[/color]

That's just notepad being confused.

UTF-8 works fine, the framework works fine - but some of your tools may
not be doing what you want them to.

See http://www.pobox.com/~skeet/csharp/unicode.html for more
information about encodings.

--
Jon Skeet - <skeet@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Guest
 
Posts: n/a
#9: Nov 15 '05

re: BUG in StreamWriter


You're right, because notepad isnt standard at all for reading text files.
Nobody in theyre right mind uses it or Wintail etc to view logs. No no not
at all :D

Its fine when i specify Encoding.Default on StreamWriter yet its NOT when I
dont specify ANY encoding anywhere in the app.





"Jon Skeet [C# MVP]" <skeet@pobox.com> wrote in message
news:MPG.1a521cf4105f61a8989cae@msnews.microsoft.c om...[color=blue]
> <discussion@discussion.microsoft.com> wrote:[color=green]
> > According to windows file system it says ASCII :D[/color]
>
> What do you mean by "according to the Windows file system"?
>[color=green]
> > I thought that was standard enough :D[/color]
>
> ASCII doesn't have any characters with accents.
>[color=green]
> > Because I used the same format all the
> > way thru the code and its umlauted ok but when its writing (using the
> > default ctors) its garbled. I wiped the file, changed it to construct[/color][/color]
the[color=blue][color=green]
> > SR with Encoding.Default and its saving the umlat charset now, howcome[/color][/color]
the[color=blue][color=green]
> > usual ctor with FileStream doesnt save umlaut chars then as nowwhere[/color][/color]
else[color=blue][color=green]
> > did I specify any form of encoding until this change to fix it.[/color]
>
> It *does* save umlaut characters, it's just that what you're using to
> read the file isn't recognising that it's UTF-8. You later say:
>[color=green]
> > Opening the text file in notepad and selecting save as shows its ANSI,
> > not UTF8[/color]
>
> That's just notepad being confused.
>
> UTF-8 works fine, the framework works fine - but some of your tools may
> not be doing what you want them to.
>
> See http://www.pobox.com/~skeet/csharp/unicode.html for more
> information about encodings.
>
> --
> Jon Skeet - <skeet@pobox.com>
> http://www.pobox.com/~skeet
> If replying to the group, please do not mail me too[/color]


Frans Bouma
Guest
 
Posts: n/a
#10: Nov 15 '05

re: BUG in StreamWriter


Jon Skeet [C# MVP] <skeet@pobox.com> wrote in
news:MPG.1a521cf4105f61a8989cae@msnews.microsoft.c om:[color=blue]
> <discussion@discussion.microsoft.com> wrote:[color=green]
>> Because I used the same format all the
>> way thru the code and its umlauted ok but when its writing (using the
>> default ctors) its garbled. I wiped the file, changed it to construct
>> the SR with Encoding.Default and its saving the umlat charset now,
>> howcome the usual ctor with FileStream doesnt save umlaut chars then as
>> nowwhere else did I specify any form of encoding until this change to
>> fix it.[/color]
>
> It *does* save umlaut characters, it's just that what you're using to
> read the file isn't recognising that it's UTF-8. You later say:[/color]

The byte specification in the actual raw data misses UTF-8
specification when you use Default. I was bitten by the same thing. I had
to explicitly state Encoding.Unicode. WHen I used Encoding.Default, it
should work according to the docs, but it didn't. It did save stuff like
scandinavian characters away in the file, but it couldn't read it back
correctly, even if I stated UTF-8 as encoding or whatever in the xml
header. So I think he's right.
[color=blue][color=green]
>> Opening the text file in notepad and selecting save as shows its ANSI,
>> not UTF8[/color]
>
> That's just notepad being confused.
> UTF-8 works fine, the framework works fine - but some of your tools may
> not be doing what you want them to.[/color]

If you specify Encoding.Unicode, it will work, if you specify
Encoding.Default it will not in some cases. In both cases, the files do
NOT have an XML heading explaining the encoding. The actual encoding is in
the bytes in the file (and probably in a meta-data property in NTFS). That
specification is not read back / or written correctly when you use
Default. I think that's the reason for his complaint and I have to admit,
he's right, I had exactly the same thing.

Frans

--
Get LLBLGen Pro, the new O/R mapper for .NET: http://www.llblgen.com
Anders Borum
Guest
 
Posts: n/a
#11: Nov 15 '05

re: BUG in StreamWriter


I've experienced similar problems too using the default encoding.

--
venlig hilsen / with regards
anders borum
--


Anders Borum
Guest
 
Posts: n/a
#12: Nov 15 '05

re: BUG in StreamWriter


And it works if you explicitly state UTF-8 as encoding?

--
venlig hilsen / with regards
anders borum
--


Jon Skeet [C# MVP]
Guest
 
Posts: n/a
#13: Nov 15 '05

re: BUG in StreamWriter


<discussion@discussion.microsoft.com> wrote:[color=blue]
> You're right, because notepad isnt standard at all for reading text files.
> Nobody in theyre right mind uses it or Wintail etc to view logs. No no not
> at all :D[/color]

That doesn't mean that notepad will automatically detect UTF-8 encoded
files. (I don't know whether or not it can cope with UTF-8 at all.)
[color=blue]
> Its fine when i specify Encoding.Default on StreamWriter yet its NOT when I
> dont specify ANY encoding anywhere in the app.[/color]

Yes, as you keep saying. That's because Encoding.Default is the default
ANSI encoding for the platform, but the default if you don't specify
any encoding is UTF-8, as I keep saying.

We seem to be going round and round here - which part are you not
understanding?

--
Jon Skeet - <skeet@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Jon Skeet [C# MVP]
Guest
 
Posts: n/a
#14: Nov 15 '05

re: BUG in StreamWriter


Frans Bouma <perseus.usenetNOSPAM@xs4all.nl> wrote:[color=blue][color=green]
> > It *does* save umlaut characters, it's just that what you're using to
> > read the file isn't recognising that it's UTF-8. You later say:[/color]
>
> The byte specification in the actual raw data misses UTF-8
> specification when you use Default.[/color]

What do you mean by this, exactly?
[color=blue]
> I was bitten by the same thing. I had
> to explicitly state Encoding.Unicode. WHen I used Encoding.Default, it
> should work according to the docs, but it didn't. It did save stuff like
> scandinavian characters away in the file, but it couldn't read it back
> correctly, even if I stated UTF-8 as encoding or whatever in the xml
> header. So I think he's right.[/color]

I really don't think so - please provide a complete example stating
*exactly* what you expected, and what you got.
[color=blue][color=green][color=darkred]
> >> Opening the text file in notepad and selecting save as shows its ANSI,
> >> not UTF8[/color]
> >
> > That's just notepad being confused.
> > UTF-8 works fine, the framework works fine - but some of your tools may
> > not be doing what you want them to.[/color]
>
> If you specify Encoding.Unicode, it will work, if you specify
> Encoding.Default it will not in some cases.[/color]

That's because notepad can cope with UCS-2 (Unicode) encoding but not
UTF-8.
[color=blue]
> In both cases, the files do
> NOT have an XML heading explaining the encoding.[/color]

Notepad isn't going to look at the XML header anyway, of course. I
don't see what the XML header has to do with anything, here, to be
honest. What relevance do you think it has to how a file is opened in
notepad?
[color=blue]
> The actual encoding is in
> the bytes in the file (and probably in a meta-data property in NTFS).[/color]

The encoding isn't "in" the bytes of the file - it's perfectly possible
to have a file which means two different things when considered as
being in two different encodings. How would it be in the meta-data
anyway? As far as the file system is concerned, it's just a stream of
bytes.
[color=blue]
> That specification is not read back / or written correctly when you use
> Default. I think that's the reason for his complaint and I have to admit,
> he's right, I had exactly the same thing.[/color]

I don't think he's write at all. When you say "Default" do you mean
"the default encoding if you don't specify one" or "Encoding.Default"?
I believe both work exactly as intended - but I suspect you're missing
something about the intention.

--
Jon Skeet - <skeet@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Jon Skeet [C# MVP]
Guest
 
Posts: n/a
#15: Nov 15 '05

re: BUG in StreamWriter


Anders Borum <na@na.na> wrote:[color=blue]
> I've experienced similar problems too using the default encoding.[/color]

What problems, exactly? People are being very woolly about what they're
seeing and how they're testing it.

To recap:

o If you don't specify an encoding, you'll get UTF-8
o If you specify Encoding.Default, you'll get the platform's default
encoding (eg Cp437)
o Notepad doesn't understand UTF-8 files, so if you open a UTF-8 file
in it you'll see garbage. This doesn't mean it's not a perfectly
valid UTF-8 file, it just means Notepad is pretty poor.

Now, given the above, what exactly do you think is wrong?

--
Jon Skeet - <skeet@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Guest
 
Posts: n/a
#16: Nov 15 '05

re: BUG in StreamWriter


originally I did NOT specify any encoding anywhere and the umlaut åäö chars
where ok everywhere except on the file save.

When I specify Encoding.Default on the StreamWriter with a fresh file ,
everything is ok. If .net defaults to UTF8 if i specify NO encoding, how
come it cant save the chars then?


"Jon Skeet [C# MVP]" <skeet@pobox.com> wrote in message
news:MPG.1a5237e693e50f84989cb1@msnews.microsoft.c om...[color=blue]
> Anders Borum <na@na.na> wrote:[color=green]
> > I've experienced similar problems too using the default encoding.[/color]
>
> What problems, exactly? People are being very woolly about what they're
> seeing and how they're testing it.
>
> To recap:
>
> o If you don't specify an encoding, you'll get UTF-8
> o If you specify Encoding.Default, you'll get the platform's default
> encoding (eg Cp437)
> o Notepad doesn't understand UTF-8 files, so if you open a UTF-8 file
> in it you'll see garbage. This doesn't mean it's not a perfectly
> valid UTF-8 file, it just means Notepad is pretty poor.
>
> Now, given the above, what exactly do you think is wrong?
>
> --
> Jon Skeet - <skeet@pobox.com>
> http://www.pobox.com/~skeet
> If replying to the group, please do not mail me too[/color]


Guest
 
Posts: n/a
#17: Nov 15 '05

re: BUG in StreamWriter


It affects wintail also, www.wintail.com


"Jon Skeet [C# MVP]" <skeet@pobox.com> wrote in message
news:MPG.1a52364497254c12989caf@msnews.microsoft.c om...[color=blue]
> <discussion@discussion.microsoft.com> wrote:[color=green]
> > You're right, because notepad isnt standard at all for reading text[/color][/color]
files.[color=blue][color=green]
> > Nobody in theyre right mind uses it or Wintail etc to view logs. No no[/color][/color]
not[color=blue][color=green]
> > at all :D[/color]
>
> That doesn't mean that notepad will automatically detect UTF-8 encoded
> files. (I don't know whether or not it can cope with UTF-8 at all.)
>[color=green]
> > Its fine when i specify Encoding.Default on StreamWriter yet its NOT[/color][/color]
when I[color=blue][color=green]
> > dont specify ANY encoding anywhere in the app.[/color]
>
> Yes, as you keep saying. That's because Encoding.Default is the default
> ANSI encoding for the platform, but the default if you don't specify
> any encoding is UTF-8, as I keep saying.
>
> We seem to be going round and round here - which part are you not
> understanding?
>
> --
> Jon Skeet - <skeet@pobox.com>
> http://www.pobox.com/~skeet
> If replying to the group, please do not mail me too[/color]


Guest
 
Posts: n/a
#18: Nov 15 '05

re: BUG in StreamWriter


Its not XML file, the XML file only is used as the input string, the actual
output thats being corrupted is a normal text file.

The program had NO reference to encoding (thereby using the default .NET
mechanism) and that was corrupting the output using StreamWriter with
FileStream. The solution to this was to construct the StreamWriter with the
Encoding.Default yet this was my actual issue, why is this default when
infact its not. It was confusing to me and why can the default .NET
mechanism (not specifying encoding) handle umlaut chars correctly (if its
UTF8 as you say).


"Frans Bouma" <perseus.usenetNOSPAM@xs4all.nl> wrote in message
news:Xns945A7BD563B8Aperseusnewsxs4allnl@207.46.24 8.16...[color=blue]
> Jon Skeet [C# MVP] <skeet@pobox.com> wrote in
> news:MPG.1a521cf4105f61a8989cae@msnews.microsoft.c om:[color=green]
> > <discussion@discussion.microsoft.com> wrote:[color=darkred]
> >> Because I used the same format all the
> >> way thru the code and its umlauted ok but when its writing (using the
> >> default ctors) its garbled. I wiped the file, changed it to construct
> >> the SR with Encoding.Default and its saving the umlat charset now,
> >> howcome the usual ctor with FileStream doesnt save umlaut chars then as
> >> nowwhere else did I specify any form of encoding until this change to
> >> fix it.[/color]
> >
> > It *does* save umlaut characters, it's just that what you're using to
> > read the file isn't recognising that it's UTF-8. You later say:[/color]
>
> The byte specification in the actual raw data misses UTF-8
> specification when you use Default. I was bitten by the same thing. I had
> to explicitly state Encoding.Unicode. WHen I used Encoding.Default, it
> should work according to the docs, but it didn't. It did save stuff like
> scandinavian characters away in the file, but it couldn't read it back
> correctly, even if I stated UTF-8 as encoding or whatever in the xml
> header. So I think he's right.
>[color=green][color=darkred]
> >> Opening the text file in notepad and selecting save as shows its ANSI,
> >> not UTF8[/color]
> >
> > That's just notepad being confused.
> > UTF-8 works fine, the framework works fine - but some of your tools may
> > not be doing what you want them to.[/color]
>
> If you specify Encoding.Unicode, it will work, if you specify
> Encoding.Default it will not in some cases. In both cases, the files do
> NOT have an XML heading explaining the encoding. The actual encoding is in
> the bytes in the file (and probably in a meta-data property in NTFS). That
> specification is not read back / or written correctly when you use
> Default. I think that's the reason for his complaint and I have to admit,
> he's right, I had exactly the same thing.
>
> Frans
>
> --
> Get LLBLGen Pro, the new O/R mapper for .NET: http://www.llblgen.com[/color]


Guest
 
Posts: n/a
#19: Nov 15 '05

re: BUG in StreamWriter


Actually with specifying Encoding.Default wintail and notepad correctly show
this characters.

Its the actual C# save that it doesnt.

<discussion@discussion.microsoft.com> wrote in message
news:%23%23$aG3UyDHA.1724@TK2MSFTNGP10.phx.gbl...[color=blue]
> It affects wintail also, www.wintail.com
>
>
> "Jon Skeet [C# MVP]" <skeet@pobox.com> wrote in message
> news:MPG.1a52364497254c12989caf@msnews.microsoft.c om...[color=green]
> > <discussion@discussion.microsoft.com> wrote:[color=darkred]
> > > You're right, because notepad isnt standard at all for reading text[/color][/color]
> files.[color=green][color=darkred]
> > > Nobody in theyre right mind uses it or Wintail etc to view logs. No no[/color][/color]
> not[color=green][color=darkred]
> > > at all :D[/color]
> >
> > That doesn't mean that notepad will automatically detect UTF-8 encoded
> > files. (I don't know whether or not it can cope with UTF-8 at all.)
> >[color=darkred]
> > > Its fine when i specify Encoding.Default on StreamWriter yet its NOT[/color][/color]
> when I[color=green][color=darkred]
> > > dont specify ANY encoding anywhere in the app.[/color]
> >
> > Yes, as you keep saying. That's because Encoding.Default is the default
> > ANSI encoding for the platform, but the default if you don't specify
> > any encoding is UTF-8, as I keep saying.
> >
> > We seem to be going round and round here - which part are you not
> > understanding?
> >
> > --
> > Jon Skeet - <skeet@pobox.com>
> > http://www.pobox.com/~skeet
> > If replying to the group, please do not mail me too[/color]
>
>[/color]


Jon Skeet [C# MVP]
Guest
 
Posts: n/a
#20: Nov 15 '05

re: BUG in StreamWriter


<discussion@discussion.microsoft.com> wrote:[color=blue]
> originally I did NOT specify any encoding anywhere and the umlaut åäö chars
> where ok everywhere except on the file save.
>
> When I specify Encoding.Default on the StreamWriter with a fresh file ,
> everything is ok. If .net defaults to UTF8 if i specify NO encoding, how
> come it cant save the chars then?[/color]

It can. It's just that the tool you're using to check for them can't
read them.

--
Jon Skeet - <skeet@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Jon Skeet [C# MVP]
Guest
 
Posts: n/a
#21: Nov 15 '05

re: BUG in StreamWriter


<discussion@discussion.microsoft.com> wrote:[color=blue]
> It affects wintail also, www.wintail.com[/color]

That doesn't mean .NET isn't writing it properly though...

--
Jon Skeet - <skeet@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Jon Skeet [C# MVP]
Guest
 
Posts: n/a
#22: Nov 15 '05

re: BUG in StreamWriter


<discussion@discussion.microsoft.com> wrote:[color=blue]
> Its not XML file, the XML file only is used as the input string, the actual
> output thats being corrupted is a normal text file.
>
> The program had NO reference to encoding (thereby using the default .NET
> mechanism) and that was corrupting the output using StreamWriter with
> FileStream. The solution to this was to construct the StreamWriter with the
> Encoding.Default yet this was my actual issue, why is this default when
> infact its not.[/color]

It's not the default for StreamWriter, it's the default encoding for
the Windows box you're running it on.
[color=blue]
> It was confusing to me and why can the default .NET
> mechanism (not specifying encoding) handle umlaut chars correctly (if its
> UTF8 as you say).[/color]

It can. You just can't read it properly.

--
Jon Skeet - <skeet@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Guest
 
Posts: n/a
#23: Nov 15 '05

re: BUG in StreamWriter


Internet explorer displays it as äåö

"Jon Skeet [C# MVP]" <skeet@pobox.com> wrote in message
news:MPG.1a52454a2188bf7c989cb5@msnews.microsoft.c om...[color=blue]
> <discussion@discussion.microsoft.com> wrote:[color=green]
> > Its not XML file, the XML file only is used as the input string, the[/color][/color]
actual[color=blue][color=green]
> > output thats being corrupted is a normal text file.
> >
> > The program had NO reference to encoding (thereby using the default .NET
> > mechanism) and that was corrupting the output using StreamWriter with
> > FileStream. The solution to this was to construct the StreamWriter with[/color][/color]
the[color=blue][color=green]
> > Encoding.Default yet this was my actual issue, why is this default when
> > infact its not.[/color]
>
> It's not the default for StreamWriter, it's the default encoding for
> the Windows box you're running it on.
>[color=green]
> > It was confusing to me and why can the default .NET
> > mechanism (not specifying encoding) handle umlaut chars correctly (if[/color][/color]
its[color=blue][color=green]
> > UTF8 as you say).[/color]
>
> It can. You just can't read it properly.
>
> --
> Jon Skeet - <skeet@pobox.com>
> http://www.pobox.com/~skeet
> If replying to the group, please do not mail me too[/color]


Guest
 
Posts: n/a
#24: Nov 15 '05

re: BUG in StreamWriter


Ok, notepad shows it ok, so does the VS editor

Wintail and INTERNET EXPLORER (which is suprising) does not.


"Jon Skeet [C# MVP]" <skeet@pobox.com> wrote in message
news:MPG.1a52454a2188bf7c989cb5@msnews.microsoft.c om...[color=blue]
> <discussion@discussion.microsoft.com> wrote:[color=green]
> > Its not XML file, the XML file only is used as the input string, the[/color][/color]
actual[color=blue][color=green]
> > output thats being corrupted is a normal text file.
> >
> > The program had NO reference to encoding (thereby using the default .NET
> > mechanism) and that was corrupting the output using StreamWriter with
> > FileStream. The solution to this was to construct the StreamWriter with[/color][/color]
the[color=blue][color=green]
> > Encoding.Default yet this was my actual issue, why is this default when
> > infact its not.[/color]
>
> It's not the default for StreamWriter, it's the default encoding for
> the Windows box you're running it on.
>[color=green]
> > It was confusing to me and why can the default .NET
> > mechanism (not specifying encoding) handle umlaut chars correctly (if[/color][/color]
its[color=blue][color=green]
> > UTF8 as you say).[/color]
>
> It can. You just can't read it properly.
>
> --
> Jon Skeet - <skeet@pobox.com>
> http://www.pobox.com/~skeet
> If replying to the group, please do not mail me too[/color]


Jon Skeet [C# MVP]
Guest
 
Posts: n/a
#25: Nov 15 '05

re: BUG in StreamWriter


<discussion@discussion.microsoft.com> wrote:[color=blue]
> Actually with specifying Encoding.Default wintail and notepad correctly show
> this characters.[/color]

Yes, because they're assuming the Windows default encoding.
[color=blue]
> Its the actual C# save that it doesnt.[/color]

<sigh>

How many times do I need to explain it? C# is working fine - it's just
that your tools don't understand UTF-8. Find a text editor which lets
you pick a UTF-8 encoding, and load the file - you'll see the
characters just fine.

--
Jon Skeet - <skeet@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Jon Skeet [C# MVP]
Guest
 
Posts: n/a
#26: Nov 15 '05

re: BUG in StreamWriter


<discussion@discussion.microsoft.com> wrote:[color=blue]
> Internet explorer displays it as äåö[/color]

Internet Explorer is probably assuming the Windows default encoding as
well.

How well do you actually understand encodings? You might like to read
http://www.pobox.com/~skeet/csharp/unicode.html for more information.

--
Jon Skeet - <skeet@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Guest
 
Posts: n/a
#27: Nov 15 '05

re: BUG in StreamWriter


Nothing like plugging your own site.

<sigh> if you get tired of explaining nobody forces you to reply to each and
every post out there, no need to step on others to make your ego bigger. Ive
seen you post before, you do the same every time.


"Jon Skeet [C# MVP]" <skeet@pobox.com> wrote in message
news:MPG.1a525a9511577d22989cb8@msnews.microsoft.c om...
<discussion@discussion.microsoft.com> wrote:[color=blue]
> Internet explorer displays it as äåö[/color]

Internet Explorer is probably assuming the Windows default encoding as
well.

How well do you actually understand encodings? You might like to read
http://www.pobox.com/~skeet/csharp/unicode.html for more information.

--
Jon Skeet - <skeet@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too


Jon Skeet [C# MVP]
Guest
 
Posts: n/a
#28: Nov 15 '05

re: BUG in StreamWriter


<discussion@discussion.microsoft.com> wrote:[color=blue]
> Nothing like plugging your own site.[/color]

I wrote that page (and various others) to save me from explaining
things in detail repeatedly. It's not like I get money from them or
anything - they're just meant to be helpful.
[color=blue]
> <sigh> if you get tired of explaining nobody forces you to reply to each and
> every post out there, no need to step on others to make your ego bigger. Ive
> seen you post before, you do the same every time.[/color]

I don't reply to each and every post out there, but it *is*
disconcerting when people clearly don't really read answers. This
thread is pointless - no-one's really saying what .NET is supposedly
doing wrong except in terms of what Notepad/Wintail etc can cope with.
I've explained what's going on numerous times now...

--
Jon Skeet - <skeet@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Marco Martin
Guest
 
Posts: n/a
#29: Nov 15 '05

re: BUG in StreamWriter


<sigh>
<discussion@discussion.microsoft.com> wrote in message
news:elO7CEWyDHA.2440@TK2MSFTNGP12.phx.gbl...[color=blue]
> Nothing like plugging your own site.
>
> <sigh> if you get tired of explaining nobody forces you to reply to each[/color]
and[color=blue]
> every post out there, no need to step on others to make your ego bigger.[/color]
Ive[color=blue]
> seen you post before, you do the same every time.
>
>
> "Jon Skeet [C# MVP]" <skeet@pobox.com> wrote in message
> news:MPG.1a525a9511577d22989cb8@msnews.microsoft.c om...
> <discussion@discussion.microsoft.com> wrote:[color=green]
> > Internet explorer displays it as äåö[/color]
>
> Internet Explorer is probably assuming the Windows default encoding as
> well.
>
> How well do you actually understand encodings? You might like to read
> http://www.pobox.com/~skeet/csharp/unicode.html for more information.
>
> --
> Jon Skeet - <skeet@pobox.com>
> http://www.pobox.com/~skeet
> If replying to the group, please do not mail me too
>
>[/color]


Marco Martin
Guest
 
Posts: n/a
#30: Nov 15 '05

re: BUG in StreamWriter


Discussion,

if all you want to do is complain about the people who are here to help you,
perhaps you would feel more at home at this forum instead:
alt.complainers.bitch-n-moan

Others here use this forum as help, and is invaluable for their jobs. I for
one find you quite distasteful to say the least and am asking nicely for you
to be, at the very least respectfull, to the people who take the time to
answer your questions.

Marco.
<discussion@discussion.microsoft.com> wrote in message
news:elO7CEWyDHA.2440@TK2MSFTNGP12.phx.gbl...[color=blue]
> Nothing like plugging your own site.
>
> <sigh> if you get tired of explaining nobody forces you to reply to each[/color]
and[color=blue]
> every post out there, no need to step on others to make your ego bigger.[/color]
Ive[color=blue]
> seen you post before, you do the same every time.
>
>
> "Jon Skeet [C# MVP]" <skeet@pobox.com> wrote in message
> news:MPG.1a525a9511577d22989cb8@msnews.microsoft.c om...
> <discussion@discussion.microsoft.com> wrote:[color=green]
> > Internet explorer displays it as äåö[/color]
>
> Internet Explorer is probably assuming the Windows default encoding as
> well.
>
> How well do you actually understand encodings? You might like to read
> http://www.pobox.com/~skeet/csharp/unicode.html for more information.
>
> --
> Jon Skeet - <skeet@pobox.com>
> http://www.pobox.com/~skeet
> If replying to the group, please do not mail me too
>
>[/color]


Frans Bouma
Guest
 
Posts: n/a
#31: Nov 15 '05

re: BUG in StreamWriter


Jon Skeet [C# MVP] <skeet@pobox.com> wrote in
news:MPG.1a52373ac26beb46989cb0@msnews.microsoft.c om:
[color=blue]
> Frans Bouma <perseus.usenetNOSPAM@xs4all.nl> wrote:[color=green][color=darkred]
>> > It *does* save umlaut characters, it's just that what you're using to
>> > read the file isn't recognising that it's UTF-8. You later say:[/color]
>>
>> The byte specification in the actual raw data misses UTF-8
>> specification when you use Default.[/color]
>
> What do you mean by this, exactly?[/color]

that I had the same XML data in the file, one written away with
Encoding.Default and the other with Encoding.Unicode. Both looked the same
in notepad, I had NO encoding specifcation. however one couldn't be loaded
due to a an 'ae' character, the other one could be loaded (or better: be
serialized back). I found this very odd, because there was NO encoding
specifier in the XML, so the encoding has to be stored somewhere else.
[color=blue][color=green]
>> I was bitten by the same thing. I had
>> to explicitly state Encoding.Unicode. WHen I used Encoding.Default, it
>> should work according to the docs, but it didn't. It did save stuff[/color][/color]
like[color=blue][color=green]
>> scandinavian characters away in the file, but it couldn't read it back
>> correctly, even if I stated UTF-8 as encoding or whatever in the xml
>> header. So I think he's right.[/color]
>
> I really don't think so - please provide a complete example stating
> *exactly* what you expected, and what you got.[/color]

write:
XmlTextWriter writer = new XmlTextWriter(Path.Combine
(Application.StartupPath, ApplicationConstants.PreferencesFilename),
System.Text.Encoding.Unicode);

try
{
writer.WriteStartElement("Preferences");
writer.WriteStartElement("preferedProjectFolder");
writer.WriteAttributeString("value",
_preferences.PreferedProjectFolder);
writer.WriteEndElement();
// etc.


THIS works. (the Unicode encoding).
However when I change that to Default, it doesn't. I even added UTF-8
encoding specification to the XML file, no luck. Now, the docs state that
the codepage of the local system is used with 'default'. I did set the
codepage of my system to all kinds of wicked pages, but also no luck.
Unicode solved it (obviously). However, 'Default' THUS doesn't work for
characters other than plain ASCII.

read:
XmlTextReader reader = new XmlTextReader(Path.Combine
(Application.StartupPath, ApplicationConstants.PreferencesFilename));

try
{
// Read the nodes and store the values as they are found in the
preferences object.
while(reader.Read())
{
switch(reader.Name)
{
case "preferedProjectFolder":
_preferences.PreferedProjectFolder =
reader.GetAttribute("value"); // <-- crash here, character could not be
loaded. Character was a scandinavian character 'ae' (combined to 1 char).
break;
// etc..
[color=blue][color=green]
>> In both cases, the files do
>> NOT have an XML heading explaining the encoding.[/color]
>
> Notepad isn't going to look at the XML header anyway, of course. I
> don't see what the XML header has to do with anything, here, to be
> honest. What relevance do you think it has to how a file is opened in
> notepad?[/color]

I wasn't talking about notepad :) I write an XML file and read it
back the next time the app starts. It crashed then (it didn't while saving
the XML). However because it is XML, I thought an encoding specification
would be better in the XML header. But if you add that (UTF-8) and you've
saved with 'Default' the file can't be opened with the XmlTextReader
because of some byte encoding issue. (IIRC).
[color=blue][color=green]
>> The actual encoding is in
>> the bytes in the file (and probably in a meta-data property in NTFS).[/color]
>
> The encoding isn't "in" the bytes of the file - it's perfectly possible
> to have a file which means two different things when considered as
> being in two different encodings. How would it be in the meta-data
> anyway? As far as the file system is concerned, it's just a stream of
> bytes.[/color]

that's what I was thinking too, however the errors I had made me
draw that conclusion. However I can be wrong, what I DO know is that
characters in extended ascii can't be handled with Encoding.Default.

FB
--
Get LLBLGen Pro, the new O/R mapper for .NET: http://www.llblgen.com
Cor
Guest
 
Posts: n/a
#32: Nov 15 '05

re: BUG in StreamWriter


Hi Jon,

A question to you.
I was seeing (not following) your impossible strugle to do this right.
I complete agree with the message from Marco Martin

But all this post is cross posted.
And although you did your best and maybe it is usefull but because of the
reactions became trash.

Maybe you can delete the next time the newsgroups from which you are not
answering when it is this kind of answers you get.

:-))

Cor


Jon Skeet [C# MVP]
Guest
 
Posts: n/a
#33: Nov 15 '05

re: BUG in StreamWriter


Frans Bouma <perseus.usenetNOSPAM@xs4all.nl> wrote:[color=blue][color=green]
> > What do you mean by this, exactly?[/color]
>
> that I had the same XML data in the file, one written away with
> Encoding.Default and the other with Encoding.Unicode. Both looked the same
> in notepad, I had NO encoding specifcation. however one couldn't be loaded
> due to a an 'ae' character, the other one could be loaded (or better: be
> serialized back). I found this very odd, because there was NO encoding
> specifier in the XML, so the encoding has to be stored somewhere else.[/color]

It's not odd add all - it would have been preferable to have the
encoding specifier in the XML, but Notepad wouldn't have used it
anyway.

In fact, it seem that Notepad on XP *does* read UTF-8 files. If you use
the following code:

using System;
using System.IO;
using System.Text;

public class Test
{
static void Main()
{
using (StreamWriter sw = new StreamWriter ("test.txt"))
{
sw.WriteLine ("\u00e9");
}
}
}

to generate a file test.txt, which has contents 0xc9 0xa9 0x0d 0x0a,
then if you open it in Notepad with encoding UTF-8, it correctly
displays an e-acute. If you open it in Notepad with encoding ANSI, it
displays é (again, correctly).

Now, if your XML didn't include an encoding specifier, the XML parser
should have assumed UTF-8. If you used Encoding.Default (instead of
UTF-8) then you would indeed get an error if the file was not a valid
UTF-8 file. From the XML specification:

<quote>
In the absence of information provided by an external transport
protocol (e.g. HTTP or MIME), it is an error for an entity including an
encoding declaration to be presented to the XML processor in an
encoding other than that named in the declaration, or for an entity
which begins with neither a Byte Order Mark nor an encoding declaration
to use an encoding other than UTF-8.
</quote>

When you used the Unicode encoding, I suspect you got a byte-order mark
which allowed the parser to tell that it was using that encoding.
[color=blue][color=green][color=darkred]
> >> I was bitten by the same thing. I had
> >> to explicitly state Encoding.Unicode. WHen I used Encoding.Default, it
> >> should work according to the docs, but it didn't. It did save stuff [/color][/color]
> like [color=green][color=darkred]
> >> scandinavian characters away in the file, but it couldn't read it back
> >> correctly, even if I stated UTF-8 as encoding or whatever in the xml
> >> header. So I think he's right.[/color]
> >
> > I really don't think so - please provide a complete example stating
> > *exactly* what you expected, and what you got.[/color]
>
> write:
> XmlTextWriter writer = new XmlTextWriter(Path.Combine
> (Application.StartupPath, ApplicationConstants.PreferencesFilename),
> System.Text.Encoding.Unicode);
>
> try
> {
> writer.WriteStartElement("Preferences");
> writer.WriteStartElement("preferedProjectFolder");
> writer.WriteAttributeString("value",
> _preferences.PreferedProjectFolder);
> writer.WriteEndElement();
> // etc.
>
>
> THIS works. (the Unicode encoding).
> However when I change that to Default, it doesn't. I even added UTF-8
> encoding specification to the XML file, no luck.[/color]

No, it wouldn't - for the reasons given above.
[color=blue]
> Now, the docs state that
> the codepage of the local system is used with 'default'. I did set the
> codepage of my system to all kinds of wicked pages, but also no luck.
> Unicode solved it (obviously). However, 'Default' THUS doesn't work for
> characters other than plain ASCII.[/color]

It does, but not when you've told the XML parser to expect UTF-8 and
then don't give it UTF-8!
[color=blue][color=green][color=darkred]
> >> In both cases, the files do
> >> NOT have an XML heading explaining the encoding.[/color]
> >
> > Notepad isn't going to look at the XML header anyway, of course. I
> > don't see what the XML header has to do with anything, here, to be
> > honest. What relevance do you think it has to how a file is opened in
> > notepad?[/color]
>
> I wasn't talking about notepad :) I write an XML file and read it
> back the next time the app starts. It crashed then (it didn't while saving
> the XML). However because it is XML, I thought an encoding specification
> would be better in the XML header. But if you add that (UTF-8) and you've
> saved with 'Default' the file can't be opened with the XmlTextReader
> because of some byte encoding issue. (IIRC).[/color]

Yup, that makes perfect sense, in the same way that if you tell someone
that you're going to talk English and then you talk French they may
well get confused. You've got to actually use the encoding you specify
in the XML header.
[color=blue][color=green][color=darkred]
> >> The actual encoding is in
> >> the bytes in the file (and probably in a meta-data property in NTFS). [/color]
> >
> > The encoding isn't "in" the bytes of the file - it's perfectly possible
> > to have a file which means two different things when considered as
> > being in two different encodings. How would it be in the meta-data
> > anyway? As far as the file system is concerned, it's just a stream of
> > bytes.[/color]
>
> that's what I was thinking too, however the errors I had made me
> draw that conclusion. However I can be wrong, what I DO know is that
> characters in extended ascii can't be handled with Encoding.Default.[/color]

a) There's no such thing as "extended ASCII". There are various
encodings which are 8-bit extensions to ASCII, but they are all
different, and there's no one true "extended ASCII".
b) Characters within an ANSI code-page *can* be used if you correctly
specify the character encoding in the XML header. I suspect that an
encoding of "windows-1252" would have worked. I haven't tried it
and I wouldn't recommend it though - I'd just stick to UTF-8.

--
Jon Skeet - <skeet@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Frans Bouma
Guest
 
Posts: n/a
#34: Nov 15 '05

re: BUG in StreamWriter


Jon Skeet [C# MVP] <skeet@pobox.com> wrote in
news:MPG.1a52adec48ab0b15989cbf@msnews.microsoft.c om:

Ok Thanks Jon, for clearing that up. :)

Frans

[color=blue]
> Frans Bouma <perseus.usenetNOSPAM@xs4all.nl> wrote:[color=green][color=darkred]
>> > What do you mean by this, exactly?[/color]
>>
>> that I had the same XML data in the file, one written away[/color][/color]
with[color=blue][color=green]
>> Encoding.Default and the other with Encoding.Unicode. Both looked the[/color][/color]
sam[color=blue]
> e[color=green]
>> in notepad, I had NO encoding specifcation. however one couldn't be[/color][/color]
loade[color=blue]
> d[color=green]
>> due to a an 'ae' character, the other one could be loaded (or better:[/color][/color]
be[color=blue]
>[color=green]
>> serialized back). I found this very odd, because there was NO encoding
>> specifier in the XML, so the encoding has to be stored somewhere else.[/color]
>
> It's not odd add all - it would have been preferable to have the
> encoding specifier in the XML, but Notepad wouldn't have used it
> anyway.
>
> In fact, it seem that Notepad on XP *does* read UTF-8 files. If you use
> the following code:
>
> using System;
> using System.IO;
> using System.Text;
>
> public class Test
> {
> static void Main()
> {
> using (StreamWriter sw = new StreamWriter ("test.txt"))
> {
> sw.WriteLine ("\u00e9");
> }
> }
> }
>
> to generate a file test.txt, which has contents 0xc9 0xa9 0x0d 0x0a,
> then if you open it in Notepad with encoding UTF-8, it correctly
> displays an e-acute. If you open it in Notepad with encoding ANSI, it
> displays é (again, correctly).
>
> Now, if your XML didn't include an encoding specifier, the XML parser
> should have assumed UTF-8. If you used Encoding.Default (instead of
> UTF-8) then you would indeed get an error if the file was not a valid
> UTF-8 file. From the XML specification:
>
> <quote>
> In the absence of information provided by an external transport
> protocol (e.g. HTTP or MIME), it is an error for an entity including an
> encoding declaration to be presented to the XML processor in an
> encoding other than that named in the declaration, or for an entity
> which begins with neither a Byte Order Mark nor an encoding declaration
> to use an encoding other than UTF-8.
> </quote>
>
> When you used the Unicode encoding, I suspect you got a byte-order mark
> which allowed the parser to tell that it was using that encoding.
>[color=green][color=darkred]
>> >> I was bitten by the same thing. I had
>> >> to explicitly state Encoding.Unicode. WHen I used Encoding.Default,[/color][/color][/color]
it[color=blue]
>[color=green][color=darkred]
>> >> should work according to the docs, but it didn't. It did save stuff[/color]
>> like[color=darkred]
>> >> scandinavian characters away in the file, but it couldn't read it[/color][/color][/color]
back[color=blue]
>[color=green][color=darkred]
>> >> correctly, even if I stated UTF-8 as encoding or whatever in the xml[/color][/color]
>[color=green][color=darkred]
>> >> header. So I think he's right.
>> >
>> > I really don't think so - please provide a complete example stating
>> > *exactly* what you expected, and what you got.[/color]
>>
>> write:
>> XmlTextWriter writer = new XmlTextWriter(Path.Combine
>> (Application.StartupPath, ApplicationConstants.PreferencesFilename),
>> System.Text.Encoding.Unicode);
>>
>> try
>> {
>> writer.WriteStartElement("Preferences");
>> writer.WriteStartElement("preferedProjectFolder");
>> writer.WriteAttributeString("value",
>> _preferences.PreferedProjectFolder);
>> writer.WriteEndElement();
>> // etc.
>>
>>
>> THIS works. (the Unicode encoding).
>> However when I change that to Default, it doesn't. I even added UTF-8
>> encoding specification to the XML file, no luck.[/color]
>
> No, it wouldn't - for the reasons given above.
>[color=green]
>> Now, the docs state that
>> the codepage of the local system is used with 'default'. I did set the
>> codepage of my system to all kinds of wicked pages, but also no luck.
>> Unicode solved it (obviously). However, 'Default' THUS doesn't work for[/color]
>[color=green]
>> characters other than plain ASCII.[/color]
>
> It does, but not when you've told the XML parser to expect UTF-8 and
> then don't give it UTF-8!
>[color=green][color=darkred]
>> >> In both cases, the files do
>> >> NOT have an XML heading explaining the encoding.
>> >
>> > Notepad isn't going to look at the XML header anyway, of course. I
>> > don't see what the XML header has to do with anything, here, to be
>> > honest. What relevance do you think it has to how a file is opened in[/color][/color]
>[color=green][color=darkred]
>> > notepad?[/color]
>>
>> I wasn't talking about notepad :) I write an XML file and read[/color][/color]
it[color=blue]
>[color=green]
>> back the next time the app starts. It crashed then (it didn't while[/color][/color]
savin[color=blue]
> g[color=green]
>> the XML). However because it is XML, I thought an encoding[/color][/color]
specification[color=blue]
>[color=green]
>> would be better in the XML header. But if you add that (UTF-8) and[/color][/color]
you've[color=blue]
>[color=green]
>> saved with 'Default' the file can't be opened with the XmlTextReader
>> because of some byte encoding issue. (IIRC).[/color]
>
> Yup, that makes perfect sense, in the same way that if you tell someone
> that you're going to talk English and then you talk French they may
> well get confused. You've got to actually use the encoding you specify
> in the XML header.
>[color=green][color=darkred]
>> >> The actual encoding is in
>> >> the bytes in the file (and probably in a meta-data property in[/color][/color][/color]
NTFS).[color=blue]
>[color=green][color=darkred]
>> >
>> > The encoding isn't "in" the bytes of the file - it's perfectly[/color][/color][/color]
possible[color=blue]
>[color=green][color=darkred]
>> > to have a file which means two different things when considered as
>> > being in two different encodings. How would it be in the meta-data
>> > anyway? As far as the file system is concerned, it's just a stream of[/color][/color]
>[color=green][color=darkred]
>> > bytes.[/color]
>>
>> that's what I was thinking too, however the errors I had made[/color][/color]
me[color=blue]
>[color=green]
>> draw that conclusion. However I can be wrong, what I DO know is that
>> characters in extended ascii can't be handled with Encoding.Default.[/color]
>
> a) There's no such thing as "extended ASCII". There are various
> encodings which are 8-bit extensions to ASCII, but they are all
> different, and there's no one true "extended ASCII".
> b) Characters within an ANSI code-page *can* be used if you correctly
> specify the character encoding in the XML header. I suspect that an
> encoding of "windows-1252" would have worked. I haven't tried it
> and I wouldn't recommend it though - I'd just stick to UTF-8.
>[/color]



--
Get LLBLGen Pro, the new O/R mapper for .NET: http://www.llblgen.com
Closed Thread