473,387 Members | 1,510 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,387 software developers and data experts.

Odd characters being written weird

I have a problem with encoding, I think (??). I have the following code:

StreamWriter sw = File.CreateText(l);
sw.Write(data);
sw.Close();

Now, "data" is a string that contains text, but sometimes contains special characters, like "§".

When I upload this file to a web server, what I get back is a preceding junk character. For instance, for the character "§", when I browse the file in IE I get "§".

If I open the file in Notepad, the character is "§", but when I view it in a command prompt with the "type" command, I get "┬º", whereas if I type "§" in notepad and save it the command prompt sees it as "º", so the junk character before in the command prompt is "┬".

What is going on and how do I fix this?

Jon
Nov 15 '05 #1
9 3113
Jon Davis <jo*@REMOVE.ME.PLEASE.jondavis.net> wrote:
I have a problem with encoding, I think (??). I have the following code:

StreamWriter sw = File.CreateText(l);
sw.Write(data);
sw.Close();


That's writing it out in UTF-8. See
http://www.pobox.com/~skeet/csharp/unicode.html
for more about what character encodings are and which one you might
want.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet/
If replying to the group, please do not mail me too
Nov 15 '05 #2
> Now, "data" is a string that contains text, but sometimes contains special characters, like "§".

I should also note that data's value comes from an XmlNode's InnerText property.

Any help would be appreciated.

Jon
"Jon Davis" <jo*@REMOVE.ME.PLEASE.jondavis.net> wrote in message news:u1**************@tk2msftngp13.phx.gbl...
I have a problem with encoding, I think (??). I have the following code:

StreamWriter sw = File.CreateText(l);
sw.Write(data);
sw.Close();

Now, "data" is a string that contains text, but sometimes contains special characters, like "§".

When I upload this file to a web server, what I get back is a preceding junk character. For instance, for the character "§", when I browse the file in IE I get "§".

If I open the file in Notepad, the character is "§", but when I view it in a command prompt with the "type" command, I get "┬º", whereas if I type "§" in notepad and save it the command prompt sees it as "º", so the junk character before in the command prompt is "┬".

What is going on and how do I fix this?

Jon
Nov 15 '05 #3
Hi

As Jon Skeet points out, you are likely to have an encoding problem.

Your code is:

StreamWriter sw = File.CreateText(l);
sw.Write(data);
sw.Close();

Instead try:

StreamWriter sw = new StreamWriter(I, System.Text.Encoding.UTF8); //or
whatever encoding you want to use.
sw.Write(data);
sw.Close();

When you create the sw object this way, you will be able to control the
encoding used.

But it is also possible, that error comes from the way that you read the
XML. You need to read the XML with the right encoding (set it explicitly the
same way as you created the sw).

Hope it helps.

Lars

Nov 15 '05 #4
Thanks Lars!!

Jon
"Lars Hansen" <madknight@___remove___post.cybercity.dk> wrote in message
news:gD**********************@news.easynews.com...
StreamWriter sw = new StreamWriter(I, System.Text.Encoding.UTF8);


Sorry, its like this instead: (second param false or true as you need):

StreamWriter sw = new StreamWriter(I, false, System.Text.Encoding.UTF8);

Lars

Nov 15 '05 #5
UTF8 didn't work but Unicode seems to work.

Jon
"Jon Davis" <jo*@REMOVE.ME.PLEASE.jondavis.net> wrote in message
news:ej*************@tk2msftngp13.phx.gbl...
Thanks Lars!!

Jon
"Lars Hansen" <madknight@___remove___post.cybercity.dk> wrote in message
news:gD**********************@news.easynews.com...
StreamWriter sw = new StreamWriter(I, System.Text.Encoding.UTF8);


Sorry, its like this instead: (second param false or true as you need):

StreamWriter sw = new StreamWriter(I, false, System.Text.Encoding.UTF8);

Lars


Nov 15 '05 #6
Well crap, Unicode "fixes" the problem, but only as a workaround. On another
web app, an ASP classic app, now I get:

Active Server Pages error 'ASP 0239'
Cannot process file
/blog/default.htm, line 1
UNICODE ASP files are not supported.
So apparently Unicode files are the exception and not the norm.
Unacceptable, then... So how on earth do I fix this? UTF8 encoding for the
file output didn't do a thing for me, as I get those junk characters when I
do.

FYI, the encoding of the XML file is UTF-8.

I hope, Lars, that you are still following this thread, otherwise I need to
start another one.

Jon
"Jon Davis" <jo*@REMOVE.ME.PLEASE.jondavis.net> wrote in message
news:eC*************@tk2msftngp13.phx.gbl...
UTF8 didn't work but Unicode seems to work.

Jon
"Jon Davis" <jo*@REMOVE.ME.PLEASE.jondavis.net> wrote in message
news:ej*************@tk2msftngp13.phx.gbl...
Thanks Lars!!

Jon
"Lars Hansen" <madknight@___remove___post.cybercity.dk> wrote in message
news:gD**********************@news.easynews.com...

> StreamWriter sw = new StreamWriter(I, System.Text.Encoding.UTF8);

Sorry, its like this instead: (second param false or true as you need):
StreamWriter sw = new StreamWriter(I, false, System.Text.Encoding.UTF8);
Lars



Nov 15 '05 #7
Jon Davis <jo*@REMOVE.ME.PLEASE.jondavis.net> wrote:
Well crap, Unicode "fixes" the problem, but only as a workaround. On another
web app, an ASP classic app, now I get:

Active Server Pages error 'ASP 0239'
Cannot process file
/blog/default.htm, line 1
UNICODE ASP files are not supported.
So apparently Unicode files are the exception and not the norm.
Unacceptable, then... So how on earth do I fix this? UTF8 encoding for the
file output didn't do a thing for me, as I get those junk characters when I
do.

FYI, the encoding of the XML file is UTF-8.

I hope, Lars, that you are still following this thread, otherwise I need to
start another one.


The thing to do is work out when encoding you actually *do* want. I
would suggest that if the files are going to be web pages, that you
actually stick to ASCII for the contents, using &#x1234; type entities
for non-ASCII characters.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet/
If replying to the group, please do not mail me too
Nov 15 '05 #8
Go figure. This works. Now I'm ticked ...

XmlDocument xDoc = new XmlDocument();
xDoc.LoadXml("<?xml version=\"1.0\" encoding=\"UTF-8\"?>\r\n"
+ "<test />");
XmlNode testNode = xDoc.SelectSingleNode("/test");
testNode.InnerText = "§";
StreamWriter sw = new StreamWriter(@"C:\__test.htm", false,
Encoding.UTF8);
sw.Write(testNode.InnerText);
sw.Close();
StreamReader sr = File.OpenText(@"C:\__test.htm");
string s = sr.ReadToEnd();
sr.Close();
//File.Delete(@"C:\__test.htm");
MessageBox.Show(s);
Jon
"Jon Davis" <jo*@REMOVE.ME.PLEASE.jondavis.net> wrote in message
news:Oq****************@TK2MSFTNGP12.phx.gbl...
Well crap, Unicode "fixes" the problem, but only as a workaround. On another web app, an ASP classic app, now I get:

Active Server Pages error 'ASP 0239'
Cannot process file
/blog/default.htm, line 1
UNICODE ASP files are not supported.
So apparently Unicode files are the exception and not the norm.
Unacceptable, then... So how on earth do I fix this? UTF8 encoding for the
file output didn't do a thing for me, as I get those junk characters when I do.

FYI, the encoding of the XML file is UTF-8.

I hope, Lars, that you are still following this thread, otherwise I need to start another one.

Jon
"Jon Davis" <jo*@REMOVE.ME.PLEASE.jondavis.net> wrote in message
news:eC*************@tk2msftngp13.phx.gbl...
UTF8 didn't work but Unicode seems to work.

Jon
"Jon Davis" <jo*@REMOVE.ME.PLEASE.jondavis.net> wrote in message
news:ej*************@tk2msftngp13.phx.gbl...
Thanks Lars!!

Jon
"Lars Hansen" <madknight@___remove___post.cybercity.dk> wrote in message news:gD**********************@news.easynews.com...
>
> > StreamWriter sw = new StreamWriter(I, System.Text.Encoding.UTF8);
>
> Sorry, its like this instead: (second param false or true as you need): >
> StreamWriter sw = new StreamWriter(I, false, System.Text.Encoding.UTF8); >
> Lars
>
>



Nov 15 '05 #9
K right now looking at the uploaded file as a flat file from within IE it
looks fine. This file is embedded as an <!--#include...--> in an ASP classic
page with no specified encoding, and then I see the junk characters. So the
problem seems to be either:

* ASP classic's #INCLUDE feature not handing UTF-8 files properly
* ASP classic's dispensation of UTF-8 encoded files, or
* IE's presentation of a downloaded HTML resource with a UTF-8 feature.

I will have to move this to an ASP Classic or IE newsgroup. Alternatively, I
can change the chars to &#[###]; but I really don't want to do that as that
is changing the content which may have unintended repercussions to the user
of my software (weblogging / blogging software).

Jon
"Jon Davis" <jo*@REMOVE.ME.PLEASE.jondavis.net> wrote in message
news:eb**************@TK2MSFTNGP10.phx.gbl...
Go figure. This works. Now I'm ticked ...

XmlDocument xDoc = new XmlDocument();
xDoc.LoadXml("<?xml version=\"1.0\" encoding=\"UTF-8\"?>\r\n"
+ "<test />");
XmlNode testNode = xDoc.SelectSingleNode("/test");
testNode.InnerText = "§";
StreamWriter sw = new StreamWriter(@"C:\__test.htm", false,
Encoding.UTF8);
sw.Write(testNode.InnerText);
sw.Close();
StreamReader sr = File.OpenText(@"C:\__test.htm");
string s = sr.ReadToEnd();
sr.Close();
//File.Delete(@"C:\__test.htm");
MessageBox.Show(s);
Jon
"Jon Davis" <jo*@REMOVE.ME.PLEASE.jondavis.net> wrote in message
news:Oq****************@TK2MSFTNGP12.phx.gbl...
Well crap, Unicode "fixes" the problem, but only as a workaround. On another
web app, an ASP classic app, now I get:

Active Server Pages error 'ASP 0239'
Cannot process file
/blog/default.htm, line 1
UNICODE ASP files are not supported.
So apparently Unicode files are the exception and not the norm.
Unacceptable, then... So how on earth do I fix this? UTF8 encoding for the
file output didn't do a thing for me, as I get those junk characters

when I
do.

FYI, the encoding of the XML file is UTF-8.

I hope, Lars, that you are still following this thread, otherwise I need

to
start another one.

Jon
"Jon Davis" <jo*@REMOVE.ME.PLEASE.jondavis.net> wrote in message
news:eC*************@tk2msftngp13.phx.gbl...
UTF8 didn't work but Unicode seems to work.

Jon
"Jon Davis" <jo*@REMOVE.ME.PLEASE.jondavis.net> wrote in message
news:ej*************@tk2msftngp13.phx.gbl...
> Thanks Lars!!
>
> Jon
>
>
> "Lars Hansen" <madknight@___remove___post.cybercity.dk> wrote in

message > news:gD**********************@news.easynews.com...
> >
> > > StreamWriter sw = new StreamWriter(I, System.Text.Encoding.UTF8); > >
> > Sorry, its like this instead: (second param false or true as you

need):
> >
> > StreamWriter sw = new StreamWriter(I, false,

System.Text.Encoding.UTF8);
> >
> > Lars
> >
> >
>
>



Nov 15 '05 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

5
by: Tim Mickelson | last post by:
Hi I thought that working with swedish characters would be easy, but since I'm working with MySQL 4.1 that's not the case any more. First thing I noticed (emditly) was an application that I've...
4
by: Ewok | last post by:
let me just say. it's not by choice but im dealing with a .net web app (top down approach with VB and a MySQL database) sigh..... Anyhow, I've just about got all the kinks worked out but I am...
28
by: wwj | last post by:
void main() { char* p="Hello"; printf("%s",p); *p='w'; printf("%s",p); }
35
by: David Mathog | last post by:
Every so often one of my fgets() based programs encounters an input file containing embedded nulls. fgets is happy to read these but the embedded nulls subsequently cause problems elsewhere in...
20
by: SMG | last post by:
Hi All, I have created an application which is working fine and is in about to launch, now suddenly my mgmt says there are chances that Scrip ID( a particular id and not prim key) may have special...
8
by: santiago | last post by:
Editing XML with special characters like ç á ñ Hello. I have to code a multilanguage website. As it will be very static, I set things up so everything is loaded from an XML file. However, as...
12
by: Logos | last post by:
Yes, eval is a tool of the devil and I'll burn for using it. However, in this instance it's quite handy and I'm quite lazy. So, here's a weird one, and I'm wondering if anyone has a workaround....
5
by: joe | last post by:
hello i have a databse program that uses char arrays to output data to reports. I would like to remove all invalid characters from the array and replace them with a blank space. I have problems...
5
by: Timothy Madden | last post by:
Hello Is there a function that will allow me to output text written in utf-8 (from db for example) if my document has Content-Type: text/html; charset=ISO-8859-1 I mean htmlspecialchars()...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.