473,516 Members | 3,399 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

encoding question

Hello,

tried this in framework.aspnet without any luck so far, maybe someone here has
a comment ...

TIA, Jim

..net c# httphandler straight html form at browser.

GBP pound sign problem (I know I know - I *can* decode it, but I've got to
understand what and why I should be doing stuff)

I am uploading text data from a form. This data is either directly input into a
textarea, or is a file stream originating from a .txt file, (or other basic text
file (like off Mac or Unix - of course I don't necessarily know at present it's
only .txt)

The page encoding is :-
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />

On arrival at the server the content encoding is, sure enough UTF8.

Data input via the textarea and input to a string is displayed in the debugger
as pounds (£)

Data input as a filestream has in the stream single bytes containing 0xA3 for
the GBP pound sign.

I process the input stream like this :-

public static string StreamToString(Stream aStream)
{ {
aStream.Position = 0;
long i = aStream.Length;
byte[] buffer = new byte[i];

aStream.Read(buffer,0,(int)aStream.Length);
return BytesToUTF8String(buffer);
}

public static string BytesToUTF8String(byte[] Array)
{
Encoding utf8 = Encoding.UTF8;
char[] utf8Chars = new char[utf8.GetCharCount(Array, 0,Array.Length)];
utf8.GetChars(Array, 0, Array.Length, utf8Chars, 0);

return new string(utf8Chars);
}

The resulting string contains nothing ...

If I use ASCII instead of UTF8, I get sense except my GBP signs are query ?
marks.

If I use UTF7 I get an apparently OK decoding.

I am dubious about using UTF7 for no better reason than that it works. Is there
logic here? What should I be doing?

Thanks,
Jim
Nov 16 '05 #1
7 2523
Jim Lawton <uc**@use.your.initiative> wrote:

<snip>
I am dubious about using UTF7 for no better reason than that it works. Is there
logic here? What should I be doing?


You should probably be using Encoding.Default, or Encoding.GetEncoding
(28591) (i.e. ISO-8859-1).

You almost certainly *don't* want UTF-7 really.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Nov 16 '05 #2
Jon Skeet [C# MVP] <sk***@pobox.com> wrote:
I am dubious about using UTF7 for no better reason than that it works. Is there
logic here? What should I be doing?


You should probably be using Encoding.Default, or Encoding.GetEncoding
(28591) (i.e. ISO-8859-1).

You almost certainly *don't* want UTF-7 really.


Thinking about it further, ISO-8859-1 won't work either - basically you
need to know the original encoding of the file. It may well be Windows
CP-1252, which will be what Encoding.Default will probably return if
you're in Western Europe or the US, unless you've changed the defaults,
but really you're still going to be at the whim of files which *aren't*
written with that encoding :(

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Nov 16 '05 #3
Jim,

In addition of Jon, the A3 is the pound in the 1252
http://www.microsoft.com/globaldev/r.../sbcs/1252.htm

To use it forced you can maybe try
mystring = System.Text.Encoding.GetEncoding(1252).GetBytes(St r.ReadToEnd);

I hope this helps?

Cor
Nov 16 '05 #4
Jon Skeet [C# MVP] <sk***@pobox.com> wrote:
Jon Skeet [C# MVP] <sk***@pobox.com> wrote:
I am dubious about using UTF7 for no better reason than that it works. Is there
logic here? What should I be doing?


You should probably be using Encoding.Default, or Encoding.GetEncoding
(28591) (i.e. ISO-8859-1).

You almost certainly *don't* want UTF-7 really.


Thinking about it further, ISO-8859-1 won't work either


Apparently I didn't think about it enough. Ignore me - ISO-8859-1
*will* work for converting byte A3 into a pound sign. Whether it'll
work for the rest of the file depends on the contents of the file.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Nov 16 '05 #5
On Wed, 12 Jan 2005 07:07:07 -0000, Jon Skeet [C# MVP] <sk***@pobox.com> wrote:
Jon Skeet [C# MVP] <sk***@pobox.com> wrote:
Jon Skeet [C# MVP] <sk***@pobox.com> wrote:
> > I am dubious about using UTF7 for no better reason than that it works. Is there
> > logic here? What should I be doing?
>
> You should probably be using Encoding.Default, or Encoding.GetEncoding
> (28591) (i.e. ISO-8859-1).
>
> You almost certainly *don't* want UTF-7 really.


Thinking about it further, ISO-8859-1 won't work either


Apparently I didn't think about it enough. Ignore me - ISO-8859-1
*will* work for converting byte A3 into a pound sign. Whether it'll
work for the rest of the file depends on the contents of the file.

:-) ... thanks for all your thoughts Jon - if nothing else it gives me some
confidence that the whole encoding issue is a can of worms! It smells a bit of
the old "DLL Hell" to me - all we need is a few bytes on the front of any file
to say what encoding it is, but we'll never get it!

Jim
Nov 16 '05 #6
On Tue, 11 Jan 2005 22:22:56 +0100, "Cor Ligthert" <no************@planet.nl>
wrote:
Jim,

In addition of Jon, the A3 is the pound in the 1252
http://www.microsoft.com/globaldev/r.../sbcs/1252.htm

To use it forced you can maybe try
mystring = System.Text.Encoding.GetEncoding(1252).GetBytes(St r.ReadToEnd);

I hope this helps?

Cor


I'll give it a whirl - thanks ...
J

Nov 16 '05 #7
On Wed, 12 Jan 2005 07:07:07 -0000, Jon Skeet [C# MVP] <sk***@pobox.com> wrote:


Apparently I didn't think about it enough. Ignore me - ISO-8859-1
*will* work for converting byte A3 into a pound sign. Whether it'll
work for the rest of the file depends on the contents of the file.

I think that's right - works well enough ... I've inspected ("watched") the
contents of the request, and I can't see anything which relates to the encoding
of the bytestream - just text/plain so I'm down to guessing. Input will always
be from the UK ...

Cheers Jim
Nov 16 '05 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
395
by: Mark | last post by:
Sorry about the last... Anyway, here's the question: I've been working on some C# routines to process strings in and out of various encodings. The hope is that I can just let the user type in the encoding they want and I'll do a pretty good job of converting. Basically, I take a string as input, write it to a byte array MemoryStream and then...
7
1492
by: polarz | last post by:
I wrote a front end to a command line mp4 music tagger that reads my playlist files and gets info such as artist, title, album, etc, etc. I use the info to catalog, tag, sort, etc, my files. I've run into a small encoding problem. I started using a fopen like so. fopen = new StreamReader(myPlaylist); Everything worked well until I ran...
5
5241
by: Waldy | last post by:
Hi there, how do you set the encoding format of an XML string? When I was outputting the XML to a file you can specify the encoding format like so: XmlTextWriter myWriter; myWriter = new XmlTextWriter(myXMLFile, System.Text.Encoding.UTF8);
4
8361
by: Christina | last post by:
Hey Guys, Currently, I am using the below code: Dim oReqDoc as XmlDocument Dim requiredBytes As Byte() requiredBytes = System.Text.UTF8Encoding.UTF8.GetBytes(oReqDoc.InnerXml). Here, I am encoding my xml string in UTF8 format.
4
34100
by: George | last post by:
Hi, I am puzzled by the following and seeking some assistance to help me understand what happened. I have very limited encoding knowledge. Our SAP system writes out a text file which includes German characters. 1. When I use StreamReader(System.String filepath) without specifying an encoding method, the German characters such as Ä are...
4
2450
by: Provost Zakharov | last post by:
Hello, I just needed some help on how the DOM is encoded by the IE parser. As per the MSDN page, http://msdn.microsoft.com/workshop/author/dhtml/reference/charsets/charset4.asp ,server encodings are considered first,then the <metatag specified encodings and then finally the user's preferred settings(which is usually Western-European aka...
3
5473
by: mortb | last post by:
1. How do I determine which encoding a xmldocument or xmlreader uses when opening a document? I'm not just talking about the <?xml encoding="utf-8"?attribute, but the actual encoding of the characters in the underlying stream. 2. How do I make sure that the encoding of my created xmldocument or xmlwriter is in utf-8? Thanks! /mortb
23
4988
by: Allan Ebdrup | last post by:
I hava an ajax web application where i hvae problems with UTF-8 encoding oc chineese chars. My Ajax webapplication runs in a HTML page that is UTF-8 Encoded. I copy and paste some chineese chars from another HTML page viewed in IE7, that is also UTF-8 encoded (search for "china" on google.com). I paste the chineese chars into a content...
1
32850
by: ujjwaltrivedi | last post by:
Hey guys, Can anyone tell me how to create a text file with Unicode Encoding. In am using FileStream Finalfile = new FileStream("finalfile.txt", FileMode.Append, FileAccess.Write); ###Question: Now this creates finalfile.txt with ANSI Encoding ...which is a default. Either tell me how to change the default or how to create a
0
5029
by: deloford | last post by:
Hi This is going to be a question for anyone who is an expert in C# Text Encoding. My situation is this: I have a Sybase database which is firing back ISO-8559 encoded strings. I am unable to get the db to translate to UTF-8 for non technical reasons. So I have a string coming back with the character œ (ISO value 156). this character...
0
7182
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language...
0
7405
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. ...
0
7574
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that...
0
7547
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the...
0
5712
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then...
1
5106
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes...
0
4769
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert...
0
3252
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
0
1620
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.