By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
432,526 Members | 1,888 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 432,526 IT Pros & Developers. It's quick & easy.

encoding problem

P: n/a
Hi,

..net c# httphandler straight html form at browser.

GBP pound sign problem (I know I know - I *can* decode it, but I've got to
understand what and why I should be doing stuff)

I am uploading text data from a form. This data is either directly input into a
textarea, or is a file stream originating from a .txt file, (or other basic text
file (like off Mac or Unix - of course I don't necessarily know at present it's
only .txt)

The page encoding is :-
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />

On arrival at the server the content encoding is, sure enough UTF8.

Data input via the textarea and input to a string is displayed in the debugger
as pounds ()

Data input as a filestream has in the stream single bytes containing 0xA3 for
the GBP pound sign.

I process the input stream like this :-

public static string StreamToString(Stream aStream)
{ {
aStream.Position = 0;
long i = aStream.Length;
byte[] buffer = new byte[i];

aStream.Read(buffer,0,(int)aStream.Length);
return BytesToUTF8String(buffer);
}

public static string BytesToUTF8String(byte[] Array)
{
Encoding utf8 = Encoding.UTF8;
char[] utf8Chars = new char[utf8.GetCharCount(Array, 0,Array.Length)];
utf8.GetChars(Array, 0, Array.Length, utf8Chars, 0);

return new string(utf8Chars);
}

The resulting string contains nothing ...

If I use ASCII instead of UTF8, I get sense except my GBP signs are query ?
marks.

If I use UTF7 I get an apparently OK decoding.

I am dubious about using UTF7 for no better reason than that it works. Is there
logic here? What should I be doing?

Thanks,
Jim
Nov 19 '05 #1
Share this Question
Share on Google+
2 Replies


P: n/a
it doesn't really matter what encoding you use for the page response, whats
important is the encoding used on the post from the browser. the browser
picks this (though often it will match). you should check the content-type
header the browser sends to determine the character set. for a html form
post (application/x-www-form-urlencoded) IS0-8859-1 is the default character
set not utf8.

-- bruce (sqlwork.com)
"Jim Lawton" <uc**@use.your.initiative> wrote in message
news:84********************************@4ax.com...
| Hi,
|
| .net c# httphandler straight html form at browser.
|
| GBP pound sign problem (I know I know - I *can* decode it, but I've got to
| understand what and why I should be doing stuff)
|
| I am uploading text data from a form. This data is either directly input
into a
| textarea, or is a file stream originating from a .txt file, (or other
basic text
| file (like off Mac or Unix - of course I don't necessarily know at present
it's
| only .txt)
|
| The page encoding is :-
| <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
|
| On arrival at the server the content encoding is, sure enough UTF8.
|
| Data input via the textarea and input to a string is displayed in the
debugger
| as pounds ()
|
| Data input as a filestream has in the stream single bytes containing 0xA3
for
| the GBP pound sign.
|
| I process the input stream like this :-
|
| public static string StreamToString(Stream aStream)
| { {
| aStream.Position = 0;
| long i = aStream.Length;
| byte[] buffer = new byte[i];
|
| aStream.Read(buffer,0,(int)aStream.Length);
| return BytesToUTF8String(buffer);
| }
|
| public static string BytesToUTF8String(byte[] Array)
| {
| Encoding utf8 = Encoding.UTF8;
| char[] utf8Chars = new char[utf8.GetCharCount(Array, 0,Array.Length)];
| utf8.GetChars(Array, 0, Array.Length, utf8Chars, 0);
|
| return new string(utf8Chars);
| }
|
| The resulting string contains nothing ...
|
| If I use ASCII instead of UTF8, I get sense except my GBP signs are query
?
| marks.
|
| If I use UTF7 I get an apparently OK decoding.
|
| I am dubious about using UTF7 for no better reason than that it works. Is
there
| logic here? What should I be doing?
|
| Thanks,
| Jim
Nov 19 '05 #2

P: n/a
On Tue, 11 Jan 2005 10:03:03 -0800, "bruce barker" <no***********@safeco.com>
wrote:
it doesn't really matter what encoding you use for the page response, whats
important is the encoding used on the post from the browser. the browser
picks this (though often it will match). you should check the content-type
header the browser sends to determine the character set. for a html form
post (application/x-www-form-urlencoded) IS0-8859-1 is the default character
set not utf8.

-- bruce (sqlwork.com)


Thanks Bruce,

for anyone googling this topic in future, there's more in
dotnet.languages.csharp
Message-ID: <36********************************@4ax.com>

cheers Jim
Nov 19 '05 #3

This discussion thread is closed

Replies have been disabled for this discussion.