473,386 Members | 1,846 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,386 software developers and data experts.

Non-ascii characters in VS.NET service

I've created a simple .NET 1.1 web service using VS.NET 2003: it has one
method that takes a string parameter. It iterates through the input string,
turning each character into hex and appending it to an output string, and
returns the result.

I now send this service SOAP messages containing non-ASCII characters in the
field that becomes the input string. Each SOAP message has an XML header
that correctly describes the format of the non-ASCII characters. (I've
tried both iso-8859-1 and utf-8).

For some reason, each XML character that's non-ASCII has been turned into a
question mark "?" in the input string. Actually, a UTF-8 character that
contains two bytes becomes two question marks. This happens before any of
my code (or any code VS.NET is willing to show me) runs, so I'm at a loss to
know how to investigate it. Question marks are often generated when trying
to represent a character in a character set that doesn't contain it, but in
this case the target is a C# string, which can represent any Unicode
character.

I'd appreciate any insights about this.
Feb 9 '07 #1
10 3091
Mike Schilling <ms*************@hotmail.comwrote:
I've created a simple .NET 1.1 web service using VS.NET 2003: it has one
method that takes a string parameter. It iterates through the input string,
turning each character into hex and appending it to an output string, and
returns the result.
How is it turning the character into hex?
I now send this service SOAP messages containing non-ASCII characters in the
field that becomes the input string. Each SOAP message has an XML header
that correctly describes the format of the non-ASCII characters. (I've
tried both iso-8859-1 and utf-8).
What do you mean by "an XML header"? It should just be in the XML
delcaration.
For some reason, each XML character that's non-ASCII has been turned into a
question mark "?" in the input string. Actually, a UTF-8 character that
contains two bytes becomes two question marks.
That suggests that whatever's producing the XML file is wrong, *or*
that you're looking at the XML in an inappropriate editor. How are you
looking at the XML?

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
Feb 9 '07 #2

"Jon Skeet [C# MVP]" <sk***@pobox.comwrote in message
news:MP************************@msnews.microsoft.c om...
Mike Schilling <ms*************@hotmail.comwrote:
>I've created a simple .NET 1.1 web service using VS.NET 2003: it has one
method that takes a string parameter. It iterates through the input
string,
turning each character into hex and appending it to an output string, and
returns the result.

How is it turning the character into hex?
ret += "0x" + ((int)s[i]).ToString("X");
>
>I now send this service SOAP messages containing non-ASCII characters in
the
field that becomes the input string. Each SOAP message has an XML header
that correctly describes the format of the non-ASCII characters. (I've
tried both iso-8859-1 and utf-8).

What do you mean by "an XML header"? It should just be in the XML
delcaration.
Exactly. Each SOAP message specifies the correct encoding in its XML
declaration, as shown below.
>
>For some reason, each XML character that's non-ASCII has been turned into
a
question mark "?" in the input string. Actually, a UTF-8 character that
contains two bytes becomes two question marks.

That suggests that whatever's producing the XML file is wrong, *or*
that you're looking at the XML in an inappropriate editor. How are you
looking at the XML?
<?xml version="1.0" encoding="iso-8859-1"?>
<soap:Envelope xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xsd="
http://www.w3.org/2001/XMLSchema"
xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">
<soap:Body>
<ToHex xmlns="http://tempuri.org/">
<s>aéìæf</s>
</ToHex>
</soap:Body>
</soap:Envelope>

NNTP is likely to garble the non-ASCII characters, but in hex the string
inside the <stags is

141 351 354 346 146

verified by using od -b.

In iso-8859-1, these are respectively

a, e with an acute accent, i with a grave accent, ae, f

And I can parse the same file locally and observe that it's correct (e.g.
with the program below). It only acts oddly when processed by the web
service.

using System;
using System.Xml;

namespace XMLParser
{
class ParseXML
{
static void Main(string[] args)
{
XmlDocument doc = new XmlDocument();
doc.Load("c:\\java\\toHex.xml");
dumpStrings(doc);
Console.WriteLine("<Done>");
}

private static void dumpStrings(XmlNode node)
{
if (node is XmlCharacterData)
{
Console.Out.WriteLine(node.Value);
}
else
{
for (XmlNode child = node.FirstChild;
child != null;
child = child.NextSibling)
{
dumpStrings(child);
}
}
}
}
}

Feb 9 '07 #3
Mike Schilling <ap@newsgroup.nospamwrote:

<snip>
And I can parse the same file locally and observe that it's correct (e.g.
with the program below). It only acts oddly when processed by the web
service.
That's pretty odd :(

I've passed non-ASCII characters in web services before with no
problems... this is very odd.

Do you have a solution with just a web service and just a test app that
I could have a look at?

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
Feb 10 '07 #4

"Jon Skeet [C# MVP]" <sk***@pobox.comwrote in message
news:MP************************@msnews.microsoft.c om...
Mike Schilling <ap@newsgroup.nospamwrote:

<snip>
>And I can parse the same file locally and observe that it's correct (e.g.
with the program below). It only acts oddly when processed by the web
service.

That's pretty odd :(

I've passed non-ASCII characters in web services before with no
problems... this is very odd.

Do you have a solution with just a web service and just a test app that
I could have a look at?
Here's the web service:

using System;
using System.Collections;
using System.ComponentModel;
using System.Data;
using System.Diagnostics;
using System.Web;
using System.Web.Services;

namespace HexString
{
public class Service1 : System.Web.Services.WebService {
public Service1() {
InitializeComponent();
}

private IContainer components = null;

private void InitializeComponent() { }

protected override void Dispose( bool disposing ) {
if(disposing && components != null) {
components.Dispose();
}
base.Dispose(disposing);
}

[WebMethod]
public string ToHex(String s) {
String ret = "";
for (int i = 0; i < s.Length; i++) {
ret += "0x" + ((int)s[i]).ToString("X");
if (i < s.Length - 1)
ret += ", ";
}
return ret;
}
}
}

and here is the client

using System;
using System.Text;
using System.Net;

namespace HexStringClient {
class Client {
[STAThread]
static void Main(string[] args) {
WebClient wc = new WebClient();
byte[] bytes = Encoding.GetEncoding("iso-8859-1").GetBytes(doc);

try {
wc.Headers.Add("SOAPAction", "\"http://tempuri.org/ToHex\"");
wc.Headers.Add("content-type", "text/xml");
byte [] response =
wc.UploadData("http://localhost/HexString/Service1.asmx",
"POST", bytes);
Console.Out.WriteLine(Encoding.ASCII.GetString(res ponse));
}
catch (Exception ex) {
Console.Out.WriteLine(ex);
}
}

static String doc =
"<?xml version='1.0' encoding='iso-8859-1'?>\n" +
"<soap:Envelope xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance'\n"
+
"xmlns:xsd='http://www.w3.org/2001/XMLSchema'\n" +
"xmlns:soap='http://schemas.xmlsoap.org/soap/envelope/'>\n" +
"<soap:Body>\n" +
"<ToHex xmlns='http://tempuri.org/'>\n" +
" <s>a\u00e9\u00ec\u00e6f</s>\n" +
"</ToHex>\n" +
"</soap:Body>\n" +
"</soap:Envelope>";
}
}

Feb 11 '07 #5
Mike Schilling <ap@newsgroup.nospamwrote:
Do you have a solution with just a web service and just a test app that
I could have a look at?

Here's the web service:
Got it :)

The SOAP handler is using the content type at the HTTP level to decode
the data. If you change your client content type line to:

wc.Headers.Add("content-type", "text/xml; charset=ISO-8859-1");

then it works.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
Feb 12 '07 #6

"Jon Skeet [C# MVP]" <sk***@pobox.comwrote in message
news:MP************************@msnews.microsoft.c om...
Mike Schilling <ap@newsgroup.nospamwrote:
Do you have a solution with just a web service and just a test app that
I could have a look at?

Here's the web service:

Got it :)

The SOAP handler is using the content type at the HTTP level to decode
the data. If you change your client content type line to:

wc.Headers.Add("content-type", "text/xml; charset=ISO-8859-1");

then it works.
So it does. Thanks.

Now, how did you figure this out?
Feb 12 '07 #7
Mike Schilling <ap@newsgroup.nospamwrote:
The SOAP handler is using the content type at the HTTP level to decode
the data. If you change your client content type line to:

wc.Headers.Add("content-type", "text/xml; charset=ISO-8859-1");

then it works.

So it does. Thanks.

Now, how did you figure this out?
With the help of your app, I put a break point in the method. Go up the
stack a few levels, have a look at the HttpRequest involved, look at
what it thinks the content encoding is, and hope :)

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
Feb 12 '07 #8
"Jon Skeet [C# MVP]" <sk***@pobox.comwrote in message
news:MP************************@msnews.microsoft.c om...
Mike Schilling <ap@newsgroup.nospamwrote:
The SOAP handler is using the content type at the HTTP level to decode
the data. If you change your client content type line to:

wc.Headers.Add("content-type", "text/xml; charset=ISO-8859-1");

then it works.

So it does. Thanks.

Now, how did you figure this out?

With the help of your app, I put a break point in the method. Go up the
stack a few levels, have a look at the HttpRequest involved, look at
what it thinks the content encoding is, and hope :)
Odd. Using VS.NET 2003, the only call stack I get is

hexstring.dll!HexString.Service1.ToHex(string s = "a???f") Line 57 C#
<non-user code>

But I can get the HttpRequest from the current stack frame and see that its
encoding is UTF-8. OK, let's chage the client to send a UTF-8 string but
leave the ciontent encoding unspecified. No, that still fails.

Trying a few more things gives this reults:

Column 1: Encoding specified in content-type header
Column 2: Value of HttpRequest.ContentEncoding
Column 3. Apparent effective encoding

<none UTF8 ASCII
UTF8 UTF8 UTF8
ISO-8859-1 ISO-8859-1 ISO-8859-1
ASCII ASCII ASCII

I don't entirely understand line 1, but I do know how to solve the problem.
Thanks!
Feb 12 '07 #9
Mike Schilling <ap@newsgroup.nospamwrote:
With the help of your app, I put a break point in the method. Go up the
stack a few levels, have a look at the HttpRequest involved, look at
what it thinks the content encoding is, and hope :)

Odd. Using VS.NET 2003, the only call stack I get is

hexstring.dll!HexString.Service1.ToHex(string s = "a???f") Line 57 C#
<non-user code>
You need to show the non-user code in order to get further up the
stack.
But I can get the HttpRequest from the current stack frame and see that its
encoding is UTF-8. OK, let's chage the client to send a UTF-8 string but
leave the ciontent encoding unspecified. No, that still fails.

Trying a few more things gives this reults:

Column 1: Encoding specified in content-type header
Column 2: Value of HttpRequest.ContentEncoding
Column 3. Apparent effective encoding

<none UTF8 ASCII
UTF8 UTF8 UTF8
ISO-8859-1 ISO-8859-1 ISO-8859-1
ASCII ASCII ASCII

I don't entirely understand line 1, but I do know how to solve the problem.
Thanks!
That's very odd...

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
Feb 12 '07 #10

"Jon Skeet [C# MVP]" <sk***@pobox.comwrote in message
news:MP***********************@msnews.microsoft.co m...
Mike Schilling <ap@newsgroup.nospamwrote:
With the help of your app, I put a break point in the method. Go up the
stack a few levels, have a look at the HttpRequest involved, look at
what it thinks the content encoding is, and hope :)

Odd. Using VS.NET 2003, the only call stack I get is

hexstring.dll!HexString.Service1.ToHex(string s = "a???f") Line 57 C#
<non-user code>

You need to show the non-user code in order to get further up the
stack.
Cool; I didn't know you could do that.
Feb 12 '07 #11

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

12
by: lothar | last post by:
re: 4.2.1 Regular Expression Syntax http://docs.python.org/lib/re-syntax.html *?, +?, ?? Adding "?" after the qualifier makes it perform the match in non-greedy or minimal fashion; as few...
5
by: klaus triendl | last post by:
hi, recently i discovered a memory leak in our code; after some investigation i could reduce it to the following problem: return objects of functions are handled as temporary objects, hence...
3
by: Mario | last post by:
Hello, I couldn't find a solution to the following problem (tried google and dejanews), maybe I'm using the wrong keywords? Is there a way to open a file (a linux fifo pipe actually) in...
25
by: Yves Glodt | last post by:
Hello, if I do this: for row in sqlsth: ________pkcolumns.append(row.strip()) ________etc without a prior:
32
by: Adrian Herscu | last post by:
Hi all, In which circumstances it is appropriate to declare methods as non-virtual? Thanx, Adrian.
8
by: Bern McCarty | last post by:
Is it at all possible to leverage mixed-mode assemblies from AppDomains other than the default AppDomain? Is there any means at all of doing this? Mixed-mode is incredibly convenient, but if I...
14
by: Patrick Kowalzick | last post by:
Dear all, I have an existing piece of code with a struct with some PODs. struct A { int x; int y; };
11
by: ypjofficial | last post by:
Hello All, So far I have been reading that in case of a polymorphic class ( having at least one virtual function in it), the virtual function call get resolved at run time and during that the...
2
by: Ian825 | last post by:
I need help writing a function for a program that is based upon the various operations of a matrix and I keep getting a "non-aggregate type" error. My guess is that I need to dereference my...
12
by: puzzlecracker | last post by:
is it even possible or/and there is a better alternative to accept input in a nonblocking manner?
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.