473,322 Members | 1,718 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,322 software developers and data experts.

Strange character transformations

Hi,

I have an ASP.Net website, which allows users to upload a file which is
then inserted into a database.

This is all fine until it reads a line with the string +Anu in it.
It transforms this to this char É» (which, if Googled for, is
described as Unicode Character 'LATIN SMALL LETTER TURNED R WITH HOOK'
(U+027B) or, in Phonetics, as a 'Retroflex approximant'.)

Has anyone seen this behaviour before, and know how to stop it?
The code's simple - here's an example. The É» appears in the output
where the input is +Anu - it's transformed before I can touch it!

using (StreamReader sr = new StreamReader(strFile,
System.Text.Encoding.UTF7)) {
// Read and display lines from the file until the end of the file is
reached.
while ((line = sr.ReadLine()) != null) {
Response.Write(line);
}
}

Regards

Adam

Aug 9 '06 #1
3 993
Looks like an encoding issue, alright.
Have you tried using the StreamReader constructor that does not require a
character encoding?

"Cy**********@gmail.com" wrote:
Hi,

I have an ASP.Net website, which allows users to upload a file which is
then inserted into a database.

This is all fine until it reads a line with the string +Anu in it.
It transforms this to this char É» (which, if Googled for, is
described as Unicode Character 'LATIN SMALL LETTER TURNED R WITH HOOK'
(U+027B) or, in Phonetics, as a 'Retroflex approximant'.)

Has anyone seen this behaviour before, and know how to stop it?
The code's simple - here's an example. The É» appears in the output
where the input is +Anu - it's transformed before I can touch it!

using (StreamReader sr = new StreamReader(strFile,
System.Text.Encoding.UTF7)) {
// Read and display lines from the file until the end of the file is
reached.
while ((line = sr.ReadLine()) != null) {
Response.Write(line);
}
}

Regards

Adam

Aug 9 '06 #2
Graven,

I'm not sure how a 4 letter string like this could be seen as an
encoding issue, but I will certainly give it a go. Thanks for the
suggestion.

Adam

Graven wrote:
Try to use plain latin-1 encoding. I think it's an unicode
normalization issue, but don't know if StreamReader performs it by
default.
Cy**********@gmail.com wrote:
Hi,

I have an ASP.Net website, which allows users to upload a file which is
then inserted into a database.

This is all fine until it reads a line with the string +Anu in it.
It transforms this to this char É» (which, if Googled for, is
described as Unicode Character 'LATIN SMALL LETTER TURNED R WITH HOOK'
(U+027B) or, in Phonetics, as a 'Retroflex approximant'.)

Has anyone seen this behaviour before, and know how to stop it?
The code's simple - here's an example. The É» appears in the output
where the input is +Anu - it's transformed before I can touch it!

using (StreamReader sr = new StreamReader(strFile,
System.Text.Encoding.UTF7)) {
// Read and display lines from the file until the end of the file is
reached.
while ((line = sr.ReadLine()) != null) {
Response.Write(line);
}
}

Regards

Adam
Aug 12 '06 #3
Larry,

You were spot on - changing to UTF8 stopped this transformation. Thanks

It's not quite solved my problem though.
The file is a Text file, each line being a series of files delimited by
the ¦ character, as this was unliekley to ever appear in the actual
data.

Unfortunately, UTF8 encoding strips these characters completely. ASCII
encoding, on the other hand, replaces them with ?

Oh the joy of character encoding.

Regards

Adam

Larry Lard wrote:
This is why you are seeing what you are seeing. UTF7 encodes characters
outside the printable 7 bit range using UTF16 then modified base64, with
+ as the indicator mark for this encoding. I haven't checked, but I
imagine +Anu is the UTF7 encoding of that character. You shouldn't use a
UTF7 reader to read a file that you don't know for sure was produced by
a UTF7 writer.

The correct way to read the file depends on what kind of file it is. If
it is text of an unknown encoding, there is no way to be absolutely
sure, but UTF8 is a good starting point. If it's binary data, you
shouldn't be using a TextReader class at all.
Aug 16 '06 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

6
by: Anders Eriksson | last post by:
Hello! I'm using ActivePython 2.3.2 build 232 on Windows 2000 and I have noticed a strange behavior in PythonWin IDE (win32all build 163) I'm from Sweden and we have a couple of letters in our...
6
by: Christopher | last post by:
I am currently in the process of evaluating the performance hits of moving to the .NET platform for our application. I created a sample project that loads the transforms the same XML and XSLT in...
6
by: Sam Sungshik Kong | last post by:
Hello! I'm testing Graphics.DrawString and it's very strange. I created an event handler for the form.s Paint event. (There's no other code in the form.) private void Form1_Paint(object...
4
by: Eps | last post by:
My program has an ini file which i read in the settings for the program. The main setting is the path where files are to be saved. At the moment it doesn't work, but if I hardcode the path it does...
0
by: Amil | last post by:
I am screen scraping a web page via GetResponseStream and can't resave a certain character in the stream: Stream respStream = wResp.GetResponseStream(); StreamReader reader = new...
3
by: CyberSpyders | last post by:
Hi, I have an ASP.Net website, which allows users to upload a file which is then inserted into a database. This is all fine until it reads a line with the string +Anu in it. It transforms...
3
by: annecarterfredi | last post by:
I was getting snapshot since the database was responding very slow...here is the query that was in a snapshot: WITH TYPEINTS ( TYPEINT, COLTYPE ) AS ( VALUES ( SMALLINT(1 ), CHAR( 'INTEGER', ...
14
by: blumen | last post by:
Hi all, I'm a newbie in VB.Net Programming.. Hope that some of you can help me to solve this.. I'm working out to read,parse and save textfile into SQL Server. The textfile contains thousands...
1
by: Mark Morss | last post by:
Is this the place to ask a win32com.client question? I am a unix person trying to run on windows, so I have little familiarity with this module. I have this code: import win32com.client ...
7
by: email7373388 | last post by:
I'm working on a program which has a strange operator, :>. This is the syntax: ((unsigned short)( var1)):>((void __near *)( var2 )) Any clue?
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
1
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
1
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.