473,320 Members | 2,145 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,320 software developers and data experts.

StreamReader mutilates '+'characters in UTF-7 files

Hi,

I need to process files that are created in UTF-7 format.
This works fine upto the point where a '+' character
(0x2B/43) appears in the line. The string is mutilated...

The reader appears to have a bug - or am I doing
something wrong here???
The code:
StreamReader Reader = new StreamReader
(@"MyLocalFile.txt", System.Text.Encoding.UTF7);
string sLine;
while((sLine = Reader.ReadLine()) != null)
{
// Process the line
}

The text:
#(6+sections)

Can anybody give me a clue to what is happening here?

Thanks, Hans
Nov 15 '05 #1
7 4787
Hans <an*******@discussions.microsoft.com> wrote:
I need to process files that are created in UTF-7 format.

This works fine upto the point where a '+' character
(0x2B/43) appears in the line. The string is mutilated...

The reader appears to have a bug - or am I doing
something wrong here???
The code:
StreamReader Reader = new StreamReader
(@"MyLocalFile.txt", System.Text.Encoding.UTF7);
string sLine;
while((sLine = Reader.ReadLine()) != null)
{
// Process the line
}

The text:
#(6+sections)

Can anybody give me a clue to what is happening here?


Are you absolutely sure it's UTF-7? In UTF-7, the "+" character
signifies a shift into a modified Base64 mode. See
http://www.faqs.org/rfcs/rfc2152.html for more details.

Where do you get this text file from? UTF-7 is not a very common
character encoding at all - UTF-8 is rather more likely.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Nov 15 '05 #2
Every older piece of windows software as well as the unix
world produce UTF-7 type files. And that's were the file
comes from.

Please explain why notepad/wordpad/visual studio/... CAN
display the contents of the file without a problem!

Regards,Hans
-----Original Message-----
Hans <an*******@discussions.microsoft.com> wrote:
I need to process files that are created in UTF-7 format.
This works fine upto the point where a '+' character
(0x2B/43) appears in the line. The string is mutilated...
The reader appears to have a bug - or am I doing
something wrong here???
The code:
StreamReader Reader = new StreamReader
(@"MyLocalFile.txt", System.Text.Encoding.UTF7);
string sLine;
while((sLine = Reader.ReadLine()) != null)
{
// Process the line
}

The text:
#(6+sections)

Can anybody give me a clue to what is happening here?
Are you absolutely sure it's UTF-7? In UTF-7, the "+"

charactersignifies a shift into a modified Base64 mode. See
http://www.faqs.org/rfcs/rfc2152.html for more details.

Where do you get this text file from? UTF-7 is not a very commoncharacter encoding at all - UTF-8 is rather more likely.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
.

Nov 15 '05 #3
I just found out that
StreamReader Reader = new StreamReader
(@"MyLocalFile.txt", System.Text.Encoding.Default);

produces the desired result.

One of my books, however, says that "using the Default
property is discouraged".

Can anybody tell me why???

Thanks, Hans
-----Original Message-----
Hi,

I need to process files that are created in UTF-7 format.
This works fine upto the point where a '+' character
(0x2B/43) appears in the line. The string is mutilated...

The reader appears to have a bug - or am I doing
something wrong here???
The code:
StreamReader Reader = new StreamReader
(@"MyLocalFile.txt", System.Text.Encoding.UTF7);
string sLine;
while((sLine = Reader.ReadLine()) != null)
{
// Process the line
}

The text:
#(6+sections)

Can anybody give me a clue to what is happening here?

Thanks, Hans
.

Nov 15 '05 #4
<an*******@discussions.microsoft.com> wrote:
I just found out that
StreamReader Reader = new StreamReader
(@"MyLocalFile.txt", System.Text.Encoding.Default);

produces the desired result.
In which case, as I suspected, it *wasn't* UTF-7.
One of my books, however, says that "using the Default
property is discouraged".

Can anybody tell me why???


It means that only people with the same default will get the same
results - and the default will depend on things like operating system,
regional settings etc.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Nov 15 '05 #5
Hans <an*******@discussions.microsoft.com> wrote:
Every older piece of windows software as well as the unix
world produce UTF-7 type files. And that's were the file
comes from.

Please explain why notepad/wordpad/visual studio/... CAN
display the contents of the file without a problem!


I don't think UTF-7 means what you think it means. UTF-7 is a way of
encoding Unicode characters within ASCII files.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Nov 15 '05 #6
Sun an Notepad (at least by default) produce ANSI encoded
files. 'System.Text.Encoding.Default' (like GetACP())
encodes according to...... the system's current ANSI code
page.

If there's another way to get ANSI-encoding do tell me!!!

Hans.

-----Original Message-----
<an*******@discussions.microsoft.com> wrote:
I just found out that
StreamReader Reader = new StreamReader
(@"MyLocalFile.txt", System.Text.Encoding.Default);

produces the desired result.
In which case, as I suspected, it *wasn't* UTF-7.
One of my books, however, says that "using the Default
property is discouraged".

Can anybody tell me why???


It means that only people with the same default will get

the sameresults - and the default will depend on things like operating system,regional settings etc.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
.

Nov 15 '05 #7
Hans <an*******@discussions.microsoft.com> wrote:
Sun an Notepad (at least by default) produce ANSI encoded
files.

'System.Text.Encoding.Default' (like GetACP())
encodes according to...... the system's current ANSI code
page.

If there's another way to get ANSI-encoding do tell me!!!


It's a case of *which* ANSI-encoding to use though. If you always use
the default one for the computer, it means that if you transfer files
to/from another computer with a different default, you're in trouble.
If you let the user specify the encoding (using Encoding.Default as the
default, but not relying on it) you give a lot more flexibility - and
if you also give the option of reading/writing in UTF-8, you end up
with the full flexibility of Unicode in a fairly compact form.

(Certainly if you don't need an older tool to understand the file
you're writing, I'd go with UTF-8 virtually every time.)

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Nov 15 '05 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

17
by: Pikkel | last post by:
i'm looking for a way to replace special characters with characters without accents, cedilles, etc.
1
by: Raed Sawalha | last post by:
I have a HTML file containing Hebrew and Arabic text when using following code to read its content string templateFile = Server.MapPath("MsgTemplate.htm"); System.IO.StreamReader reader = new...
2
by: Ian Oldbury | last post by:
I'm having a problem reading from a flat file, in the file "£" and "»" exist however when i view the contents of the variable LineContents these characters don't exist. Has anyone got any...
7
by: Drew Berkemeyer | last post by:
Hello, I'm using the following code to read a text file in VB.NET. Dim sr As StreamReader = File.OpenText(strFilePath) Dim input As String = sr.ReadLine() While Not input Is Nothing...
3
by: pabelard | last post by:
I am reading from a file and trying to find out if it has characters above ASCII 127 in it. My sample file does have several of these characters. However, the streamreader seems to skip over...
2
by: Bryan Dickerson | last post by:
StreamReader says it is designed to read a stream of characters StringReader says it is designed to read a string TextReader says it is designed to read a sequential list of characters. I hate...
11
by: LucaJonny | last post by:
Hi, I've got a problem using StreamReader in VB.NET. I try to read a txt file that contains extended characters and theese are removed from the line that is being read. I've read a lot of...
7
by: sweetpotatop | last post by:
Hello, I have a txt file which has a few lines of special characters: This is A line from the text file: ...
4
by: George | last post by:
Hi, I am puzzled by the following and seeking some assistance to help me understand what happened. I have very limited encoding knowledge. Our SAP system writes out a text file which includes...
0
by: rajana | last post by:
Dear All, We have Ansi file with german characters (Ä / Ø) , We are using Streamreader to read the contents of the file. But Readline() not able to read the German characters. We tried all...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
0
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
1
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.