Lenard Gunda <frenzy@fbi.hu> wrote:[color=blue]
> I have the following problem. I need to read data from a TXT file our
> company receives.
> I would use StreamReader, and process it line by line using ReadLine,
> however, the following problem occurs.
>
> The file contains characters with ASCII codes above 128.[/color]
No it doesn't, because there are no such things. ASCII is a 7-bit
encoding.
[color=blue]
> But the file is still text (nothing like UTF7/8 or the like).[/color]
UTF-7 and UTF-8 are text encodings - a file containing text in UTF-8
encoding is still a text file.
[color=blue]
> It also might contain + signs. As a result:
>
> UTF8 encoding doesn't read characters above 128
> UTF7 encoding reads everything ok, except eats the + signs, and some
> characters after them
> ASCII encoding reads the + sign ok, however, characters above 128 are
> disappear.
>
> Because the file arrives in this form, I do not have any control on how it
> looks like. The best idea so far was to create an own ReadLine method, that
> reads the file byte after byte, and converts using UTF7, while taking
> special care to feed the + character (ASCII code 46) to an ASCII encoder.
> This way I could build a string from a line, that contains exactly what's in
> the file.
>
> But would there be a nicer way, or just this do-it-yourself-manually?[/color]
It sounds like you really need to know what encoding your file is
*really* in. Have you tried Encoding.Default?
See
http://www.pobox.com/~skeet/csharp/unicode.html for more
information.
--
Jon Skeet - <skeet@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too