472,129 Members | 1,742 Online
Bytes | Software Development & Data Engineering Community
Post +

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 472,129 software developers and data experts.

EBCDIC Codepage file to Windows based file

Ram
Dear All,

Good Day

I am trying to convert a file which is generated on AS400 with codepage 00420 (arabic & English data combination) with no success. But using the same code( and changing 20420 to 708) I am able to convert a file from codepage ASMO 708 to windows based file and it is perfect. The following is the code I tried. i used 20420 codepage which is the nearest match for 00420. but no luck.

//Open file for reading and set encoding to 20420
StreamReader sr = new StreamReader((System.IO.Stream)File.OpenRead(strIn File ), Encoding.GetEncoding(20420));

//Openfile to write and set encoding to default

StreamWriter sw = new StreamWriter(strOutFile, false, Encoding.Default);

and I read the input file line by line and then write each line to the output file
//Read line
Line = sr.ReadLine();

//Loop logic here and then write to file

//Write line
sw.WriteLine(Line );

I verfied in windows regional settings to make sure that codepage 20420 is checked. Still I am not able to convert the file. The output is full of questions marks, which i belive is because it didnt recognize the character.

Anyone please help me to solve this issue.

Thanks in advance
Ram
Nov 16 '05 #1
4 6968
"Ram" <Ra*@discussions.microsoft.com> wrote in message
news:3B**********************************@microsof t.com...
Anyone please help me to solve this issue.


This might help: http://www.eggheadcafe.com/articles/20030521.asp
Nov 16 '05 #2
Ram <Ra*@discussions.microsoft.com> wrote:
I am trying to convert a file which is generated on AS400 with
codepage 00420 (arabic & English data combination) with no success.
But using the same code( and changing 20420 to 708) I am able to
convert a file from codepage ASMO 708 to windows based file and it is
perfect. The following is the code I tried. i used 20420 codepage
which is the nearest match for 00420. but no luck.

//Open file for reading and set encoding to 20420
StreamReader sr = new
StreamReader((System.IO.Stream)File.OpenRead(strIn File ),
Encoding.GetEncoding(20420));

//Openfile to write and set encoding to default

StreamWriter sw = new StreamWriter(strOutFile, false, Encoding.Default);

and I read the input file line by line and then write each line to
the output file
//Read line
Line = sr.ReadLine();

//Loop logic here and then write to file

//Write line
sw.WriteLine(Line );

I verfied in windows regional settings to make sure that codepage
20420 is checked. Still I am not able to convert the file. The output
is full of questions marks, which i belive is because it didnt
recognize the character.


Are you sure that all the characters in the file are available in
Encoding.Default?

I would suggest you separate the reading part from the writing part -
read some data and then write out the Unicode values you've read, eg

foreach (char c in line)
{
Console.WriteLine ((int)c);
}

Then look on www.unicode.org to see whether those are correct.

When you've got the reading of the data sorted, you can then move onto
trying to write it correctly.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Nov 16 '05 #3
Ram
When you refer to Unicode Values i am confused. Is it because I am using 20420 encoding the string becomes unicode?

Now i am displaying the values, but which character mapping should i look in Unicode.org.

I am new to this codepage conversion stuff. If it is a basic question please excuse me.

Thanks & Regards
Ram

"Jon Skeet [C# MVP]" wrote:
Ram <Ra*@discussions.microsoft.com> wrote:
I am trying to convert a file which is generated on AS400 with
codepage 00420 (arabic & English data combination) with no success.
But using the same code( and changing 20420 to 708) I am able to
convert a file from codepage ASMO 708 to windows based file and it is
perfect. The following is the code I tried. i used 20420 codepage
which is the nearest match for 00420. but no luck.

//Open file for reading and set encoding to 20420
StreamReader sr = new
StreamReader((System.IO.Stream)File.OpenRead(strIn File ),
Encoding.GetEncoding(20420));

//Openfile to write and set encoding to default

StreamWriter sw = new StreamWriter(strOutFile, false, Encoding.Default);

and I read the input file line by line and then write each line to
the output file
//Read line
Line = sr.ReadLine();

//Loop logic here and then write to file

//Write line
sw.WriteLine(Line );

I verfied in windows regional settings to make sure that codepage
20420 is checked. Still I am not able to convert the file. The output
is full of questions marks, which i belive is because it didnt
recognize the character.


Are you sure that all the characters in the file are available in
Encoding.Default?

I would suggest you separate the reading part from the writing part -
read some data and then write out the Unicode values you've read, eg

foreach (char c in line)
{
Console.WriteLine ((int)c);
}

Then look on www.unicode.org to see whether those are correct.

When you've got the reading of the data sorted, you can then move onto
trying to write it correctly.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too

Nov 16 '05 #4
Ram <Ra*@discussions.microsoft.com> wrote:
When you refer to Unicode Values i am confused. Is it because I am
using 20420 encoding the string becomes unicode?
Strings in .NET are *always* in Unicode.
Now i am displaying the values, but which character mapping should i
look in Unicode.org.
Look in http://www.unicode.org/charts and find the chart with the right
region of characters in. For instance, if you saw character 0x0785
you'd look down the list until you found
http://www.unicode.org/charts/PDF/U0780.pdf - unfortunately you
basically need to look at where the links go to work out which one to
follow.

I should have said before - it'll be easier for you if you print out
the values in hex, eg

Console.WriteLine ("{0:x}", (int)c);
I am new to this codepage conversion stuff. If it is a basic question
please excuse me.


It's hard to say whether it's a basic question or not at the moment :)

See http://www.pobox.com/~skeet/csharp/unicode.html for more
information though.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Nov 16 '05 #5

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

12 posts views Thread by John Leslie | last post: by
6 posts views Thread by R.A. | last post: by
8 posts views Thread by Chris H. | last post: by
9 posts views Thread by jeff M via .NET 247 | last post: by
5 posts views Thread by Tin Gherdanarra | last post: by
reply views Thread by Guido/RM/ITALY | last post: by
4 posts views Thread by =?Utf-8?B?ai5hLiBoYXJyaW1hbg==?= | last post: by
44 posts views Thread by Pilcrow | last post: by
reply views Thread by leo001 | last post: by

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.