By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
432,474 Members | 1,001 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 432,474 IT Pros & Developers. It's quick & easy.

Convert DOS Cyrillic text to Unicode

P: n/a
How can I convert DOS cyrillic text to Unicode
Jul 21 '05 #1
Share this Question
Share on Google+
17 Replies


P: n/a
Nikolay Petrov <jo******@mail.bg> wrote:
How can I convert DOS cyrillic text to Unicode


See http://www.pobox.com/~skeet/csharp/unicode.html

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Jul 21 '05 #2

P: n/a
I have read this and other info in Unicode topic
My question is how can I do it in VB. I need the code.

"Jon Skeet [C# MVP]" <sk***@pobox.com> wrote in message
news:MP************************@msnews.microsoft.c om...
Nikolay Petrov <jo******@mail.bg> wrote:
How can I convert DOS cyrillic text to Unicode


See http://www.pobox.com/~skeet/csharp/unicode.html

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too

Jul 21 '05 #3

P: n/a
Nikolay Petrov <jo******@mail.bg> wrote:
I have read this and other info in Unicode topic
My question is how can I do it in VB. I need the code.


I provide some C# code to read a file in one encoding and write it in
another. It's very simple code - it should be easy to understand and
rewrite in VB.NET. The important thing is really just the creation of
the StreamReader with the right encoding.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Jul 21 '05 #4

P: n/a
My problem is that I don't read file.
The DOS Cyrillic text is pasted in a textbox, and should apear in another.
That's all.
I don't have anyting in Binary.
"Jon Skeet [C# MVP]" <sk***@pobox.com> wrote in message
news:MP************************@msnews.microsoft.c om...
Nikolay Petrov <jo******@mail.bg> wrote:
I have read this and other info in Unicode topic
My question is how can I do it in VB. I need the code.


I provide some C# code to read a file in one encoding and write it in
another. It's very simple code - it should be easy to understand and
rewrite in VB.NET. The important thing is really just the creation of
the StreamReader with the right encoding.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too

Jul 21 '05 #5

P: n/a
Hi Jon,

I pointed Nikolay in the language.VB newsgroup on you and Jay B, who has
answered a message in language.VB however as well not complete enough for
Nikolay. Jay B will probably not be active on this newsgroup before 13:00
GMT.

I am curious as well, what is the right encoding you think about for this
Cyrillic problem?

Nikolas wrote in the language VB group that he past it from a notepad
so I guess UTF16?

:-)

Cor

....
Nikolay Petrov <jo******@mail.bg> wrote:
I have read this and other info in Unicode topic
My question is how can I do it in VB. I need the code.


I provide some C# code to read a file in one encoding and write it in
another. It's very simple code - it should be easy to understand and
rewrite in VB.NET. The important thing is really just the creation of
the StreamReader with the right encoding.

--

Jul 21 '05 #6

P: n/a
Nikolay Petrov <jo******@mail.bg> wrote:
My problem is that I don't read file.
The DOS Cyrillic text is pasted in a textbox, and should apear in another.
That's all.
I don't have anyting in Binary.


If it's in a text box, you should have it as Unicode text already. All
strings are in Unicode in .NET.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Jul 21 '05 #7

P: n/a
Cor Ligthert <no**********@planet.nl> wrote:
I pointed Nikolay in the language.VB newsgroup on you and Jay B, who has
answered a message in language.VB however as well not complete enough for
Nikolay. Jay B will probably not be active on this newsgroup before 13:00
GMT.

I am curious as well, what is the right encoding you think about for this
Cyrillic problem?
Not sure - but it sounds like it won't actually be a problem, as if
he's got the data in notepad to start with, there's no encoding change
required - cut and paste should sort everything out.
Nikolas wrote in the language VB group that he past it from a notepad
so I guess UTF16?


No way - DOS precedes UTF16 by a long time!

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Jul 21 '05 #8

P: n/a
The user pasts text from text files, which contain DOS Cyrillic characters.
When they are pasted in text box or even in the Notepad windows they look
like garbage.
I am not sure, can I post a file here as attachment, so you can see it?

"Jon Skeet [C# MVP]" <sk***@pobox.com> wrote in message
news:MP************************@msnews.microsoft.c om...
Nikolay Petrov <jo******@mail.bg> wrote:
I have read this and other info in Unicode topic
My question is how can I do it in VB. I need the code.


I provide some C# code to read a file in one encoding and write it in
another. It's very simple code - it should be easy to understand and
rewrite in VB.NET. The important thing is really just the creation of
the StreamReader with the right encoding.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too

Jul 21 '05 #9

P: n/a
Nikolay Petrov <jo******@mail.bg> wrote:
The user pasts text from text files, which contain DOS Cyrillic characters.
What does he have the text open in? It sounds like the existing app is
probably not putting it into the clipboard in Unicode :(
When they are pasted in text box or even in the Notepad windows they look
like garbage.
Ah - I thought you meant he had it working in notepad to start with.
I am not sure, can I post a file here as attachment, so you can see it?


It's probably best if you email it to me.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Jul 21 '05 #10

P: n/a
Hi John,
It's probably best if you email it to me.


I am also interested in this question, so why not mail to the newsgroup?

Cor
Jul 21 '05 #11

P: n/a
Cor Ligthert <no**********@planet.nl> wrote:
It's probably best if you email it to me.


I am also interested in this question, so why not mail to the
newsgroup?


It's more that depending on the way of attaching the file, it might get
converted during the attachment process - that's less likely to happen
in a mail message.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Jul 21 '05 #12

P: n/a
> It's more that depending on the way of attaching the file, it might get
converted during the attachment process - that's less likely to happen
in a mail message.

So I wait the results and than you can maybe send it to me when all is
clear?

Cor
Jul 21 '05 #13

P: n/a
Cor Ligthert <no**********@planet.nl> wrote:
It's more that depending on the way of attaching the file, it might get
converted during the attachment process - that's less likely to happen
in a mail message.
So I wait the results and than you can maybe send it to me when all is
clear?


Yup, sure. I suspect there's nothing particularly interesting about the
file though - it's just I should be able to work out what encoding it's
in, so that if the OP *does* want to read it directly (rather than with
c'n'p) he should be able to.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Jul 21 '05 #14

P: n/a
Ok guys, I have mailed it to both of you

I'll also but some of this DOS text here, case anyone else is interested

???<?'? ?? 6 ?. 2004??".

"Jon Skeet [C# MVP]" <sk***@pobox.com> wrote in message
news:MP************************@msnews.microsoft.c om...
Nikolay Petrov <jo******@mail.bg> wrote:
The user pasts text from text files, which contain DOS Cyrillic characters.

What does he have the text open in? It sounds like the existing app is
probably not putting it into the clipboard in Unicode :(
When they are pasted in text box or even in the Notepad windows they

look like garbage.


Ah - I thought you meant he had it working in notepad to start with.
I am not sure, can I post a file here as attachment, so you can see it?


It's probably best if you email it to me.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too

Jul 21 '05 #15

P: n/a
New problem ;-(
Text is encoded partialy.
All calital letters are fine, and some of the lower, but not all.
What may coused this?

"Nikolay Petrov" <jo******@mail.bg> wrote in message
news:eE*************@TK2MSFTNGP10.phx.gbl...
How can I convert DOS cyrillic text to Unicode

Jul 21 '05 #16

P: n/a
Nikolay Petrov <jo******@mail.bg> wrote:
New problem ;-(
Text is encoded partialy.
At what stage?
All calital letters are fine, and some of the lower, but not all.
What may coused this?


No idea - are you saying the original files are corrupt, basically?

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Jul 21 '05 #17

P: n/a
Hi,

"Nikolay Petrov" <jo******@mail.bg> wrote in message news:<eu**************@TK2MSFTNGP10.phx.gbl>...
New problem ;-(
Text is encoded partialy.
All calital letters are fine, and some of the lower, but not all.
What may coused this?

"Nikolay Petrov" <jo******@mail.bg> wrote in message
news:eE*************@TK2MSFTNGP10.phx.gbl...
How can I convert DOS cyrillic text to Unicode


You did not answer Jon's question, but it was critical -
in what _program_ your user opens a text file with DOS Cyrillic?

I am working with Cyrillic encodings since 1995 :) so I dealt
with most of them, including CP-866.

The easiest way in your scenario would be:

Open that DOS Cyrillic .txt file in MS Word 2000 or newer,
choosing "Cyrillic (DOS)" encoding in the process:
http://ourworld.compuserve.com/homep.../cp_e.htm#open

Now your user should see normal Russian text - in Unicode already
converted by Word and can paste it itno your text box.

Otherwise, if you try to open a file that contains text in
DOS Cyrillic encoding in some regular MS Windows text editor,
you *will* see just gibberish - editor expects one of _Windows_
encodings, not a DOS one.

There are many more ways to get it done, say converter programs that
make "Cyrillic(Windows), 1251" text from your DOS Cyrillic text,
I18n-aware editors that - as Word - offer you to specify explicitely
what is the encoding of your file - such as
http://www.esperanto.mv.ru/UniRed/ENG/
etc., etc.

--
Regards,
Paul Gorodyansky
"Cyrillic (Russian): instructions for Windows and Internet":
http://RusWin.net
Russian On-screen Keyboard: http://Kbd.RusWin.net
Jul 21 '05 #18

This discussion thread is closed

Replies have been disabled for this discussion.