473,408 Members | 2,450 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,408 software developers and data experts.

Converting Unicode

Hi @all,

I'm searching for a solution for the following problem:
I want to replace all unicode characters in a string with a valid
substituition.

For example:

string s = "Catalán";
string s2 = ModifyMyString(s); //s2 = "Catal\xC3\xA1n"

Since replacing unicode characters in a string that way, should be a
very common task, I asked myself whether there is a function in the
..NET-Framework, that does this job. Doing a s.Replace("á","\xC3\xA1")
would not be a very effective way cause there are "many" unicode
characters. :-)

Thanks for help
Sams

Nov 17 '05 #1
7 1838
Hi,

You can probably use Regular Expressions to replace multiple occurences of
the same character with the substitution sequence.

--
Sincerely,
Dmytro Lapshyn [Visual Developer - Visual C# MVP]

"sams" <Fr************@web.de> wrote in message
news:11**********************@g44g2000cwa.googlegr oups.com...
Hi @all,

I'm searching for a solution for the following problem:
I want to replace all unicode characters in a string with a valid
substituition.

For example:

string s = "Catalán";
string s2 = ModifyMyString(s); //s2 = "Catal\xC3\xA1n"

Since replacing unicode characters in a string that way, should be a
very common task, I asked myself whether there is a function in the
..NET-Framework, that does this job. Doing a s.Replace("á","\xC3\xA1")
would not be a very effective way cause there are "many" unicode
characters. :-)

Thanks for help
Sams

Nov 17 '05 #2
What is the nature of your substitution? Are you trying to convert
Unicode to UTF-8? If so, there are methods for doing this within the
Framework.

If the encoding is totally your own then you would need to create a new
subclass of Encoding (if you want to build the Cadillac version).

Nov 17 '05 #3
Thanks so far for your suggestions.

To answer the question about the nature of the substitution Bruce
asked: I'm reading content from a SQL Server 2000 and want to insert it
into an PostgreSQL DB. I think the database driver I use only accepts
ASCII encodings (characters [0..9][A..z] and those replacement strings
I already mentioned). The database, of course, is unicode compatible.
Since my knowledge of unicode/utf-8 is not sufficient enough, I'm going
to find those functions Bruce mentioned. I will keep you up to date. If
someone has another idea I would be very happy.

Sams

Nov 17 '05 #4
Thanks so far for your suggestions.

To answer the question about the nature of the substitution Bruce
asked: I'm reading content from a SQL Server 2000 and want to insert it
into an PostgreSQL DB. I think the database driver I use only accepts
ASCII encodings (characters [0..9][A..z] and those replacement strings
I already mentioned). The database, of course, is unicode compatible.
Since my knowledge of unicode/utf-8 is not sufficient enough, I'm going
to find those functions Bruce mentioned. I will keep you up to date. If
someone has another idea I would be very happy.

Sams

Nov 17 '05 #5
Sams,

The real question is how are you going to pass UTF-8 characters to the
driver. Remember that System.String is *always* Unicode, so unless you have
a way to pass a byte array, you might have hard time passing a UTF-8 string.
Can you please elaborate on the driver interface you are using?

--
Sincerely,
Dmytro Lapshyn [Visual Developer - Visual C# MVP]
"sams" <Fr************@web.de> wrote in message
news:11**********************@f14g2000cwb.googlegr oups.com...
Thanks so far for your suggestions.

To answer the question about the nature of the substitution Bruce
asked: I'm reading content from a SQL Server 2000 and want to insert it
into an PostgreSQL DB. I think the database driver I use only accepts
ASCII encodings (characters [0..9][A..z] and those replacement strings
I already mentioned). The database, of course, is unicode compatible.
Since my knowledge of unicode/utf-8 is not sufficient enough, I'm going
to find those functions Bruce mentioned. I will keep you up to date. If
someone has another idea I would be very happy.

Sams


Nov 17 '05 #6
sams wrote:
Hi @all,

I'm searching for a solution for the following problem:
I want to replace all unicode characters in a string with a valid
substituition.

For example:

string s = "Catalán";
string s2 = ModifyMyString(s); //s2 = "Catal\xC3\xA1n"

Since replacing unicode characters in a string that way, should be a
very common task, I asked myself whether there is a function in the
.NET-Framework, that does this job. Doing a s.Replace("á","\xC3\xA1")
would not be a very effective way cause there are "many" unicode
characters. :-)


Characters and thus strings in .NET are alyways Unicode. There's no
difference between replacing characters with characters and replacing
Unicode characters with characters. And "\xC3\xA1" is not a character,
but a string that says
\xC3\xA1

You seem to be confusing these things with character encoding?

Cheers,
--
http://www.joergjooss.de
mailto:ne********@joergjooss.de
Nov 17 '05 #7
> string s = "Catalán";
string s2 = ModifyMyString(s); //s2 = "Catal\xC3\xA1n"


C3 A1 are the bytes used to represent á as UTF-8.
A .NET string is Unicode (UTF-16 representation), so probably
what you want is to convert a string to a UTF-8 byte array.
If this is the case, take a look at System.Text.UTF8Encoding

But depending on what mechanism you are using to interact with the database,
you may not need to do your own conversion.

--
Mihai Nita [Microsoft MVP, Windows - SDK]
------------------------------------------
Replace _year_ with _ to get the real email
Nov 17 '05 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

12
by: Peter Wilkinson | last post by:
Hello tlistmembers, I am using the encoding function to convert unicode to ascii. At one point this code was working just fine, however, now it has broken. I am reading a text file that has is...
10
by: Maxim Kasimov | last post by:
there are a few questions i can find answer in manual: 1. how to define which is internal encoding of python unicode strings (UTF-8, UTF-16 ...) 2. how to convert string to UCS-2 (Python 2.2.3...
3
by: Supratim | last post by:
Hi, For past few weeks I am working on a function that would take encoded Unicode characters from query string of http requests and then decode them back to Unicode numbers. I have full success...
7
by: Robert Diamond | last post by:
Hi ppl, just a quick question... I need to use "MultiByteToWideChar(stuff)" to convert a char to unicode, so that OleLoadPicturePath can get the image files i want, and load it into a HBITMAP,...
1
by: Rajesh Kumar Mallah | last post by:
Hi , my current database in 7.3.4 is SQL_ASCII are there any benifits of coverting it to UNICODE encoding in 7.4 ? If so what is the process. Regds Mallah.
8
by: Alphaboomer | last post by:
I'm using the following code to retrieve a list of all the Categories used by Microsoft Outlook: sub test() Dim objWSHShell As Object Dim strCategoryList As Variant Set objWSHShell =...
0
by: kurotsuke | last post by:
I need to convert a sequence of keys presses on the keyboard into the corresponding character code (UNICODE). I'm intercepting the KeyUp event (using an external hooking library) and need to get...
2
by: Paolo | last post by:
I imported a VC++6.0 project into VC++7.1. The conversion operation makes a mess with Preprocessor Definitions, adding a "$(NoInherit)" for each file. For example: I had a DLL project in VC++6.0...
5
by: Sonu | last post by:
Hello everyone and thanks in advance. I have a multilingual application which has been built in MFC VC++ 6.0 (non-Unicode). It support English German Hungarian so far, which has been fine. But...
2
by: Nikola Skoric | last post by:
What I have is a bunch of text in arabic, and series of Unicode bytes which represent those arabic words (like this: \'c2\'e4\'f6\'d3\'f3\'c9 \'f1). Now I have to figure out how to convert my...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.