473,623 Members | 3,366 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

ToUpper()

Hello,

I'm trying to convert strings to upper without the accents. For the moment,
ToUpper converts é to E with an accent...
I tried to set up english culture (en) but it's the same...

Any ideas ?

Ornette.

Mar 1 '07 #1
19 1862
Ok, finally I did it like this :

private string ReplaceAccents( string chaine)
{

string strAccents=
"ÀÁÂÃÄÅàáâãäåÒÓ ÔÕÖØòóôõöøÈÉÊËè éêëÌÍÎÏìíîïÙÚÛÜ ùúûüÿÑñÇç";
string strNoAccents =
"AAAAAAaaaaaaOO OOOOooooooEEEEe eeeIIIIiiiiUUUU uuuuyNnCc";

char[] tAccent = strAccents.ToCh arArray();
char[] tNoAccent = strNoAccents .ToCharArray();

for(int i=0; i<strAccents.Le ngth; i++)
{
chaine = chaine.Replace( tAccent [i].ToString(), tNoAccent
[i].ToString());
}
return chaine;
}

J'ai pas trouvé mieux, même si ça boucle un peu pour rien...

Ornette.
"Ornette" <abstrait...nos pam...@free.fra écrit dans le message de
news:DF******** *************** ***********@mic rosoft.com...
Hello,

I'm trying to convert strings to upper without the accents. For the
moment, ToUpper converts é to E with an accent...
I tried to set up english culture (en) but it's the same...

Any ideas ?

Ornette.
Mar 1 '07 #2
Hello,

This is better :

byte[] bString =
System.Text.Enc oding.GetEncodi ng(1251).GetByt es(StringAvecAc cents);
string stringSansAccen t = System.Text.Enc oding.ASCII.Get String(bString );

Reference CodePage :
http://www.microsoft.com/globaldev/r...sbcs/1251.mspx

Ornette.

"Ornette" <abstrait...nos pam...@free.fra écrit dans le message de
news:56******** *************** ***********@mic rosoft.com...
Ok, finally I did it like this :

private string ReplaceAccents( string chaine)
{

string strAccents=
"ÀÁÂÃÄÅàáâãäåÒÓ ÔÕÖØòóôõöøÈÉÊËè éêëÌÍÎÏìíîïÙÚÛÜ ùúûüÿÑñÇç";
string strNoAccents =
"AAAAAAaaaaaaOO OOOOooooooEEEEe eeeIIIIiiiiUUUU uuuuyNnCc";

char[] tAccent = strAccents.ToCh arArray();
char[] tNoAccent = strNoAccents .ToCharArray();

for(int i=0; i<strAccents.Le ngth; i++)
{
chaine = chaine.Replace( tAccent [i].ToString(), tNoAccent
[i].ToString());
}
return chaine;
}

J'ai pas trouvé mieux, même si ça boucle un peu pour rien...

Ornette.
"Ornette" <abstrait...nos pam...@free.fra écrit dans le message de
news:DF******** *************** ***********@mic rosoft.com...
>Hello,

I'm trying to convert strings to upper without the accents. For the
moment, ToUpper converts é to E with an accent...
I tried to set up english culture (en) but it's the same...

Any ideas ?

Ornette.
Mar 1 '07 #3
Ornette <abstrait...nos pam...@free.frw rote:
This is better :

byte[] bString =
System.Text.Enc oding.GetEncodi ng(1251).GetByt es(StringAvecAc cents);
string stringSansAccen t = System.Text.Enc oding.ASCII.Get String(bString );
Well, that's assuming that the encoding will find the closest match
letter. It may work now, but there's no guarantee that it will in the
future.

--
Jon Skeet - <sk***@pobox.co m>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
Mar 1 '07 #4
Hello,

So how would you do ?

Ornette.

"Jon Skeet [C# MVP]" <sk***@pobox.co ma écrit dans le message de
news:MP******** *************** *@msnews.micros oft.com...
Ornette <abstrait...nos pam...@free.frw rote:
>This is better :

byte[] bString =
System.Text.En coding.GetEncod ing(1251).GetBy tes(StringAvecA ccents);
string stringSansAccen t = System.Text.Enc oding.ASCII.Get String(bString );

Well, that's assuming that the encoding will find the closest match
letter. It may work now, but there's no guarantee that it will in the
future.

--
Jon Skeet - <sk***@pobox.co m>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
Mar 1 '07 #5
Ornette <abstrait...nos pam...@free.frw rote:
So how would you do ?
The mapping table idea you had before looked best to me, although I
wouldn't quite implement it the same way. I'd have a look up table for
every possible character, where it defaults to the Unicode character,
but for all the accented characters you care about, you specify the
non-accented version.

You'd then call ToCharArray() on the string in question, go through
each character replacing the original with the mapped character, and
then create a new string with the char array.

It does require you to manually map all the accented characters you
care about though.

My guess is that there are libraries around to do this somewhere, but I
don't know of any myself.

--
Jon Skeet - <sk***@pobox.co m>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
Mar 1 '07 #6
Ok, thank you for your point of view.
I really agree.

For the the librairies, I also didn't find any one.

Have a nice day and thanks again.

Ornette.

"Jon Skeet [C# MVP]" <sk***@pobox.co ma écrit dans le message de
news:MP******** *************** *@msnews.micros oft.com...
Ornette <abstrait...nos pam...@free.frw rote:
>So how would you do ?

The mapping table idea you had before looked best to me, although I
wouldn't quite implement it the same way. I'd have a look up table for
every possible character, where it defaults to the Unicode character,
but for all the accented characters you care about, you specify the
non-accented version.

You'd then call ToCharArray() on the string in question, go through
each character replacing the original with the mapped character, and
then create a new string with the char array.

It does require you to manually map all the accented characters you
care about though.

My guess is that there are libraries around to do this somewhere, but I
don't know of any myself.

--
Jon Skeet - <sk***@pobox.co m>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
Mar 2 '07 #7
The closest thing that comes to mind is an RFC called stringprep. There are
a wide variety of stringprep profiles, and while they don't quite do what
you're looking for, they're close. Included in stringprep is a set of
mapping tables for Uppder->Lower case conversions. These are (in that
context) called case-foldings, are are found in table B.2. Unfortunatly,
they're Upper->Lower, not the other way around.

Stringprep:
http://www.faqs.org/rfcs/rfc3454.html

There are a number of profiles:
[Profile for Internaional Domain Names]
http://www.rfc-editor.org/rfc/rfc3491.txt

[Profile for iSCSI names]
http://tools.ietf.org/html/draft-iet...string-prep-01

[Profile for SASL UserNames & Passwords]
http://www.ietf.org/rfc/rfc4013.txt

[Profile for XMPP Resources]
http://www.xmpp.org/internet-drafts/...ceprep-02.html

There's a C# implementation of this RFC that's part of the libidn library.
There's also a C++ & Java version.
http://www.gnu.org/software/libidn/

We've actually got a full implemention of stringprep as well - it's much
more .Net 2.0 ish than the libidn one, which is just a native C++ app that
was then ported to Java & .Net. It's found in our open-source SoapBox
Framework.

--
Chris Mullins, MCSD.NET, MCPD:Enterprise , Microsoft C# MVP
http://www.coversant.com/blogs/cmullins

"Jon Skeet [C# MVP]" <sk***@pobox.co mwrote in message
news:MP******** *************** *@msnews.micros oft.com...
Ornette <abstrait...nos pam...@free.frw rote:
>So how would you do ?

The mapping table idea you had before looked best to me, although I
wouldn't quite implement it the same way. I'd have a look up table for
every possible character, where it defaults to the Unicode character,
but for all the accented characters you care about, you specify the
non-accented version.

You'd then call ToCharArray() on the string in question, go through
each character replacing the original with the mapped character, and
then create a new string with the char array.

It does require you to manually map all the accented characters you
care about though.

My guess is that there are libraries around to do this somewhere, but I
don't know of any myself.

--
Jon Skeet - <sk***@pobox.co m>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too

Mar 2 '07 #8
JR
In a more general way:

There is a Unicode database at

http://www.unicode.org/Public/UNIDATA/

You could do what you want in two steps: decompose the string to base
characters followed by accent (NFKD normalization), then remove the accents.

JR
"Chris Mullins [MVP]" <cm******@yahoo .comëúá
áäåãòä:us****** ********@TK2MSF TNGP05.phx.gbl. ..
The closest thing that comes to mind is an RFC called stringprep. There
are a wide variety of stringprep profiles, and while they don't quite do
what you're looking for, they're close. Included in stringprep is a set of
mapping tables for Uppder->Lower case conversions. These are (in that
context) called case-foldings, are are found in table B.2. Unfortunatly,
they're Upper->Lower, not the other way around.

Stringprep:
http://www.faqs.org/rfcs/rfc3454.html

There are a number of profiles:
[Profile for Internaional Domain Names]
http://www.rfc-editor.org/rfc/rfc3491.txt

[Profile for iSCSI names]
http://tools.ietf.org/html/draft-iet...string-prep-01

[Profile for SASL UserNames & Passwords]
http://www.ietf.org/rfc/rfc4013.txt

[Profile for XMPP Resources]
http://www.xmpp.org/internet-drafts/...ceprep-02.html

There's a C# implementation of this RFC that's part of the libidn library.
There's also a C++ & Java version.
http://www.gnu.org/software/libidn/

We've actually got a full implemention of stringprep as well - it's much
more .Net 2.0 ish than the libidn one, which is just a native C++ app that
was then ported to Java & .Net. It's found in our open-source SoapBox
Framework.

--
Chris Mullins, MCSD.NET, MCPD:Enterprise , Microsoft C# MVP
http://www.coversant.com/blogs/cmullins

"Jon Skeet [C# MVP]" <sk***@pobox.co mwrote in message
news:MP******** *************** *@msnews.micros oft.com...
>Ornette <abstrait...nos pam...@free.frw rote:
>>So how would you do ?

The mapping table idea you had before looked best to me, although I
wouldn't quite implement it the same way. I'd have a look up table for
every possible character, where it defaults to the Unicode character,
but for all the accented characters you care about, you specify the
non-accented version.

You'd then call ToCharArray() on the string in question, go through
each character replacing the original with the mapped character, and
then create a new string with the char array.

It does require you to manually map all the accented characters you
care about though.

My guess is that there are libraries around to do this somewhere, but I
don't know of any myself.

--
Jon Skeet - <sk***@pobox.co m>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too


Mar 2 '07 #9
I hadn't thought of that, but it's certainly an option.

Doing the normalization in .Net 2.0 is easy enough:

string s = "test";
string normalized = s.Normalize(Nor malizationForm. FormKC);

Then you can iterate over the normalized string looking for (and
removing) the accents.

--
Chris Mullins
On Mar 1, 11:15 pm, "JR" <NoM...@qsm.co. ilwrote:
In a more general way:

There is a Unicode database at

http://www.unicode.org/Public/UNIDATA/

You could do what you want in two steps: decompose the string to base
characters followed by accent (NFKD normalization), then remove the accents.

JR

"Chris Mullins [MVP]" <cmull...@yahoo .comëúá
áäåãòä:usdBDlGX HHA.3...@TK2MSF TNGP05.phx.gbl. ..
The closest thing that comes to mind is an RFC called stringprep. There
are a wide variety of stringprep profiles, and while they don't quite do
what you're looking for, they're close. Included in stringprep is a setof
mapping tables for Uppder->Lower case conversions. These are (in that
context) called case-foldings, are are found in table B.2. Unfortunatly,
they're Upper->Lower, not the other way around.
Stringprep:
http://www.faqs.org/rfcs/rfc3454.html
There are a number of profiles:
[Profile for Internaional Domain Names]
http://www.rfc-editor.org/rfc/rfc3491.txt
[Profile for iSCSI names]
http://tools.ietf.org/html/draft-iet...string-prep-01
[Profile for SASL UserNames & Passwords]
http://www.ietf.org/rfc/rfc4013.txt
[Profile for XMPP Resources]
http://www.xmpp.org/internet-drafts/...pp-resourcepre...
There's a C# implementation of this RFC that's part of the libidn library.
There's also a C++ & Java version.
http://www.gnu.org/software/libidn/
We've actually got a full implemention of stringprep as well - it's much
more .Net 2.0 ish than the libidn one, which is just a native C++ app that
was then ported to Java & .Net. It's found in our open-source SoapBox
Framework.
--
Chris Mullins, MCSD.NET, MCPD:Enterprise , Microsoft C# MVP
http://www.coversant.com/blogs/cmullins
"Jon Skeet [C# MVP]" <s...@pobox.com wrote in message
news:MP******** *************** *@msnews.micros oft.com...
Ornette <abstrait...nos pam...@free.frw rote:
So how would you do ?
The mapping table idea you had before looked best to me, although I
wouldn't quite implement it the same way. I'd have a look up table for
every possible character, where it defaults to the Unicode character,
but for all the accented characters you care about, you specify the
non-accented version.
You'd then call ToCharArray() on the string in question, go through
each character replacing the original with the mapped character, and
then create a new string with the char array.
It does require you to manually map all the accented characters you
care about though.
My guess is that there are libraries around to do this somewhere, but I
don't know of any myself.
--
Jon Skeet - <s...@pobox.com >
http://www.pobox.com/~skeet Blog:http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too

Mar 2 '07 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

7
3162
by: Kerri | last post by:
Hi, I have a string 'hello world' I can UCase my string using String.ToUpper Is there anyway to just UCase the first letter ofe ach word? Thanks,
2
1911
by: JasBascom | last post by:
if i have struct crecord{ char record_type; } if record_type = 'c'; how do I make record_type toupper?
7
15732
by: Duane | last post by:
Aside from the pitfalls of using this function, according to the standard, what is the correct way to call it? #include <string> #include <locale> // seems to work with BCC5.6/STLPort void stringToUpper(std::string &str) {
18
4789
by: didgerman | last post by:
Chaps, I need to properly format the case of a struct. Can I just hit it with tolower, and then 'while (string ==' ') pos++; string=toupper(string); to add in the higher case for the start of each letter? The struct will contain some integers, will tolower/upper affect any integers? Also...has anyone written an easy struct to xml converter yet?
3
3528
by: gelbeiche | last post by:
I have a question regarding the following small C program. #include <locale.h> int main() { char* loc = 0; char before,after; int i;
0
1528
by: Shrinivas Reddy | last post by:
Hi, I am using the ToUpper() function in an FXCop rule which checks whether a boolean variable has "is" or "has" as the prefix. The ToUpper() function does not work. When I put the expression field.Name.Name.ToUpper() in Watch or Command Window I get the error "error: 'field.Name.Name.ToUpper' does not exist" but when I print the result using Debug.WriteLine(field.Name.Name.ToUpper()) it works fine.
48
3350
by: Frederick Gotham | last post by:
The "toupper" function takes an int as an argument. That's not too irrational given that a character literal is of type "int" in C. (Although why it isn't of type "char" escapes me... ) The "toupper" function imposes a further constrait in that the value passed to it must be representable as a unsigned char. (If C does not require all character values to be positive, then again, this constrait too escapes me... ) Let's say we have the...
4
493
by: sandy | last post by:
I am trying to upper case a string, so I have this method: string FileSystem::toupper(string S) { for (int i=0; i<S.length(); ++i) { S=toupper(S); } return S;
13
8621
by: JanWhitney | last post by:
I am learning C++, so please be kind. Is there anywhere that I can view the source code for the toUpper function? Thanks.
16
8175
by: gaga | last post by:
my function should accept a pointer to a string as its argument. and it should convert each charater to an uppercase letter. I know what i need to do, its just my syntax is all out of whack. void strUpper (char *myStr) { int i; strcpy (*myStr); for(i=0; myStr; i++) {
0
8221
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
8162
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
8662
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
8603
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
8317
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
7134
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
6104
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
4067
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
2
1468
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.