473,406 Members | 2,345 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,406 software developers and data experts.

regex with accents

Hi,

I can't get the characters with accents in a regex. This is my code :
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
var MyText1 = "éléphant1" ;
var MyText2 = "elephant1" ;
var MyReg = /^[\w]+$/ ;

if(MyReg.test(MyText1))
alert(MyText1 + " is OK") ;
else
alert(MyText1 + " is not valid") ;
if(MyReg.test(MyText2))
alert(MyText2 + " is OK") ;
else
alert(MyText2 + " is not valid") ;
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Here's what I get :
éléphant1 is not valid
elephant1 is OK

I'd like éléphant1 to be OK, but I can't.
Can you help me ?

Thanks in advance,

Albert
Sep 22 '07 #1
9 19476
albert wrote:
I can't get the characters with accents in a regex. This is my code :
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
var MyText1 = "�l�phant1" ;
var MyText2 = "elephant1" ;
var MyReg = /^[\w]+$/ ;
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Here's what I get :
�l�phant1 is not valid
elephant1 is OK

I'd like �l�phant1 to be OK, but I can't.
Can you help me ?
ECMA262 15.10.2.12 defines \w as being equivalent to the character class
[0-1A-za-z_]. The w suggests word, but that is deceptive. Support for
internationalization in JavaScript's RegExp is virtually nonexistent.

You need to define your own character class.

http://javascript.crockford.com/
Sep 22 '07 #2
ECMA262 15.10.2.12 defines \w as being equivalent to the character class
[0-1A-za-z_]. The w suggests word, but that is deceptive. Support for
internationalization in JavaScript's RegExp is virtually nonexistent.

You need to define your own character class.
How can I do so ?
albert
Sep 22 '07 #3
albert wrote on 22 sep 2007 in comp.lang.javascript:
>ECMA262 15.10.2.12 defines \w as being equivalent to the character
class [0-1A-za-z_]. The w suggests word, but that is deceptive.
Support for internationalization in JavaScript's RegExp is virtually
nonexistent.

You need to define your own character class.

How can I do so ?
var MyReg = /^[\wáéíóäëiöúàèììù]+$/i;

Depending on your local requirements.

--
Evertjan.
The Netherlands.
(Please change the x'es to dots in my emailaddress)
Sep 22 '07 #4
var MyReg = /^[\wáéíóäëiöúàèììù]+$/i;
>
Depending on your local requirements.

--
Evertjan.
The Netherlands.
(Please change the x'es to dots in my emailaddress)
I've got french... that's no pb.
But I also have arabic & hebrew, this is more difficult.
albert
Sep 22 '07 #5
albert wrote on 22 sep 2007 in comp.lang.javascript:
>var MyReg = /^[\wáéíóäëiöúàèììù]+$/i;

Depending on your local requirements.
[please do not quote signatures on usenet. removed]
>
I've got french... that's no pb.
pb? [please no sms-language on usenet]
But I also have arabic & hebrew, this is more difficult.
Why should it be easy?

Javascript accommodates unicode.

--
Evertjan.
The Netherlands.
(Please change the x'es to dots in my emailaddress)
Sep 22 '07 #6
In comp.lang.javascript message <S_******************@newssvr13.news.pro
digy.net>, Sat, 22 Sep 2007 13:44:18, Douglas Crockford
<no****@sbcglobal.netposted:
>
ECMA262 15.10.2.12 defines \w as being equivalent to the character
class [0-1A-za-z_]. The w suggests word, but that is deceptive. Support
for internationalization in JavaScript's RegExp is virtually
nonexistent.
<URL:http://www.merlyn.demon.co.uk/humourous.htm#FredHoyleadvises <G>
:-
Fred Hoyle (1915-2001) :-
"'Dam’ good idea. Always force foreigner to learn English.'"
Alexis Ivan Alexandrov, in "The Black Cloud", Chap. 10, para 4.

--
(c) John Stockton, Surrey, UK. ?@merlyn.demon.co.uk Turnpike v6.05 MIME.
Web <URL:http://www.merlyn.demon.co.uk/- FAQqish topics, acronyms & links;
Astro stuff via astron-1.htm, gravity0.htm ; quotings.htm, pascal.htm, etc.
No Encoding. Quotes before replies. Snip well. Write clearly. Don't Mail News.
Sep 22 '07 #7
>I've got french... that's no pb.
>
pb? [please no sms-language on usenet]
pb = problem (sorry, I thought it was obvious).
>
>But I also have arabic & hebrew, this is more difficult.

Why should it be easy?
I've never said it should be easy. Don't waste time to answer here...
>
Javascript accommodates unicode.
Well I tried a simple word in Arabic with the following regex :

^[\w]+$

still, the "test" function always returned false. Do you have any good
working example about it ?
thx, oops, soory I meant "Thanks" ;-)
albert
Sep 23 '07 #8
albert wrote on 23 sep 2007 in comp.lang.javascript:
>>I've got french... that's no pb.

pb? [please no sms-language on usenet]

pb = problem (sorry, I thought it was obvious).
Not to me. Usenet has it's own limited set of abbreviations.
If any Pb perhaps would be lead.
>>But I also have arabic & hebrew, this is more difficult.

Why should it be easy?

I've never said it should be easy. Don't waste time to answer here...
You are the OP, so ...
>Javascript accommodates unicode.

Well I tried a simple word in Arabic with the following regex :

^[\w]+$
Would you allow for figures 0-9?
Otherwise this is better for simple Latin chars:

/^[a-z]+$/i
still, the "test" function always returned false.
I showed you how to do that with accents,
did you understand the regex?

Why would Arabic characters match
where accented characters do not?
Do you have any good
working example about it ?
I am not into working examples, but will gve you a hint.

Arabic should work the same as accented ones:

/^[a-z\u0600-\u06ff]+$/

[http://unicode.org/charts/PDF/U0600.pdf]

Not knowing Arabic I cannot test that.
thx, oops, soory I meant "Thanks" ;-)
--
Evertjan.
The Netherlands.
(Please change the x'es to dots in my emailaddress)
Sep 23 '07 #9
You are the OP, so ...

Now it's my turn :-)
What does OP mean ?
>>
Well I tried a simple word in Arabic with the following regex :

^[\w]+$

Would you allow for figures 0-9?
Yes
Otherwise this is better for simple Latin chars:

/^[a-z]+$/i
>still, the "test" function always returned false.

I showed you how to do that with accents,
did you understand the regex?
Yes
>
Why would Arabic characters match
where accented characters do not?
You're right.
>
>Do you have any good
working example about it ?

I am not into working examples, but will gve you a hint.

Arabic should work the same as accented ones:

/^[a-z\u0600-\u06ff]+$/

[http://unicode.org/charts/PDF/U0600.pdf]

Not knowing Arabic I cannot test that.
I tested. It works :-)

Thank you for your help !
albert
Sep 24 '07 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

5
by: chepiok | last post by:
I'd like to send email containing accents (french one) using PHP command mail(). The content of these emails are store in text files. I'de like to know : - text file format (encoding, with...
2
by: c w | last post by:
Can anyone point me in the right direction? Using Oracle 9i, Pro*C and Excel. I am trying to print french accents from the Oracle DB using Pro*C to extract the necessary info and sent the result...
0
by: Wim Roffal | last post by:
When I sort texts with accents the accents end up in the end instead of near the same text without accent. For example, the 3 composers Händel, Haydn and Holst will appear in the order Haydn,...
2
by: Ghislain Benrais | last post by:
Hi everybody, I have xml documents with external entities for my accents that I want to output properly with php function domxml_open_file. I can't get my accents on a linux-apache server (I get...
7
by: bill tie | last post by:
I'd appreciate it if you could advise. 1. How do I replace "\" (backslash) with anything? 2. Suppose I want to replace (a) every occurrence of characters "a", "b", "c", "d" with "x", (b)...
0
by: Chris Leffer | last post by:
Hi. I am having problems to use HtmlEncode with strings that use accents. My page uses some expressions like that: <%# Server.HtmlEncode(DataBinder.Eval(Container.DataItem, "Name").Trim) %> ...
1
by: bssjohn | last post by:
Dear All, I have developing a French website using PHP & Ajax. In that I tried to display some French texts from mysql database using Ajax. Form local I got the text from db with Correct accents...
1
by: theduffman | last post by:
Hi, I'm trying to write code to update nightly some NHL stats. Everything works, except for names with an accent, e.g. José Théodore. This won't generate a match no matter what I try. I've edited...
1
by: AMP | last post by:
Hello, I am coming back to a project and I dont remember what the following Regex says I do know it removes all \r\n from the string, but I dont see how. Can someone explain this one? Regex re...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.