i have this function:
------------------------------------------------------------
function isAlfaNumeric(vnos,space) {
if (space==false) {
validRegExp = /^[a-zA-Z0-9]{0,}$/;
}
else {
validRegExp = /^[a-zA-Z0-9\s]{0,}$/;
}
return vnos.search(validRegExp)
}
-------------------------------------------------------------
the function is checking if the string "vnos" contains any non-alfanumeric
characters... it works fine except it returns true if the string contains
my country characters like ž,š.....i tried to do the following
validRegExp = /^[a-zA-Z0-9žš]{0,}$/; and also
validRegExp = /^[a-zA-Z0-9\ž\š]{0,}$/; but result was the same
Does anyone know how to check for foreign characters in string using regular
expression?? 12 11238
Smash wrote on 20 jan 2004 in comp.lang.javascript : function isAlfaNumeric(vnos,space) { if (space==false) { validRegExp = /^[a-zA-Z0-9]{0,}$/; } else { validRegExp = /^[a-zA-Z0-9\s]{0,}$/; } return vnos.search(validRegExp) } -------------------------------------------------------------
the function is checking if the string "vnos" contains any non-alfanumeric characters... it works fine except it returns true if the string contains my country characters like z,s.....i tried to do the following
validRegExp = /^[a-zA-Z0-9zs]{0,}$/; and also
validRegExp = /^[a-zA-Z0-9\z\s]{0,}$/; but result was the same
for {0,} use +
for 0-9 use \d
\s is all kinds of whitespace, like tabs etc.
use test, if you test for a string
try this:
<SCRIPT>
function isAlfaNumeric(s,sp) {
r = /^[a-zA-Z\džš]+$/;
rs = /^[a-zA-Z\džš\s]+$/;
return (sp)? rs.test(s) : r.test(s);
};
alert(isAlfaNumeric("12astš",true));
alert(isAlfaNumeric("34astš",false));
alert(isAlfaNumeric("56as tš",true));
alert(isAlfaNumeric("78as tš",false));
</SCRIPT>
If you want to accept empty strings as true, use:
r = /^[a-zA-Z\džš]*$/;
rs = /^[a-zA-Z\džš\s]*$/;
this on works the other way around, accepts empty strings:
<SCRIPT>
function isAlfaNumeric(s,sp) {
r = /[^a-zA-Z\džš]/;
rs = /[^a-zA-Z\džš\s]/;
return ! ((sp)? rs.test(s) : r.test(s));
};
alert(isAlfaNumeric("12astš",true));
alert(isAlfaNumeric("34astš",false));
alert(isAlfaNumeric("56as tš",true));
alert(isAlfaNumeric("78as tš",false));
</SCRIPT>
--
Evertjan.
The Netherlands.
(Please change the x'es to dots in my emailaddress) sm*****@email.si (Smash) writes: Does anyone know how to check for foreign characters in string using regular expression??
I think the safest is to use the \w esacpe, which matches "word characters".
That includes letters, international included, digits and the underscore.
If you can live with that:
if (space==false) {
validRegExp = /^[\w]*$/;
}
else {
validRegExp = /^[\w\s]*$/;
}
/L
--
Lasse Reichstein Nielsen - lr*@hotpop.com
DHTML Death Colors: <URL:http://www.infimum.dk/HTML/rasterTriangleDOM.html>
'Faith without judgement merely degrades the spirit divine.'
JRS: In article <72*************************@posting.google.com> , seen
in news:comp.lang.javascript, Smash <sm*****@email.si> posted at Tue, 20
Jan 2004 00:37:31 :- function isAlfaNumeric(vnos,space) { if (space==false) {
if (!space) { // or if (space) and swap the rest
Does anyone know how to check for foreign characters in string using regular expression??
"Foreign" does not mean "non-Anglo"; Americans & British are foreigners
too.
AIUI, a string can contain any Unicode character, and there are tens of
thousands of those, a large proportion of which are letters in some
language or other. Therefore, to test fully for letters outside A-Za-z,
one needs in some form or another either a list of *all* letters or a
list of *all* non-letters, or both.
I don't know Slovenian; but I guess that it has a relatively small
number of non-Anglo letters; those could be listed and tested for, but
that would not be entirely helpful to a Scandinavian visitor.
There *should* be a javascript function to test whether the current font
has a specific glyph for a given character, or for all those in a
string; but AFAIK there is not.
--
© John Stockton, Surrey, UK. ?@merlyn.demon.co.uk Turnpike v4.00 IE 4 ©
<URL:http://jibbering.com/faq/> Jim Ley's FAQ for news:comp.lang.javascript
<URL:http://www.merlyn.demon.co.uk/js-index.htm> jscr maths, dates, sources.
<URL:http://www.merlyn.demon.co.uk/> TP/BP/Delphi/jscr/&c, FAQ items, links.
Dr John Stockton wrote on 20 jan 2004 in comp.lang.javascript : There *should* be a javascript function to test whether the current font has a specific glyph for a given character, or for all those in a string; but AFAIK there is not.
If we had a Regex syntax for a character above-a/below-a/in-a-range-of
certain char number(s), even without the knowledge of the specific font,
that would be nice.
regex.definerange('%3','>#80')
regex.definerange('%5','>#0','<#20')
boolean = /aa\%5+bb[^\%3]?/.test(string)
--
Evertjan.
The Netherlands.
(Please change the x'es to dots in my emailaddress)
"Evertjan." <ex**************@interxnl.net> writes: If we had a Regex syntax for a character above-a/below-a/in-a-range-of certain char number(s), even without the knowledge of the specific font, that would be nice.
regex.definerange('%3','>#80')
regex.definerange('%5','>#0','<#20')
boolean = /aa\%5+bb[^\%3]?/.test(string)
Try:
var boolean = /aa[\x01-\x1f]+bb[^\x81-\uffff]?/.test(string);
It says true for
var string = "aa\n\rbb\u1268";
(which is 7 characters long).
/L
--
Lasse Reichstein Nielsen - lr*@hotpop.com
DHTML Death Colors: <URL:http://www.infimum.dk/HTML/rasterTriangleDOM.html>
'Faith without judgement merely degrades the spirit divine.'
Lasse Reichstein Nielsen wrote on 21 jan 2004 in comp.lang.javascript : Try: var boolean = /aa[\x01-\x1f]+bb[^\x81-\uffff]?/.test(string); It says true for var string = "aa\n\rbb\u1268"; (which is 7 characters long).
[\x01-\x1f] etc
Nice, never thought of that !
--
Evertjan.
The Netherlands.
(Please change the x'es to dots in my emailaddress)
JRS: In article <8y**********@hotpop.com>, seen in
news:comp.lang.javascript, Lasse Reichstein Nielsen <lr*@hotpop.com>
posted at Tue, 20 Jan 2004 22:47:33 :- sm*****@email.si (Smash) writes:
Does anyone know how to check for foreign characters in string using regular expression??
I think the safest is to use the \w esacpe, which matches "word characters". That includes letters, international included, digits and the underscore.
In MSIE4, it does not match É (E-acute), ä (a-umlait), Å (A-ring); and,
I suppose, others.
A Netscape 1.3 reference page include(s|d) :
Matches any alphanumeric character including the underscore.
Equivalent to [A-Za-z0-9_].
It would be nice to be able to match *any* letter, including non-anglo;
but ISTM that \w is fundamentally matching the characters that normally
appear in identifiers, and there it would be very wrong for that to be
altered.
--
© John Stockton, Surrey, UK. ?@merlyn.demon.co.uk Turnpike v4.00 IE 4 ©
<URL:http://jibbering.com/faq/> Jim Ley's FAQ for news:comp.lang.javascript
<URL:http://www.merlyn.demon.co.uk/js-index.htm> jscr maths, dates, sources.
<URL:http://www.merlyn.demon.co.uk/> TP/BP/Delphi/jscr/&c, FAQ items, links.
Dr John Stockton <sp**@merlyn.demon.co.uk> writes: In MSIE4, it does not match É (E-acute), ä (a-umlait), Å (A-ring); and, I suppose, others.
Yes, that was me misremembering. Bummer. I would have been nice with
an escape that matches alphanumeric unicode characters, and not just
ASCII ones, and I though ECMAScript had it. That was apparently
just wishful thinking.
/L
--
Lasse Reichstein Nielsen - lr*@hotpop.com
DHTML Death Colors: <URL:http://www.infimum.dk/HTML/rasterTriangleDOM.html>
'Faith without judgement merely degrades the spirit divine.'
JRS: In article <pt**********@hotpop.com>, seen in
news:comp.lang.javascript, Lasse Reichstein Nielsen <lr*@hotpop.com>
posted at Wed, 21 Jan 2004 18:24:12 :- Try: var boolean = /aa[\x01-\x1f]+bb[^\x81-\uffff]?/.test(string); It says true for var string = "aa\n\rbb\u1268"; (which is 7 characters long).
But for that approach to do the original job in full, one needs to read
the entire Unicode table and decide which squashed spiders are foreign
letters and which are foreign non-letters.
I've seen AJF's Unicode table in HTML; but I don't recall seeing one
written in ISO-7 and intended for simple machine-reading. http://ppewww.ph.gla.ac.uk/~flavell/...e/unidata.html
--
© John Stockton, Surrey, UK. ?@merlyn.demon.co.uk Turnpike v4.00 IE 4 ©
<URL:http://jibbering.com/faq/> Jim Ley's FAQ for news:comp.lang.javascript
<URL:http://www.merlyn.demon.co.uk/js-index.htm> jscr maths, dates, sources.
<URL:http://www.merlyn.demon.co.uk/> TP/BP/Delphi/jscr/&c, FAQ items, links.
Dr John Stockton wrote on 21 jan 2004 in comp.lang.javascript : But for that approach to do the original job in full, one needs to read the entire Unicode table and decide which squashed spiders are foreign letters and which are foreign non-letters.
A perfect solution is impossible, as long as the unicode is not redesigned
to have seperate ranges for both types. And that probably will not happen.
For an imperfect solution, say for most European languages, could be done
in a standard string along the lines of [\x01-\x1f].
Seems a perfect job for you, John, to collect suggestions from many of us
about their local lingo needs. ;-}
Would the same unicode number stand for different [letter vs nonletter]
types in different European languages ?
Or in different fonts ?
I hope not.
--
Evertjan.
The Netherlands.
(Please change the x'es to dots in my emailaddress)
JRS: In article <Xn********************@194.109.133.29>, seen in
news:comp.lang.javascript, Evertjan. <ex**************@interxnl.net>
posted at Thu, 22 Jan 2004 08:47:39 :- Dr John Stockton wrote on 21 jan 2004 in comp.lang.javascript: But for that approach to do the original job in full, one needs to read the entire Unicode table and decide which squashed spiders are foreign letters and which are foreign non-letters.
A perfect solution is impossible, as long as the unicode is not redesigned to have seperate ranges for both types. And that probably will not happen.
For an imperfect solution, say for most European languages, could be done in a standard string along the lines of [\x01-\x1f].
Seems a perfect job for you, John, to collect suggestions from many of us about their local lingo needs. ;-}
Would the same unicode number stand for different [letter vs nonletter] types in different European languages ? Or in different fonts ?
Read AJF's cited page, and others, on Unicode. Look and see what is
actually in Unicode.
AIUI, the idea of Unicode is that a given character has a given number,
independently of font, size, and language; \u0041 is 'A' and \u0061 is
'a' *everywhere*. If it's not \u0061, it's not our 'a', whatever it
looks like.
In practice, though, a letter only counts as a letter if it is a letter
of the current language. In English, Nijmegen has eight letters; I
suspect it of having only seven in Dutch, only six of which are English.
--
© John Stockton, Surrey, UK. ?@merlyn.demon.co.uk Turnpike v4.00 IE 4 ©
<URL:http://jibbering.com/faq/> Jim Ley's FAQ for news:comp.lang.javascript
<URL:http://www.merlyn.demon.co.uk/js-index.htm> jscr maths, dates, sources.
<URL:http://www.merlyn.demon.co.uk/> TP/BP/Delphi/jscr/&c, FAQ items, links.
Dr John Stockton wrote on 22 jan 2004 in comp.lang.javascript : In practice, though, a letter only counts as a letter if it is a letter of the current language.
I do not think o. It depends on definition, of cource. I would say a
letter in computerstrings can also be a letter in another language and
be counted as a letter. the u-umlaut [ü] is definitly a letter in
English in the sense that it is definitly not a non-letter, like
!?#%&.,.
In English, Nijmegen has eight letters; I suspect it of having only seven in Dutch, only six of which are English.
This is long since left concept in this time of computer generated and
sorted telephone books. The "ij", though it counts a one letter in
linguistic Dutch sense has definitely become a two letter "thing" like
the "ph".
The "ph" however, can also be pronounced in a two letter fassion in
words like:
poephark
ophaalbrug
Generaal van Opheusden ;-)
If there were a word with the ij pronounced as seperate letters, the j
should have two little points [de trema] like an umlaut. This is not
available in current fonts, I definitely presume, because the j is
usually thought as a consonant.
[The above thoughts are not tested on recent or old versions of eastern
languages, nor on Netscape]
--
Evertjan.
The Netherlands.
(Please change the x'es to dots in my emailaddress) This discussion thread is closed Replies have been disabled for this discussion. Similar topics
4 posts
views
Thread by Toffe |
last post: by
|
1 post
views
Thread by Kenneth McDonald |
last post: by
|
5 posts
views
Thread by Sue |
last post: by
|
3 posts
views
Thread by Zach |
last post: by
|
1 post
views
Thread by NvrBst |
last post: by
|
27 posts
views
Thread by rhaazy |
last post: by
| | | | | | | | | | |