Hi,
I'm trying to construct a RegEx pattern which will validate a string so that
it can contain:
only the numerical characters from 0 to 9 i.e. no decimal points, negative
signs, exponentials etc
only the 26 letters of the standard Western alphabet in either upper or
lower case
spaces i.e. ASCII character 32
I seem to be doing OK with the first two criteria, but am having trouble
with the space character.
E.g. the following works perfectly:
Regex.IsMatch("ThisIsThe2ndString", @"[^0-9][^a-z][^A-Z]")
However, this doesn't work:
Regex.IsMatch("This Is The 2nd String", @"[^0-9][^a-z][^A-Z]")
I've tried various combinations of [\s] and [^\s] but with little success.
However, the following works, though I don't really understand why:
Regex.IsMatch("This is the 2nd string", @"[^0-9][^a-z][^A-Z]",
RegexOptions.IgnoreCase)
Any assistance gratefully received.
Mark 15 16659
Mark Rae wrote: I'm trying to construct a RegEx pattern which will validate a string so that it can contain [only digits. letters and spaces]
I think you want something like this:
^[a-zA-Z0-9 ]*$
i.e. every character between ^ start and $ end must be in the [group],
and there can be * zero or more of them (you'd use + if you want at
least one character in there). Be aware that "\s" would match some
things that aren't spaces (like tabs and newlines).
Of course, if you're having special trouble with spaces, you could do
s.Replace(" ", "") first to get rid of them in your validator.
Finally, I'm not convinced that regexes are ideal in .NET for this
kind of trivial check (as opposed to something complicated like nested
expressions and optional segments), because they're a special library
call and not a native operator as in Perl, which I suspect you might
have come from. I expect a loop like this would be more efficient:
bool valid = true;
for (int i = 0; i < s.Length; i++)
{
if (!((s[i] >= 'A' && s[i] <= 'Z') || (s[i] >= 'a' && s[i] <= 'z')
|| (s[i] >= '0' && s[i] <= '9') || s[i] == ' '))
{
valid = false; break;
}
}
Eq.
string[] strs = new string[] { "ABC123", "ABC1.1", "ABC 123", "ABC 123
.." };
string srx = @"[^\.]+|[\w\s\d]+";
Regex rx = new Regex(srx,RegexOptions.ECMAScript);
foreach (string str in strs)
{
Console.WriteLine("{0} {1}", str,
rx.Match(str).Length==str.Length);
}
This works (if I understood correctly your problem). IsMatch returns
true for any match in the string so I don't think this is the one you
want.
Regards,
Tasos
You can use a literal space in your character set:
(?i)[^a-z 0-9]
The "(?i)" indicates case-insensitivity. Note the literal space between
"a-z" and "0-9". This excludes the space character as well.
The "\s" indicates *any* white-space character, including such things as
tabs. If that is what you want, use:
(?i)[^a-z\s0-9]
--
HTH,
Kevin Spencer
Microsoft MVP
Professional Numbskull
The man who questions opinions is wise.
The man who quarrels with facts is a fool.
"Mark Rae" <ma**@markN-O-S-P-A-M.co.uk> wrote in message
news:eV**************@TK2MSFTNGP03.phx.gbl... Hi,
I'm trying to construct a RegEx pattern which will validate a string so that it can contain:
only the numerical characters from 0 to 9 i.e. no decimal points, negative signs, exponentials etc only the 26 letters of the standard Western alphabet in either upper or lower case spaces i.e. ASCII character 32
I seem to be doing OK with the first two criteria, but am having trouble with the space character.
E.g. the following works perfectly:
Regex.IsMatch("ThisIsThe2ndString", @"[^0-9][^a-z][^A-Z]")
However, this doesn't work:
Regex.IsMatch("This Is The 2nd String", @"[^0-9][^a-z][^A-Z]")
I've tried various combinations of [\s] and [^\s] but with little success.
However, the following works, though I don't really understand why:
Regex.IsMatch("This is the 2nd string", @"[^0-9][^a-z][^A-Z]", RegexOptions.IgnoreCase)
Any assistance gratefully received.
Mark
"Kevin Spencer" <ke***@DIESPAMMERSDIEtakempis.com> wrote in message
news:eF**************@TK2MSFTNGP03.phx.gbl... You can use a literal space in your character set:
(?i)[^a-z 0-9]
The "(?i)" indicates case-insensitivity. Note the literal space between "a-z" and "0-9". This excludes the space character as well.
The "\s" indicates *any* white-space character, including such things as tabs. If that is what you want, use:
(?i)[^a-z\s0-9]
Excellent! Thanks very much.
"Tasos Vogiatzoglou" <tv*****@gmail.com> wrote in message
news:11**********************@j33g2000cwa.googlegr oups.com... This works (if I understood correctly your problem).
It doesn't.
IsMatch returns true for any match in the string so I don't think this is the one you want.
There you go, then... :-)
"Paul E Collins" <fi******************@CL4.org> wrote in message
news:CI******************************@bt.com... I think you want something like this: ^[a-zA-Z0-9 ]*$ i.e. every character between ^ start and $ end must be in the [group], and there can be * zero or more of them (you'd use + if you want at least one character in there).
Doesn't work...
Of course, if you're having special trouble with spaces, you could do s.Replace(" ", "") first to get rid of them in your validator.
I could do that, or even not do any validation at all...
Finally, I'm not convinced that regexes are ideal in .NET for this kind of trivial check (as opposed to something complicated like nested expressions and optional segments), because they're a special library call and not a native operator as in Perl, which I suspect you might have come from.
I've never written a line of Perl in my life...
I expect a loop like this would be more efficient:
I wouldn't know...
Mark Rae <ma**@markN-O-S-P-A-M.co.uk> wrote: "Tasos Vogiatzoglou" <tv*****@gmail.com> wrote in message news:11**********************@j33g2000cwa.googlegr oups.com...
This works (if I understood correctly your problem).
It doesn't.
When a proposed solution doesn't work, could you explain in what way?
It makes life a lot easier for people who want to make further
suggestions.
--
Jon Skeet - <sk***@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
"Jon Skeet [C# MVP]" <sk***@pobox.com> wrote in message
news:MP************************@msnews.microsoft.c om... When a proposed solution doesn't work, could you explain in what way?
I'm afraid I can't in this case, other than to say it always seems to find a
match no matter what string I pass into it...
I simply don't know enough about regular expressions to make a valuable
response - I don't mind confessing that it remains one area of coding which
I find very difficult to get my head around, to the extent where I still
find it difficult to look at even the simplest of patterns and understand
instinctively what it's trying to do...
It makes life a lot easier for people who want to make further suggestions.
I couldn't agree more! However, in this case, Kevin Spencer has solved my
problem completely.
Mark Rae <ma**@markN-O-S-P-A-M.co.uk> wrote: "Jon Skeet [C# MVP]" <sk***@pobox.com> wrote in message news:MP************************@msnews.microsoft.c om...
When a proposed solution doesn't work, could you explain in what way? I'm afraid I can't in this case, other than to say it always seems to find a match no matter what string I pass into it...
That's enough - just an example of something which should fail but
passes would be good.
I simply don't know enough about regular expressions to make a valuable response
A sample which doesn't do what you want to is the most valuable
response you can make in this case :) It makes life a lot easier for people who want to make further suggestions.
I couldn't agree more! However, in this case, Kevin Spencer has solved my problem completely.
Right. I'd still be interested in an example which should fail but
passes, so I can try to beef up my own regex experience.
--
Jon Skeet - <sk***@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
"Jon Skeet [C# MVP]" <sk***@pobox.com> wrote in message
news:MP************************@msnews.microsoft.c om... I'm afraid I can't in this case, other than to say it always seems to find a match no matter what string I pass into it... That's enough - just an example of something which should fail but passes would be good.
I simply don't know enough about regular expressions to make a valuable response
A sample which doesn't do what you want to is the most valuable response you can make in this case :)
See the reply I'm referring to:
IsMatch returns true for any match in the string so I don't think this is the one you want.
That's correct - no matter what string I pass into it, it always returns
true...
Hi Mark,
I may be able to help you there. It helps to understand how the Regular
Expressions Engine works. First, it evaluates a character at a time, and it
is procedural in nature. A regular expression is like a series of
instructions, rather than a real single pattern. In your case: Regex.IsMatch("This is the 2nd string", @"[^0-9][^a-z][^A-Z]", RegexOptions.IgnoreCase)
Basically, this is using character classes. A character class is a series of
tokens inside square brackets, and it can be translated as "this type of
character or this type of character or this type of character..." In other
words, multiple character types or literals are joined with an implicit "or"
operator:
[\dA!] literally means "any single digit or an 'A' or an '!' character".
Note that it also implies a singular value, that is, one character.
Quantifiers are used to indicate that anything in the character class are
repeated 0, 1 or more times, as in:
[\dA!] (any of these characters 1 time)
[\dA!]* (any of these characters 0 or more times)
[\dA!]+ (any of these characters 1 or more times)
etc.
The '^' is the logical "Not" operator, which means "Not any of these
characters."
So, you had at first "[^0-9]" (Not a digit between 0 and 9)
followed by "[^a-z]" (Not a character between a and z)
and followed by "[^A-Z]" (Not a character between A and Z)
Now, remember that it's looking for a match. A match satisfies *all* of the
criteria you specify, so you can think of this and joining all of these
character classes with "AND" as in:
"Not a digit between 0 and 9 AND not a character between a and z AND not a
character between A and Z."
Note that the space character is not any of those, so it's a match. Using
negation is tricky. In fact, *any* character that was NOT in any of those 3
character sets would be a match.
The character class is used to apply the same rules to a set of characters.
The only time you need to separate them into groups is when the rules
(specifically logical Not or quantifiers) do not apply the same to all of
the characters.
Also, as a regular expression is basically procedural (although it does
employ backtracking), you should be careful about the order of the matches.
The following 2 sets are NOT the same:
[\dA!][0X]
[0X][\dA!]
In the first case, "0X3A" would *not* match. In the second case it would.
This is because the string and the pattern are evaluated in sequence. One
term for this is "consumption" - a regular expression "consumes" a string as
it evaluates it.
--
HTH,
Kevin Spencer
Microsoft MVP
Professional Numbskull
The man who questions opinions is wise.
The man who quarrels with facts is a fool.
"Mark Rae" <ma**@markN-O-S-P-A-M.co.uk> wrote in message
news:%2****************@TK2MSFTNGP03.phx.gbl... "Jon Skeet [C# MVP]" <sk***@pobox.com> wrote in message news:MP************************@msnews.microsoft.c om...
I'm afraid I can't in this case, other than to say it always seems to find a match no matter what string I pass into it...
That's enough - just an example of something which should fail but passes would be good.
I simply don't know enough about regular expressions to make a valuable response
A sample which doesn't do what you want to is the most valuable response you can make in this case :)
See the reply I'm referring to:
IsMatch returns true for any match in the string so I don't think this is the one you want.
That's correct - no matter what string I pass into it, it always returns true...
"Kevin Spencer" <ke***@DIESPAMMERSDIEtakempis.com> wrote in message
news:OS**************@TK2MSFTNGP05.phx.gbl... I may be able to help you there.
Very interesting - thanks.
I still find it really hard to get my head round it, though...
Mark Rae <ma**@markN-O-S-P-A-M.co.uk> wrote: IsMatch returns true for any match in the string so I don't think this is the one you want.
That's correct - no matter what string I pass into it, it always returns true...
Well, I've only tried the version that Paul Collins gave (which you
replied to with the same "doesn't work" answer), and that seems to
work:
using System;
using System.Text.RegularExpressions;
class Test
{
static void Main()
{
Regex r = new Regex("^[a-zA-Z0-9 ]*$");
Console.WriteLine (r.IsMatch ("Hello"));
Console.WriteLine (r.IsMatch ("Hello there"));
Console.WriteLine (r.IsMatch ("Hell#o"));
}
}
Produces:
True
True
False
This is why it's important to give a specific example of something that
fails - preferrably with a short but complete program which
demonstrates what you've been trying it with.
--
Jon Skeet - <sk***@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
Hi Mark,
You may find the following article informative: http://www.codeproject.com/csharp/regex.asp
--
HTH,
Kevin Spencer
Microsoft MVP
Professional Numbskull
The man who questions opinions is wise.
The man who quarrels with facts is a fool.
"Mark Rae" <ma**@markN-O-S-P-A-M.co.uk> wrote in message
news:uC**************@TK2MSFTNGP05.phx.gbl... "Kevin Spencer" <ke***@DIESPAMMERSDIEtakempis.com> wrote in message news:OS**************@TK2MSFTNGP05.phx.gbl...
I may be able to help you there.
Very interesting - thanks.
I still find it really hard to get my head round it, though...
"Kevin Spencer" <ke***@DIESPAMMERSDIEtakempis.com> wrote in message
news:us**************@TK2MSFTNGP03.phx.gbl... You may find the following article informative:
http://www.codeproject.com/csharp/regex.asp
I love it - it's almost "RegEx for Dummies"... :-)
Just what I need! This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics
by: Francesco |
last post by:
I've created a form asking a user to enter a text; the user can press
key or any other "strange" non printable char that i want to strip
away from the entred text and replace it with a " "...
|
by: Erik Lechak |
last post by:
Hello all,
I wrote the code below. It is simply a dictionary that uses regular
expressions to match keys. A quick look at _test() will give you an
example.
Is there a module that already...
|
by: Kenneth McDonald |
last post by:
I'm working on the 0.8 release of my 'rex' module, and would appreciate
feedback, suggestions, and criticism as I work towards finalizing the
API and feature sets. rex is a module intended to make...
|
by: Neri |
last post by:
Some document processing program I write has to deal with documents
that have headers and footers that are unnecessary for the main
processing part. Therefore, I'm using a regular expression to go...
|
by: Billa |
last post by:
Hi,
I am replaceing a big string using different regular expressions (see
some example at the end of the message). The problem is whenever I
apply a "replace" it makes a new copy of string and I...
|
by: Mike |
last post by:
I have a regular expression (^(.+)(?=\s*).*\1 ) that results in
matches. I would like to get what the actual regular expression is.
In other words, when I apply ^(.+)(?=\s*).*\1 to " HEART...
|
by: Avi Kak |
last post by:
Folks,
Does regular expression processing in Python allow for executable
code to be embedded inside a regular expression?
For example, in Perl the following two statements
$regex =...
|
by: lisong |
last post by:
Hi All,
I have problem to split a string like this:
'abc.defg.hij.klmnop'
and I want to get all substrings with only one '.' in mid. so the
output I expect is :
'abc.defg', 'defg.hij',...
|
by: blaine |
last post by:
Hey everyone,
For the regular expression gurus...
I'm trying to write a string matching algorithm for genomic
sequences. I'm pulling out Genes from a large genomic pattern, with
certain start...
|
by: taylorcarr |
last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
|
by: Charles Arthur |
last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
|
by: ryjfgjl |
last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
|
by: ryjfgjl |
last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
|
by: emmanuelkatto |
last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud.
Please let me know.
Thanks!
Emmanuel
|
by: BarryA |
last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
|
by: nemocccc |
last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
|
by: Hystou |
last post by:
There are some requirements for setting up RAID:
1. The motherboard and BIOS support RAID configuration.
2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
|
by: Hystou |
last post by:
Overview:
Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
| |