473,385 Members | 1,506 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,385 software developers and data experts.

Regular expressions

Folks,

I've started using regular expressions for parsing some data string
that come along - works quite nicely, however, as a newbie to RE, I'm
still struggling with some special cases:

1) I get strings representing decimal values, and they have the
following format:
* first, a plus or minus sign
* then, a number of numeric chars
* if the precision is > 0, then a dot and "precision" number of
numerical characters will follow

This last thing is what's killing me - how can I put the "conditial"
thing ("if precision > 0, then a dot and digits) into a single regular
expression?

What I have now is this:

[+-]\d{x,x}.\d{p,p}

which works fine, as long as I have a DECIMAL number with x digits
before and p digits after the comma (e.g. +120.45 if x=3 and p=2).

However, if p=0, this string won't be matched +120 since it doesn't
have the dot at the end.........

Any ideas??

2) For a date, which comes along as DD.MM.YYYY, I'd like to be able to
match both the cases where DD is either "01" or " 1" (leading 0 or
space). I tried various things, but nothing seems to work - it seems
if I allow it to have a leading whitespace, it'll also match "
01.12.2003" (leading whitespace and then two digits for the day) -
this is *not* what I want!

I've tried \d{2,2}.\d{2,2}.\d{4,4}, which works fine if the day is
specified with a leading zero - but will fail if I pass it "
1.12.2003". I also tried (\s\d|\d{2,2}).\d{2,2}.\d{4,4}, but as I
mentioned - in this case, all these dates are being matched:
"01.01.2003" - okay
" 1.01.2003" - okay
" 01.01.2003" - NOT okay (leading zero + 2 digits)

Any takers?

Thanks!
Marc

================================================== ==============
Marc Scheuner May The Source Be With You!
Bern, Switzerland m.scheuner(at)inova.ch
Nov 15 '05 #1
6 8971
Got another riddle:

how can I make a regular expression that will match all of the
following:

DECIMAL
DECIMAL(5)
DECIMAL(15, 5) (fifteen-comma-space-five)

Anyone? I've tried numerous expressions - either I get too much
(everything), or I get the two last ones (with the parenthesis) - but
I can't seem to make it work for all three cases in just one
expression.......

Thanks!
Marc

================================================== ==============
Marc Scheuner May The Source Be With You!
Bern, Switzerland m.scheuner(at)inova.ch
Nov 15 '05 #2
For #1, I don't know of a way to match based on
arithmetical comparisons (like p>0), so you would probably
have to use two different expressions:

if p>0 -> [+-]\d{x}.\d{p}
if p=0 -> [+-]\d{x}

This expression will make sure that a result does not end
with only a decimal point, but it will accept -123 even if
p>0:

[+-]\d{x}(?(\.\d).\d{p})$
Additionally, if you just wanted to make sure that no more
than p digits followed the decimal point, you could use
this:

[+-]\d{x}(?(\.\d).\d{0,p})$
----------------------------------------
For #2, The last expression you listed:

(\s\d|\d{2,2}).\d{2,2}.\d{4,4}

matches corredtly. If you check the Match object's value,
it will not include the leading space if a leading zero
exists. If you want to make sure that the input string
contains only what you are trying to match, place ^
(beginning of string) at the beginning and $(end of
string) at the end:

^(\s\d|\d{2}).\d{2}.\d{4}$
Hope this helps,

Brian Davis
www.knowdotnet.com

-----Original Message-----
Folks,

I've started using regular expressions for parsing some data stringthat come along - works quite nicely, however, as a newbie to RE, I'mstill struggling with some special cases:

1) I get strings representing decimal values, and they have thefollowing format:
* first, a plus or minus sign
* then, a number of numeric chars
* if the precision is > 0, then a dot and "precision" number ofnumerical characters will follow

This last thing is what's killing me - how can I put the "conditial"thing ("if precision > 0, then a dot and digits) into a single regularexpression?

What I have now is this:

[+-]\d{x,x}.\d{p,p}

which works fine, as long as I have a DECIMAL number with x digitsbefore and p digits after the comma (e.g. +120.45 if x=3 and p=2).
However, if p=0, this string won't be matched +120 since it doesn'thave the dot at the end.........

Any ideas??

2) For a date, which comes along as DD.MM.YYYY, I'd like to be able tomatch both the cases where DD is either "01" or " 1" (leading 0 orspace). I tried various things, but nothing seems to work - it seemsif I allow it to have a leading whitespace, it'll also match "01.12.2003" (leading whitespace and then two digits for the day) -this is *not* what I want!

I've tried \d{2,2}.\d{2,2}.\d{4,4}, which works fine if the day isspecified with a leading zero - but will fail if I pass it "1.12.2003". I also tried (\s\d|\d{2,2}).\d{2,2}.\d{4,4}, but as Imentioned - in this case, all these dates are being matched: "01.01.2003" - okay
" 1.01.2003" - okay
" 01.01.2003" - NOT okay (leading zero + 2 digits)

Any takers?

Thanks!
Marc

================================================= ========= ======Marc Scheuner May The Source Be With You!Bern, Switzerland m.scheuner(at) inova.ch.

Nov 15 '05 #3

Try this one:

^DECIMAL(\(\d+(, \d+)?\))?$
Brian Davis
www.knowdotnet.com
-----Original Message-----
Got another riddle:

how can I make a regular expression that will match all of thefollowing:

DECIMAL
DECIMAL(5)
DECIMAL(15, 5) (fifteen-comma-space-five)

Anyone? I've tried numerous expressions - either I get too much(everything), or I get the two last ones (with the parenthesis) - butI can't seem to make it work for all three cases in just oneexpression.......

Thanks!
Marc

================================================= ========= ======Marc Scheuner May The Source Be With You!Bern, Switzerland m.scheuner(at) inova.ch.

Nov 15 '05 #4
Marc,

The usual way to do what you want is to use the '?' quantifier, which means
"match 0 or 1 time". So, to add the plus or minus, you get:

(+|-)?\d+

plus or minus one or zero times followed by one or more digits.

Add in the decimal part:

(+|-)?\d+(\.\d+)?

adds in "." followed by one or more digits, match the whole thing zero or
one time.
Lastly, you might want to download my regular expression workbench at the
csharp site below. It will make playing with regex much easier.

--
Eric Gunnerson

Visit the C# product team at http://www.csharp.net
Eric's blog is at http://blogs.gotdotnet.com/ericgu/

This posting is provided "AS IS" with no warranties, and confers no rights.
"Marc Scheuner [MVP ADSI]" <m.********@inova.SPAMBEGONE.ch> wrote in message
news:hd********************************@4ax.com...
Folks,

I've started using regular expressions for parsing some data string
that come along - works quite nicely, however, as a newbie to RE, I'm
still struggling with some special cases:

1) I get strings representing decimal values, and they have the
following format:
* first, a plus or minus sign
* then, a number of numeric chars
* if the precision is > 0, then a dot and "precision" number of
numerical characters will follow

This last thing is what's killing me - how can I put the "conditial"
thing ("if precision > 0, then a dot and digits) into a single regular
expression?

What I have now is this:

[+-]\d{x,x}.\d{p,p}

which works fine, as long as I have a DECIMAL number with x digits
before and p digits after the comma (e.g. +120.45 if x=3 and p=2).

However, if p=0, this string won't be matched +120 since it doesn't
have the dot at the end.........

Any ideas??

2) For a date, which comes along as DD.MM.YYYY, I'd like to be able to
match both the cases where DD is either "01" or " 1" (leading 0 or
space). I tried various things, but nothing seems to work - it seems
if I allow it to have a leading whitespace, it'll also match "
01.12.2003" (leading whitespace and then two digits for the day) -
this is *not* what I want!

I've tried \d{2,2}.\d{2,2}.\d{4,4}, which works fine if the day is
specified with a leading zero - but will fail if I pass it "
1.12.2003". I also tried (\s\d|\d{2,2}).\d{2,2}.\d{4,4}, but as I
mentioned - in this case, all these dates are being matched:
"01.01.2003" - okay
" 1.01.2003" - okay
" 01.01.2003" - NOT okay (leading zero + 2 digits)

Any takers?

Thanks!
Marc

================================================== ==============
Marc Scheuner May The Source Be With You!
Bern, Switzerland m.scheuner(at)inova.ch

Nov 15 '05 #5
>The usual way to do what you want is to use the '?' quantifier, which means
"match 0 or 1 time". So, to add the plus or minus, you get:
Add in the decimal part:
(+|-)?\d+(\.\d+)?
Yeah, I stumbled across that after a while of *not* looking at my
regex :-) Thanks.
Lastly, you might want to download my regular expression workbench at the
csharp site below. It will make playing with regex much easier.


Excellent, thanks so much!

Marc

================================================== ==============
Marc Scheuner May The Source Be With You!
Bern, Switzerland m.scheuner(at)inova.ch
Nov 15 '05 #6
>Try this one:
^DECIMAL(\(\d+(, \d+)?\))?$


Thanks - it would work, trouble is, I also need to recognize other
types such as DATE, INTEGER, CHAR(x), VARCHAR(y) and so forth, and
they're not on a line of their own (so I can't use the ^ and $
delimiters).

I think I got it figured out by now - thanks for your input! Highly
appreciated.

Marc

================================================== ==============
Marc Scheuner May The Source Be With You!
Bern, Switzerland m.scheuner(at)inova.ch
Nov 15 '05 #7

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

8
by: Michael McGarry | last post by:
Hi, I am horrible with Regular Expressions, can anyone recommend a book on it? Also I am trying to parse the following string to extract the number after load average. ".... load average:...
1
by: Kenneth McDonald | last post by:
I'm working on the 0.8 release of my 'rex' module, and would appreciate feedback, suggestions, and criticism as I work towards finalizing the API and feature sets. rex is a module intended to make...
2
by: Sehboo | last post by:
Hi, I have several regular expressions that I need to run against documents. Is it possible to combine several expressions in one expression in Regex object. So that it is faster, or will I...
4
by: Együd Csaba | last post by:
Hi All, I'd like to "compress" the following two filter expressions into one - assuming that it makes sense regarding query execution performance. .... where (adate LIKE "2004.01.10 __:30" or...
7
by: Billa | last post by:
Hi, I am replaceing a big string using different regular expressions (see some example at the end of the message). The problem is whenever I apply a "replace" it makes a new copy of string and I...
3
by: a | last post by:
I'm a newbie needing to use some Regular Expressions in PHP. Can I safely use the results of my tests using 'The Regex Coach' (http://www.weitz.de/regex-coach/index.html) Are the Regular...
25
by: Mike | last post by:
I have a regular expression (^(.+)(?=\s*).*\1 ) that results in matches. I would like to get what the actual regular expression is. In other words, when I apply ^(.+)(?=\s*).*\1 to " HEART...
1
by: Allan Ebdrup | last post by:
I have a dynamic list of regular expressions, the expressions don't change very often but they can change. And I have a single string that I want to match the regular expressions against and find...
13
by: Wiseman | last post by:
I'm kind of disappointed with the re regular expressions module. In particular, the lack of support for recursion ( (?R) or (?n) ) is a major drawback to me. There are so many great things that can...
12
by: FAQEditor | last post by:
Anybody have any URL's to tutorials and/or references for Regular Expressions? The four I have so far are: http://docs.sun.com/source/816-6408-10/regexp.htm...
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.