473,586 Members | 2,652 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

regular expression for integer and decimal numbers

I want to pick all intergers and decimal numbers out of a string.
Would this be the most correct regular expression to use?

"\d+\.?\d*"
Jul 18 '05 #1
9 17225
On 23 Sep 2004 17:51:17 -0700, gary <ga*********@gm ail.com> wrote:
I want to pick all intergers and decimal numbers out of a string.
Would this be the most correct regular expression to use?

"\d+\.?\d*"


That will work for numbers such as 0123 12.345 12. 0.5 -- but it
won't work for the following:
0x12AB .5 10e-3 -15 123L
If you want to handle some of those, then you'll need a more complicated regex.
If you want to accept numbers of the form .5 but don't care about 12.
then a better regex would be
\d*\.?\d+
Jul 18 '05 #2
Andrew Durdin wrote:
That will work for numbers such as 0123 12.345 12. 0.5 -- but it
won't work for the following:
0x12AB .5 10e-3 -15 123L


This will handle the normal floats including a leading + or -
and trailing exponent, all optional.

r"[+-]?((\d+(\.\d*)?) |\.\d+)([eE][+-]?[0-9]+)?"

Andrew
da***@dalkescie ntific.com
Jul 18 '05 #3
gary wrote:
I want to pick all intergers and decimal numbers out of a string.
Would this be the most correct regular expression to use?

"\d+\.?\d*"


Examples, including the most extreme cases you want to handle,
are always a good idea.

-Peter
Jul 18 '05 #4
Peter Hansen <pe***@engcorp. com> wrote in message news:<pb******* *************@p owergate.ca>...
gary wrote:
I want to pick all intergers and decimal numbers out of a string.
Would this be the most correct regular expression to use?

"\d+\.?\d*"


Examples, including the most extreme cases you want to handle,
are always a good idea.

-Peter


Here is an example of what I will be dealing with:
"""
TOTAL FIRST DOWNS 19 21
By Rushing 11 6
By Passing 6 10
By Penalty 2 5
THIRD DOWN EFFICIENCY 4-11-36% 6-14-43%
FOURTH DOWN EFFICIENCY 0-1-0% 0-0-0%
TOTAL NET YARDS 379 271
Total Offensive Plays (inc. times thrown passing) 58 63
Average gain per offensive play 6.5 4.3
NET YARDS RUSHING 264 115
"""

I can only hope that they were nice and put a leading zero in front of
numbers less than 1.
Jul 18 '05 #5
On 25 Sep 2004 13:13:22 -0700, ga*********@gma il.com (gary) wrote:
Peter Hansen <pe***@engcorp. com> wrote in message news:<pb******* *************@p owergate.ca>...
gary wrote:
> I want to pick all intergers and decimal numbers out of a string.
> Would this be the most correct regular expression to use?
>
> "\d+\.?\d*"


Examples, including the most extreme cases you want to handle,
are always a good idea.

-Peter


Here is an example of what I will be dealing with:
"""
TOTAL FIRST DOWNS 19 21
By Rushing 11 6
By Passing 6 10
By Penalty 2 5
THIRD DOWN EFFICIENCY 4-11-36% 6-14-43%
FOURTH DOWN EFFICIENCY 0-1-0% 0-0-0%
TOTAL NET YARDS 379 271
Total Offensive Plays (inc. times thrown passing) 58 63
Average gain per offensive play 6.5 4.3
NET YARDS RUSHING 264 115
"""

I can only hope that they were nice and put a leading zero in front of
numbers less than 1.


Are you sure you want to throw away all the info implicit in the structure of that data?
How about the columns? Will you get other input with more columns? Otherwise if your
numeric fields are as they appear, maybe just
def extract(s): ... for a in s.split():
... if not a[0].isdigit(): continue
... if a.endswith('%') :
... for i in map(int,a[:-1].split('-')): yield i
... elif '.' in a: yield float(a)
... else: yield int(a)
... s = ( ... """
... TOTAL FIRST DOWNS 19 21
... By Rushing 11 6
... By Passing 6 10
... By Penalty 2 5
... THIRD DOWN EFFICIENCY 4-11-36% 6-14-43%
... FOURTH DOWN EFFICIENCY 0-1-0% 0-0-0%
... TOTAL NET YARDS 379 271
... Total Offensive Plays (inc. times thrown passing) 58 63
... Average gain per offensive play 6.5 4.3
... NET YARDS RUSHING 264 115
... """
... ) for num in extract(s): print num,

...
19 21 11 6 6 10 2 5 4 11 36 6 14 43 0 1 0 0 0 0 379 271 58 63 6.5 4.3 264 115

But I doubt that's what you really want ;-)

Regards,
Bengt Richter
Jul 18 '05 #6
gary wrote:
Peter Hansen <pe***@engcorp. com> wrote in message news:<pb******* *************@p owergate.ca>...
Examples, including the most extreme cases you want to handle,
are always a good idea.


Here is an example of what I will be dealing with:
"""
TOTAL FIRST DOWNS 19 21
By Rushing 11 6
By Passing 6 10
By Penalty 2 5
THIRD DOWN EFFICIENCY 4-11-36% 6-14-43%
FOURTH DOWN EFFICIENCY 0-1-0% 0-0-0%
TOTAL NET YARDS 379 271
Total Offensive Plays (inc. times thrown passing) 58 63
Average gain per offensive play 6.5 4.3
NET YARDS RUSHING 264 115
"""

I can only hope that they were nice and put a leading zero in front of
numbers less than 1.


Good example of the input. Now all you need to do is tell
us exactly what kind of output you would expect to come
from the routine which you seek. ;-)

-Peter
Jul 18 '05 #7
bo**@oz.net (Bengt Richter) wrote in message news:<cj******* *************** ***@theriver.co m>...
On 25 Sep 2004 13:13:22 -0700, ga*********@gma il.com (gary) wrote:
Peter Hansen <pe***@engcorp. com> wrote in message news:<pb******* *************@p owergate.ca>...
gary wrote:
> I want to pick all intergers and decimal numbers out of a string.
> Would this be the most correct regular expression to use?
>
> "\d+\.?\d*"

Examples, including the most extreme cases you want to handle,
are always a good idea.

-Peter
Here is an example of what I will be dealing with:
"""
TOTAL FIRST DOWNS 19 21
By Rushing 11 6
By Passing 6 10
By Penalty 2 5
THIRD DOWN EFFICIENCY 4-11-36% 6-14-43%
FOURTH DOWN EFFICIENCY 0-1-0% 0-0-0%
TOTAL NET YARDS 379 271
Total Offensive Plays (inc. times thrown passing) 58 63
Average gain per offensive play 6.5 4.3
NET YARDS RUSHING 264 115
""" Are you sure you want to throw away all the info implicit in the structure of that data?
How about the columns? Will you get other input with more columns?


There are several other instances in the files that I am extracting
data from where the numbers are not so nicely arranged in columns, so
I am really looking for something that could be used in all instances.
(http://www.nfl.com/gamecenter/gamebo...020929_TEN@OAK)

I do however still need to convert everything from string to numbers.
I was thinking about using the following for that unless someone has a
better solution:
def StrToNum(str): .... try: return int(str)
.... except ValueError:
.... try: return float(str)
.... except ValueError: return str
statlist = ['10', '6', '2002', 'tampa bay buccaneers', 'atlanta falcons', 'the georgia dome', '1', '03', 'pm', 'est', 'artificial',
'0', '3', '7', '10', '0', '20', '3', '0', '3', '0', '0', '6', '15',
'14', '5', '2', '9', '10', '1', '2', '4', '13', '31', '3', '14', '21',
'1', '1', '100', '0', '1', '0', '327', '243', '59', '64', '5.5',
'3.8', '74', '70', '26', '22', '2.8', '3.2', '2', '3', '2', '3',
'253', '173', '2', '8', '4', '14', '261', '187', '31', '17', '1',
'38', '17', '4', '7.7', '4.1', '5', '3', '0', '3', '2', '2', '5',
'43.2', '5', '45.6', '0', '0', '0', '0', '0', '0', '31.2', '41.6',
'50', '40', '0', '0', '3', '40', '0', '0', '5', '120', '4', '50', '1',
'0', '6', '35', '6', '41', '1', '1', '0', '0', '2', '0', '0', '0',
'1', '0', '1', '0', '2', '2', '0', '0', '2', '2', '0', '0', '2', '2',
'2', '3', '0', '2', '0', '0', '2', '0', '0', '1', '0', '0', '0', '0',
'0', '0', '20', '6', '29', '34', '30', '26', '3', '37', '9', '59',
'9', '35', '6', '23', 0, 0, '11', '23', '5', '01', '5', '25', '8',
'37', 0, 0, '26'] [StrToNum(item) for item in statlist] [10, 6, 2002, 'tampa bay buccaneers', 'atlanta falcons', 'the georgia
dome', 1, 3, 'pm', 'est', 'artificial', 0, 3, 7, 10, 0, 20, 3, 0, 3,
0, 0, 6, 15, 14, 5, 2, 9, 10, 1, 2, 4, 13, 31, 3, 14, 21, 1, 1, 100,
0, 1, 0, 327, 243, 59, 64, 5.5, 3.7999999999999 998, 74, 70, 26, 22,
2.7999999999999 998, 3.2000000000000 002, 2, 3, 2, 3, 253, 173, 2, 8, 4,
14, 261, 187, 31, 17, 1, 38, 17, 4, 7.7000000000000 002,
4.0999999999999 996, 5, 3, 0, 3, 2, 2, 5, 43.200000000000 003, 5,
45.600000000000 001, 0, 0, 0, 0, 0, 0, 31.199999999999 999,
41.600000000000 001, 50, 40, 0, 0, 3, 40, 0, 0, 5, 120, 4, 50, 1, 0, 6,
35, 6, 41, 1, 1, 0, 0, 2, 0, 0, 0, 1, 0, 1, 0, 2, 2, 0, 0, 2, 2, 0, 0,
2, 2, 2, 3, 0, 2, 0, 0, 2, 0, 0, 1, 0, 0, 0, 0, 0, 0, 20, 6, 29, 34,
30, 26, 3, 37, 9, 59, 9, 35, 6, 23, 0, 0, 11, 23, 5, 1, 5, 25, 8, 37,
0, 0, 26]

Another thing was that I found a negative number which kinds screws up
the regex's previously disscussed. So I came up with a workaround
below: str = """ .... FGs - PATs Had Blocked 0-0 0-0
.... Net Punting Average -6.3 33.3
.... TOTAL RETURN YARDAGE (Not Including Kickoffs) 14 257
.... No. and Yards Punt Returns 1-14 2-157
.... """ str = re.sub(r"(\d+)-",r"\1 ",str) #replace number followed by dash with number followed by space teamstats = re.findall(r"-?\d+\.?\d*",str ) #regex discussed before but with an optional negative sign in front teamstats ['0', '0', '0', '0', '-6.3', '33.3', '14', '257', '1', '14', '2',
'157'] [StrToNum(item) for item in teamstats]

[0, 0, 0, 0, -6.2999999999999 998, 33.299999999999 997, 14, 257, 1, 14,
2, 157]

Gary
Jul 18 '05 #8
Peter Hansen <pe***@engcorp. com> wrote in message news:<jf******* *************@p owergate.ca>...
Good example of the input. Now all you need to do is tell
us exactly what kind of output you would expect to come
from the routine which you seek. ;-)

-Peter


Well for that particular example something of the form...

Cleveland at Cincinnati +8

would be nice ;-)
Jul 18 '05 #9
gary wrote:
Peter Hansen <pe***@engcorp. com> wrote in message news:<jf******* *************@p owergate.ca>...
Good example of the input. Now all you need to do is tell
us exactly what kind of output you would expect to come
from the routine which you seek. ;-)


Well for that particular example something of the form...

Cleveland at Cincinnati +8

would be nice ;-)


I know nothing about American football except that it
isn't played with a puck, so I don't think I get the joke...

-Peter
Jul 18 '05 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
4159
by: Kenneth McDonald | last post by:
I'm working on the 0.8 release of my 'rex' module, and would appreciate feedback, suggestions, and criticism as I work towards finalizing the API and feature sets. rex is a module intended to make regular expressions easier to create and use (and in my experience as a regular expression user, it makes them MUCH easier to create and use.) ...
3
1562
by: greenflame | last post by:
I am trying to find a regular expression that returns true in the following cases but no others. 2.0 2.4 2. 324.0e345 234e34 34.e-43 234.673
3
7626
by: Robert Scheer | last post by:
Hi. I have a regularexpression validator control on a page. This regular expression validates a textbox to accept only numbers and commas: validationexpression="*" I am trying to modify this expression to not allow commas at the beginning and at the end of the expression without success. It needs to allow commas only between the numbers....
11
3090
by: Steve | last post by:
Hi All, I'm having a tough time converting the following regex.compile patterns into the new re.compile format. There is also a differences in the regsub.sub() vs. re.sub() Could anyone lend a hand? import regsub
10
10182
by: Mike9900 | last post by:
Hello, I need a regular expression to match a currency with its symbol, for example Pound66.99 must return 66.99 or Pound(66.99) or Pound-66.99 or -66.99Pound return -66.99 or any other combination return the decimal number. I have created a regular expression, but it seems that it does not work if the number is Pound66.99 but it works...
11
1604
by: Michael_Burgess | last post by:
Hi there, I'm using the following regex validator: ^\d{0,4}.?\d{0,2}$ This is to validate that a text box has 0-4 numbers, possible followed by a decimal point and possibly followed by 2 decimal places. For instance, the following are valid.......
4
2977
by: =?Utf-8?B?ZG1idXNv?= | last post by:
I am looking for a regular expression that would filter numbers in my vb.net application. The integer part could have up to 5 digits and the fractional part up to 2 digits. I came up with the regex pattern of ^{1,5}$ for Five decimal digits or less but I couldn't finish the expression for the fractional part. Thanks in adance. --...
7
2970
by: war | last post by:
Hi , I am Having a doubt in Regular expression validator,Since i am not aware of that i am having a text box it should accept any integer value upto 8 digit and it also should accept decimal if so it should accept only two digits after the decimal point can u guide me.
10
1661
by: venugopal.sjce | last post by:
Hi Friends, I'm constructing a regular expression for validating an expression which looks as any of the following forms: 1. =4*++2 OR 2. =Sum()*6 Some of the samples I have constructed below:
0
8202
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. ...
1
7959
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For...
0
8216
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the...
0
6614
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then...
0
5390
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert...
0
3865
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
2345
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
1
1449
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
0
1180
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.