473,321 Members | 1,778 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,321 software developers and data experts.

Getting different Regular Expression results

440 256MB
Hi ,

Could anybody help me in fixing this problem.I am getting different results.

Expand|Select|Wrap|Line Numbers
  1. Sample1
  2.  import re 
  3.  strLine =' 1 THRU 20    
  4.  sList = re.findall('\d+ THRU \d+|\d+', strLine)
  5.  print sList
  6.  
O/P is :
['1 THRU 20'] --Which is correct?

Expand|Select|Wrap|Line Numbers
  1. Sample2
  2. import re 
  3. strLine ='  8001  THRU 10828  '
  4. sList = re.findall('\d+ THRU \d+|\d+', strLine)
  5. print sList
  6.  
O/P is :
['8001', '10828'] --Which is different compare to above one?

The correct Output should be :

['8001 THRU 10828']


Thanks
PSB
Mar 29 '07 #1
12 1376
bartonc
6,596 Expert 4TB
Hi ,

Could anybody help me in fixing this problem.I am getting different results.

Expand|Select|Wrap|Line Numbers
  1. Sample1
  2.  import re 
  3.  strLine =' 1 THRU 20    
  4.  sList = re.findall('\d+ THRU \d+|\d+', strLine)
  5.  print sList
  6.  
O/P is :
['1 THRU 20'] --Which is correct?

Expand|Select|Wrap|Line Numbers
  1. Sample2
  2. import re 
  3. strLine ='  8001  THRU 10828  '
  4. sList = re.findall('\d+ THRU \d+|\d+', strLine)
  5. print sList
  6.  
O/P is :
['8001', '10828'] --Which is different compare to above one?

The correct Output should be :

['8001 THRU 10828']


Thanks
PSB
Expand|Select|Wrap|Line Numbers
  1. Sample2
  2. import re 
  3. strLine ='  8001  THRU 10828  '
  4. sList = re.findall('\d+ THRU \d+|\d+', strLine)
  5. print sList
Your Regular Expression doesn't take into account 2 spaces between '8001' and 'THRU'.
Mar 29 '07 #2
psbasha
440 256MB
Thanks Barton,

How to fix this problem.

PSB
Mar 29 '07 #3
bartonc
6,596 Expert 4TB
Thanks Barton,

How to fix this problem.

PSB
You are welcome.

Regular Expressions are a good tool. Well worth the time to learn.
Mar 29 '07 #4
bvdet
2,851 Expert Mod 2GB
Thanks Barton,

How to fix this problem.

PSB
Expand|Select|Wrap|Line Numbers
  1. >>> strLine ='  8001  THRU 10828  '
  2. >>> sList = re.findall('\d+.+THRU.+\d+|\d+', strLine)
  3. >>> sList
  4. ['8001  THRU 10828']
  5. >>> 
  6.  
Mar 30 '07 #5
psbasha
440 256MB
Hi BV ,

Thanks for the reply.

I have some issue when I am using the string in this format.How to fix this issue?.

Expand|Select|Wrap|Line Numbers
  1. Sample1
  2. import re 
  3. strLine = '1 = 11001 THRU 11848'
  4. sList = re.findall('\d+ THRU \d+|\d+', strLine)
  5. print sList
  6.  
  7.  
O/P is : ['1', '11001 THRU 11848']


Expand|Select|Wrap|Line Numbers
  1. Sample
  2.  
  3. import re 
  4. strLine = '1 = 11001 THRU 11848'
  5. sList = re.findall('\d+.+THRU.+\d+|\d+', strLine)
  6. print sList
  7.  

O/P is : ['1 = 11001 THRU 11848']

I would like to have my Output to be ['1', '11001 THRU 11848'].

Handling the strLine '14 = 8001 THRU 10828 '

O/P to be ['14', '8001 THRU 10828' ].

-PSB
Mar 31 '07 #6
bvdet
2,851 Expert Mod 2GB
Hi BV ,

Thanks for the reply.

I have some issue when I am using the string in this format.How to fix this issue?.

Expand|Select|Wrap|Line Numbers
  1. Sample1
  2. import re 
  3. strLine = '1 = 11001 THRU 11848'
  4. sList = re.findall('\d+ THRU \d+|\d+', strLine)
  5. print sList
  6.  
  7.  
O/P is : ['1', '11001 THRU 11848']


Expand|Select|Wrap|Line Numbers
  1. Sample
  2.  
  3. import re 
  4. strLine = '1 = 11001 THRU 11848'
  5. sList = re.findall('\d+.+THRU.+\d+|\d+', strLine)
  6. print sList
  7.  

O/P is : ['1 = 11001 THRU 11848']

I would like to have my Output to be ['1', '11001 THRU 11848'].

Handling the strLine '14 = 8001 THRU 10828 '

O/P to be ['14', '8001 THRU 10828' ].

-PSB
Expand|Select|Wrap|Line Numbers
  1. >>> strLine = '1 = 11001 THRU 11848'
  2. >>> re.findall(r'\d+ +THRU +\d+|\d+', strLine)
  3. ['1', '11001 THRU 11848']
  4. >>> re.findall(r'\d+\s+THRU\s+\d+|\d+', strLine)
  5. ['1', '11001 THRU 11848']
  6. >>> 
Notice what happens when I rearrange the expression a bit:
Expand|Select|Wrap|Line Numbers
  1. >>> re.findall(r'\d+|\d+\s+THRU\s+\d+', strLine)
  2. ['1', '11001', '11848']
  3. >>> 
Mar 31 '07 #7
ghostdog74
511 Expert 256MB
Hi BV ,

Thanks for the reply.

I have some issue when I am using the string in this format.How to fix this issue?.

Expand|Select|Wrap|Line Numbers
  1. Sample1
  2. import re 
  3. strLine = '1 = 11001 THRU 11848'
  4. sList = re.findall('\d+ THRU \d+|\d+', strLine)
  5. print sList
  6.  
  7.  
O/P is : ['1', '11001 THRU 11848']


Expand|Select|Wrap|Line Numbers
  1. Sample
  2.  
  3. import re 
  4. strLine = '1 = 11001 THRU 11848'
  5. sList = re.findall('\d+.+THRU.+\d+|\d+', strLine)
  6. print sList
  7.  

O/P is : ['1 = 11001 THRU 11848']

I would like to have my Output to be ['1', '11001 THRU 11848'].

Handling the strLine '14 = 8001 THRU 10828 '

O/P to be ['14', '8001 THRU 10828' ].

-PSB
you could have easily got your results with split()
Expand|Select|Wrap|Line Numbers
  1. >>> '14 = 8001 THRU 10828'.split(" = ")
  2. ['14', '8001 THRU 10828']
  3. >>>
  4.  
Mar 31 '07 #8
psbasha
440 256MB
HI BV,

Thanks for the reply.

Still I have the problem in reading the string data.In my earlier post "Reading and writing a text file " ,I am reading different SETS file format.When I use this regular expression pattern,the earlier file data is not able to read.

Could you please refer to my posting "Reading and writing a text file " and let me know the exact pattern.

-PSB
Apr 1 '07 #9
bvdet
2,851 Expert Mod 2GB
HI BV,

Thanks for the reply.

Still I have the problem in reading the string data.In my earlier post "Reading and writing a text file " ,I am reading different SETS file format.When I use this regular expression pattern,the earlier file data is not able to read.

Could you please refer to my posting "Reading and writing a text file " and let me know the exact pattern.

-PSB
PSB - See my last post in the referenced thread. I posted the code for parsing the data as you had described. The only difference between that data and your problem in this thread is extra spaces around keyword "THRU" - is this correct? If so, all you have to do is modify function getThruData(s) slightly as follows:
Old code:
Expand|Select|Wrap|Line Numbers
  1. sList = re.findall('\d+ THRU \d+|\d+', s)
New code:
Expand|Select|Wrap|Line Numbers
  1. sList = re.findall('\d+ +THRU +\d+|\d+', s)
The added '+' characters enable a match if one or more spaces occur around "THRU". You really need to study that code to understand how it works.
Apr 1 '07 #10
psbasha
440 256MB
Hi BV,

Thanks,it is working fine.

How can we decide this "pattern" style while using regular expression?.

Whether we have to go with "trail and error" or we have to get the different possibilities of formats and then decide the pattern.

-PSB
Apr 1 '07 #11
bvdet
2,851 Expert Mod 2GB
Hi BV,

Thanks,it is working fine.

How can we decide this "pattern" style while using regular expression?.

Whether we have to go with "trail and error" or we have to get the different possibilities of formats and then decide the pattern.

-PSB
Ideally the data should be in a strict format designed for easy of parsing. My experience has been to determine the different formats then decide how best to parse.
Apr 1 '07 #12
ghostdog74
511 Expert 256MB
Hi BV,

Thanks,it is working fine.

How can we decide this "pattern" style while using regular expression?.

Whether we have to go with "trail and error" or we have to get the different possibilities of formats and then decide the pattern.

-PSB
You have to sit down and think through what are the different formats that your input file may turn out and then construct the expression to fit all possible scenarios. Regular expression is a powerful tool, at the same time, may confuse new users. Also, too much of it makes debugging and enhancements to your programs difficult. It takes practice to really understand the mechanics of it too. As the saying goes,
Some people, when confronted with a problem, think “I know, I’ll use regular expressions.” Now they have two problems. —Jamie Zawinski,
Apr 2 '07 #13

Sign in to post your reply or Sign up for a free account.

Similar topics

303
by: mike420 | last post by:
In the context of LATEX, some Pythonista asked what the big successes of Lisp were. I think there were at least three *big* successes. a. orbitz.com web site uses Lisp for algorithms, etc. b....
10
by: Jeff Sandler | last post by:
I have a page that accepts input from many textboxes. Many of the textboxes are intended to accept dates and times, thus, I expect only digits to be entered. I originally tested using parseInt...
2
by: Christian Staffe | last post by:
Hi, I would like to check for a partial match between an input string and a regular expression using the Regex class in .NET. By partial match, I mean that the input string could not yet be...
1
by: Raed Sawalha | last post by:
I havea regular expression to text as pairs key:value (?<Keyword>\w+):(?<Value>.*)((?=\W$)|\z) when enter the text as following: x-sender: raed_sawalha@hotmail.com x-receiver:...
3
by: Zach | last post by:
Hello, Please forgive if this is not the most appropriate newsgroup for this question. Unfortunately I didn't find a newsgroup specific to regular expressions. I have the following regular...
25
by: Mike | last post by:
I have a regular expression (^(.+)(?=\s*).*\1 ) that results in matches. I would like to get what the actual regular expression is. In other words, when I apply ^(.+)(?=\s*).*\1 to " HEART...
1
by: Steve Barnett | last post by:
I'm attempting my first regular expression (ok, I nicked some of it from a blog) and I'm not seeing the results I expect to see. Given that it does not work, I find myself entirely stuck. Can you...
6
by: sk.rasheedfarhan | last post by:
Hi , I am using regular expression in C++ code, . Negation is not working in the down loaded code. matches all characters except "a", "b", and "c] So I am in dilemma can negation work in C++...
5
by: Noah Hoffman | last post by:
I have been trying to write a regular expression that identifies a block of text enclosed by (potentially nested) parentheses. I've found solutions using other regular expression engines (for...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
1
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.