473,397 Members | 1,969 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,397 software developers and data experts.

regexp qns

hi
suppose i have a string like

test1?test2t-test3*test4*test5$test6#test7*test8

how can i construct the regexp to get test3*test4*test5 and
test7*test8, ie, i want to match * and the words before and after?
thanks

Jan 20 '07 #1
4 1190
ei***********@yahoo.com wrote:
hi
suppose i have a string like

test1?test2t-test3*test4*test5$test6#test7*test8

how can i construct the regexp to get test3*test4*test5 and
test7*test8, ie, i want to match * and the words before and after?
thanks

pyimport re
pys = 'test1?test2t-test3*test4*test5$test6#test7*test8'
pyr = re.compile(r'(test\d(?:\*test\d)+)')
pyr.findall(s)
['test3*test4*test5', 'test7*test8']

James
Jan 20 '07 #2
James Stroud wrote:
ei***********@yahoo.com wrote:
hi
suppose i have a string like

test1?test2t-test3*test4*test5$test6#test7*test8

how can i construct the regexp to get test3*test4*test5 and
test7*test8, ie, i want to match * and the words before and after?
thanks


pyimport re
pys = 'test1?test2t-test3*test4*test5$test6#test7*test8'
pyr = re.compile(r'(test\d(?:\*test\d)+)')
pyr.findall(s)
['test3*test4*test5', 'test7*test8']

James
thanks !
I check the regexp doc it says:
"""
(?:...)
A non-grouping version of regular parentheses. Matches whatever
regular expression is inside the parentheses, but the substring matched
by the group cannot be retrieved after performing a match or referenced
later in the pattern.
"""
but i could not understand this : r'(test\d(?:\*test\d)+)'. which
parenthesis is it referring to? Sorry, could you explain the solution ?
thanks

Jan 20 '07 #3

<ei***********@yahoo.comescribió en el mensaje
news:11*********************@l53g2000cwa.googlegro ups.com...
hi
suppose i have a string like

test1?test2t-test3*test4*test5$test6#test7*test8

how can i construct the regexp to get test3*test4*test5 and
test7*test8, ie, i want to match * and the words before and after?
thanks
I suppose this is just an example and you mean "any word" instead of test1,
test2, etc.
So your pattern would be: word*word*word*word, that is, word* repeated many
times, followed by another word.
To match a word we'll use "\w+", to match an * we have to use "\*" (it's a
special character)
So the regexp would be: "(\w+\*)+\w+"
Since we are not interested in the () as a group by itself -it was just to
describe the repeating pattern- we change it into a non-grouping
parenthesis.
Final version: "(?:\w+\*)+\w+"

import re
rexp = re.compile(r"(?:\w+\*)+\w+")
lines = [
'test1?test2t-test3*test4*test5$test6#test7*test8',
'test1?test2t-test3*test4$test6#test7_test8',
'test1?nada-que-ver$esto.no.matchea',
'test1?test2t-test3*test4*',
'test1?test2t-test3*test4',
'test1?test2t-test3*',
]

for line in lines:
print line
for txt in rexp.findall(line):
print '->', txt

Test it with some corner cases and see if it does what you expect: no "*",
starting with "*", ending with "*", embedded whitespace before and after the
"*", whitespace inside a word, the very definition of "word"...

--
Gabriel Genellina
Jan 20 '07 #4
ei***********@yahoo.com wrote:
James Stroud wrote:
>>ei***********@yahoo.com wrote:
>>>hi
suppose i have a string like

test1?test2t-test3*test4*test5$test6#test7*test8

how can i construct the regexp to get test3*test4*test5 and
test7*test8, ie, i want to match * and the words before and after?
thanks


pyimport re
pys = 'test1?test2t-test3*test4*test5$test6#test7*test8'
pyr = re.compile(r'(test\d(?:\*test\d)+)')
pyr.findall(s)
['test3*test4*test5', 'test7*test8']

James


thanks !
I check the regexp doc it says:
"""
(?:...)
A non-grouping version of regular parentheses. Matches whatever
regular expression is inside the parentheses, but the substring matched
by the group cannot be retrieved after performing a match or referenced
later in the pattern.
"""
but i could not understand this : r'(test\d(?:\*test\d)+)'. which
parenthesis is it referring to? Sorry, could you explain the solution ?
thanks
The outer parentheses are the grouping operator. These are saved and
accessible from a match object via group() or groups() methods. The "\d"
part matches a single digit 0-1. The (?:....) construct is used to make
a non-grouping operator that is not itself remembered for access through
the group() or groups() methods. The expression can also reference
earlier groups, but not groups specified with the non-grouping operator.

You may want to note that this is the most specific regular expression
that would match your given example.

James
Jan 21 '07 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

10
by: Anand Pillai | last post by:
To search a word in a group of words, say a paragraph or a web page, would a string search or a regexp search be faster? The string search would of course be, if str.find(substr) != -1:...
5
by: Lukas Holcik | last post by:
Hi everyone! How can I simply search text for regexps (lets say <a href="(.*?)">(.*?)</a>) and save all URLs(1) and link contents(2) in a dictionary { name : URL}? In a single pass if it could....
0
by: Chris Croughton | last post by:
I'm trying to use the EXSLT regexp package from http://www.exslt.org/regexp/functions/match/index.html (specifically the match function) with the libxml xltproc (which supports EXSLT), but...
4
by: Jon Maz | last post by:
Hi All, I want to strip the accents off characters in a string so that, for example, the (Spanish) word "práctico" comes out as "practico" - but ignoring case, so that "PRÁCTICO" comes out as...
8
by: Dmitry Korolyov | last post by:
ASP.NET app using c# and framework version 1.1.4322.573 on a IIS 6.0 web server. A single-line asp:textbox control and regexp validator attached to it. ^\d+$ expression does match an empty...
26
by: Matt Kruse | last post by:
Are there any current browsers that have Javascript support, but not RegExp support? For example, cell phone browsers, blackberrys, or other "minimal" browsers? I know that someone using Netscape...
7
by: Csaba Gabor | last post by:
I need to come up with a function function regExpPos (text, re, parenNum) { ... } that will return the position within text of RegExp.$parenNum if there is a match, and -1 otherwise. For...
4
by: conan | last post by:
This regexp '<widget class=".*" id=".*">' works well with 'grep' for matching lines of the kind <widget class="GtkWindow" id="window1"> on a XML .glade file However that's not true for the...
6
by: runsun pan | last post by:
Hi I am wondering why I couldn't get what I want in the following 3 cases of re: (A) var p=/(+-?+):(+)/g p.exec("style='font-size:12'") -- // expected
4
by: Matt | last post by:
Hello all, I have just discovered (the long way) that using a RegExp object with the 'global' flag set produces inconsistent results when its test() method is executed. I realize that 'global'...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.