473,653 Members | 2,990 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

regexp qns

hi
suppose i have a string like

test1?test2t-test3*test4*tes t5$test6#test7* test8

how can i construct the regexp to get test3*test4*tes t5 and
test7*test8, ie, i want to match * and the words before and after?
thanks

Jan 20 '07 #1
4 1199
ei***********@y ahoo.com wrote:
hi
suppose i have a string like

test1?test2t-test3*test4*tes t5$test6#test7* test8

how can i construct the regexp to get test3*test4*tes t5 and
test7*test8, ie, i want to match * and the words before and after?
thanks

pyimport re
pys = 'test1?test2t-test3*test4*tes t5$test6#test7* test8'
pyr = re.compile(r'(t est\d(?:\*test\ d)+)')
pyr.findall(s)
['test3*test4*te st5', 'test7*test8']

James
Jan 20 '07 #2
James Stroud wrote:
ei***********@y ahoo.com wrote:
hi
suppose i have a string like

test1?test2t-test3*test4*tes t5$test6#test7* test8

how can i construct the regexp to get test3*test4*tes t5 and
test7*test8, ie, i want to match * and the words before and after?
thanks


pyimport re
pys = 'test1?test2t-test3*test4*tes t5$test6#test7* test8'
pyr = re.compile(r'(t est\d(?:\*test\ d)+)')
pyr.findall(s)
['test3*test4*te st5', 'test7*test8']

James
thanks !
I check the regexp doc it says:
"""
(?:...)
A non-grouping version of regular parentheses. Matches whatever
regular expression is inside the parentheses, but the substring matched
by the group cannot be retrieved after performing a match or referenced
later in the pattern.
"""
but i could not understand this : r'(test\d(?:\*t est\d)+)'. which
parenthesis is it referring to? Sorry, could you explain the solution ?
thanks

Jan 20 '07 #3

<ei***********@ yahoo.comescrib ió en el mensaje
news:11******** *************@l 53g2000cwa.goog legroups.com...
hi
suppose i have a string like

test1?test2t-test3*test4*tes t5$test6#test7* test8

how can i construct the regexp to get test3*test4*tes t5 and
test7*test8, ie, i want to match * and the words before and after?
thanks
I suppose this is just an example and you mean "any word" instead of test1,
test2, etc.
So your pattern would be: word*word*word* word, that is, word* repeated many
times, followed by another word.
To match a word we'll use "\w+", to match an * we have to use "\*" (it's a
special character)
So the regexp would be: "(\w+\*)+\w +"
Since we are not interested in the () as a group by itself -it was just to
describe the repeating pattern- we change it into a non-grouping
parenthesis.
Final version: "(?:\w+\*)+ \w+"

import re
rexp = re.compile(r"(? :\w+\*)+\w+")
lines = [
'test1?test2t-test3*test4*tes t5$test6#test7* test8',
'test1?test2t-test3*test4$tes t6#test7_test8' ,
'test1?nada-que-ver$esto.no.mat chea',
'test1?test2t-test3*test4*',
'test1?test2t-test3*test4',
'test1?test2t-test3*',
]

for line in lines:
print line
for txt in rexp.findall(li ne):
print '->', txt

Test it with some corner cases and see if it does what you expect: no "*",
starting with "*", ending with "*", embedded whitespace before and after the
"*", whitespace inside a word, the very definition of "word"...

--
Gabriel Genellina
Jan 20 '07 #4
ei***********@y ahoo.com wrote:
James Stroud wrote:
>>ei*********** @yahoo.com wrote:
>>>hi
suppose i have a string like

test1?test 2t-test3*test4*tes t5$test6#test7* test8

how can i construct the regexp to get test3*test4*tes t5 and
test7*test 8, ie, i want to match * and the words before and after?
thanks


pyimport re
pys = 'test1?test2t-test3*test4*tes t5$test6#test7* test8'
pyr = re.compile(r'(t est\d(?:\*test\ d)+)')
pyr.findall(s )
['test3*test4*te st5', 'test7*test8']

James


thanks !
I check the regexp doc it says:
"""
(?:...)
A non-grouping version of regular parentheses. Matches whatever
regular expression is inside the parentheses, but the substring matched
by the group cannot be retrieved after performing a match or referenced
later in the pattern.
"""
but i could not understand this : r'(test\d(?:\*t est\d)+)'. which
parenthesis is it referring to? Sorry, could you explain the solution ?
thanks
The outer parentheses are the grouping operator. These are saved and
accessible from a match object via group() or groups() methods. The "\d"
part matches a single digit 0-1. The (?:....) construct is used to make
a non-grouping operator that is not itself remembered for access through
the group() or groups() methods. The expression can also reference
earlier groups, but not groups specified with the non-grouping operator.

You may want to note that this is the most specific regular expression
that would match your given example.

James
Jan 21 '07 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

10
39346
by: Anand Pillai | last post by:
To search a word in a group of words, say a paragraph or a web page, would a string search or a regexp search be faster? The string search would of course be, if str.find(substr) != -1: domything() And the regexp search assuming no case restriction would be,
5
2347
by: Lukas Holcik | last post by:
Hi everyone! How can I simply search text for regexps (lets say <a href="(.*?)">(.*?)</a>) and save all URLs(1) and link contents(2) in a dictionary { name : URL}? In a single pass if it could. Or how can I replace the html &entities; in a string "blablabla&amp;blablabal&amp;balbalbal" with the chars they mean using re.sub? I found out they are stored in an dict . I though about this functionality:
0
1813
by: Chris Croughton | last post by:
I'm trying to use the EXSLT regexp package from http://www.exslt.org/regexp/functions/match/index.html (specifically the match function) with the libxml xltproc (which supports EXSLT), but whatever I do gets errors. The examples use namespace regExp, but the supplied files use regexp, I've got it so that it at least doesn't complain about namespaces but it then complains that it can't find the match function. My stylesheet is:
4
7465
by: Jon Maz | last post by:
Hi All, I want to strip the accents off characters in a string so that, for example, the (Spanish) word "práctico" comes out as "practico" - but ignoring case, so that "PRÁCTICO" comes out as "PRACTICO". What's the best way to do this? TIA,
8
2017
by: Dmitry Korolyov | last post by:
ASP.NET app using c# and framework version 1.1.4322.573 on a IIS 6.0 web server. A single-line asp:textbox control and regexp validator attached to it. ^\d+$ expression does match an empty string (when you don't enter any values) - this is wrong d+ expression does not match, for example "g24" string - this is also wrong www.regexplib.com test validator works fine for both cases, i.e. it is reporting "not match" for the...
26
2111
by: Matt Kruse | last post by:
Are there any current browsers that have Javascript support, but not RegExp support? For example, cell phone browsers, blackberrys, or other "minimal" browsers? I know that someone using Netscape 3 would fall into this category, for example, but that's not a realistic situation anymore. And if such a condition exists, then how do you guys handle validation using regular expressions, if the browser lacks them? For example:
7
3439
by: Csaba Gabor | last post by:
I need to come up with a function function regExpPos (text, re, parenNum) { ... } that will return the position within text of RegExp.$parenNum if there is a match, and -1 otherwise. For example: var re = /some(thing|or other)?.*(n(est)(?:ed)?.*(parens) )/ var text = "There were some nesting parens in the test"; alert (regExpPos (text, re, 3));
4
2744
by: conan | last post by:
This regexp '<widget class=".*" id=".*">' works well with 'grep' for matching lines of the kind <widget class="GtkWindow" id="window1"> on a XML .glade file However that's not true for the re module in python, since this one takes the regexp as if were specified this way: '^<widget class=".*"
6
2265
by: runsun pan | last post by:
Hi I am wondering why I couldn't get what I want in the following 3 cases of re: (A) var p=/(+-?+):(+)/g p.exec("style='font-size:12'") -- // expected
4
3899
by: Matt | last post by:
Hello all, I have just discovered (the long way) that using a RegExp object with the 'global' flag set produces inconsistent results when its test() method is executed. I realize that 'global' is not an appropriate modifier for the test() function - test() searches the entire string by default. However, I would expect it to degrade gracefully. Instead, I seem to be getting something as follows - using W3Schools handy page at :
0
8370
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
8811
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
8704
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
0
8590
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
1
6160
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5620
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
4147
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
4291
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
2
1591
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.