regexp qns - Python

eight02645999

hi
suppose i have a string like

test1?test2t-test3*test4*tes t5$test6#test7* test8

how can i construct the regexp to get test3*test4*tes t5 and
test7*test8, ie, i want to match * and the words before and after?
thanks

Jan 20 '07 #1

Subscribe Reply

1199

James Stroud

ei***********@y ahoo.com wrote:

hi
suppose i have a string like

test1?test2t-test3*test4*tes t5$test6#test7* test8

how can i construct the regexp to get test3*test4*tes t5 and
test7*test8, ie, i want to match * and the words before and after?
thanks

pyimport re
pys = 'test1?test2t-test3*test4*tes t5$test6#test7* test8'
pyr = re.compile(r'(t est\d(?:\*test\ d)+)')
pyr.findall(s)
['test3*test4*te st5', 'test7*test8']

James

Jan 20 '07 #2

eight02645999

James Stroud wrote:

ei***********@y ahoo.com wrote:
hi
suppose i have a string like

test1?test2t-test3*test4*tes t5$test6#test7* test8

how can i construct the regexp to get test3*test4*tes t5 and
test7*test8, ie, i want to match * and the words before and after?
thanks

pyimport re
pys = 'test1?test2t-test3*test4*tes t5$test6#test7* test8'
pyr = re.compile(r'(t est\d(?:\*test\ d)+)')
pyr.findall(s)
['test3*test4*te st5', 'test7*test8']

James

thanks !
I check the regexp doc it says:
"""
(?:...)
A non-grouping version of regular parentheses. Matches whatever
regular expression is inside the parentheses, but the substring matched
by the group cannot be retrieved after performing a match or referenced
later in the pattern.
"""
but i could not understand this : r'(test\d(?:\*t est\d)+)'. which
parenthesis is it referring to? Sorry, could you explain the solution ?
thanks

Jan 20 '07 #3

Gabriel Genellina

<ei***********@ yahoo.comescrib ió en el mensaje
news:11******** *************@l 53g2000cwa.goog legroups.com...

hi
suppose i have a string like

test1?test2t-test3*test4*tes t5$test6#test7* test8

how can i construct the regexp to get test3*test4*tes t5 and
test7*test8, ie, i want to match * and the words before and after?
thanks

I suppose this is just an example and you mean "any word" instead of test1,
test2, etc.
So your pattern would be: word*word*word* word, that is, word* repeated many
times, followed by another word.
To match a word we'll use "\w+", to match an * we have to use "\*" (it's a
special character)
So the regexp would be: "(\w+\*)+\w +"
Since we are not interested in the () as a group by itself -it was just to
describe the repeating pattern- we change it into a non-grouping
parenthesis.
Final version: "(?:\w+\*)+ \w+"

import re
rexp = re.compile(r"(? :\w+\*)+\w+")
lines = [
'test1?test2t-test3*test4*tes t5$test6#test7* test8',
'test1?test2t-test3*test4$tes t6#test7_test8' ,
'test1?nada-que-ver$esto.no.mat chea',
'test1?test2t-test3*test4*',
'test1?test2t-test3*test4',
'test1?test2t-test3*',
]

for line in lines:
print line
for txt in rexp.findall(li ne):
print '->', txt

Test it with some corner cases and see if it does what you expect: no "*",
starting with "*", ending with "*", embedded whitespace before and after the
"*", whitespace inside a word, the very definition of "word"...

--
Gabriel Genellina

Jan 20 '07 #4

James Stroud

ei***********@y ahoo.com wrote:

James Stroud wrote:

>>ei*********** @yahoo.com wrote:

>>>hi
suppose i have a string like

test1?test 2t-test3*test4*tes t5$test6#test7* test8

how can i construct the regexp to get test3*test4*tes t5 and
test7*test 8, ie, i want to match * and the words before and after?
thanks

pyimport re
pys = 'test1?test2t-test3*test4*tes t5$test6#test7* test8'
pyr = re.compile(r'(t est\d(?:\*test\ d)+)')
pyr.findall(s )
['test3*test4*te st5', 'test7*test8']

James

thanks !
I check the regexp doc it says:
"""
(?:...)
A non-grouping version of regular parentheses. Matches whatever
regular expression is inside the parentheses, but the substring matched
by the group cannot be retrieved after performing a match or referenced
later in the pattern.
"""
but i could not understand this : r'(test\d(?:\*t est\d)+)'. which
parenthesis is it referring to? Sorry, could you explain the solution ?
thanks

The outer parentheses are the grouping operator. These are saved and
accessible from a match object via group() or groups() methods. The "\d"
part matches a single digit 0-1. The (?:....) construct is used to make
a non-grouping operator that is not itself remembered for access through
the group() or groups() methods. The expression can also reference
earlier groups, but not groups specified with the non-grouping operator.

You may want to note that this is the most specific regular expression
that would match your given example.

James

Jan 21 '07 #5

Similar topics

39346

String search vs regexp search

by: Anand Pillai | last post by:

To search a word in a group of words, say a paragraph or a web page, would a string search or a regexp search be faster? The string search would of course be, if str.find(substr) != -1: domything() And the regexp search assuming no case restriction would be,

Python

2347

Saving search results in a dictionary

by: Lukas Holcik | last post by:

Hi everyone! How can I simply search text for regexps (lets say <a href="(.*?)">(.*?)</a>) and save all URLs(1) and link contents(2) in a dictionary { name : URL}? In a single pass if it could. Or how can I replace the html &entities; in a string "blablabla&blablabal&balbalbal" with the chars they mean using re.sub? I found out they are stored in an dict . I though about this functionality:

Python

1813

EXSLT and regexp

by: Chris Croughton | last post by:

I'm trying to use the EXSLT regexp package from http://www.exslt.org/regexp/functions/match/index.html (specifically the match function) with the libxml xltproc (which supports EXSLT), but whatever I do gets errors. The examples use namespace regExp, but the supplied files use regexp, I've got it so that it at least doesn't complain about namespaces but it then complains that it can't find the match function. My stylesheet is:

.NET Framework

7465

RegExp to strip accents while ignoring case

by: Jon Maz | last post by:

Hi All, I want to strip the accents off characters in a string so that, for example, the (Spanish) word "práctico" comes out as "practico" - but ignoring case, so that "PRÁCTICO" comes out as "PRACTICO". What's the best way to do this? TIA,

C# / C Sharp

2017

regexp validator - wrong?

by: Dmitry Korolyov | last post by:

ASP.NET app using c# and framework version 1.1.4322.573 on a IIS 6.0 web server. A single-line asp:textbox control and regexp validator attached to it. ^\d+$ expression does match an empty string (when you don't enter any values) - this is wrong d+ expression does not match, for example "g24" string - this is also wrong www.regexplib.com test validator works fine for both cases, i.e. it is reporting "not match" for the...

ASP.NET

2111

JS Enabled But No RegExp Support?

by: Matt Kruse | last post by:

Are there any current browsers that have Javascript support, but not RegExp support? For example, cell phone browsers, blackberrys, or other "minimal" browsers? I know that someone using Netscape 3 would fall into this category, for example, but that's not a realistic situation anymore. And if such a condition exists, then how do you guys handle validation using regular expressions, if the browser lacks them? For example:

Javascript

3439

Finding position of a RegExp subexpression

by: Csaba Gabor | last post by:

I need to come up with a function function regExpPos (text, re, parenNum) { ... } that will return the position within text of RegExp.$parenNum if there is a match, and -1 otherwise. For example: var re = /some(thing|or other)?.*(n(est)(?:ed)?.*(parens) )/ var text = "There were some nesting parens in the test"; alert (regExpPos (text, re, 3));

Javascript

2744

unexpected behaviour for python regexp: caret symbol almost useless?

by: conan | last post by:

This regexp '<widget class=".*" id=".*">' works well with 'grep' for matching lines of the kind <widget class="GtkWindow" id="window1"> on a XML .glade file However that's not true for the re module in python, since this one takes the regexp as if were specified this way: '^<widget class=".*"

Python

2265

Why this RegExp doesn't work

by: runsun pan | last post by:

Hi I am wondering why I couldn't get what I want in the following 3 cases of re: (A) var p=/(+-?+):(+)/g p.exec("style='font-size:12'") -- // expected

Javascript

3899

RegExp.test() with global flag set

by: Matt | last post by:

Hello all, I have just discovered (the long way) that using a RegExp object with the 'global' flag set produces inconsistent results when its test() method is executed. I realize that 'global' is not an appropriate modifier for the test() function - test() searches the entire string by default. However, I would expect it to degrade gracefully. Instead, I seem to be getting something as follows - using W3Schools handy page at :

Javascript

8370

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...

General

8811

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...

C / C++

8704

Maximizing Business Potential: The Nexus of Website Design and Digital Marketing

by: jinu1996 | last post by:

In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...

Online Marketing

8590

Discussion: How does Zigbee compare with other wireless protocols in smart home applications?

by: tracyyun | last post by:

Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...

General

6160

Access Europe - Using VBA to create a class based on a table - Wed 1 May

by: isladogs | last post by:

The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...

Microsoft Access / VBA

5620

Couldn’t get equations in html when convert word .docx file to html file in C#.

by: conductexam | last post by:

I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...

C# / C Sharp

4147

Trying to create a lan-to-lan vpn between two differents networks

by: TSSRALBI | last post by:

Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...

Networking - Hardware / Configuration

4291

Windows Forms - .Net 8.0

by: adsilva | last post by:

A Windows Forms form does not have the event Unload, like VB6. What one acts like?

Visual Basic .NET

1591

Comprehensive Guide to Website Development in Toronto: Expert Insights from BSMN Consultancy

by: bsmnconsultancy | last post by:

In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

General