473,787 Members | 2,857 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Enumerating Regular Expressions

Hi all,

Does anybody know of a module that allows you to enumerate all the
strings a particular regular expression describes?

Cheers,
-Blair

May 10 '06 #1
13 2213
bl************* @gmail.com wrote:
Hi all,

Does anybody know of a module that allows you to enumerate all the
strings a particular regular expression describes?

Cheers,
-Blair


You mean like re.compile(r'.* ') ?

--
James Stroud
UCLA-DOE Institute for Genomics and Proteomics
Box 951570
Los Angeles, CA 90095

http://www.jamesstroud.com/
May 10 '06 #2
James Stroud wrote:
You mean like re.compile(r'.* ') ?


No. I mean like:
regex = re.compile(r'a| b')
regex.enumerate () a
b


May 10 '06 #3
bl************* @gmail.com wrote:
James Stroud wrote:
You mean like re.compile(r'.* ') ?

No. I mean like:
regex = re.compile(r'a| b')
regex.enume rate()


a
b


You see the difficulty don't you? How will the computer know in advance
that the regex matches only a finite set of possible strings?

James

--
James Stroud
UCLA-DOE Institute for Genomics and Proteomics
Box 951570
Los Angeles, CA 90095

http://www.jamesstroud.com/
May 10 '06 #4
James Stroud wrote:
You see the difficulty don't you? How will the computer know in advance
that the regex matches only a finite set of possible strings?


Well sure it might be a little difficult to figure _that_ out, although
probably not all that hard if you converted to an FSA or something. I
imagine detecting looping would be as simple as applying the right
graph algorithm.

But that's not the point, you don't strictly need to know in advance
whether the regex represents a finite or infinite language. I just
want to enumerate it, if it's going to take longer than the life of the
universe and a stack size to match to do it, then so be it. It's
surely up to the user to be careful about how they form their
expressions. One way to get around it would be to disallow the *
operator in enumerations.

Cheers,
-Blair

May 10 '06 #5
In article <e3**********@d aisy.noc.ucla.e du>,
James Stroud <js*****@ucla.e du> wrote:
bl************* @gmail.com wrote:
James Stroud wrote:
You mean like re.compile(r'.* ') ?

No. I mean like:
>regex = re.compile(r'a| b')
>regex.enume rate()


a
b


You see the difficulty don't you? How will the computer know in advance
that the regex matches only a finite set of possible strings?


You don't. Hence, you want something that behaves like a generator, and
will produce the strings one at a time. Preferably, for the purposes of
useful computation, in some canonical order.

I'm sorry to say I don't know of an existing Python module to do this,
although you could write one for at least the basic regular expression
operators if you wanted. The basic problem isn't all that hard to
solve, though the full generality of the re module's input syntax is far
more expressive than truly "regular" expressions from language theory.

Cheers,
-M

--
Michael J. Fromberger | Lecturer, Dept. of Computer Science
http://www.dartmouth.edu/~sting/ | Dartmouth College, Hanover, NH, USA
May 10 '06 #6
Michael J. Fromberger wrote:
You see the difficulty don't you? How will the computer know in advance
that the regex matches only a finite set of possible strings?
You don't. Hence, you want something that behaves like a generator, and
will produce the strings one at a time. Preferably, for the purposes of
useful computation, in some canonical order.


Precisely, probably in order of length and value.
I'm sorry to say I don't know of an existing Python module to do this,
although you could write one for at least the basic regular expression
operators if you wanted. The basic problem isn't all that hard to
solve, though the full generality of the re module's input syntax is far
more expressive than truly "regular" expressions from language theory.


I thought that was the case, I've found a paper on the topic at least.
Maybe once I've finished some other work I'll give it a shot. It seems
like a fairly useful thing to be able to do with a regular expression
so I just guessed that somebody must have done it already. Perhaps I
can find it for another language. I'm not so sure I fancy the idea of
digging around in the re module in order to add the functionality
there...

-Blair

May 10 '06 #7
Any regular expression that has an asterisk in it has an infinite number of
possible matches.

If it has two asterisks, that's an infinite number squared and that's a
really big number.

You wouldn't want to print them out.
bl************* @gmail.com wrote:
Hi all,

Does anybody know of a module that allows you to enumerate all the
strings a particular regular expression describes?

Cheers,
-Blair


--
Dale Strickland-Clark
Riverhall Systems - www.riverhall.co.uk

May 10 '06 #8
Dale Strickland-Clark wrote:
Any regular expression that has an asterisk in it has an infinite number of
possible matches.

If it has two asterisks, that's an infinite number squared and that's a
really big number.

You wouldn't want to print them out.


We've been over this already. Why are people getting stuck on infinite
regular languages? I've made it quite clear that I'm only really
interested in doing this for finite languages, but that shouldn't
matter anyway. Maybe I should phrase it differently; I want a tool
that can enumerate a regex, with support for generating each string
described by the regex in some predefined order. So something like:
# pattern is the regex representing language L
reg_lang_gen = re.compile('pat tern').enumerat or()
while len(s) < n:
s = reg_lang_gen.ne xt() # get next string in L
...

It's easy to imagine a problem where you'd want to enumerate a regex
that represents an infinite regular language. E.g. a particularly dumb
bruteforce password cracker, you might want to keep generating strings
from the language of choice until you find one that matches your
encrpyted version, naturally you might want to stop if you hadn't found
it within 10 characters or so.

Cheers,
-B

May 10 '06 #9
> I thought that was the case, I've found a paper on the topic at least.
Maybe once I've finished some other work I'll give it a shot. It seems
like a fairly useful thing to be able to do with a regular expression
so I just guessed that somebody must have done it already.


Just wandering: whatfor do you perceive it useful? I've been done quite a
few things with rexes - yet never it occured to me that I'd be in need of
enumeration all the gazillion of possible matches. YMMV - so what for?

Regards,

Diez
May 10 '06 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
4184
by: Kenneth McDonald | last post by:
I'm working on the 0.8 release of my 'rex' module, and would appreciate feedback, suggestions, and criticism as I work towards finalizing the API and feature sets. rex is a module intended to make regular expressions easier to create and use (and in my experience as a regular expression user, it makes them MUCH easier to create and use.) I'm still working on formal documentation, and in any case, such documentation isn't necessarily the...
2
5100
by: Sehboo | last post by:
Hi, I have several regular expressions that I need to run against documents. Is it possible to combine several expressions in one expression in Regex object. So that it is faster, or will I have to use all the expressions seperately? Here are my regular expressions that check for valid email address and link Dim Expression As String =
4
5187
by: Együd Csaba | last post by:
Hi All, I'd like to "compress" the following two filter expressions into one - assuming that it makes sense regarding query execution performance. .... where (adate LIKE "2004.01.10 __:30" or adate LIKE "2004.01.10 __:15") .... into something like this: .... where adate LIKE "2004.01.10 __:(30/15)" ...
7
3831
by: Billa | last post by:
Hi, I am replaceing a big string using different regular expressions (see some example at the end of the message). The problem is whenever I apply a "replace" it makes a new copy of string and I want to avoid that. My question here is if there is a way to pass either a memory stream or array of "find", "replace" expressions or any other way to avoid multiple copies of a string. Any help will be highly appreciated
3
3027
by: a | last post by:
I'm a newbie needing to use some Regular Expressions in PHP. Can I safely use the results of my tests using 'The Regex Coach' (http://www.weitz.de/regex-coach/index.html) Are the Regular Expressions used in Perl identical to the Regular Expressions in PHP?
25
5169
by: Mike | last post by:
I have a regular expression (^(.+)(?=\s*).*\1 ) that results in matches. I would like to get what the actual regular expression is. In other words, when I apply ^(.+)(?=\s*).*\1 to " HEART (CONDUCTION DEFECT) 37.33/2 HEART (CONDUCTION DEFECT) WITH CATHETER 37.34/2 " the expression is "HEART (CONDUCTION DEFECT)". How do I gain access to the expression (not the matches) at runtime? Thanks, Mike
1
4387
by: Allan Ebdrup | last post by:
I have a dynamic list of regular expressions, the expressions don't change very often but they can change. And I have a single string that I want to match the regular expressions against and find the first regular expression that matches the string. I've gor the regular expressions ordered so that the highest priority is first (if two or more regular expressions match the string I want the first one returned) The code that does this has...
13
7493
by: Wiseman | last post by:
I'm kind of disappointed with the re regular expressions module. In particular, the lack of support for recursion ( (?R) or (?n) ) is a major drawback to me. There are so many great things that can be accomplished with regular expressions this way, such as validating a mathematical expression or parsing a language with nested parens, quoting or expressions. Another feature I'm missing is once-only subpatterns and possessive quantifiers...
12
2493
by: FAQEditor | last post by:
Anybody have any URL's to tutorials and/or references for Regular Expressions? The four I have so far are: http://docs.sun.com/source/816-6408-10/regexp.htm http://en.wikipedia.org/wiki/Regular_expression http://www.regular-expressions.info/javascript.html http://www.webreference.com/js/column5/
0
9655
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
9497
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10363
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
10169
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
10110
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
8993
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
7517
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
2
3670
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
2894
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.