473,480 Members | 2,266 Online
Bytes | Software Development & Data Engineering Community
Create Post

Home Posts Topics Members FAQ

prefix matching

Hello, there!

Given a list with strings.

What is the most pythonic way to check if a given string starts with one of
the strings in that list?

I started by composing a regular expression pattern which consists of all
the strings in the list separated by "|" in a for loop. Then I used that
pattern to do a regexp match.

Seems rather complicated to me. Any alternatives?

Christian
Jul 18 '05 #1
8 3054
Here's another simple approach:
def startswith_one(s, prefixes):
for p in prefixes:
if s.startswith(p): return True
return False
If speed is important and "prefixes" doesn't change frequently, then
coding a FSM in C is the way to go. The last time this question went
around, I learned that REs like
a|b|c
essentially tries the alternatives one after another, rather than
compiling into an FSM like I learned in school, so the amount of time
taken is proportional to the length of the total RE, not the longest
alternative.

Jeff

Jul 18 '05 #2
Must be something like
l=["pref1","pref2"]
s="String"
if filter(lambda x:s[:len(x)] == x, l):
# do something

Cheers,
Stefan

On Wednesday 26 May 2004 12:08 pm, Christian Gudrian wrote: Hello, there!

Given a list with strings.

What is the most pythonic way to check if a given string starts with one of
the strings in that list?

I started by composing a regular expression pattern which consists of all
the strings in the list separated by "|" in a for loop. Then I used that
pattern to do a regexp match.

Seems rather complicated to me. Any alternatives?

Christian


--
Stefan Meier
Computational Biology & Informatics
Exelixis, Inc.
170 Harbor Way, P.O. Box 511
South San Francisco, CA 94083-511
fon. (650)837 7816

Jul 18 '05 #3
Christian Gudrian wrote:
What is the most pythonic way to check if a given string starts with one
of the strings in that list?

I started by composing a regular expression pattern which consists of all
the strings in the list separated by "|" in a for loop. Then I used that
pattern to do a regexp match.

Seems rather complicated to me. Any alternatives?


Taken from the itertools examples at
http://docs.python.org/lib/itertools-example.html
import itertools
True in itertools.imap("so what".startswith, ["so", "what", "else"]) True True in itertools.imap("so what".startswith, ["what", "else"]) False


It's wrapped in a function - any() - there.

Peter

Jul 18 '05 #4

"Jeff Epler" <je****@unpythonic.net> wrote:
def startswith_one(s, prefixes):
for p in prefixes:
if s.startswith(p): return True
return False
Thanks! startswith sounds good.
If speed is important and "prefixes" doesn't change frequently, then
coding a FSM in C is the way to go.


The prefixes don't change very often but can be configured by the user. So a
FSM is not the preferred approach here I think. And speed does not really
matter.

Christian
Jul 18 '05 #5


Christian Gudrian wrote:
Given a list with strings.

What is the most pythonic way to check if a given string starts with one of
the strings in that list?

I started by composing a regular expression pattern which consists of all
the strings in the list separated by "|" in a for loop. Then I used that
pattern to do a regexp match.

Seems rather complicated to me. Any alternatives?


I'd use something like this:

reduce (operator.or_,
[string_to_test.startswith (x) for x in list_with_strings])

Greetings,

Holger

Jul 18 '05 #6
Christian Gudrian wrote:
I started by composing a regular expression pattern which consists of all
the strings in the list separated by "|" in a for loop. Then I used that
pattern to do a regexp match.

Seems rather complicated to me. Any alternatives?


does it work? is it fast enough?

(if the answer is yes and yes, what's wrong with you ;-)

</F>


Jul 18 '05 #7
Jeff Epler wrote:
The last time this question went around, I learned that REs like
a|b|c
essentially tries the alternatives one after another, rather than
compiling into an FSM


that depends somewhat on what a, b, and c happens to be.

for example,

if all alternatives consist of a single literal character, the entire
subexpression is replaced with a character set ("a|b|c" becomes
"[abc]")

for alternatives that start with literal text or a character set, the
engine never checks alternatives that cannot possible match (if
you feed "aha" to "a...|b...|c...", the second and third alternative
are never checked).

if all alternatives share a common prefix, that prefix will be checked
before any alternative is tried; if the prefix matches, only the suffixes
will be checked for each alternative. ("aa|ab|ac" becomes "a(?:a|b|c)"
becomes "a[abc]")

when searching, the engine uses a KMP-style overlap table to skip
over places where the prefix cannot possibly match. (which explains
why re.search can sometimes run faster than string.find)

</F>

Jul 18 '05 #8

"Fredrik Lundh" <fr*****@pythonware.com> schrieb:
does it work? is it fast enough?

(if the answer is yes and yes, what's wrong with you ;-)


Well, um, the answer is indeed ("yes", "yes"). But if intended to write code
that just works and runs fast enough I wouldn't be here. :)

Just like many of us I'm quite new to Python. Learning a high level
programming language is a matter of some hours nowadays. What really takes
time is digging into the libraries. And that's what my question aimed at:
learning some new functions and methods.

Christian
Jul 18 '05 #9

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

6
8377
by: John R | last post by:
Hi all, I'm trying to get my VB6 app to look up prefixes for a phone number out of an MDB file along with an associated price etc. For example the phone number could be 9802xxxx, and the MDB...
1
4195
by: Xuejun Li \(SH/RDC\) | last post by:
Hi, How do I select a value that starts with a certain prefix? For example I want all <author> elements that contain a <last-name> element with a value that begins with the letter Mi. Will this...
0
2010
by: Alan Silver | last post by:
Hello, I have two weird problems here. I have a master page file that works absolutely fine. When I load it up in VWD, I get a couple of weird (to me) errors. First, I get the error...
30
5426
by: Xah Lee | last post by:
The Concepts and Confusions of Prefix, Infix, Postfix and Fully Functional Notations Xah Lee, 2006-03-15 In LISP languages, they use a notation like “(+ 1 2)†to mean “1+2â€....
0
7049
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
7052
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
7092
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
1
6744
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
6981
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
5348
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
2989
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
0
1304
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated ...
0
188
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.