472,982 Members | 2,285 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 472,982 software developers and data experts.

startswith( prefix[, start[, end]]) Query

Hi

startswith( prefix[, start[, end]]) States:

Return True if string starts with the prefix, otherwise return False.
prefix can also be a tuple of suffixes to look for. However when I try
and add a tuple of suffixes I get the following error:

Type Error: expected a character buffer object

For example:

file = f.readlines()
for line in file:
if line.startswith(("abc","df"))
CODE

It would generate the above error

To overcome this problem, I am currently just joining individual
startswith methods
i.e. if line.startswith("if") or line.startswith("df")
but know there must be a way to define all my suffixes in one tuple.

Thanks in advance

Sep 6 '07 #1
11 5075
cj***@bath.ac.uk wrote:
Hi

startswith( prefix[, start[, end]]) States:

Return True if string starts with the prefix, otherwise return False.
prefix can also be a tuple of suffixes to look for.
That particular aspect of the functionality (the multiple
prefixes in a tuple) was only added Python 2.5. If you're
using <= 2.4 you'll need to use "or" or some other approach,
eg looping over a sequence of prefixes.

TJG
Sep 6 '07 #2
On Sep 6, 7:09 am, cj...@bath.ac.uk wrote:
Hi

startswith( prefix[, start[, end]]) States:

Return True if string starts with the prefix, otherwise return False.
prefix can also be a tuple of suffixes to look for. However when I try
and add a tuple of suffixes I get the following error:

Type Error: expected a character buffer object

For example:

file = f.readlines()
for line in file:
if line.startswith(("abc","df"))
CODE

It would generate the above error
(snipped)

You see to be using an older version of Python.
For me it works as advertised with 2.5.1,
but runs into the problem you described with 2.4.4:

Python 2.5.1c1 (r251c1:54692, Apr 17 2007, 21:12:16)
[GCC 4.0.0 (Apple Computer, Inc. build 5026)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>line = "foobar"
if line.startswith(("foo", "bar")): print line
....
foobar
>>if line.startswith(("foo", "bar")):
.... print line
....
foobar
VS.

Python 2.4.4 (#1, Oct 18 2006, 10:34:39)
[GCC 4.0.1 (Apple Computer, Inc. build 5341)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>line = "foobar"
if line.startswith(("foo", "bar")): print line
....
Traceback (most recent call last):
File "<stdin>", line 1, in ?
TypeError: expected a character buffer object
--
Hope this helps,
Steven

Sep 6 '07 #3
cj***@bath.ac.uk a écrit :
Hi

startswith( prefix[, start[, end]]) States:

Return True if string starts with the prefix, otherwise return False.
prefix can also be a tuple of suffixes to look for. However when I try
and add a tuple of suffixes I get the following error:

Type Error: expected a character buffer object

For example:

file = f.readlines()
for line in file:
slightly OT, but:
1/ you should not use 'file' as an identifier, it shadowas the builtin
file type
2/ FWIW, it's also a pretty bad naming choice for a list of lines - why
not just name this list 'lines' ?-)
3/ anyway, unless you need to store this whole list in memory, you'd be
better using the iterator idiom (Python files are iterables):

f = open('some_file.ext')
for line in f:
print line

if line.startswith(("abc","df"))
CODE

It would generate the above error
May I suggest that you read the appropriate version of the doc ? That
is, the one corresponding to your installed Python version ?-)

Passing a tuple to str.startswith is new in 2.5. I bet you're trying it
on a 2.4 or older version.
To overcome this problem, I am currently just joining individual
startswith methods
i.e. if line.startswith("if") or line.startswith("df")
but know there must be a way to define all my suffixes in one tuple.
You may want to try with a regexp, but I'm not sure it's worth it (hint:
the timeit module is great for quick small benchmarks).

Else, you could as well write your own testing function:

def str_starts_with(astring, *prefixes):
startswith = astring.startswith
for prefix in prefixes:
if startswith(prefix):
return true
return false

for line in f:
if str_starts_with(line, 'abc, 'de', 'xxx'):
# CODE HERE

HTH
Sep 6 '07 #4
On 06/09/07, Bruno Desthuilliers
<br********************@wtf.websiteburo.oops.comwr ote:
>
You may want to try with a regexp, but I'm not sure it's worth it (hint:
the timeit module is great for quick small benchmarks).

Else, you could as well write your own testing function:

def str_starts_with(astring, *prefixes):
startswith = astring.startswith
for prefix in prefixes:
if startswith(prefix):
return true
return false

for line in f:
if str_starts_with(line, 'abc, 'de', 'xxx'):
# CODE HERE
Isn't slicing still faster than startswith? As you mention timeit,
then you should probably add slicing to the pot too :)

if astring[:len(prefix)] == prefix:
do_stuff()

:)
Sep 6 '07 #5
Else, you could as well write your own testing function:

def str_starts_with(astring, *prefixes):
startswith = astring.startswith
for prefix in prefixes:
if startswith(prefix):
return true
return false
What is the reason for
startswith = astring.startswith
startswith(prefix)

instead of
astring.startswith(prefix)

Sep 7 '07 #6
TheFlyingDutchman wrote:
>Else, you could as well write your own testing function:

def str_starts_with(astring, *prefixes):
startswith = astring.startswith
for prefix in prefixes:
if startswith(prefix):
return true
return false

What is the reason for
startswith = astring.startswith
startswith(prefix)

instead of
astring.startswith(prefix)
It's an optimization: the assigment creates a "bound method" (i.e. a
method associated with a specific string instance) and avoids having to
look up the startswith method of astring for each iteration of the inner
loop.

Probably not really necessary, though, and they do say that premature
optimization is the root of all evil ...

regards
Steve
--
Steve Holden +1 571 484 6266 +1 800 494 3119
Holden Web LLC/Ltd http://www.holdenweb.com
Skype: holdenweb http://del.icio.us/steve.holden
--------------- Asciimercial ------------------
Get on the web: Blog, lens and tag the Internet
Many services currently offer free registration
----------- Thank You for Reading -------------

Sep 7 '07 #7
Steve Holden a écrit :
TheFlyingDutchman wrote:
>>Else, you could as well write your own testing function:

def str_starts_with(astring, *prefixes):
startswith = astring.startswith
for prefix in prefixes:
if startswith(prefix):
return true
return false

What is the reason for
startswith = astring.startswith
startswith(prefix)

instead of
astring.startswith(prefix)
It's an optimization: the assigment creates a "bound method" (i.e. a
method associated with a specific string instance) and avoids having to
look up the startswith method of astring for each iteration of the inner
loop.

Probably not really necessary, though, and they do say that premature
optimization is the root of all evil ...
I wouldn't call this one "premature" optimization, since it doesn't
change the algorithm, doesn't introduce (much) complication, and is
proven to really save on lookup time.

Now I do agree that unless you have quite a lot of prefixes to test, it
might not be that necessary in this particular case...
Sep 7 '07 #8
"Tim Williams" <li********@tdw.netwrote:
Isn't slicing still faster than startswith? As you mention timeit,
then you should probably add slicing to the pot too :)
Possibly, but there are so many other factors that affect the timing
that writing it clearly should be your first choice.

Some timings:

@echo off
setlocal
cd \python25\lib
echo "startswith"
...\python timeit.py -s "s='abracadabra1'*1000;t='abracadabra2'" s.startswith(t)
...\python timeit.py -s "s='abracadabra1'*1000;t='abracadabra1'" s.startswith(t)
echo "prebound startswith"
...\python timeit.py -s "s='abracadabra1'*1000;t='abracadabra2';startswith =s.startswith" startswith(t)
...\python timeit.py -s "s='abracadabra1'*1000;t='abracadabra1';startswith =s.startswith" startswith(t)
echo "slice with len"
...\python timeit.py -s "s='abracadabra1'*1000;t='abracadabra2'" s[:len(t)]==t
...\python timeit.py -s "s='abracadabra1'*1000;t='abracadabra1'" s[:len(t)]==t
echo "slice with magic number"
...\python timeit.py -s "s='abracadabra1'*1000;t='abracadabra2'" s[:12]==t
...\python timeit.py -s "s='abracadabra1'*1000;t='abracadabra1'" s[:12]==t

and typical output from this is:

"startswith"
1000000 loops, best of 3: 0.542 usec per loop
1000000 loops, best of 3: 0.514 usec per loop
"prebound startswith"
1000000 loops, best of 3: 0.472 usec per loop
1000000 loops, best of 3: 0.474 usec per loop
"slice with len"
1000000 loops, best of 3: 0.501 usec per loop
1000000 loops, best of 3: 0.456 usec per loop
"slice with magic number"
1000000 loops, best of 3: 0.34 usec per loop
1000000 loops, best of 3: 0.315 usec per loop

So for these particular strings, the naive slice wins if the comparison is
true, but loses to the pre-bound method if the comparison fails. The slice is
taking a hit from calling len every time, so pre-calculating the length
(which should be possible in the same situations as pre-binding startswith)
might be worthwhile, but I would still favour using startswith unless I knew
the code was time critical.
Sep 7 '07 #9
Bruno Desthuilliers wrote:
Steve Holden a écrit :
[...]
>>
Probably not really necessary, though, and they do say that premature
optimization is the root of all evil ...

I wouldn't call this one "premature" optimization, since it doesn't
change the algorithm, doesn't introduce (much) complication, and is
proven to really save on lookup time.

Now I do agree that unless you have quite a lot of prefixes to test, it
might not be that necessary in this particular case...
The defense rests.

regards
Steve
--
Steve Holden +1 571 484 6266 +1 800 494 3119
Holden Web LLC/Ltd http://www.holdenweb.com
Skype: holdenweb http://del.icio.us/steve.holden
--------------- Asciimercial ------------------
Get on the web: Blog, lens and tag the Internet
Many services currently offer free registration
----------- Thank You for Reading -------------

Sep 7 '07 #10
Duncan Booth <du**********@invalid.invalidwrote in
news:Xn*************************@127.0.0.1:

I went through your example to get timings for my machine, and I
ran into an issue I didn't expect.

My bat file did the following 10 times in a row:
(the command line wraps in this post)

call timeit -s "s='abracadabra1'*1000;t='abracadabra2';
startswith=s.startswith" startswith(t)
.... giving me these times:

1000000 loops, best of 3: 0.483 usec per loop
1000000 loops, best of 3: 0.49 usec per loop
1000000 loops, best of 3: 0.489 usec per loop
1000000 loops, best of 3: 0.491 usec per loop
1000000 loops, best of 3: 0.488 usec per loop
1000000 loops, best of 3: 0.492 usec per loop
1000000 loops, best of 3: 0.49 usec per loop
1000000 loops, best of 3: 0.493 usec per loop
1000000 loops, best of 3: 0.486 usec per loop
1000000 loops, best of 3: 0.489 usec per loop

Then I thought that a shorter name for the lookup might affect the
timings, so I changed the bat file, which now did the following 10
times in a row:

timeit -s "s='abracadabra1'* 1000;t='abracadabra2';
sw=s.startswith" sw(t)

.... giving me these times:
1000000 loops, best of 3: 0.516 usec per loop
1000000 loops, best of 3: 0.512 usec per loop
1000000 loops, best of 3: 0.514 usec per loop
1000000 loops, best of 3: 0.517 usec per loop
1000000 loops, best of 3: 0.515 usec per loop
1000000 loops, best of 3: 0.518 usec per loop
1000000 loops, best of 3: 0.523 usec per loop
1000000 loops, best of 3: 0.513 usec per loop
1000000 loops, best of 3: 0.514 usec per loop
1000000 loops, best of 3: 0.515 usec per loop

In other words, the shorter name did seem to affect the timings,
but in a negative way. Why it would actually change at all is
beyond me, but it is consistently this way on my machine.

Can anyone explain this?

--
rzed
Sep 8 '07 #11
Steve Holden a écrit :
Bruno Desthuilliers wrote:
>Steve Holden a écrit :

[...]
>>>
Probably not really necessary, though, and they do say that premature
optimization is the root of all evil ...


I wouldn't call this one "premature" optimization, since it doesn't
change the algorithm, doesn't introduce (much) complication, and is
proven to really save on lookup time.

Now I do agree that unless you have quite a lot of prefixes to test,
it might not be that necessary in this particular case...


The defense rests.
Sorry, I don't understand this one (please bare with a poor french boy).

Sep 11 '07 #12

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

39
by: Erlend Fuglum | last post by:
Hi everyone, I'm having some trouble sorting lists. I suspect this might have something to do with locale settings and/or character encoding/unicode. Consider the following example, text...
8
by: Christian Gudrian | last post by:
Hello, there! Given a list with strings. What is the most pythonic way to check if a given string starts with one of the strings in that list? I started by composing a regular expression...
18
by: Steven Bethard | last post by:
In the "empty classes as c structs?" thread, we've been talking in some detail about my proposed "generic objects" PEP. Based on a number of suggestions, I'm thinking more and more that instead of...
11
by: Dan Sugalski | last post by:
Is there any good way to speed up SQL that uses like and has placeholders? Here's the scoop. I've got a system that uses a lot of pre-generated SQL with placeholders in it. At runtime these SQL...
5
by: metaperl | last post by:
I just finished answering a question in #python because someone tried to match using ... well.. match() but did not realize that match() is actually startswith() for regexps. I suggest:...
8
by: js | last post by:
Hello, list. I have a list of sentence in text files that I use to filter-out some data. I managed the list so badly that now it's become literally a mess. Let's say the list has a sentence...
4
by: =?utf-8?B?Qm9yaXMgRHXFoWVr?= | last post by:
Hello, what is the use-case of parameter "start" in string's "endswith" method? Consider the following minimal example: a = "testing" suffix="ing" a.endswith(suffix, 2) Significance of...
6
Colloid Snake
by: Colloid Snake | last post by:
Hello, I'm running into an odd problem - well, at least I think it's odd, but that's probably because I have a Cygwin screen burned into my retinas from staring at it for so long. When I run my...
4
by: Deckarep | last post by:
Hey everyone, Is there a more elegant or cleaner way of accomplishing the following null check? List<stringmyString = null; //Purposely null list of strings to show the example XElement...
0
by: Aliciasmith | last post by:
In an age dominated by smartphones, having a mobile app for your business is no longer an option; it's a necessity. Whether you're a startup or an established enterprise, finding the right mobile app...
0
tracyyun
by: tracyyun | last post by:
Hello everyone, I have a question and would like some advice on network connectivity. I have one computer connected to my router via WiFi, but I have two other computers that I want to be able to...
2
by: giovanniandrean | last post by:
The energy model is structured as follows and uses excel sheets to give input data: 1-Utility.py contains all the functions needed to calculate the variables and other minor things (mentions...
4
NeoPa
by: NeoPa | last post by:
Hello everyone. I find myself stuck trying to find the VBA way to get Access to create a PDF of the currently-selected (and open) object (Form or Report). I know it can be done by selecting :...
3
NeoPa
by: NeoPa | last post by:
Introduction For this article I'll be using a very simple database which has Form (clsForm) & Report (clsReport) classes that simply handle making the calling Form invisible until the Form, or all...
1
by: Teri B | last post by:
Hi, I have created a sub-form Roles. In my course form the user selects the roles assigned to the course. 0ne-to-many. One course many roles. Then I created a report based on the Course form and...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 1 Nov 2023 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM) Please note that the UK and Europe revert to winter time on...
3
by: nia12 | last post by:
Hi there, I am very new to Access so apologies if any of this is obvious/not clear. I am creating a data collection tool for health care employees to complete. It consists of a number of...
0
isladogs
by: isladogs | last post by:
The next online meeting of the Access Europe User Group will be on Wednesday 6 Dec 2023 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, Mike...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.