473,396 Members | 1,852 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,396 software developers and data experts.

String functions: what's the difference?

(absolute beginner here, sorry if this seems basic)

Section 7.10 of 'How to Think Like a Computer Scientist' contains this
discussion of string.find and other string functions:

(quote)
We can use these constants and find to classify characters. For example, if
find(lowercase, ch) returns a value other than -1, then ch must be lowercase:

def isLower(ch):
return string.find(string.lowercase, ch) != -1

Alternatively, we can take advantage of the in operator, which determines
whether a character appears in a string:
def isLower(ch):
return ch in string.lowercase

As yet another alternative, we can use the comparison operator:
def isLower(ch):
return 'a' <= ch <= 'z'
If ch is between a and z, it must be a lowercase letter.

As an exercise, discuss which version of isLower you think will be
fastest. Can you think of other reasons besides speed to prefer one
or the other?

(end quote)

I've tried all three, but the function is so small (test a single letter) I
can't measure the difference. I'm using time.time() to see how long it takes to
execute the function.
I could use a loop to increase execution time, but then I might be measuring
mostly overhead.

I'd expect the third option to be the fastest (involves looking up 3 values,
where the others have to iterate through a-z), but am I right?
And reasons to prefer one? a-z doesn't contain all lowercase letters (it omits
acents and symbols), but other than that?
--
Harro de Jong
remove the extra Xs from xmsnet to mail me
Mar 9 '06 #1
4 1862
Harro de Jong wrote:
I've tried all three, but the function is so small (test a single letter) I
can't measure the difference. I'm using time.time() to see how long it takes to
execute the function.
I could use a loop to increase execution time, but then I might be measuring
mostly overhead.
Still, this is what you should do. Try the timeit.py module; it does the
loop for you. Surprisingly, one of the faster ways to do a loop is

nones = [None]*10000000

<start timer>
for x in nones:
<action>
<stop timer>

This is fast because no Python integers are created to implement the
loop.
I'd expect the third option to be the fastest (involves looking up 3 values,
where the others have to iterate through a-z), but am I right?
Just measure it for yourself. I just did, and the third option indeed
came out fastest, with the "in" operator only slightly slower.
And reasons to prefer one?


For what purpose? To find out whether a letter is lower-case?

Just use the .islower() method on the character for that.

Regards,
Martin
Mar 9 '06 #2
gry
First, don't appologize for asking questions. You read, you thought,
and you tested. That's more than many people on this list do. Bravo!

One suggestion: when asking questions here it's a good idea to always
briefly mention which version of python and what platform (linux,
windows, etc) you're using. It helps us answer your questions more
effectively.

For testing performance the "timeit" module is great. Try something
like:
python -mtimeit -s 'import string;from myfile import isLower'
"isLower('x')"

You didn't mention the test data, i.e. the character you're feeding to
isLower.
It might make a difference if the character is near the beginning or
end of the range.

As to reasons to prefer one or another implementation, one *very*
important question is "which one is clearer?". It may sound like a
minor thing, but when I'm accosted first thing in the
morning(pre-coffee) about a nasty urgent bug and sit down to pore over
code and face "string.find(string.lowercase, ch) != -1", I'm not happy.

Have fun with python!
-- George Young

Mar 9 '06 #3
<gr*@ll.mit.edu> wrote:

One suggestion: when asking questions here it's a good idea to always
briefly mention which version of python and what platform (linux,
windows, etc) you're using.
Of course, forgot about that. It's Python 2.4.2 for Windows.

For testing performance the "timeit" module is great. Try something
like:
python -mtimeit -s 'import string;from myfile import isLower'
"isLower('x')"
Thanks for the pointer. I was using time.time(), which I now see isn't
very accurate on Windows.

You didn't mention the test data, i.e. the character you're feeding to
isLower.
It might make a difference if the character is near the beginning or
end of the range.
(slaps forehead) I used "A" as the input, of course that would make a
difference.

....
Have fun with python!


If the few days I've spent on it so far are any indication, I will. This
is my first foray into programming since college; I like 'How to Think
Like a Computer Scientist' much better than my impenetrable C textbook.

--
Harro de Jong
remove the extra Xs from xmsnet to mail me
Mar 9 '06 #4
Harro de Jong wrote:
Thanks for the pointer. I was using time.time(), which I now see isn't
very accurate on Windows.


time.clock() is more accurate on Windows (and much less so on
Linux, where it also measures something completely different.)
Mar 9 '06 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

99
by: David MacQuigg | last post by:
I'm not getting any feedback on the most important benefit in my proposed "Ideas for Python 3" thread - the unification of methods and functions. Perhaps it was buried among too many other less...
18
by: Metro12 | last post by:
In the <basic_string.h>, I find the implementation of these two functions. But I can't understand the difference between them. Please give me some help! //basic_string::c_str() const _CharT*...
9
by: Danny | last post by:
HI again Is there a nifty function in access that will: 1. return the amount of occurances of a small string within a larger string? this<br>is<br>a<br>test would return 3 for <br>
26
by: junky_fellow | last post by:
Consider the following piece of code: char *str = "Hello"; if (str = "Hello") printf("\nstring matches\n"); str is pointer to char and "Hello" is a string literal whose type is "array of...
3
by: Mark Kamoski | last post by:
Hi-- What is the difference between Convert.ToString(obj) and CType(obj, String)? (Assume obj is a variable of type Object.) Please advise. Thank you.
35
by: Cor | last post by:
Hallo, I have promised Jay B yesterday to do some tests. The subject was a string evaluation that Jon had send in. Jay B was in doubt what was better because there was a discussion in the C#...
12
by: CMirandaman | last post by:
Sounds like a stupid question I know. I can tell that they are used to copy strings. But what is the difference between x = y; versus x = String.Copy(y); Or are they essentially the same?
87
by: Robert Seacord | last post by:
The SEI has published CMU/SEI-2006-TR-006 "Specifications for Managed Strings" and released a "proof-of-concept" implementation of the managed string library. The specification, source code for...
92
by: =?Utf-8?B?bW9iaWxlbW9iaWxl?= | last post by:
I'm trying to load this structure for a call to DeviceIoControl: typedef struct _NDISUIO_QUERY_OID { NDIS_OID Oid; PTCHAR ptcDeviceName; UCHAR Data; } NDISUIO_QUERY_OID, *PNDISUIO_QUERY_OID; ...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.