Still new. I am trying to make a simple word count script.
I found this in the great Python Cookbook, which allows me to process
every word in a file. But how do I use it to count the items generated?
def words_of_file(thefilepath, line_to_words=str.split):
the_file = open(thefilepath)
for line in the_file:
for word in line_to_words(line):
yield word
the_file.close()
for word in words_of_file(thefilepath):
dosomethingwith(word)
The best I could come up with:
def words_of_file(thefilepath, line_to_words=str.split):
the_file = open(thefilepath)
for line in the_file:
for word in line_to_words(line):
yield word
the_file.close()
len(list(words_of_file(thefilepath)))
But that seems clunky. 14 8526
BartlebyScrivener <rp*******@gmail.com> wrote: Still new. I am trying to make a simple word count script.
I found this in the great Python Cookbook, which allows me to process every word in a file. But how do I use it to count the items generated?
def words_of_file(thefilepath, line_to_words=str.split): the_file = open(thefilepath) for line in the_file: for word in line_to_words(line): yield word the_file.close() for word in words_of_file(thefilepath): dosomethingwith(word)
The best I could come up with:
def words_of_file(thefilepath, line_to_words=str.split): the_file = open(thefilepath) for line in the_file: for word in line_to_words(line): yield word the_file.close() len(list(words_of_file(thefilepath)))
But that seems clunky.
My preference would be (with the original definition for
words_of_the_file) to code
numwords = sum(1 for w in words_of_the_file(thefilepath))
Alex
BartlebyScrivener wrote: Still new. I am trying to make a simple word count script.
I found this in the great Python Cookbook, which allows me to process every word in a file. But how do I use it to count the items generated?
def words_of_file(thefilepath, line_to_words=str.split): the_file = open(thefilepath) for line in the_file: for word in line_to_words(line): yield word the_file.close() for word in words_of_file(thefilepath): dosomethingwith(word)
The best I could come up with:
def words_of_file(thefilepath, line_to_words=str.split): the_file = open(thefilepath) for line in the_file: for word in line_to_words(line): yield word the_file.close() len(list(words_of_file(thefilepath)))
But that seems clunky.
As clunky as it seems, I don't think you can beat it in terms of
brevity; if you care about memory efficiency though, here's what I use:
def length(iterable):
try: return len(iterable)
except:
i = 0
for x in iterable: i += 1
return i
You can even shadow the builtin len() if you prefer:
import __builtin__
def len(iterable):
try: return __builtin__.len(iterable)
except:
i = 0
for x in iterable: i += 1
return i
HTH,
George
Thanks! And thanks for the Cookbook.
rd
"There is no abstract art. You must always start with something.
Afterward you can remove all traces of reality."--Pablo Picasso
"George Sakkis" <ge***********@gmail.com> writes: As clunky as it seems, I don't think you can beat it in terms of brevity; if you care about memory efficiency though, here's what I use:
def length(iterable): try: return len(iterable) except: i = 0 for x in iterable: i += 1 return i
Alex's example amounted to something like that, for the generator
case. Notice that the argument to sum() was a generator
comprehension. The sum function then iterated through it.
Paul Rubin <http://ph****@NOSPAM.invalid> wrote: "George Sakkis" <ge***********@gmail.com> writes: As clunky as it seems, I don't think you can beat it in terms of brevity; if you care about memory efficiency though, here's what I use:
def length(iterable): try: return len(iterable) except: i = 0 for x in iterable: i += 1 return i
Alex's example amounted to something like that, for the generator case. Notice that the argument to sum() was a generator comprehension. The sum function then iterated through it.
True. Changing the except clause here to
except: return sum(1 for x in iterable)
keeps George's optimization (O(1), not O(N), for containers) and is a
bit faster (while still O(N)) for non-container iterables.
Alex
In article <1h***************************@mac.com>,
Alex Martelli <al***@mac.com> wrote: cl****@lairds.us (Cameron Laird) writes: For that matter, would it be an advantage for len() to operate on iterables?
print len(itertools.count())
Ouch!!
>> True. Changing the except clause here to except: return sum(1 for x in iterable)
keeps George's optimization (O(1), not O(N), for containers) and is a bit faster (while still O(N)) for non-container iterables.
Every thing was going just great. Now I have to think again.
Thank you all.
rick
Paul Rubin wrote: cl****@lairds.us (Cameron Laird) writes: For that matter, would it be an advantage for len() to operate on iterables?
print len(itertools.count())
Ouch!!
How is this worse than list(itertools.count()) ?
Cameron Laird <cl****@lairds.us> wrote: In article <1h***************************@mac.com>, Alex Martelli <al***@mac.com> wrote: . . .My preference would be (with the original definition for words_of_the_file) to code
numwords = sum(1 for w in words_of_the_file(thefilepath)) . . . There are times when
numwords = len(list(words_of_the_file(thefilepath))
will be advantageous.
Can you please give some examples? None comes readily to mind...
For that matter, would it be an advantage for len() to operate on iterables? It could be faster and thriftier on memory than either of the above, and my first impression is that it's sufficiently natural not to offend those of suspicious of language bloat.
I'd be a bit worried about having len(x) change x's state into an
unusable one. Yes, it happens in other cases (if y in x:), but adding
more such problematic cases doesn't seem advisable to me anyway -- I'd
evaluate this proposal as a -0, even taking into account the potential
optimizations to be garnered by having some iterables expose __len__
(e.g., a genexp such as (f(x) fox x in foo), without an if-clause, might
be optimized to delegate __len__ to foo -- again, there may be semantic
alterations lurking that make this optimization a bit iffy).
Alex
George Sakkis <ge***********@gmail.com> wrote: Paul Rubin wrote:
cl****@lairds.us (Cameron Laird) writes: For that matter, would it be an advantage for len() to operate on iterables?
print len(itertools.count())
Ouch!!
How is this worse than list(itertools.count()) ?
It's a slightly worse trap because list(x) ALWAYS iterates on x (just
like "for y in x:"), while len(x) MAY OR MAY NOT iterate on x (under
Cameron's proposal; it currently never does).
Yes, there are other subtle traps of this ilk already in Python, such as
"if y in x:" -- this, too, may or may not iterate. But the fact that a
potential problem exists in some corner cases need not be a good reason
to extend the problem to higher frequency;-).
Alex
In article <1h**************************@mac.com>,
Alex Martelli <al***@mac.com> wrote: Cameron Laird <cl****@lairds.us> wrote:
In article <1hfarom.1lfetjc18leddeN%al***@mac.com>, Alex Martelli <al***@mac.com> wrote: . . . >My preference would be (with the original definition for >words_of_the_file) to code > > numwords = sum(1 for w in words_of_the_file(thefilepath)) . . . There are times when
numwords = len(list(words_of_the_file(thefilepath))
will be advantageous.
Can you please give some examples? None comes readily to mind...
In article <1h**************************@mac.com>,
Alex Martelli <al***@mac.com> wrote:
George Sakkis a écrit :
(snip) def length(iterable): try: return len(iterable) except:
except TypeError:
i = 0 for x in iterable: i += 1 return i
(snip) This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics
by: Pete |
last post by:
There is a Summary/Example further down...
On page one of my site I have a form with some checkboxes and detailed
descriptions. When the form is submitted (to page two), the values of
the...
|
by: It's me |
last post by:
Okay, I give up.
What's the best way to count number of items in a list?
For instance,
a=,4,5,]
I want to know how many items are there in a (answer should be 7 - I don't
want it to be 4)
|
by: cefrancke |
last post by:
I can't seem to find a straight answer for my specific issue.
Any help would be appreciated.
I would like to count the various items in a table where the fields
have a 'group' relationship.
I...
|
by: chris.bender |
last post by:
1. My problem:
I am using a query to populate a Chart in MS Access 2k.
2. My query:
SELECT .Status, ., Sum(.Amount) AS SumOfAmount,
Sum(IIf(!="Debit",!,!*-1)) AS
realAmount, Count(.Status) AS...
|
by: Alpha |
last post by:
Hi, How can I set all the items in a listbox to be selected? I can't find a
property or mehtod to do it so I thought I'll try using setselected method
but I need to find out how many items are in...
|
by: rdraider |
last post by:
We have an inventory table (Items) that contains item_no and qty_on_hand
fields.
Another table (Item_Serial) contains serial numbers for any item that has
serial numbers.
If an item has 10...
|
by: Dave Dean |
last post by:
Hi all,
I'm looking for a way to iterate through a list, two (or more) items at a
time. Basically...
myList =
I'd like to be able to pull out two items at a time - simple examples would...
|
by: Ping |
last post by:
Hi,
I'm wondering if it is useful to extend the count() method of a list
to accept a callable object? What it does should be quite intuitive:
count the number of items that the callable returns...
|
by: Kugutsumen |
last post by:
I am relatively new the python language and I am afraid to be missing
some clever construct or built-in way equivalent to my 'chunk'
generator below.
def chunk(size, items):
"""generate N items...
|
by: Charles Arthur |
last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
|
by: ryjfgjl |
last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
|
by: emmanuelkatto |
last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud.
Please let me know.
Thanks!
Emmanuel
|
by: BarryA |
last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
|
by: Sonnysonu |
last post by:
This is the data of csv file
1 2 3
1 2 3
1 2 3
1 2 3
2 3
2 3
3
the lengths should be different i have to store the data by column-wise with in the specific length.
suppose the i have to...
|
by: Hystou |
last post by:
There are some requirements for setting up RAID:
1. The motherboard and BIOS support RAID configuration.
2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
|
by: Hystou |
last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
|
by: jinu1996 |
last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
|
by: tracyyun |
last post by:
Dear forum friends,
With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
| |