By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
440,905 Members | 879 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 440,905 IT Pros & Developers. It's quick & easy.

syntax philosophy

P: n/a
I'm checking out Python as a candidate for replacing Perl as my "Swiss
Army knife" tool. The longer I can remember the syntax for performing
a task, the more likely I am to use it on the spot if the need arises.
If I have to go off and look it up, as I increasingly have to do with
Perl's ever hairier syntax, I'm more likely to just skip it, making me
even less likely to remember the syntax the next time.

So I hear that Python is easier to remember between uses than Perl. So
far, I like what I see. Iterators and generators, for example, are
great. Basic loops and other things are very convenient in Python.

But I'm surprised at what you apparently have to go through to do
something as common as counting the frequency of elements in a
collection. For example, counting word frequency in a file in Perl
means looping over all the words with the following line:

$histogram{$word}++;

The creation of the dictionary, the creation of new elements, the
initialization to zero, are all taken care of automatically. You just
tell it to start counting and the rest is taken care of for you. And
the incrementing is just a simple "++".

In Python, apparently you have to first remember to declare your
dictionary outside the loop:

histogram {}

Then within the loop you use the following construct:

histogram[word] = histogram.get(word, 0) + 1

That's quite a bit hairier and it requires remembering to use braces
{}, then square brackets [], then parentheses (), and accessing the
dictionary via two different techniques in the same line.

This seems sort of unPythonesque to me, given the relative cleanliness
and obviousness (after seeing it once) of other common Python
constructs.

But I guess I'm making assumptions about what Python's philosophy
really is. I would expect that a language with something as nice as

[x**3 for x in my_list]

would want to use something like:

histogram[word]++

or even combine them into something like:

{histogram[word]++ for word in my_list}

Is this just something that hasn't been done yet but is on the way, or
is it a violation of Python's philosphy in some way?

Since I'm trying to choose a good Swiss-Army-knife programming
language, I'm wondering if this Python histogram technique is the sort
of thing that bothers Pythonistas and gets cleaned up in subsequent
versions because it violates Python's "philosophy" or whether, on the
contrary, it's just the way Pythonistas like to do things and is a
fair representation of what the community (or Guido) wants the
language to be.

Thanks.
Jul 18 '05 #1
Share this Question
Share on Google+
22 Replies


P: n/a
Hi,
But I'm surprised at what you apparently have to go through to do
something as common as counting the frequency of elements in a
collection. For example, counting word frequency in a file in Perl
means looping over all the words with the following line:

$histogram{$word}++;
Is this all thats needed - or is it actually something like this:

for $word in $words:
$histogram{$word}++

(I've never used perl, so I blatantly mix python/perl syntax here - but I
think you'll understand anyway :))
Since I'm trying to choose a good Swiss-Army-knife programming
language, I'm wondering if this Python histogram technique is the sort
of thing that bothers Pythonistas and gets cleaned up in subsequent
versions because it violates Python's "philosophy" or whether, on the
contrary, it's just the way Pythonistas like to do things and is a
fair representation of what the community (or Guido) wants the
language to be.


From what I understand in your example, the power of perl for this example
comes from its untypedness - having an unbound variable histrogram and
accessing it with the {}-operator seems to implicitly make it an
dictionary. The same seems true for accessing an nonexistant key - it
doesn't provoke an exception, instead it returns nothing (is there a
keyword/constant for that?) - and obviously the plus-operator implicitly
casts that to a 0, so it can perform the addition.

You won't see this in python - as python is a strong-typed
programming-language. Thats not to be confused beeing a static typed
language - which it is not.

So, to answer your question - I think thats the way we like it :) After all,
you can always put your code into a histogram-function and call that -
which I don't find too much work ;-)

Regards,

Diez
Jul 18 '05 #2

P: n/a
No direct answer, but if you need histograms that often, you might consider
putting something along these lines into your toolbox:

import copy
class histogram(dict):
def __init__(self, default):
dict.__init__(self)
self.default = default
def __getitem__(self, key):
return self.setdefault(key, copy.copy(self.default))
h = histogram(0)
sample = [(1, 10), (2, 10), (0, 1), (1, 1)]
for key, freq in sample:
h[key] += freq
print h

h = histogram([])
for key, item in sample:
h[key].append(item)
print h

While 0 and 0.0 are certainly the most frequent default values, they are not
the only options.

Peter
Jul 18 '05 #3

P: n/a
Tuang wrote:
I'm checking out Python as a candidate for replacing Perl as my "Swiss
Army knife" tool. The longer I can remember the syntax for performing
a task, the more likely I am to use it on the spot if the need arises.
If I have to go off and look it up, as I increasingly have to do with
Perl's ever hairier syntax, I'm more likely to just skip it, making me
even less likely to remember the syntax the next time.

So I hear that Python is easier to remember between uses than Perl. So
far, I like what I see. Iterators and generators, for example, are
great. Basic loops and other things are very convenient in Python.

But I'm surprised at what you apparently have to go through to do
something as common as counting the frequency of elements in a
collection. For example, counting word frequency in a file in Perl
means looping over all the words with the following line:

$histogram{$word}++;
Hi Tuang,

Here's my _opinion_: Perl is especially geared towards text processing, where
maybe counting word frequency is fairly common. Python is more of a general
purpose programming language, in which counting the word frequency is a pretty
rare operation (I can't remember needing to do that more than once or twice in
the past several years). As such, it probably doesn't make sense to support
that feature at the language level - it would burden a lot of people with
knowing syntax they'd rarely use.

[snip] But I guess I'm making assumptions about what Python's philosophy
really is. I would expect that a language with something as nice as

[x**3 for x in my_list]
Building a list out of another list, however, is far more common, hence (in my
view at least) the appropriateness of syntax-level support.
Is this just something that hasn't been done yet but is on the way, or
is it a violation of Python's philosphy in some way?
Python can automatically import custom modules and functions on startup (search
for information on the site module), so if I were you I'd write a
WordHistorgram function in my custom site module just once and never look back.
The added benefit is that

histogram = WordHistogram(text)

is much more readable to me as well as others than

$histogram{$word}++;
Since I'm trying to choose a good Swiss-Army-knife programming
language, I'm wondering if this Python histogram technique is the sort
of thing that bothers Pythonistas and gets cleaned up in subsequent
versions because it violates Python's "philosophy" or whether, on the
contrary, it's just the way Pythonistas like to do things and is a
fair representation of what the community (or Guido) wants the
language to be.


My impression is that features generally get added if (1) there is a good
enough case for their broad usefulness and (2) they don't overly compromise the
relatively clean syntax of the language. In this specific example, the
histogram-builder function fails both tests, so such functionality would best
live in some separate module. If over time enough people wanted it, it could
always be shipped as one of the "standard" Python modules.

-Dave
Jul 18 '05 #4

P: n/a
On 17 Nov 2003 13:29:16 -0800, tu******@hotmail.com (Tuang) wrote:
I'm checking out Python as a candidate for replacing Perl as my "Swiss
Army knife" tool. The longer I can remember the syntax for performing
a task, the more likely I am to use it on the spot if the need arises.
If I have to go off and look it up, as I increasingly have to do with
Perl's ever hairier syntax, I'm more likely to just skip it, making me
even less likely to remember the syntax the next time.

So I hear that Python is easier to remember between uses than Perl. So
far, I like what I see. Iterators and generators, for example, are
great. Basic loops and other things are very convenient in Python.

But I'm surprised at what you apparently have to go through to do
something as common as counting the frequency of elements in a
collection. For example, counting word frequency in a file in Perl
means looping over all the words with the following line:

$histogram{$word}++;

The creation of the dictionary, the creation of new elements, the
initialization to zero, are all taken care of automatically. You just
tell it to start counting and the rest is taken care of for you. And
the incrementing is just a simple "++".

In Python, apparently you have to first remember to declare your
dictionary outside the loop:

histogram {}

Then within the loop you use the following construct:

histogram[word] = histogram.get(word, 0) + 1

That's quite a bit hairier and it requires remembering to use braces
{}, then square brackets [], then parentheses (), and accessing the
dictionary via two different techniques in the same line.

This seems sort of unPythonesque to me, given the relative cleanliness
and obviousness (after seeing it once) of other common Python
constructs.

But I guess I'm making assumptions about what Python's philosophy
really is. I would expect that a language with something as nice as

[x**3 for x in my_list]

would want to use something like:

histogram[word]++

or even combine them into something like:

{histogram[word]++ for word in my_list}

Is this just something that hasn't been done yet but is on the way, or
is it a violation of Python's philosphy in some way?

Since I'm trying to choose a good Swiss-Army-knife programming
language, I'm wondering if this Python histogram technique is the sort
of thing that bothers Pythonistas and gets cleaned up in subsequent
versions because it violates Python's "philosophy" or whether, on the
contrary, it's just the way Pythonistas like to do things and is a
fair representation of what the community (or Guido) wants the
language to be.

Thanks.


IMO Python is a very good Swiss Army Knife. If you think you have a pattern
that you will want to re-use, then it is pretty easy to make something to
hide the stuff you want as default, and leave out some unnecessaries. E.g.,
if you want histograms, it's easy to make a histogram class that will
take a word sequence and give you a histogram object that will do what you want,
and that you can add to as your requirements change. E.g.,
class Histogram(dict): ... def __iadd__(self, name):
... self[name] = self.get(name, 0) + 1
... return self
... def __init__(self, wordseq=None):
... if wordseq is not None:
... for w in wordseq: self += w
...

now you can start to use this, e.g., pick some "words"
words = 'a a bb a c bb a'.split()
words ['a', 'a', 'bb', 'a', 'c', 'bb', 'a']

and make a histogram object
h = Histogram(words)
h {'a': 4, 'c': 1, 'bb': 2}

That's the __repr__ of the underlying dict showing the data. We can use other
dict methods, e.g.,
for name, value in h.items(): print '%6s: %s' %(name, value*'*') ...
a: ****
c: *
bb: **

or we could have overridden .items() to return a list sorted by names or first by frequency value.
Or we could add some specialized methods to do anything you like.

pump some more data:
for c in 'cccc': h+=c ... h {'a': 4, 'c': 5, 'bb': 2}

and another type:
for i in range(5): h+=i ... h {'a': 4, 0: 1, 'c': 5, 3: 1, 4: 1, 'bb': 2, 1: 1, 2: 1}

that's not very orderly
hitems = h.items()
hitems.sort()
hitems

[(0, 1), (1, 1), (2, 1), (3, 1), (4, 1), ('a', 4), ('bb', 2), ('c', 5)]

Of course we could override the items method of our class to return a sorted list,
by key or by value (which is occurence frequency here)

and we could override __repr__ and/or __str__ to return other representations of
the histogram. Etc., etc.

The point is, if we created a built-in way to handle every little problem someone
would like a concise way to handle, python would become a gigantic midden of one-offs.

So Python makes your one-offs easy instead ;-)
OTOH, if enough people like something, eventually it may get added.

Regards,
Bengt Richter
Jul 18 '05 #5

P: n/a
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

At 2003-11-17T21:29:16Z, tu******@hotmail.com (Tuang) writes:
In Python, apparently you have to first remember to declare your
dictionary outside the loop:

histogram {}


Note that you have to declare:

my %histogram;

outside the loop in Perl if use want it to be "use strict"-safe. And you
*do* want that, don't you?
- --
Kirk Strauser
The Strauser Group
Open. Solutions. Simple.
http://www.strausergroup.com/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (GNU/Linux)

iD8DBQE/uXql5sRg+Y0CpvERAjnoAJ9FFYb563uVtd/5pamdH05fg5T+5gCdE+yB
xwja+KMVVU0tqdQWXeNh5y8=
=X6bA
-----END PGP SIGNATURE-----
Jul 18 '05 #6

P: n/a
> Then within the loop you use the following construct:

histogram[word] = histogram.get(word, 0) + 1


Why not this?

if word in histogram:
histogram[word] += 1
else:
histogram[word] = 1

Isn't that crystal clear? Or, for that matter:

if word not in histogram:
histogram[word] = 0
histogram[word] += 1
Jul 18 '05 #7

P: n/a
"Dave Brueck" <da**@pythonapocrypha.com> wrote in message news:<ma************************************@pytho n.org>...
Tuang wrote:
But I'm surprised at what you apparently have to go through to do
something as common as counting the frequency of elements in a
collection. For example, counting word frequency in a file in Perl
means looping over all the words with the following line:

$histogram{$word}++;


Hi Tuang,

Here's my _opinion_: Perl is especially geared towards text processing, where
maybe counting word frequency is fairly common. Python is more of a general
purpose programming language, in which counting the word frequency is a pretty
rare operation (I can't remember needing to do that more than once or twice in
the past several years). As such, it probably doesn't make sense to support
that feature at the language level - it would burden a lot of people with
knowing syntax they'd rarely use.


Oops. I appear to have given the impression that word frequency was
what I was after. I was just using that as an easy to explain example
of a very common task: subtotaling.

Imagine that you have a list of records -- lines in a text file will
do fine. Let's say each record is a person and you're interested in
favorite colors.

You iterate thru the lines, regexing the "favorite color" field out of
each and put it in the variable $color. Then you just use the line:

$dict{$color}++

to count up what you find. The first time that line is called, it
creates the dictionary, then creates a key for $color, initializes its
value to zero, then increments it to 1.

As you continue iterating, each new color it encounters creates a new
key, initializing it to zero and incrementing. When it finds a color
that already has a key, it just increments the count.

When you get to the end, your dictionary has keys for all the favorite
colors listed by at least one person, along with a count for how many
people listed that as their favorite color.

You can then sort, either on keys or more likely on values, and list
the favorite colors in order of popularity.

The word frequency in a list (e.g. a file) of words is just another
similar operation. Create counters for each new word you encounter and
increment them each time you see them again.

And the actual application that brought this up is that I'm going thru
the Python online tutorial where it shows an algorithm for finding
primes. I just got curious about the distribution of gaps between
primes. It's the same problem: Find a prime. Then find the next higher
prime. Subtract the smaller from the larger to find the gap, then
increment the dictionary using that gap as the key.

This is a very common data analysis problem. It's the SQL database
operation of GROUP BY and then returning COUNT, but applied to any
sequence.

You can also subtotal nearly the same way. If you have sales records
for a bunch of salepeople, you just iterate thru the sales records,
plucking out a name for $salesperson and a sale amount ($amount), then
call:

$dict{$salesperson} += $amount

Instead of adding one, which "++" does, this increments by the amount
of the sale, resulting in subtotals for each salesperson.

This is such a common operation for Perl users that I'm surprised that
it's not easier to express in Python.

But I may be misunderstanding Python's philosophy a bit. I'm surprised
that value++ has to be spelled out as value = value+1, too, so I'm not
quite sure that I understand the philosophy.

[snip]
But I guess I'm making assumptions about what Python's philosophy
really is. I would expect that a language with something as nice as

[x**3 for x in my_list]
Building a list out of another list, however, is far more common, hence (in my
view at least) the appropriateness of syntax-level support.
Is this just something that hasn't been done yet but is on the way, or
is it a violation of Python's philosphy in some way?


Python can automatically import custom modules and functions on startup (search
for information on the site module), so if I were you I'd write a
WordHistorgram function in my custom site module just once and never look back.
The added benefit is that

histogram = WordHistogram(text)

is much more readable to me as well as others than

$histogram{$word}++;


I agree for word frequency, but not for something as general as GROUP
BY and (some operation, such as COUNT or SUM). Maybe using some of the
functional programming constructs of Python (before they're removed in
Python 3) would be the way to build my own.

And thanks for the tip on the "site module"! No matter what, that
sounds like something useful.

My impression is that features generally get added if (1) there is a good
enough case for their broad usefulness and (2) they don't overly compromise the
relatively clean syntax of the language. In this specific example, the
histogram-builder function fails both tests,


As I said, it shouldn't fail (1) if people understand it, unless
Python programmers are significantly different from Perl programmers.
They may actually be (which is why I'm asking), but it may just be
that its broad usefulness wasn't clear from my explanation.
Jul 18 '05 #8

P: n/a
"Diez B. Roggisch" <de************@web.de> wrote in message news:<bp*************@news.t-online.com>...
Hi,
But I'm surprised at what you apparently have to go through to do
something as common as counting the frequency of elements in a
collection. For example, counting word frequency in a file in Perl
means looping over all the words with the following line:

$histogram{$word}++;


Is this all thats needed - or is it actually something like this:

for $word in $words:
$histogram{$word}++

(I've never used perl, so I blatantly mix python/perl syntax here - but I
think you'll understand anyway :))


Yes, you guessed correctly. I just left out the loop wrapper, which is
essentially the same as the pseudocode version you wrote above,
because it's about the same as Python's. I was just talking about the
"guts" of the loop. ;-)
Jul 18 '05 #9

P: n/a
Tuang:
But I'm surprised at what you apparently have to go through to do
something as common as counting the frequency of elements in a
collection. For example, counting word frequency in a file in Perl
means looping over all the words with the following line: ... This seems sort of unPythonesque to me, given the relative cleanliness
and obviousness (after seeing it once) of other common Python
constructs.

But I guess I'm making assumptions about what Python's philosophy
really is.


I see several replies already, but they don't seem to address your
question about the philosophical reasons for this choice.

A Python philosophy is that "Errors should never pass silently."
(To see some of the other points, 'import this' from the Python
prompt.)

When you reference '$histogram{$word}++' in Perl it automatically
creates the hash 'histogram' and creates an entry for $word with
the value of 0 (I think; it may set it to undef or ""). This is great,
as long as you don't make mistakes.

But people do make mistakes and misspell variables. Had you
written '$histrogram{$word}' then Perl would have simply
created a new hash for you with that name. This is enough of
a problem in Perl that it's recommended you 'use strict'
and declare the hash beforehand, as 'my %histogram'.

Python takes this approach by default, so there's no need for
the 'use strict' declaration. Python also doesn't have the
sigil-based typing of Perl so you need to tell it what object
to create, hence the need for 'histogram={}'.

Similarly, dictionaries require that entries be created before they
can be used. This is because it's impossible for Python to
know which value you want for the default. Python is strongly
typed, so "2"+1 will raise an exception, unlike Perl where it
yields the number 3. If Python used a 0 for the default then
what if you really wanted to concatenate strings? If it used
"" then what if you wanted to add numbers? Whatever choice
you make, it will be wrong for most cases.

It works in Perl because of Perl's weak type system -- or
permissive coercion system if you want to look at it that way --
and corresponding 'typed' operators, so that + and . coerce
to numbers or strings, respectively.

I've also found that the requirement that the key exists before
being used catches mistakes similar to the requirement that
variables exist before being used.

If you want, you can make a class which acts like a perl
hash, and assigns a default value or lets you redefine what
to use for that default. Here's a start (similar in result but
different in approach to Peter Otten's example)

import UserDict

class PerlDict(UserDict.DictMixin):
def __init__(self, default = 0):
self.data = {}
self.default = default
def __getitem__(self, key):
try:
return self.data[key]
except KeyError:
self.data[key] = self.default
return self.default
def __setitem__(self, key, item):
self.data[key] = item
def __delitem__(self, key):
try:
del self.data[key]
except KeyError:
pass

But even with this you won't save much code. Here's what
it looks like:

histogram = PerlDict()
for line in open(filename):
for word in line.split():
histogram[word] += 1

Compare that to the canonical Python implementation

histogram = {}
for line in open(filename):
for word in line.split():
histogram[word] = histogram.get(word, 0) + 1

As several people pointed out, for this example you should
consider using a histogram/counter class, which would
separate intent from the actual calculation, as in

histogram = Histogram()
for line in open(filename):
for word in line.split():
histogram.count(word)

Your reply is that you're looking for the philosophy behind
Python, using the histogram as an example. That actually
is part of the philosophy -- in Python it's much easier to
make a class and instantiate an object with the appropriate
behaviours than it is in Perl, what with Perl's "bless" and
shift and @ISA. The above 'Histogram' is simply

class Histogram:
def __init__(self):
self.histogram = {}
def count(self, word):
self.histogram[word] = self.histogram.get(word, 0) + 1

In Perl the equivalent would be something like (and only
roughly like -- I never did fully figure out how do to Perl
OO correctly)

package Histogram;
sub new {
my ($class, $obj) = @_;
bless $class, $obj;
$obj -> {'histogram'} = {};
return $obj;
}
sub count {
my ($class, $obj, $word) = @_;
$obj -> {'histogram'}{$word}++;
}

However, for a one-off histogram this level of abstraction isn't
worthwhile.

To summarize, Python's philosophical differences from Perl
for your example are:
- variables must be declared before use (reduces errors)
- dict entries must be declared before use (reduces errors)
- dict entries cannot have a default value (strong typing)
- classes are easy to create (letting you create objects which
better fit your domain)

Andrew
da***@dalkescientific.com
Jul 18 '05 #10

P: n/a
Tuang wrote:
...
$dict{$color}++

to count up what you find. The first time that line is called, it
creates the dictionary, then creates a key for $color, initializes its
value to zero, then increments it to 1.
Right: it does a lot of different things depending on context. Python
tends to avoid context-dependency while Perl tends to use it with
enthusiasm.
This is a very common data analysis problem. It's the SQL database
operation of GROUP BY and then returning COUNT, but applied to any
sequence.
histogram = dict([ (value,seq.count(value)) for value in sets.Set(seq) ])

is a higher-order expression of the same concept. Not quite as fast
as the lower-level expression, because each .count step is a separate
loop over seq, but sometimes one prefers abstraction and concision to
speed. When speed IS preferred, being just a tad more explicit:

histogram = {}
for value in seq:
histogram[value] = 1 + histogram.get(value, 0)

doesn't seem all that big a deal to me -- basically, you just have
to initialize the histogram to be the empty dictionary (no implicit
creation!) and be explicit about using 0 as "previous mapping" for
values that are not keys in the histogram yet. I know that Raymond
Hettinger considers bags (aka multisets) just as fundamental as sets
(which he's laboring to make built-ins in the future 2.4 release),
so there may be a histogram.add(value) if he has his way -- but you
still will have to initialize histogram (to a bag, in that case).

Python will never second-guess you in terms of "oh he's using it as
a [set/bag/dict/list], and it doesn't exist, so I'll just create a
[whatever type] instance on the fly".

But I may be misunderstanding Python's philosophy a bit. I'm surprised
that value++ has to be spelled out as value = value+1, too, so I'm not
value += 1 is preferred these days. But in the histogram case, there
is no previous "value" to increment, and guesstimating to insert a 0
there just isn't Python's way.
quite sure that I understand the philosophy.
There is an ideal about "only one obvious way to do it", just as, in
C (per the Rationale to the C standard) there is an ideal to "provide
only one way to do an operation". So, having both value++ and
value += 1 should never happen (though in C it did: ideals are ideals,
the real world is sometimes messier:-). Perl's enthusiastic abundance
of multiple ways to perform each task is a very different philosophy.
I agree for word frequency, but not for something as general as GROUP
BY and (some operation, such as COUNT or SUM). Maybe using some of the
functional programming constructs of Python (before they're removed in
Python 3) would be the way to build my own.


List comprehensions (which Python copied from Haskell) are the key
FP construct in Python, and far from being removed they're growing
(with genexp's coming in 2.4 -- they're "lazy", like Haskell's LCs).

A more general GROUP BY (dict of lists) is built by:

histogram = {}
for value in seq:
histogram.setdefault(f(value), []).append(value)

where f(...) represents the grouping -- e.g., to group by the
value of an attribute x.key of each item x,

histogram.setdefault(x.key, []).append(x)

or if you want the higher-order abstraction,

histogram = dict([ (key, [y for y in seq if y.key==key] )
for key in sets.Set( [x.key for x in seq] )
])
Alex

Jul 18 '05 #11

P: n/a
[Tuang wrote]
I'm checking out Python as a candidate for replacing Perl as my "Swiss
Army knife" tool. The longer I can remember the syntax for performing
a task, the more likely I am to use it on the spot if the need arises.
If I have to go off and look it up, as I increasingly have to do with
Perl's ever hairier syntax, I'm more likely to just skip it, making me
even less likely to remember the syntax the next time.
and

[Tuang wrote] Imagine that you have a list of records -- lines in a text file will
do fine. Let's say each record is a person and you're interested in
favorite colors.

You iterate thru the lines, regexing the "favorite color" field out of
each and put it in the variable $color. Then you just use the line:

$dict{$color}++

to count up what you find. The first time that line is called, it
creates the dictionary, then creates a key for $color, initializes its
value to zero, then increments it to 1.

As you continue iterating, each new color it encounters creates a new
key, initializing it to zero and incrementing. When it finds a color
that already has a key, it just increments the count.


It seems to me that it is not syntax that is the issue here, but
semantics.

You've listed a number of data structures that are creatly implicitly
by Perl, given the above syntax, i.e. dictionaries, keys, etc, are
created implicitly as your code is run.

That, IMHO, is what hinders you from remembering the "syntax" to carry
out the job. You don't just have to remember the syntax, but also all
of the implicit object creation semantics that go with it. That is
more likely to send you reaching for the reference manual, and also
more likely to prevent you using such use a construct (according to
your statement "If I have to go off and look it up .. I'm more likely
to just skip it").

The problem with such implicit semantics is that they increase the
complexity of what it is that you're learning, and thus steepen your
learning curve. It gets worse if/when there are lots of special cases
because the implicit object creation semantics don't fit every
situation, and the semantics implied by a syntax vary depending on
context.

I think the fundamental difference between Perl and Python is one of
philosophy.

Perl people seem to like implied semantics, because it gives them
terser code, meaning that

1. More steps can be achieved with less syntax, e.g. golf
competitions[1].
2. Increasing knowledge of the "secret operation of the machine"
brings on feelings of guru-hood.
3. Which is shown off by writing code which depends upon the detailed
implicit semantics of the machine, that no-one else can understand
without a similar detailed understanding of the machine.

Python takes the opposite philosophical view: Explicit is better than
Implicit. Meaning object and data creation semantics are always
explicit. This tends to make python code easier to read, because there
is a limited (but well-designed and powerful) set of semantics to
learn, which must be explicitly stated by all code that uses them.

Have you done this yet?

shell>python
Python 2.3 (#46, Jul 29 2003, 18:54:32) [MSC v.1200 32 bit (Intel)] on
win32
Type "help", "copyright", "credits" or "license" for more information.
import this The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!


Welcome to the python community.

--
alan kennedy
-----------------------------------------------------
check http headers here: http://xhaus.com/headers
email alan: http://xhaus.com/mailto/alan

[1] http://perlmonks.thepen.com/130140.html
Jul 18 '05 #12

P: n/a
bo**@oz.net (Bengt Richter) wrote

IMO Python is a very good Swiss Army Knife. If you think you have a pattern
that you will want to re-use, then it is pretty easy to make something to
hide the stuff you want as default, and leave out some unnecessaries. E.g.,
if you want histograms, it's easy to make a histogram class that will
take a word sequence and give you a histogram object that will do what you want,
and that you can add to as your requirements change. E.g.,


[...lots of useful stuff for me to copy for future reference...]

Thanks!

I appreciate your examples of how easy it is to create abstractions
somewhat above the level of the built-ins for future reuse. It can be
done in Perl too, of course, but it appears to be so much more
convenient in Python that I think I actually will start assembling a
bunch of prebuilt tools of this sort to put in a standard module of my
own (my personal toolbox). As I said, it can be done in Perl, too, but
it's just difficult enough that I never quite bothered to do it. I
definitely like this aspect of Python.
Jul 18 '05 #13

P: n/a
In article <hv*****************@newsread2.news.pas.earthlink. net>, Andrew Dalke wrote:
Similarly, dictionaries require that entries be created before they
can be used. This is because it's impossible for Python to
know which value you want for the default. Python is strongly
typed, so "2"+1 will raise an exception, unlike Perl where it
yields the number 3. If Python used a 0 for the default then
what if you really wanted to concatenate strings? If it used
"" then what if you wanted to add numbers? Whatever choice
you make, it will be wrong for most cases.


I agree with everything else you said here, but want to caution against your
claim that it's "impossible for Python to know" what to supply as a default.
For instance--and I'm not saying this is better or worse, nor that it is in
harmony with the Python Way--Ruby works like this instead:

irb(main):001:0> d = {}
=> {}
irb(main):002:0> d['a']
=> nil
irb(main):003:0> d.default = 0
=> 0
irb(main):004:0> d['a']
=> 0
irb(main):005:0> d.fetch('a')
IndexError: key not found
from (irb):5:in fetch'
from (irb):5

So, as you can see, the default behavior in Ruby is akin to Python's
dict.get (which allows you to supply a default), and the exception-throwing
method is "fetch", instead of [].

In summary:

Python Ruby
------------- ----------------------
d['a'] d.fetch('a')
d.get('a') d['a']
d.get('a', 0) d.default = 0; d['a']

I prefer the Python way because the lazier syntax ([]) is fail-fast.

--
..:[ dave benjamin (ramenboy) -:- www.ramenfest.com -:- www.3dex.com ]:.
: d r i n k i n g l i f e o u t o f t h e c o n t a i n e r :
Jul 18 '05 #14

P: n/a
Kirk Strauser <ki**@strauser.com> wrote in message news:<87************@strauser.com>...
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

At 2003-11-17T21:29:16Z, tu******@hotmail.com (Tuang) writes:
In Python, apparently you have to first remember to declare your
dictionary outside the loop:

histogram {}


Note that you have to declare:

my %histogram;

outside the loop in Perl if use want it to be "use strict"-safe. And you
*do* want that, don't you?


Actually, I don't, so your point is a good one. I occasionally build
something big enough in Perl that "use strict" matters (me:"What? Perl
CGI is the only platform available?", them:"Yep. Take it or leave
it"), but that's not my preferred use for Perl. My preferred use is
when I encounter some data in one form and I want to quickly process
it into another form with about two minutes of work and less than ten
lines of code. Sort of like using a particularly powerful
sed+awk+grep+sort+... all rolled into one utility. I don't want to
have to set things up. I just want to tell it what I want and have it
take care of the overhead (declaring variables, initializing them to
zero, taking care of memory, etc.)

I was hoping that Python would be just as easy to use in that role AND
scale better to larger programs. I'm convinced that it will scale
better, but there is apparently a rather small but still real cost in
inconvenience at the "one-liner" level.
Jul 18 '05 #15

P: n/a
Dave Benjamin
I agree with everything else you said here, but want to caution against your claim that it's "impossible for Python to know" what to supply as a default.

The full quote is 'This is because it's impossible for Python to know which
value you want for the default'. Note that "what you want".

I later said:
Whatever choice you make [should have said "Python makes"], it
will be wrong for most cases.
irb(main):002:0> d['a']
=> nil
In most cases, this is the wrong return value, yes?
irb(main):003:0> d.default = 0
=> 0
And there's where you tell Ruby what you want.
d.get('a', 0) d.default = 0; d['a']

I prefer the Python way because the lazier syntax ([]) is fail-fast.


And because it's thread-safe.

I do see your point that 'get' does provide a default value of
None if the item doesn't exist, but the behaviour, given that
the default is passed as part of the function parameter, means
that that value isn't really associated with the dictionary.

Andrew
da***@dalkescientific.com
Jul 18 '05 #16

P: n/a
"Andrew Dalke" <ad****@mindspring.com> wrote in message news:<hv*****************@newsread2.news.pas.earth link.net>...
Tuang:
But I'm surprised at what you apparently have to go through to do
something as common as counting the frequency of elements in a
collection. For example, counting word frequency in a file in Perl
means looping over all the words with the following line: ...
This seems sort of unPythonesque to me, given the relative cleanliness
and obviousness (after seeing it once) of other common Python
constructs.

But I guess I'm making assumptions about what Python's philosophy
really is.


I see several replies already, but they don't seem to address your
question about the philosophical reasons for this choice.


Thanks for a spot-on answer to my exact question. ;-)

A Python philosophy is that "Errors should never pass silently."
(To see some of the other points, 'import this' from the Python
prompt.)

When you reference '$histogram{$word}++' in Perl it automatically
creates the hash 'histogram' and creates an entry for $word with
the value of 0 (I think; it may set it to undef or ""). This is great,
as long as you don't make mistakes.

But people do make mistakes and misspell variables. Had you
written '$histrogram{$word}' then Perl would have simply
created a new hash for you with that name. This is enough of
a problem in Perl that it's recommended you 'use strict'
and declare the hash beforehand, as 'my %histogram'.
Very true. Other respondants made this point as well, which I think is
a good one. It sort of points out that my favorite use of Perl is for
quick one-liners that let me do exactly what I want on the spur of the
moment, as opposed to writing bigger "programs" for repeated use. I've
done both in Perl, but I think it's better suited to the former than
the latter. Python appears to scale up much more gracefully.

That use of Perl for one-liners, though, is why having them be easy,
automatic, and memorable matters. That's why I would prefer the same
convenience in anything I use to replace Perl for "Swiss Army knife"
use. I would apparently have to give up a bit of the convenience at
the one-liner level to buy the superior scaling (and other relative
benefits) of Python.
...Python is strongly
typed, so "2"+1 will raise an exception, unlike Perl where it
yields the number 3. If Python used a 0 for the default then
what if you really wanted to concatenate strings? If it used
"" then what if you wanted to add numbers? Whatever choice
you make, it will be wrong for most cases.
I find Perl's defaults to be right almost *all* of the time. Increment
a counter? Fine, $counter++ creates the counter if it needs to be
created, initializes it if it was just created, and then increments
it. On the rare occasions when I want it initialized to something
other than 0, I just initialize it myself, but the default
initialization of zero is right almost all the time for numbers, so
why not take advantage of that fact?

And how often would you want to initialize a string to anything other
than an empty string? So in Perl, you just start concatenating, and if
the string doesn't already exist, it's automatically created for you
and your first concatenation concatenates to an empty string. Again,
Perl's default is almost always what you want, and in those rare cases
where it's not what you want, you're free to do what you have to do in
Python every single time.

Your point about the typing, though, may be a reason why such a thing
couldn't be done in Python, and your point about "use strict"
indicates to me that such convenience may have proven to be a
liability as soon as your program starts to grow.

I do know that there have been a few occasions when I've been stumped
by Perl's guessing incorrectly at what data type I meant and my not
having any way to explicitly tell it.

Your reply is that you're looking for the philosophy behind
Python, using the histogram as an example. That actually
is part of the philosophy -- in Python it's much easier to
make a class and instantiate an object with the appropriate
behaviours than it is in Perl, what with Perl's "bless" and
shift and @ISA.
This is very true. It's starting to appear as though Python's
built-ins are often less convenient than Perl's, but in every case
that I've seen so far it has been easier to combine those built-ins
into higher-level abstractions in Python than in Perl.
The above 'Histogram' is simply
class Histogram:
def __init__(self):
self.histogram = {}
def count(self, word):
self.histogram[word] = self.histogram.get(word, 0) + 1

In Perl the equivalent would be something like (and only
roughly like -- I never did fully figure out how do to Perl
OO correctly)
LOL! Neither have I. It's not that we're retarded -- well, you be the
judge ;-) -- but it's just more trouble than it's worth in Perl. There
are so many useful Perl modules that I try to remember just enough to
be able to use other peoples', but it's just not worth the bother
remembering how to make them myself. After a few hours with Python, I
was already better at object-oriented Python than I ever was at
object-oriented Perl.


package Histogram;
sub new {
my ($class, $obj) = @_;
bless $class, $obj;
$obj -> {'histogram'} = {};
return $obj;
}
sub count {
my ($class, $obj, $word) = @_;
$obj -> {'histogram'}{$word}++;
}

However, for a one-off histogram this level of abstraction isn't
worthwhile.
Amen. But $hist{$word++} works beautifully.

To summarize, Python's philosophical differences from Perl
for your example are:
- variables must be declared before use (reduces errors)
- dict entries must be declared before use (reduces errors)
- dict entries cannot have a default value (strong typing)
- classes are easy to create (letting you create objects which
better fit your domain)


Thanks. That perfectly explained the philosophy behind my example.
Jul 18 '05 #17

P: n/a
Tuang:
That use of Perl for one-liners, though, is why having them be easy,
automatic, and memorable matters. That's why I would prefer the same
convenience in anything I use to replace Perl for "Swiss Army knife"
use. I would apparently have to give up a bit of the convenience at
the one-liner level to buy the superior scaling (and other relative
benefits) of Python.
One thing to consider is Python's interactive nature means you don't
have to do everything as one-liners. For more complicated cases where
I would have done a one-off one-liner I now start up Python and
to things step-by-step. Python is *not* a good shell substitute, though
PyShell does help a lot, judging from the demo I saw.
I find Perl's defaults to be right almost *all* of the time.
That is true, but only works because of weak types and typed
operations.
Increment a counter? Fine, $counter++ creates the counter if it
needs to be created, initializes it if it was just created, and then
increments it.
And your use of ++ tells it to be a number. I agree that Perl
puts a lot of flexibility into just a few characters.
On the rare occasions when I want it initialized to something
other than 0, I just initialize it myself, but the default
initialization of zero is right almost all the time for numbers, so
why not take advantage of that fact?
It doesn't initilize to 0, I think it initializes to undef. It's only
when you do ++ or += 1 where the undef gets converted into
an integer. You can test it out by doing ' .= "X"' instead of ++.
It should not start with a '0', which it would if the default value
was 0.
Amen. But $hist{$word++} works beautifully.
Especially if you typed it correctly ;)
Thanks. That perfectly explained the philosophy behind my example.


You're welcome.

Andrew
da***@dalkescientific.com
Jul 18 '05 #18

P: n/a
Alan Kennedy <al****@hotmail.com> wrote

[lots of good stuff...]
Welcome to the python community.


Thanks. It looks like a pretty good community. The graciousness of
this community in its response to expressions of concern or scepticism
by newcomers stands in sharp contrast to the hostility with which
similar questions are met by certain other language communities and
definitely increases my interest in Python.
Jul 18 '05 #19

P: n/a

"Tuang" <tu******@hotmail.com> wrote in message
news:df**************************@posting.google.c om...
$dict{$color}++
to count up what you find.
Your wish is Python's command:

class cdict(dict):
def __getitem__(self, key):
try:
return dict.__getitem__(self, key)
except KeyError:
return 0

h=cdict()
for item in [1,1,2,3,2,3,1,4]:
h[item]+=1
h {1: 3, 2: 2, 3: 2, 4: 1}
$dict{$salesperson} += $amount


Trivial modification:

t=cdict()
for key,amt in ((1,2), (1,3), (2,5), (3,1), (3,2), (3,3)):
t[key] += amt
t

{1: 5, 2: 5, 3: 6}

Terry J. Reedy


Jul 18 '05 #20

P: n/a
"Terry Reedy" <tj*****@udel.edu> wrote in message news:<v7********************@comcast.com>...
"Tuang" <tu******@hotmail.com> wrote in message
news:df**************************@posting.google.c om...
$dict{$color}++
to count up what you find.


Your wish is Python's command:

class cdict(dict):
def __getitem__(self, key):
try:
return dict.__getitem__(self, key)
except KeyError:
return 0

h=cdict()
for item in [1,1,2,3,2,3,1,4]:
h[item]+=1
h {1: 3, 2: 2, 3: 2, 4: 1}
$dict{$salesperson} += $amount


Trivial modification:

t=cdict()
for key,amt in ((1,2), (1,3), (2,5), (3,1), (3,2), (3,3)):
t[key] += amt
t

{1: 5, 2: 5, 3: 6}


Nice. I like that. It appears that in Python it pays to think at a
slightly higher level of abstraction than in Perl. Put a little extra
time into creating a little tool for the job the first time, then
reuse it the next time. I do that with Perl code snippets, but using
small classes or functions instead appeals to me.
Jul 18 '05 #21

P: n/a
I just started learning Python over the weekend, and I have to say that it
was very frustrating at first. This thread came at just the right time,
and it's been very enlightening. Understanding Python's approach has
helped me accept and appreciate why it does things the way it does.

I was originally a C advocate, lured to Perl by the rapid
development... I'm a Unix sysadmin and Perl was perfect for me. But as I
look at doing larger projects, I find Perl's modules, scoping and OO to be
esoteric. Python looks like just the ticket.

Thanks to everyone who contributed.

--
Carl D Cravens (ra***@phoenyx.net)
I'm not lost, I'm "locationally challenged".
Jul 18 '05 #22

P: n/a
tu******@hotmail.com (Tuang) writes:
Nice. I like that. It appears that in Python it pays to think at a
slightly higher level of abstraction than in Perl. Put a little extra
time into creating a little tool for the job the first time, then
reuse it the next time. I do that with Perl code snippets, but using
small classes or functions instead appeals to me.


Yep, w/ Python, if you want magic, you just implement it yourself or
use a premade module that implements the magic. It's all doable, it's
just not the standard, out of the box way. IMO the "convenience"
argument for having the magic built into the language is worthless,
because getting to that magic is easy enough w/ the tools we already
have. doRE("var=~s/hi/hello/i"), anyone? :-)

And having something as an out-of-the-box __builtins__ is still
infinitely preferable to having an implementation of something
intermingled w/ the code generator for the language, even w/ the
minimal sacrifice of performance.

--
Ville Vainio http://www.students.tut.fi/~vainio24
Jul 18 '05 #23

This discussion thread is closed

Replies have been disabled for this discussion.