By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
448,959 Members | 1,196 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 448,959 IT Pros & Developers. It's quick & easy.

user-defined operators: a very modest proposal

P: n/a
I have been studying Python recently, and I read a comment on one
web page that said something like "the people using Python for heavy math
really wish they could define their own operators". The specific
example was to define an "outer product" operator for matrices. (There
was even a PEP, number 211, about this.)

I gave it some thought, and Googled for previous discussions about this,
and came up with this suggestion:

User-defined operators could be defined like the following: ]+[

I'm not any kind of language design expert, but this seems to me like a
syntax that would be easy for Python to recognize. Because the square
braces are reversed from the usual "[]" order, this should not look like
any currently-valid code. And square braces, IMHO, do not fail the
"low-toner printout" test. (Some earlier proposals included operators like
"~+" and these were deemed too hard to read.)

For improved readability, Python could even enforce a requirement that
there should be white space on either side of a user-defined operator.
I don't really think that's necessary.

It should be possible to define operators using punctuation,
alphanumerics, or both:

]+[
]add[
]outer*[
Examples of use:

m = m0 ]*[ m1
m = m0]*[m1

m = m0 ]outer*[ m1
m = m0]outer*[m1

It looks a lot better with the white space, I think, but it's not horrible
without the white space.
Also, there should be a way to declare what kind of precedence the user-defined
operators use. Python already has lots of operators with different precedence,
and I think the best way is just to indicate which Python operator the new
operator's precedence should match:

class MyExcellentMatrix(object):
@precedence('*')
def __op_outer*__(self, right):
# ...do stuff...

I think a decorator is a good way to set the precedence.
Perhaps the default precedence should be that of '+'.

Augmented forms should be supported:

]+=[
]*=[
]outer*=[
Examples:

m ]*=[ m0
m]*=[m0

m ]outer*=[ m0
m]outer*=[
Either I actually have made a sensible suggestion, or else people will now
explain why this idea isn't good (and I'll learn something). Either way,
I look forward to your comments.

References:

Elementwise/Objectwise Operators
http://www.python.org/peps/pep-0225.html
Adding A New Outer Product Operator
http://www.python.org/peps/pep-0211.html
--
Steve R. Hastings "Vita est"
st***@hastings.org http://www.blarg.net/~steveha
Nov 22 '05 #1
Share this Question
Share on Google+
17 Replies


P: n/a
On Tue, 22 Nov 2005 13:48:05 -0800, Steve R. Hastings wrote:
User-defined operators could be defined like the following: ]+[
[snip]
Examples of use:

m = m0 ]*[ m1
m = m0]*[m1
That looks to me like multiplying two lists. I have to look twice to see
that the operands are merely m0 and m1 and not [m0] and [m1].
m = m0 ]outer*[ m1
m = m0]outer*[m1


That just looks weird.
Here is a thought: Python already supports an unlimited number of
operators, if you write them in prefix notation:

inner_product(m0, m1)
outer_product(m0, m1)
etc.

Here is some syntax that I don't object to, although that's not saying
much. In mathematics, there are operators of a plus sign within a circle,
multiply sign within a circle, etc. The closest we can get in plain ASCII
would be:

m0(+)m1
m0(*)m1
m0(-)m1
etc.
--
Steven.

Nov 22 '05 #2

P: n/a
If your proposal is implemented, what does this code mean?
if [1,2]+[3,4] != [1,2,3,4]: raise TestFailed, 'list concatenation'
Since it contains ']+[' I assume it must now be parsed as a user-defined
operator, but this code currently has a meaning in Python.

(This code is the first example I found, Python 2.3's test/test_types.py, so it
is actual code)

I don't believe that Python needs user-defined operators, but let me share my
terrible proposal anyway: Each unicode character in the class 'Sm' (Symbol,
Math) whose value is greater than 127 may be used as a user-defined operator.
The special method called depends on the ord() of the unicode character, so
that __u2044__ is called when the source code contains u'\N{FRACTION SLASH}'.
Whatever alternate syntax is adopted to allow unicode identifier charactersto
be typed in pure ASCII will also apply to typing user-defined operators. "r"
and "i" versions of the operators will of course exist, as in __ru2044__ and
__iu2044__.

Also, to accomodate operators such as u'\N{DOUBLE INTEGRAL}', which are not
simple unary or binary operators, the character u'\N{NO BREAK SPACE}' will be
used to separate arguments. When necessary, parentheses will be added to
remove ambiguity. This leads naturally to expressions like
\N{DOUBLE INTEGRAL} (y * x**2) \N{NO BREAK SPACE} dx \N{NO BREAK SPACE} dy
(corresponding to the call (y*x**2).__u222c__(dx, dy)) which are clearly easy
to love, except for the small issue that many inferior editors will not clearly
display the \N{NO BREAK SPACE} characters.

Some items on which I think I'd like to hear the community's ideas are:
* Do we give special meaning to comparison characters like
\N{NEITHER LESS-THAN NOR GREATER-THAN}, or let users define them in new
ways? We could just provide, on object,
def __u2279__(self, other): return not self.__gt__(other) and other.__gt__(self)
which would in effect satisfy all users.

* Do we immediately implement the combination of operators with nonspacing
marks, or defer it? If we implement it, do we allow the combination with
pure ASCII operators, as in
u'\N{COMBINING LEFT RIGHT ARROW ABOVE}+'
or treat it as a syntax error? (BTW the method name for this would be
__u20e1u002b__, even though it might be tempting to support __u20e1x2b__,
__u2oe1add__ and similar method names) How and when do we normalize
operators combined with more than one nonspacing mark?

* Which unicode operator methods should be supported by built-in types?
Implementing __u222a__ and __iu222a__ for sets is a no-brainer,
obviously, but what about __iu2206__ for integers and long?

* Should some of the unicode mathematical symbols be reserved for literals?
It would be greatly preferable to write \u2205 instead of the other proposed
empty-set literal notation, {-}. Perhaps nullary operators could be defined,
so that writing \u2205 alone is the same as __u2205__() i.e., callingthe
nullary function, whether it is defined at the local, lexical, module, or
built-in scope.

* Do we support characters from the category 'So' (symbol, other)? Not
doing so means preventing programmers from using operators like
\u"n{HEAVY CONCAVE-POINTED BLACK RIGHTWARDS ARROW}". Who are we to
make those kinds of choices for our users?

Jeff

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)

iD8DBQFDg6EaJd01MZaTXX0RAn8NAJ0enTxrgz3KAS1otCMHFF DYkSKeQQCgmtyV
OvbivR1dPtSaT2+bAMjK4jg=
=rK5l
-----END PGP SIGNATURE-----

Nov 22 '05 #3

P: n/a
Steve R. Hastings wrote:
I have been studying Python recently, and I read a comment on one
web page that said something like "the people using Python for heavy math
really wish they could define their own operators". The specific
example was to define an "outer product" operator for matrices. (There
was even a PEP, number 211, about this.)

I gave it some thought, and Googled for previous discussions about this,
and came up with this suggestion:

User-defined operators could be defined like the following: ]+[

I'm not any kind of language design expert, but this seems to me like a
syntax that would be easy for Python to recognize. Because the square
braces are reversed from the usual "[]" order, this should not look like
any currently-valid code.
Is [a,b]+[c] the concatenation of two lists, or a single two-element
list containing a and b ]+[ c?
And square braces, IMHO, do not fail the "low-toner printout" test.


They do. Just yesterday I printed some code in which some of the
square braces didn't show up.

Nov 22 '05 #4

P: n/a
"Steve R. Hastings" <st*****@localhost.localdomain> writes:
I have been studying Python recently, and I read a comment on one
web page that said something like "the people using Python for heavy math
really wish they could define their own operators". The specific
example was to define an "outer product" operator for matrices. (There
was even a PEP, number 211, about this.)
I gave it some thought, and Googled for previous discussions about this,
and came up with this suggestion:
User-defined operators could be defined like the following: ]+[


See <URL:
http://aspn.activestate.com/ASPN/Coo.../Recipe/384122 > for
some better suggestions, including an implementation in Python.

<mike
--
Mike Meyer <mw*@mired.org> http://www.mired.org/home/mwm/
Independent WWW/Perforce/FreeBSD/Unix consultant, email for more information.
Nov 23 '05 #5

P: n/a
> Here is a thought: Python already supports an unlimited number of
operators, if you write them in prefix notation:
And indeed, so far Python hasn't added user-defined operators because this
has been adequate.

Here is some syntax that I don't object to, although that's not saying
much. m0(+)m1


That form was discussed previously, as were "[+]", "<+>", etc. The
favorite was "{+}". I believe such forms were considered hard to tell
from code. In particular, m0(+) looks like a function call.

See the PEP:

http://www.python.org/peps/pep-0225.html

Alas, the links to the discussion about this don't work. But it is
possible to use the Google Groups archive of comp.lang.python to read some
of the discussion.
--
Steve R. Hastings "Vita est"
st***@hastings.org http://www.blarg.net/~steveha

Nov 23 '05 #6

P: n/a
> if [1,2]+[3,4] != [1,2,3,4]: raise TestFailed, 'list concatenation'
Since it contains ']+[' I assume it must now be parsed as a user-defined
operator, but this code currently has a meaning in Python.


Yes. I agree that this is a fatal flaw in my suggestion.

Perhaps there is no syntax that can be done inside the bounds of ASCII
that will please everyone and not break existing code.
Your suggestion of Unicode makes a lot of sense. There are glyphs for
math operators, and if Python can accept Unicode source files, that seems
to me like a much better solution than hacks involving ASCII characters.

I didn't notice it before, but PEP 263 allows Python source files to be
Unicode:

http://www.python.org/peps/pep-0263.html

So the latest versions of Python already have support for Unicode source
files!
Could such Unicode sources be exported to ASCII for porting code to
platforms that don't allow Unicode Python files? Yes: just replace the
Unicode character with a symbol like __op__, where op is the operator.

Actually, that's a better syntax than the one I proposed, too:

__+__
# __add__ # this one's already in use, so not allowed
__outer*__
--
Steve R. Hastings "Vita est"
st***@hastings.org http://www.blarg.net/~steveha

Nov 23 '05 #7

P: n/a
On Tue, Nov 22, 2005 at 04:08:41PM -0800, Steve R. Hastings wrote:
Actually, that's a better syntax than the one I proposed, too:

__+__
# __add__ # this one's already in use, so not allowed
__outer*__


Again, this means something already.
__ = 3
__+__ 6 __outer = 'x'
__outer*__

'xxx'

Jeff

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)

iD8DBQFDg74ZJd01MZaTXX0RAtDAAJ9pMXaY8ybWaCznIQgR4N 4xHISDcQCfaFJw
yAbNACnP5Tx2wGO6jJE7UXU=
=7jGl
-----END PGP SIGNATURE-----

Nov 23 '05 #8

P: n/a
On Tue, 22 Nov 2005, Steve R. Hastings wrote:
User-defined operators could be defined like the following: ]+[
Eeek. That really doesn't look right.

Could you remind me of the reason we can't say [+]? It seems to me that an
operator can never be a legal filling for an array literal or a subscript,
so there wouldn't be ambiguity.

We could even just say that [?] is an array version of whatever operator ?
is, and let python do the heavy lifting (excuse the pun) of looping it
over the operands. [[?]] would obviously be a doubly-lifted version.
Although that would mean[*] is a componentwise product, rather than an
outer product, which wouldn't really help you very much! Maybe we could
define {?} as the generalised outer/tensor version of the ? operator ...
For improved readability, Python could even enforce a requirement that
there should be white space on either side of a user-defined operator. I
don't really think that's necessary.
Indeed, it would be extremely wrong - normal operators don't require that,
and special cases aren't special enough to break the rules.

Reminds me of my idea for using spaces instead of parentheses for grouping
in expressions, so a+b * c+d evaluates as (a+b)*(c+d) - one of my worst
ideas ever, i'd say, up there with gin milkshakes.
Also, there should be a way to declare what kind of precedence the
user-defined operators use.


Can't be done - different uses of the same operator symbol on different
classes could have different precedence, right? So python would need to
know what the class of the receiver is before it can work out the
evaluation order of the expression; python does evaluation order at
compile time, but only knows classes at execute time, so no dice.

Also, i'm pretty sure you could cook up a situation where you could
exploit differing precedences of different definitions of one symbol to
generate ambiguous cases, but i'm not in a twisted enough mood to actually
work out a concrete example!

And now for something completely different.

For Py4k, i think we should allow any sequence of characters that doesn't
mean something else to be an operator, supported with one special method
to rule them all, __oper__(self, ator, and), so:

a + b

Becomes:

a.__oper__("+", b)

And:

a --{--@ b

Becomes:

a.__oper__("--{--@", b) # Euler's 'single rose' operator

Etc. We need to be able to distinguish a + -b from a +- b, but this is
where i can bring my grouping-by-whitespace idea into play, requiring
whitespace separating operands and operators - after all, if it's good
enough for grouping statements (as it evidently is at present), it's good
enough for expressions. The character ']' would be treated as whitespace,
so a[b] would be handled as a.__oper__("[", b). Naturally, the . operator
would also be handled through __oper__.

Jeff Epler's proposal to use unicode operators would synergise most
excellently with this, allowing python to finally reach, and even surpass,
the level of expressiveness found in languages such as perl, APL and
INTERCAL.

tom

--
I DO IT WRONG!!!
Nov 23 '05 #9

P: n/a
Tom Anderson wrote:
Jeff Epler's proposal to use unicode operators would synergise most
excellently with this, allowing python to finally reach, and even surpass,
the level of expressiveness found in languages such as perl, APL and
INTERCAL.

tom

What do you mean by unicode operators? Link?
Nov 23 '05 #10

P: n/a
On Tue, 22 Nov 2005 je****@unpythonic.net wrote:
Each unicode character in the class 'Sm' (Symbol,
Math) whose value is greater than 127 may be used as a user-defined operator.
EXCELLENT idea, Jeff!
Also, to accomodate operators such as u'\N{DOUBLE INTEGRAL}', which are not
simple unary or binary operators, the character u'\N{NO BREAK SPACE}' will be
used to separate arguments. When necessary, parentheses will be added to
remove ambiguity. This leads naturally to expressions like
\N{DOUBLE INTEGRAL} (y * x**2) \N{NO BREAK SPACE} dx \N{NO BREAK SPACE} dy
(corresponding to the call (y*x**2).__u222c__(dx, dy)) which are clearly easy
to love, except for the small issue that many inferior editors will not clearly
display the \N{NO BREAK SPACE} characters.
Could we use '\u2202' instead of 'd'? Or, to be more correct, is there a
d-which-is-not-a-d somewhere in the mathematical character sets? It would
be very useful to be able to distinguish d'x', as it were, from 'dx'.
* Do we immediately implement the combination of operators with nonspacing
marks, or defer it?
As long as you don't use normalisation form D, i'm happy.
* Should some of the unicode mathematical symbols be reserved for literals?
It would be greatly preferable to write \u2205 instead of the other proposed
empty-set literal notation, {-}. Perhaps nullary operators could be defined,
so that writing \u2205 alone is the same as __u2205__() i.e., calling the
nullary function, whether it is defined at the local, lexical, module, or
built-in scope.


Sounds like a good idea. \u211D and relatives would also be a candidate
for this treatment.

And for those of you out there who are laughing at this, i'd point out
that Perl IS ACTUALLY DOING THIS.

tom

--
I DO IT WRONG!!!
Nov 23 '05 #11

P: n/a
Joseph Garvin wrote:
Jeff Epler's proposal to use unicode operators would synergise most
excellently with this, allowing python to finally reach, and even surpass,
the level of expressiveness found in languages such as perl, APL and
INTERCAL.

What do you mean by unicode operators? Link?


a few messages earlier in the thead you're posting to. if your mail or news
provider is dropping messages, you can read the group via e.g.

http://news.gmane.org/gmane.comp.python.general

jeff's proposal is here:

http://article.gmane.org/gmane.comp....general/433247

</F>

Nov 23 '05 #12

P: n/a
On 23/11/05, Joseph Garvin <k0*****@kzoo.edu> wrote:
What do you mean by unicode operators? Link?


http://fishbowl.pastiche.org/2003/03...d_operator_set

--
Cheers,
Simon B,
si***@brunningonline.net,
http://www.brunningonline.net/simon/blog/
Nov 23 '05 #13

P: n/a
On 23/11/05, Fredrik Lundh <fr*****@pythonware.com> wrote:
see also:

http://www.brunningonline.net/simon/...es/000666.html
http://www.python.org/peps/pep-0666.html


PEP 666 should have been left open. There are a number of ideas that
come up here that should be added to it - and i'm sure there'll be
more.

--
Cheers,
Simon B,
si***@brunningonline.net,
http://www.brunningonline.net/simon/blog/
Nov 23 '05 #15

P: n/a
Joseph Garvin wrote:
Tom Anderson wrote:
Jeff Epler's proposal to use unicode operators would synergise most
excellently with this, allowing python to finally reach, and even
surpass, the level of expressiveness found in languages such as perl,
APL and INTERCAL.


s/expressiveness/unreadability/
--
bruno desthuilliers
python -c "print '@'.join(['.'.join([w[::-1] for w in p.split('.')]) for
p in 'o****@xiludom.gro'.split('@')])"
Nov 23 '05 #16

P: n/a
Steve R. Hastings wrote:
It should be possible to define operators using punctuation,
alphanumerics, or both:

]+[
]add[
]outer*[


Seems like you look for advanced source-code editors.Some ideas are
around for quite a while e.g. here

http://en.wikipedia.org/wiki/Intentional_programming

I'm not sure if current computer algebra systems also offer a WYSIWYG
input mode? Of course this is not clutter and line noise but domain
specific standard notation.

There has also been a more Python related ambitious multi-language
project called Logix that enabled user-defined operators but it seems
to be dead.

Kay

Nov 23 '05 #17

P: n/a
Op 2005-11-22, je****@unpythonic.net schreef <je****@unpythonic.net>:
* Should some of the unicode mathematical symbols be reserved for literals?
It would be greatly preferable to write \u2205 instead of the other proposed
empty-set literal notation, {-}. Perhaps nullary operators could be defined,
so that writing \u2205 alone is the same as __u2205__() i.e., calling the
nullary function, whether it is defined at the local, lexical, module, or
built-in scope.


Isn't this essentially already happening with lists?.

And isn't something like this already possible with properties, except
for the scoping.

If python would develop the property idea a bit further and have
variables that would call a function each time they are accessed,
something like this could work.

--
Antoon Pardon
Nov 24 '05 #18

This discussion thread is closed

Replies have been disabled for this discussion.