473,387 Members | 1,374 Online

# boolean operations on sets

Hi, I have been playing with set operations lately and came across a
kind of surprising result given that it is not mentioned in the
standard Python tutorial:

with python sets, intersections and unions are supposed to be done
like this:
In [7]:set('casa') & set('porca')
Out[7]:set(['a', 'c'])

In [8]:set('casa') | set('porca')
Out[8]:set(['a', 'c', 'o', 'p', 's', 'r'])

and they work correctly. Now what is confusing is that if you do:

In [5]:set('casa') and set('porca')
Out[5]:set(['a', 'p', 'c', 'r', 'o'])

In [6]:set('casa') or set('porca')
Out[6]:set(['a', 'c', 's'])

The results are not what you would expect from an AND or OR
operation, from the mathematical point of view! aparently the "and"
operation is returning the the second set, and the "or" operation is
returning the first.

If python developers wanted these operations to reflect the
traditional (Python) truth value for data structures: False for empty
data structures and True otherwise, why not return simply True or
False?

So My question is: Why has this been implemented in this way? I can
see this confusing many newbies...

Aug 6 '07 #1
7 1729
Flavio wrote:
Hi, I have been playing with set operations lately and came across a
kind of surprising result given that it is not mentioned in the
standard Python tutorial:

with python sets, intersections and unions are supposed to be done
like this:
In [7]:set('casa') & set('porca')
Out[7]:set(['a', 'c'])

In [8]:set('casa') | set('porca')
Out[8]:set(['a', 'c', 'o', 'p', 's', 'r'])

and they work correctly. Now what is confusing is that if you do:

In [5]:set('casa') and set('porca')
Out[5]:set(['a', 'p', 'c', 'r', 'o'])

In [6]:set('casa') or set('porca')
Out[6]:set(['a', 'c', 's'])

The results are not what you would expect from an AND or OR
operation, from the mathematical point of view! aparently the "and"
operation is returning the the second set, and the "or" operation is
returning the first.

If python developers wanted these operations to reflect the
traditional (Python) truth value for data structures: False for empty
data structures and True otherwise, why not return simply True or
False?

So My question is: Why has this been implemented in this way? I can
see this confusing many newbies...
it has been implemented in this way to conform with the definitions of
"and" and "or", which have never been intended to apply to set
operations. The result of these operations has always returned one of
the operands in the case where possible, and they continue to do so with
set operands.

regards
Steve
--
Steve Holden +1 571 484 6266 +1 800 494 3119
Holden Web LLC/Ltd http://www.holdenweb.com
Skype: holdenweb http://del.icio.us/steve.holden
--------------- Asciimercial ------------------
Get on the web: Blog, lens and tag the Internet
Many services currently offer free registration
----------- Thank You for Reading -------------

Aug 6 '07 #2
On Monday 06 August 2007, Flavio wrote:
So My question is: Why has this been implemented in this way? I can
see this confusing many newbies...
I did not implement this, so I cannot say, but it does have useful
side-effects, for example:

x = A or B

is equivalent to:

if A:
x = A
else:
x = B

also, in python implementations without the (y if x else z) syntax, you can
use (x and y or z) with nearly the same result*. Also, this implementation of
and/or might well be faster ;-)
*: this doesn't work the same if y is a false value; (x and [y] or [z])[0] is
less readable, but works for all y

--
Regards, Thomas Jollans
GPG key: 0xF421434B may be found on various keyservers, eg pgp.mit.edu
Hacker key <http://hackerkey.com/>:
v4sw6+8Yhw4/5ln3pr5Ock2ma2u7Lw2Nl7Di2e2t3/4TMb6HOPTen5/6g5OPa1XsMr9p-7/-6

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQBGtz6FJpinDvQhQ0sRAsN8AJ9SsIx6gj3fG+VHtXvp1a aCJ3E2WgCfeh+y
rx90H88SVRlBZbVRXmIG9Lo=
=Qgsq
-----END PGP SIGNATURE-----

Aug 6 '07 #3
Flavio wrote:
Hi, I have been playing with set operations lately and came across a
kind of surprising result given that it is not mentioned in the
standard Python tutorial:

with python sets, intersections and unions are supposed to be done
like this:
In [7]:set('casa') & set('porca')
Out[7]:set(['a', 'c'])

In [8]:set('casa') | set('porca')
Out[8]:set(['a', 'c', 'o', 'p', 's', 'r'])

and they work correctly. Now what is confusing is that if you do:

In [5]:set('casa') and set('porca')
Out[5]:set(['a', 'p', 'c', 'r', 'o'])

In [6]:set('casa') or set('porca')
Out[6]:set(['a', 'c', 's'])

The results are not what you would expect from an AND or OR
operation, from the mathematical point of view! aparently the "and"
operation is returning the the second set, and the "or" operation is
returning the first.

If python developers wanted these operations to reflect the
traditional (Python) truth value for data structures: False for empty
data structures and True otherwise, why not return simply True or
False?

So My question is: Why has this been implemented in this way? I can
see this confusing many newbies...
It has nothing to do with sets - it stems from the fact that certain values
in python are considered false, and all others true. And these semantics
were introduced at a point where there was no explicit True/False, so the
operators were defined in exact the way you observed.

Consider this:

"foo" or "bar" -"foo"

So - nothing to do with sets.

Diez
Aug 6 '07 #4
On Mon, 06 Aug 2007 14:13:51 +0000, Flavio wrote:
Hi, I have been playing with set operations lately and came across a
kind of surprising result given that it is not mentioned in the standard
Python tutorial:

with python sets, intersections and unions are supposed to be done
like this:
In [7]:set('casa') & set('porca')
Out[7]:set(['a', 'c'])

In [8]:set('casa') | set('porca')
Out[8]:set(['a', 'c', 'o', 'p', 's', 'r'])

and they work correctly. Now what is confusing is that if you do:

In [5]:set('casa') and set('porca')
Out[5]:set(['a', 'p', 'c', 'r', 'o'])

In [6]:set('casa') or set('porca')
Out[6]:set(['a', 'c', 's'])

The results are not what you would expect from an AND or OR operation,
from the mathematical point of view! aparently the "and" operation is
returning the the second set, and the "or" operation is returning the
first.
That might be, because `and` and `or` are not mathematical in Python (at
least not as you think). All the operator bits, e.g. `|` and `&`, are
overloadable. You can just give them any meaning you want.

The `and` and `or` operator, though, are implemented in Python and there
is no way you can make them behave different from how they do it by
default. It has been discussed to remove this behaviour or make them
overloadable as well but this hasn't made it far, as far as I remember.
If python developers wanted these operations to reflect the traditional
(Python) truth value for data structures: False for empty data
structures and True otherwise, why not return simply True or False?
Because in the most cases, returning True of False simply has no
advantage. But returning the actual operands has been of fairly large
use, e.g. for replacing the if expression ("ternary operator") ``THEN if
COND else DEFAULT`` with ``COND and THEN or DEFAULT`` (which has some bad
corner cases, though).
So My question is: Why has this been implemented in this way? I can see
this confusing many newbies...
Hmm, you could be right there. But they shouldn't be biased by default
boolean behaviour, then, anyways.
Aug 6 '07 #5
In article <5h*************@mid.uni-berlin.de>,
"Diez B. Roggisch" <de***@nospam.web.dewrote:
Flavio wrote:
Hi, I have been playing with set operations lately and came across a
kind of surprising result given that it is not mentioned in the
standard Python tutorial:

with python sets, intersections and unions are supposed to be done
like this:
[...]

The results are not what you would expect from an AND or OR
operation, from the mathematical point of view! aparently the "and"
operation is returning the the second set, and the "or" operation is
returning the first.
[...]

It has nothing to do with sets - it stems from the fact that certain values
in python are considered false, and all others true. And these semantics
were introduced at a point where there was no explicit True/False, so the
operators were defined in exact the way you observed.

Consider this:

"foo" or "bar" -"foo"

So - nothing to do with sets.
In addition to what Diez wrote above, it is worth noting that the
practise of returning the value of the determining expression turns out
to be convenient for the programmer in some cases. Consider the
following example:

x = some_function(a, b, c) or another_function(d, e)

This is a rather nice shorthand notation for the following behaviour:

t = some_function(a, b, c)
if t:
x = t
else:
x = another_function(d, e)

In other words, the short-circuit behaviour of the logical operators
gives you a compact notation for evaluating certain types of conditional
expressions and capturing their values. If the "or" operator converted
the result to True or False, you could not use it this way.

Similarly,

x = some_function(a, b, c) and another_function(d, e)

.... behaves as if you had written:

x = some_function(a, b, c)
if x:
x = another_function(d, e)

Again, as above, if the results were forcibly converted to Boolean
values, you could not use the shorthand.

Now that Python provides an expression variety of "if", this is perhaps
not as useful as it once was; however, it still has a role. Suppose,
for example, that a call to some_function() is very time-consuming; you
would not want to write:

x = some_function(a, b, c) \
if some_function(a, b, c) else another_function(d, e)

.... because then some_function would get evaluated twice. Python does
not permit assignment within an expression, so you can't get rid of the
second call without changing the syntax.

Also, it is a common behaviour in many programming languages for logical
connectives to both short-circuit and yield their values, so I'd argue
that most programmers are proabably accustomed to it. The && and ||
operators of C and its descendants also behave in this manner, as do the
AND and OR of Lisp or Scheme. It is possible that beginners may find it
a little bit confusing at first, but I believe such confusion is minor
and easily remedied.

Cheers,
-M

--
Michael J. Fromberger | Lecturer, Dept. of Computer Science
http://www.dartmouth.edu/~sting/ | Dartmouth College, Hanover, NH, USA
Aug 6 '07 #6
Michael J. Fromberger <Mi******************@Clothing.Dartmouth.EDU>
wrote:
...
Also, it is a common behaviour in many programming languages for logical
connectives to both short-circuit and yield their values, so I'd argue
that most programmers are proabably accustomed to it. The && and ||
operators of C and its descendants also behave in this manner, as do the
Untrue, alas...:

brain:~ alex\$ cat a.c
#include <stdio.h>

int main()
{
printf("%d\n", 23 && 45);
return 0;
}
brain:~ alex\$ gcc a.c
brain:~ alex\$ ./a.out
1

In C, && and || _do_ "short circuit", BUT they always return 0 or 1,
*NOT* "yield their values" (interpreted as "return the false or true
value of either operand", as in Python).
Alex

Aug 7 '07 #7
Flavio a écrit :
Hi, I have been playing with set operations lately and came across a
kind of surprising result given that it is not mentioned in the
standard Python tutorial:

with python sets, intersections and unions are supposed to be done
like this:
In [7]:set('casa') & set('porca')
Out[7]:set(['a', 'c'])

In [8]:set('casa') | set('porca')
Out[8]:set(['a', 'c', 'o', 'p', 's', 'r'])

and they work correctly. Now what is confusing is that if you do:

In [5]:set('casa') and set('porca')
Out[5]:set(['a', 'p', 'c', 'r', 'o'])

In [6]:set('casa') or set('porca')
Out[6]:set(['a', 'c', 's'])

The results are not what you would expect from an AND or OR
operation, from the mathematical point of view! aparently the "and"
operation is returning the the second set, and the "or" operation is
returning the first.
the semantic of 'and' and 'or' operators in Python is well defined and
works the same for all types AFAIK.
If python developers wanted these operations to reflect the
traditional (Python) truth value for data structures: False for empty
data structures and True otherwise, why not return simply True or
False?

So My question is: Why has this been implemented in this way?
Because Python long lived without the 'bool' type - considering None,
numeric zero, empty string and empty containers as false (ie :
'nothing', and anything else as true (ie : 'something').
I can
see this confusing many newbies...
Yes, and this has been one of the arguments against the introduction of
the bool type. Changing this behaviour would have break lot of existing
code, and indeed, not changing it makes things confusing.

OTHO - and while I agree that there may be cases of useless complexities
in Python -, stripping a language from anything that might confuse a
newbie doesn't make great languages.
Aug 7 '07 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

### Similar topics

 14 by: greg | last post by: Discussion is invited on the following proto-PEP. ------------------------------------------------------------- PEP ??? - Overloadable Boolean Operators... 15 by: F. Da Costa | last post by: Hi all, Following two sniperts of code I'm using and getting very interesting results from. ..html
This is the segment under 'investigation' ..js 1 by: leegold | last post by: Show full header Is there any way to make relevance when using boolean mode more useful? If not, are there plans in the Fulltext development "todo" for making it useful? I'm thinking of just... 2 by: jmdeschamps | last post by: Working with several thousand tagged items on a Tkinter Canvas, I want to change different configurations of objects having a certain group of tags. I've used the sets module, on the tuple... 2 by: Brian Henry | last post by: say i have a dataset called dv and i have a boolean value i want to filter by "b_DoesWork" while the column name in the data view that i want to filter on is called "Works" how would i filter... 16 by: Shawnk | last post by: I would like to perform various boolean operations on bitmapped (FlagsAttribute) enum types for a state machine design as in; ------------------- enum portState { Unknown, Open, 7 by: rguarnieri | last post by: Hi! I'm trying to create a query with a boolean expression like this: select (4 and 1) as Value from Table1 this query return always -1, but when I make the same calculation in visual... 76 by: KimmoA | last post by: First of all: I love C and think that it's beautiful. However, there is at least one MAJOR flaw: the lack of a boolean type. OK. Some of you might refer to C99 and its _Bool (what's up with the... 33 by: Stef Mientki | last post by: hello, I discovered that boolean evaluation in Python is done "fast" (as soon as the condition is ok, the rest of the expression is ignored). Is this standard behavior or is there a compiler... 0 by: taylorcarr | last post by: A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,... 0 by: Charles Arthur | last post by: How do i turn on java script on a villaon, callus and itel keypad mobile phone 0 by: aa123db | last post by: Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function \$name\$ (\$parameters\$) { } ... 0 by: ryjfgjl | last post by: In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files... 0 by: emmanuelkatto | last post by: Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel 1 by: nemocccc | last post by: hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions? 1 by: Sonnysonu | last post by: This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to... 0 by: Oralloy | last post by: Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,... 0 by: jinu1996 | last post by: In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...