470,810 Members | 1,495 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 470,810 developers. It's quick & easy.

Python Module Exposure

I have created what I think may be a useful Python module, but I'd like
to share it with the Python community to get feedback, i.e. if it's
Pythonic. If it's considered useful by Pythonistas, I'll see about
hosting it on Sourceforge or something like that. Is this a good forum
for exposing modules to the public, or is there somewhere
more-acceptable? Does this newsgroup find attachments acceptable?

--
Jacob
Jul 21 '05 #1
17 1902
Jacob Page wrote:
I have created what I think may be a useful Python module, but I'd like
to share it with the Python community to get feedback, i.e. if it's
Pythonic. If it's considered useful by Pythonistas, I'll see about
hosting it on Sourceforge or something like that. Is this a good forum
for exposing modules to the public, or is there somewhere
more-acceptable? Does this newsgroup find attachments acceptable?


No. Please put files somewhere on the web and post a URL. This would be
a good forum to informally announce and discuss your module. Formal
announcements once you, e.g. put it on SF should go to c.l.py.announce .

--
Robert Kern
rk***@ucsd.edu

"In the fields of hell where the grass grows high
Are the graves of dreams allowed to die."
-- Richard Harter

Jul 21 '05 #2
Robert Kern wrote:
Jacob Page wrote:
I have created what I think may be a useful Python module, but I'd
like to share it with the Python community to get feedback, i.e. if
it's Pythonic. If it's considered useful by Pythonistas, I'll see
about hosting it on Sourceforge or something like that. Is this a
good forum for exposing modules to the public, or is there somewhere
more-acceptable? Does this newsgroup find attachments acceptable?


No. Please put files somewhere on the web and post a URL. This would be
a good forum to informally announce and discuss your module. Formal
announcements once you, e.g. put it on SF should go to c.l.py.announce .


Thanks for the information, Robert. Anyway, here's my informal
announcement:

The iset module is a pure Python module that provides the ISet class and
some helper functions for creating them. Unlike Python sets, which are
sets of discrete values, an ISet is a set of intervals. An ISet could,
for example, stand for all values less than 0, all values from 2 up to,
but not including 62, or all values not equal to zero. ISets can also
pertain to non-numeric values.

ISets can be used in much the same way as sets. They can be or'ed,
and'ed, xor'ed, added, subtracted, and inversed. Membership testing is
done the same as with a set. The documentation has some examples of how
to create and use ISets.

The iset module is for Python 2.4 or later.

I am seeking feedback from programmers and mathematicians on how to
possibly make this module more user-friendly, better-named,
better-documented, better-tested, and more Pythonic. Then, if this
module is considered acceptable by the community, I'll create a more
permanent home for this project.

To download the iset module and view its pydoc-generated documentation,
please visit http://members.cox.net/apoco/iset/.
Jul 21 '05 #3
Robert Kern wrote:
Jacob Page wrote:
Does this newsgroup find attachments acceptable?


No. Please put files somewhere on the web and post a URL. This would be
a good forum to informally announce and discuss your module.


To add to what Robert said, keep in mind this newsgroup is also mirrored
to a mailing list, so posting anything but example code snippets would
quickly fill up people's inboxes.
Jul 21 '05 #4
Jacob Page wrote:
better-named,


Just a quick remark, without even having looked at it yet: the name is not
really descriptive and runs a chance of misleading people. The example I'm
thinking of is using zope.interface in the same project: it's customary to
name interfaces ISomething.

--
Thomas

Jul 21 '05 #5
Thomas Lotze wrote:
Jacob Page wrote:
better-named,


Just a quick remark, without even having looked at it yet: the name is not
really descriptive and runs a chance of misleading people. The example I'm
thinking of is using zope.interface in the same project: it's customary to
name interfaces ISomething.


I've thought of IntervalSet (which is a very long name), IntSet (might
be mistaken for integers), ItvlSet (doesn't roll off the fingers),
ConSet, and many others. Perhaps IntervalSet is the best choice out of
these. I'd love any suggestions.
Jul 21 '05 #6
Jacob Page wrote:
Thomas Lotze wrote:
Jacob Page wrote:
better-named,


Just a quick remark, without even having looked at it yet: the name is not
really descriptive and runs a chance of misleading people. The example I'm
thinking of is using zope.interface in the same project: it's customary to
name interfaces ISomething.


I've thought of IntervalSet (which is a very long name), IntSet (might
be mistaken for integers), ItvlSet (doesn't roll off the fingers),
ConSet, and many others. Perhaps IntervalSet is the best choice out of
these. I'd love any suggestions.

Hi Jacob,

I liked a lot the idea of interval sets; I wonder why no one has
thought of it before (hell, why *I* haven't thought about it before ?
:)) I haven't had a chance to look into the internals, so my remarks
now are only API-related and stylistic:

1. As already noted, ISet is not really descriptive of what the class
does. How about RangeSet ? It's not that long and I find it pretty
descriptive. In this case, it would be a good idea to change Interval
to Range to make the association easier.

2. The module's "helper functions" -- which are usually called factory
functions/methods because they are essentially alternative constructors
of ISets -- would perhaps better be defined as classmethods of ISet;
that's a common way to define instance factories in python. Except for
'eq' and 'ne', the rest define an ISet of a single Interval, so they
would rather be classmethods of Interval. Also the function names could
take some improvement; at the very least they should not be 2-letter
abbreviations. Finally I would omit 'eq', an "interval" of a single
value; single values can be given in ISet's constructor anyway. Here's
a different layout:

class Range(object): # Interval

@classmethod
def lowerThan(cls, value, closed=False):
# lt, for closed==False
# le, for closed==True
return cls(None, False, value, closed)

@classmethod
def greaterThan(cls, value, closed=False):
# gt, for closed==False
# ge, for closed==True
return cls(value, closed, None, False)

@classmethod
def between(cls, min, max, closed=False):
# exInterval, for closed==False
# incInterval, for closed==True
return cls(min, closed, max, closed)
class RangeSet(object): # ISet

@classmethod
def allExcept(cls, value): # ne
return cls(Range.lowerThan(value), Range.greaterThan(value))

3. Having more than one names for the same thing is good to be avoided
in general. So keep either "none" or "empty" (preferably "empty" to
avoid confusion with None) and remove ISet.__add__ since it is synonym
to ISet.__or__.

4. Intervals shouldn't be comparable; the way __cmp__ works is
arbitrary and not obvious.

5. Interval.adjacentTo() could take an optional argument to control
whether an endpoint is allowed to be in both ranges or not (e.g.
whether (1,3], [3, 7] are adjacent or not).

6. Possible ideas in your TODO list:
- Implement the whole interface of sets for ISet's so that a client
can use either or them transparently.
- Add an ISet.remove() for removing elements, Intervals, ISets as
complementary to ISet.append().
- More generally, think about mutable vs immutable Intervals and
ISets. The sets module in the standard library will give you a good
idea of the relevant design and implementation issues.

After I look into the module's internals, I'll try to make some changes
and send it back to you for feedback.

Regards,
George

Jul 21 '05 #7
Jacob Page schrieb:
I have created what I think may be a useful Python module, but I'd like
to share it with the Python community to get feedback, i.e. if it's
Pythonic. If it's considered useful by Pythonistas, I'll see about
hosting it on Sourceforge or something like that. Is this a good forum
for exposing modules to the public, or is there somewhere
more-acceptable? Does this newsgroup find attachments acceptable?

--
Jacob


One side-question: has anyone made experiences in hosting his open
source project on

http://www.python-hosting.com/

Regards,
Kay

Jul 21 '05 #8
George Sakkis wrote:
1. As already noted, ISet is not really descriptive of what the class
does. How about RangeSet ? It's not that long and I find it pretty
descriptive. In this case, it would be a good idea to change Interval
to Range to make the association easier.
The reason I decided not to use the term Range was because that could be
confused with the range function, which produces a discrete set of
integers. Interval isn't a term used in the standard/built-in library,
AFAIK, so I may stick with it. Is this sound reasoning?
2. The module's "helper functions" -- which are usually called factory
functions/methods because they are essentially alternative constructors
of ISets -- would perhaps better be defined as classmethods of ISet;
that's a common way to define instance factories in python. Except for
'eq' and 'ne', the rest define an ISet of a single Interval, so they
would rather be classmethods of Interval. Also the function names could
take some improvement; at the very least they should not be 2-letter
abbreviations. Finally I would omit 'eq', an "interval" of a single
value; single values can be given in ISet's constructor anyway. Here's
a different layout:
First, as far as having the factory functions create Interval instances
(Range instances in your examples), I was hoping to avoid Intervals
being publically "exposed." I think it's cleaner if the end-programmer
only has to deal with one object interface.

I like the idea of lengthening the factory function names, but I'm not
sure that the factory methods would be better as class methods.
Consider a use-case:
import iset
interval = iset.ISet.lowerThan(4)
as opposed to:
import iset
interval = iset.lowerThan(4)


This is less typing and probably eliminates some run-time overhead. Can
you list any advantages of making the factory functions class methods?
class Range(object): # Interval

@classmethod
def lowerThan(cls, value, closed=False):
# lt, for closed==False
# le, for closed==True
return cls(None, False, value, closed)

@classmethod
def greaterThan(cls, value, closed=False):
# gt, for closed==False
# ge, for closed==True
return cls(value, closed, None, False)

@classmethod
def between(cls, min, max, closed=False):
# exInterval, for closed==False
# incInterval, for closed==True
return cls(min, closed, max, closed)
I like the function names, but in my dialect, lessThan would be more proper.
class RangeSet(object): # ISet

@classmethod
def allExcept(cls, value): # ne
return cls(Range.lowerThan(value), Range.greaterThan(value))

3. Having more than one names for the same thing is good to be avoided
in general. So keep either "none" or "empty" (preferably "empty" to
avoid confusion with None) and remove ISet.__add__ since it is synonym
to ISet.__or__.
I agree that "none" should be removed. However, some programmers may
prefer to use +, some |, so I'd like to provide both.
4. Intervals shouldn't be comparable; the way __cmp__ works is
arbitrary and not obvious.
I agree that __cmp__ is being used arbitrarily. I wanted sorting to be
easy, but there's other ways of doing that. I'll take your suggestion.
5. Interval.adjacentTo() could take an optional argument to control
whether an endpoint is allowed to be in both ranges or not (e.g.
whether (1,3], [3, 7] are adjacent or not).
Interval objects weren't meant to be public; adjacentTo is used by ISet
for combinatory functions. I could look into ways to hide the class
from the public, but hiding things never seemed Pythonic to me; someone
may want to use them for some reason. Maybe pydoc can be made to not
document certain objects.
6. Possible ideas in your TODO list:
- Implement the whole interface of sets for ISet's so that a client
can use either or them transparently.
That was my intention. I'll have to see which interface functions I'm
missing.
- Add an ISet.remove() for removing elements, Intervals, ISets as
complementary to ISet.append().
- More generally, think about mutable vs immutable Intervals and
ISets. The sets module in the standard library will give you a good
idea of the relevant design and implementation issues.
Both good ideas.
After I look into the module's internals, I'll try to make some changes
and send it back to you for feedback.


Thanks for your feedback, George. I look forward to any additional
comments you come up with.
Jul 21 '05 #9
"Jacob Page" <ap***@cox.net> wrote:
George Sakkis wrote:
1. As already noted, ISet is not really descriptive of what the class
does. How about RangeSet ? It's not that long and I find it pretty
descriptive. In this case, it would be a good idea to change Interval
to Range to make the association easier.
The reason I decided not to use the term Range was because that could be
confused with the range function, which produces a discrete set of
integers. Interval isn't a term used in the standard/built-in library,
AFAIK, so I may stick with it. Is this sound reasoning?


Yes, it is not unreasonable; I can't argue strongly against Interval.
Still I'm a bit more in favor of Range and I don't think it is
particularly confusing with range() because:
1. Range has to be either qualified with the name of the package (e.g.
rangesets.Range) or imported as "from rangesets import Range", so one
cannot mistake it for the range builtin.
2. Most popular naming conventions use lower first letter for functions
and capital for classes.
3. If you insist that only RangeSet should be exposed from the module's
interface and Range (Interval) should be hidden, the potential conflict
between range() and RangeSet is even less.
2. The module's "helper functions" -- which are usually called factory
functions/methods because they are essentially alternative constructors
of ISets -- would perhaps better be defined as classmethods of ISet;
that's a common way to define instance factories in python. Except for
'eq' and 'ne', the rest define an ISet of a single Interval, so they
would rather be classmethods of Interval. Also the function names could
take some improvement; at the very least they should not be 2-letter
abbreviations. Finally I would omit 'eq', an "interval" of a single
value; single values can be given in ISet's constructor anyway. Here's
a different layout:


First, as far as having the factory functions create Interval instances
(Range instances in your examples), I was hoping to avoid Intervals
being publically "exposed." I think it's cleaner if the end-programmer
only has to deal with one object interface.


First off, the python convention for names you intend to be 'private'
is to prefix them with a single underscore, i.e. _Interval, so it was
not obvious at all by reading the documentation that this was your
intention. Assuming that Interval is to be exposed, I found
Interval.lowerThan(5) a bit more intuitive than
IntervalSet.lowerThan(5). The only slight problem is the 'ne'/
allExcept factory which doesn't return a continuous interval and
therefore cannot be a classmethod in Interval.

On whether Interval should be exposed or not: I believe that interval
is a useful abstraction by itself and has the important property of
being continuous, which IntervalSet doesn't. Having a simple
single-class interface is a valid argument, but it may turn out to be
restricted later. For example, I was thinking that a useful method of
IntervalSet would be an iterator over its Intervals:
for interval in myIntervalSet:
print interval.min, interval.max
There are several possible use cases where dealing directly with
intervals would be appropriate or necessary, so it's good to have them
supported directly by the module.
I like the idea of lengthening the factory function names, but I'm not
sure that the factory methods would be better as class methods.
Consider a use-case:
>>> import iset
>>> interval = iset.ISet.lowerThan(4)
as opposed to:
>>> import iset
>>> interval = iset.lowerThan(4)
I was thinking along the lines of:
from rangesets import RangeSet, Range
mySet = RangeSet(Range.lowerThan(4), Range.greaterThan(5))

That's similar to "from sets import Set" before set became builtin.
This is less typing and probably eliminates some run-time overhead. Can
you list any advantages of making the factory functions class methods?


One of the main reason for introducing classmethods was to allow
alternate constructors that play well with subclasses. So if Interval
is subclassed, say for a FrozenInterval class,
FrozenInterval.lowerThan() returns FrozenInterval instances
automatically.
class Range(object): # Interval

@classmethod
def lowerThan(cls, value, closed=False):
# lt, for closed==False
# le, for closed==True
return cls(None, False, value, closed)

@classmethod
def greaterThan(cls, value, closed=False):
# gt, for closed==False
# ge, for closed==True
return cls(value, closed, None, False)

@classmethod
def between(cls, min, max, closed=False):
# exInterval, for closed==False
# incInterval, for closed==True
return cls(min, closed, max, closed)


I like the function names, but in my dialect, lessThan would be more proper.


Either (lowerThan,greaterThan) or (lessThan,moreThan) are fine with me;
I'm very slightly in favor of the second for brevity.
class RangeSet(object): # ISet

@classmethod
def allExcept(cls, value): # ne
return cls(Range.lowerThan(value), Range.greaterThan(value))

3. Having more than one names for the same thing is good to be avoided
in general. So keep either "none" or "empty" (preferably "empty" to
avoid confusion with None) and remove ISet.__add__ since it is synonym
to ISet.__or__.


I agree that "none" should be removed. However, some programmers may
prefer to use +, some |, so I'd like to provide both.


Interval.__add__ is ok, but IntervalSet should be as close to regular
sets as possible; anyway, that's a minor point.
4. Intervals shouldn't be comparable; the way __cmp__ works is
arbitrary and not obvious.


I agree that __cmp__ is being used arbitrarily. I wanted sorting to be
easy, but there's other ways of doing that. I'll take your suggestion.
5. Interval.adjacentTo() could take an optional argument to control
whether an endpoint is allowed to be in both ranges or not (e.g.
whether (1,3], [3, 7] are adjacent or not).


Interval objects weren't meant to be public; adjacentTo is used by ISet
for combinatory functions. I could look into ways to hide the class
from the public, but hiding things never seemed Pythonic to me; someone
may want to use them for some reason. Maybe pydoc can be made to not
document certain objects.


I mentioned already the arguments for keeping it public. If you insist
though, several tools recognize the single underscore convention as
meaning 'private'; epydoc does, I don't know about pydoc. Also, "from
module import *" imports only the top level names that don't start with
an undescore.
6. Possible ideas in your TODO list:
- Implement the whole interface of sets for ISet's so that a client
can use either or them transparently.


That was my intention. I'll have to see which interface functions I'm
missing.
- Add an ISet.remove() for removing elements, Intervals, ISets as
complementary to ISet.append().
- More generally, think about mutable vs immutable Intervals and
ISets. The sets module in the standard library will give you a good
idea of the relevant design and implementation issues.


Both good ideas.
After I look into the module's internals, I'll try to make some changes
and send it back to you for feedback.


Thanks for your feedback, George. I look forward to any additional
comments you come up with.


Regards,
George

Jul 21 '05 #10
George Sakkis wrote:
"Jacob Page" <ap***@cox.net> wrote:
George Sakkis wrote:
1. As already noted, ISet is not really descriptive of what the class
does. How about RangeSet ? It's not that long and I find it pretty
descriptive. In this case, it would be a good idea to change Interval
to Range to make the association easier.
The reason I decided not to use the term Range was because that could be
confused with the range function, which produces a discrete set of
integers. Interval isn't a term used in the standard/built-in library,
AFAIK, so I may stick with it. Is this sound reasoning?


Yes, it is not unreasonable; I can't argue strongly against Interval.
Still I'm a bit more in favor of Range and I don't think it is
particularly confusing with range() because:
1. Range has to be either qualified with the name of the package (e.g.
rangesets.Range) or imported as "from rangesets import Range", so one
cannot mistake it for the range builtin.
2. Most popular naming conventions use lower first letter for functions
and capital for classes.
3. If you insist that only RangeSet should be exposed from the module's
interface and Range (Interval) should be hidden, the potential conflict
between range() and RangeSet is even less.


Those are pretty good arguments, but after doing some poking around on
planetmath.org and reading
http://planetmath.org/encyclopedia/Interval.html, I've now settled on
Interval, since that seems to be the proper use of the term.
2. The module's "helper functions" -- which are usually called factory
functions/methods because they are essentially alternative constructors
of ISets -- would perhaps better be defined as classmethods of ISet;
that's a common way to define instance factories in python. Except for
'eq' and 'ne', the rest define an ISet of a single Interval, so they
would rather be classmethods of Interval. Also the function names could
take some improvement; at the very least they should not be 2-letter
abbreviations.


First, as far as having the factory functions create Interval instances
(Range instances in your examples), I was hoping to avoid Intervals
being publically "exposed." I think it's cleaner if the end-programmer
only has to deal with one object interface.


First off, the python convention for names you intend to be 'private'
is to prefix them with a single underscore, i.e. _Interval, so it was
not obvious at all by reading the documentation that this was your
intention. Assuming that Interval is to be exposed, I found
Interval.lowerThan(5) a bit more intuitive than
IntervalSet.lowerThan(5). The only slight problem is the 'ne'/
allExcept factory which doesn't return a continuous interval and
therefore cannot be a classmethod in Interval.


If the factories resided in separate classes, it seems like they might
be less convenient to use. I wanted these things to be easily
constructed. Maybe a good compromise is to implement lessThan and
greaterThan in both Interval and IntervalSet.
On whether Interval should be exposed or not: I believe that interval
is a useful abstraction by itself and has the important property of
being continuous, which IntervalSet doesn't.
Perhaps I should add a boolean function for IntervalSet called
continuous (isContinuous?).

Having a simple single-class interface is a valid argument, but it may turn out to be
restricted later. For example, I was thinking that a useful method of
IntervalSet would be an iterator over its Intervals:
for interval in myIntervalSet:
print interval.min, interval.max
I like the idea of allowing iteration over the Intervals.
There are several possible use cases where dealing directly with
intervals would be appropriate or necessary, so it's good to have them
supported directly by the module.


I think I will keep Interval exposed. It sort of raises a bunch of
hard-to-answer design questions having two class interfaces, though.
For example, would Interval.between(2, 3) + Interval.between(5, 7) raise
an error (as it currently does) because the intervals are disjoint or
yield an IntervalSet, or should it not even be implemented? How about
subtraction, xoring, and anding? An exposed class should have a more
complete interface.

I think that IntervalSet.between(5, 7) | IntervalSet.between(2, 3) is
more intuitive than IntervalSet(Interval.between(5, 7),
Interval.between(2, 3)), but I can understand the reverse. I think I'll
just support both.
I like the idea of lengthening the factory function names, but I'm not
sure that the factory methods would be better as class methods.


One of the main reason for introducing classmethods was to allow
alternate constructors that play well with subclasses. So if Interval
is subclassed, say for a FrozenInterval class,
FrozenInterval.lowerThan() returns FrozenInterval instances
automatically.


You've convinced me :). I sure hate to require the programmer to have
to type the class name when using a factory method, but I'd hate
providing a duo of frozen and non-frozen factories even more.
I like the function names, but in my dialect, lessThan would be more proper.


Either (lowerThan,greaterThan) or (lessThan,moreThan) are fine with me;
I'm very slightly in favor of the second for brevity.


Less than/greater than seem to be the usual way to pronounce < and >.
At least it matches the Python, which uses __lt__, __le__, __gt__, and
__ge__. I'll stick with lessThan and greaterThan.

Thanks for your input. I'm going to start implementing your
suggestions. I hope more people add feedback so I can make this an even
better module.
Jul 21 '05 #11
"Jacob Page" <ap***@cox.net> wrote:
George Sakkis wrote:
There are several possible use cases where dealing directly with
intervals would be appropriate or necessary, so it's good to have them
supported directly by the module.


I think I will keep Interval exposed. It sort of raises a bunch of
hard-to-answer design questions having two class interfaces, though.
For example, would Interval.between(2, 3) + Interval.between(5, 7) raise
an error (as it currently does) because the intervals are disjoint or
yield an IntervalSet, or should it not even be implemented? How about
subtraction, xoring, and anding? An exposed class should have a more
complete interface.

I think that IntervalSet.between(5, 7) | IntervalSet.between(2, 3) is
more intuitive than IntervalSet(Interval.between(5, 7),
Interval.between(2, 3)), but I can understand the reverse. I think I'll
just support both.


As I see it, there are two main options you have:

1. Keep Intervals immutable and pass all the responsibility of
combining them to IntervalSet. In this case Interval.__add__ would have
to go. This is simple to implement, but it's probably not the most
convenient to the user.

2. Give Interval the same interface with IntervalSet, at least as far
as interval combinations are concerned, so that Interval.between(2,3) |
Interval.greaterThan(7) returns an IntervalSet. Apart from being user
friendlier, an extra benefit is that you don't have to support
factories for IntervalSets, so I am more in favor of this option.

Another hard design problem is how to combine intervals when
inheritance comes to play. Say that you have FrozenInterval and
FrozenIntervalSet subclasses. What should "Interval.between(2,3) |
FrozenInterval.greaterThan(7)" return ? I ran into this problem for
another datastructure that I wanted it to support both mutable and
immutable versions. I was surprised to find out that:
s1 = set([1])
s2 = frozenset([2])
s1|s2 == s2|s1 True type(s1|s2) == type(s2|s1)

False

It make sense why this happens if you think about the implementation,
but I guess most users expect the result of a commutative operation to
be independent of the operands' order. In the worst case, this may lead
to some hard-to-find bugs.

George

Jul 21 '05 #12
George Sakkis wrote:
"Jacob Page" <ap***@cox.net> wrote:
I think I will keep Interval exposed. It sort of raises a bunch of
hard-to-answer design questions having two class interfaces, though.
For example, would Interval.between(2, 3) + Interval.between(5, 7) raise
an error (as it currently does) because the intervals are disjoint or
yield an IntervalSet, or should it not even be implemented? How about
subtraction, xoring, and anding? An exposed class should have a more
complete interface.

I think that IntervalSet.between(5, 7) | IntervalSet.between(2, 3) is
more intuitive than IntervalSet(Interval.between(5, 7),
Interval.between(2, 3)), but I can understand the reverse. I think I'll
just support both.
As I see it, there are two main options you have:

1. Keep Intervals immutable and pass all the responsibility of
combining them to IntervalSet. In this case Interval.__add__ would have
to go. This is simple to implement, but it's probably not the most
convenient to the user.

2. Give Interval the same interface with IntervalSet, at least as far
as interval combinations are concerned, so that Interval.between(2,3) |
Interval.greaterThan(7) returns an IntervalSet. Apart from being user
friendlier, an extra benefit is that you don't have to support
factories for IntervalSets, so I am more in favor of this option.


I selected option one; Intervals are immutable. However, this doesn't
mean that __add__ has to go, as that function has no side-effects. The
reason I chose option one was because it's uncommon for a mathematical
operation on two objects to return a different type altogether.
Another hard design problem is how to combine intervals when
inheritance comes to play. Say that you have FrozenInterval and
FrozenIntervalSet subclasses. What should "Interval.between(2,3) |
FrozenInterval.greaterThan(7)" return ?


For now, operations will return mutable instances. They can always be
frozen later if needs be.
Jul 21 '05 #13
"Jacob Page" <ap***@cox.net> wrote:
George Sakkis wrote:
As I see it, there are two main options you have:

1. Keep Intervals immutable and pass all the responsibility of
combining them to IntervalSet. In this case Interval.__add__ would have
to go. This is simple to implement, but it's probably not the most
convenient to the user.

2. Give Interval the same interface with IntervalSet, at least as far
as interval combinations are concerned, so that Interval.between(2,3) |
Interval.greaterThan(7) returns an IntervalSet. Apart from being user
friendlier, an extra benefit is that you don't have to support
factories for IntervalSets, so I am more in favor of this option.


I selected option one; Intervals are immutable. However, this doesn't
mean that __add__ has to go, as that function has no side-effects. The
reason I chose option one was because it's uncommon for a mathematical
operation on two objects to return a different type altogether.


Mathematically what you described corresponds to sets that are not
closed under some operation and it's not uncommon at all; a few
examples are:
- The set of integers is not closed under division: int / int ->
rational
- The set of real numbers is not closed under square root: sqrt(real)
-> complex
- The set of positive number is not closed under subtraction:
pos_number - pos_number -> number
- And yes, the set of intervals is not closed under union.

On another note, I noticed you use __contains__ both for membership and
is-subset queries. This is problematic in case of Intervals that have
other Intervals as members. The set builtin type uses __contains__ for
membership checks and issubset for subset checks (with __le__ as
synonym); it's good to keep the same interface.

George

Jul 21 '05 #14
George Sakkis wrote:
"Jacob Page" <ap***@cox.net> wrote:
I selected option one; Intervals are immutable. However, this doesn't
mean that __add__ has to go, as that function has no side-effects. The
reason I chose option one was because it's uncommon for a mathematical
operation on two objects to return a different type altogether.
Mathematically what you described corresponds to sets that are not
closed under some operation and it's not uncommon at all; a few
examples are:
- The set of integers is not closed under division: int / int ->
rational
- The set of real numbers is not closed under square root: sqrt(real)
-> complex
- The set of positive number is not closed under subtraction:
pos_number - pos_number -> number
- And yes, the set of intervals is not closed under union.


Yes, but I wasn't talking about mathematical operations in general; I
was talking about mathematical operations in Python. Example: 6 / 5
doesn't yield a float (though I heard that might change in future
versions). If the union of two integers yielded a set of integers, then
it'd make more since for the union of two Intervals to yield an
IntervalSet. But it doesn't. Just as set([2, 6]) creates a set of two
integers, IntervalSet(Interval(...), Interval(...)) creates a set of two
intervals.
On another note, I noticed you use __contains__ both for membership and
is-subset queries. This is problematic in case of Intervals that have
other Intervals as members. The set builtin type uses __contains__ for
membership checks and issubset for subset checks (with __le__ as
synonym); it's good to keep the same interface.


Good catch. I think I've made similar mistakes for a few of the other
functions, too.

Thanks for your help.
Jul 21 '05 #15
"Jacob Page" <ap***@cox.net> wrote:
If the union of two integers yielded a set of integers, then
it'd make more since for the union of two Intervals to yield an
IntervalSet.


AFAIK union is defined over sets, not numbers, so I'm not sure what you
mean by the "union of two integers". What I'm saying is that while the
union of two intervals is always defined (since intervals are sets),
the result set is not guaranteed to be an interval. More specifically,
the result is an interval if and only if the intervals overlap, e.g.
(2,4] | [3,7] = (2,7]
but
(2,4] | [5,7] = { (2,4], [5,7] }
That is, the set of intervals is not closed under union. OTOH, the set
of intervals _is_ closed under intersection; intersecting two
non-overlapping intervals gives the empty interval.

George

Jul 21 '05 #16
George Sakkis wrote:
"Jacob Page" <ap***@cox.net> wrote:
If the union of two integers yielded a set of integers, then
it'd make more since for the union of two Intervals to yield an
IntervalSet.


AFAIK union is defined over sets, not numbers, so I'm not sure what you
mean by the "union of two integers". What I'm saying is that while the
union of two intervals is always defined (since intervals are sets),
the result set is not guaranteed to be an interval. More specifically,
the result is an interval if and only if the intervals overlap, e.g.
(2,4] | [3,7] = (2,7]
but
(2,4] | [5,7] = { (2,4], [5,7] }
That is, the set of intervals is not closed under union. OTOH, the set
of intervals _is_ closed under intersection; intersecting two
non-overlapping intervals gives the empty interval.


OK, you've convinced me now to support and, or, and xor between every
combination of Intervals and IntervalSets, Intervals and IntervalSets,
and IntervalSets and Intervals. However, I'm not sure I like the idea
of an operation generating either one type or another. Thus, I'll have
| and ^ operations between Intervals always return an IntervalSet
instead of returning either an IntervalSet or an Interval. & will
return an Interval. I suppose that means I should just have + do a
union and - return an IntervalSet. It will just have to be documented
which types are to be expected for the return values depending on the
operands.
Jul 21 '05 #17
"Jacob Page" <ap***@cox.net> wrote:
OK, you've convinced me now to support and, or, and xor between every
combination of Intervals and IntervalSets, Intervals and IntervalSets,
and IntervalSets and Intervals.
I'm sorry, this was not my intention <wink>.
However, I'm not sure I like the idea
of an operation generating either one type or another.
I don't like it either; it makes the user's work harder by forcing him
to check the result's type, and then either process each type
differently or wrap Intervals into IntervalSets and deal with the
latter only.
Thus, I'll have
| and ^ operations between Intervals always return an IntervalSet
instead of returning either an IntervalSet or an Interval. & will
return an Interval. I suppose that means I should just have + do a
union and - return an IntervalSet. It will just have to be documented
which types are to be expected for the return values depending on the
operands.


This is better, but still I'm not sure if it's good enough. Splitting
the set of operators into those returning Interval and those returning
IntervalSet has to be documented of course, but nevertheles it is not
intuitive for the simple user who doesn't think about closed and
non-closed operations. It's a viable option though. The other two
options are:

- Return always an IntervalSet from all five operators (~, |, &, ^, -).
This is inconvenient for at least intersection and difference which are
known to be closed operations.

- Go back to your initial preference, that is don't support any
operator at all for Intervals. Given that in your new version there are
factories in IntervalSet as well, it's not as bad as before; simply the
user can create Intervals through the IntervalSet factories. You can
take this even further by disallowing (or discouraging at least) the
user to instantiate Intervals directly. An analogy is the Match type in
the re module. Match objects are returned by re.match() and re.search()
and they expose a set of useful methods (group(), groups(), etc.).
However if the user attempts to create a Match instance, a TypeError is
raised. Currently I like this option better; it's both user and
developer friendly :)

George

Jul 21 '05 #18

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

699 posts views Thread by mike420 | last post: by
6 posts views Thread by Aubrey Hutchison | last post: by
46 posts views Thread by Scott Chapman | last post: by
reply views Thread by Emile van Sebille | last post: by
76 posts views Thread by Nick Coghlan | last post: by
31 posts views Thread by surfunbear | last post: by
112 posts views Thread by mystilleef | last post: by
4 posts views Thread by daniel | last post: by
reply views Thread by mihailmihai484 | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.