I'm adding a thread for comments on Gerrit Holl's pre-pep, which
can be found here: http://tinyurl.com/2578q
Frankly, I like the idea. It's about time that all of the file
and directory stuff in the os module got objectified
properly (or at least with some semblance of OO propriety!)
In the issues section:
1) Should path be a subclass of str?
No. Outside of the difficulty of deciding whether it's a
subclass of single byte or unicode strings, it's a pure
and simple case of Feature Envy. Granted, there will
be times a developer wants to use string methods, but
the most common operations should be supported directly.
2) virtual file system extensibility.
No opinion at this time. I'd like to see a couple
of attempts at an implementation first before
settling on a single design.
3) Should the / operator map joinpath.
I agree. No way. In the first place, that's a unixism
(Windows uses \, the Mac uses :) In the second
place it doesn't fit the common use of /, which is
to divide (separate) things. If we want an operator
for join (not a bad idea) I'd suggest using '+'. String
already overloads it for concatenation, and as I said
above, I'd just as soon *not* have this be a subclass
of string.
4) Should path expose an iterator for listdir(?)
I don't see why not, as long as the path is to a
directory.
5) Should == operator be the same as os.path.samefile()?
Why not...
6) Path.open()?
Of course.
7) Should the various gettime methods return Datetime
objects.
Of course.
8) Touch method?
Of course.
9) Current OS constants?
What are they? Are we talking about the four
constants in the access() function, or about something
else?
10) Commonprefix, walk and sameopenfile?
Commonprefix should be a string or list method,
it doesn't fit here.
walk is a nice function, but it should be redone to
use the visitor pattern directly, with different method
names for files, directories and whatever else a
particular file system has in it's warped little mind.
sameopenfile doesn't belong in the os.path module
in the first place. It belongs in the os module under
6.1.3 - File Descriptor Operations.
11) rename join and split.
I wouldn't bother. Since I'm against making it a
subclass of str(), the issue doesn't arise.
12) Should == compare file sizes.
No. Might have a method to do that.
13) chdir, chmod, etc?
No. This has nothing to do with pathname.
14. Unicode filenames
Have to have them on Windows and probably
on the Mac.
15. Should files and directories be the same
class.
Probably not. While they share a lot of common
functionality (which should be spelled out as an
interface) they also have a lot of dissimilar
functionality. Separating them also makes it easy
to create objects for things like symbolic links.
In addition to this, we should have the ability
to update the other times (utime()) directly
using another file or directory object as well
as a Datetime object.
John Roth 31 3714
In article <vv************@news.supernews.com>,
"John Roth" <ne********@jhrothjr.com> wrote: I'm adding a thread for comments on Gerrit Holl's pre-pep, which can be found here:
http://tinyurl.com/2578q
Frankly, I like the idea. It's about time that all of the file and directory stuff in the os module got objectified properly (or at least with some semblance of OO propriety!)
In the issues section:
[ snipping those points where I agree with John ]
4) Should path expose an iterator for listdir(?)
I don't see why not, as long as the path is to a directory.
_An_ iterator, sure, but not __iter__. How about path.listdir()? :)
__iter__ could also iterate over the path elements, so it's ambiguous at
least.
15. Should files and directories be the same class.
Probably not. While they share a lot of common functionality (which should be spelled out as an interface) they also have a lot of dissimilar functionality. Separating them also makes it easy to create objects for things like symbolic links.
But what about paths for not-yet-existing files of folders? I don't
think you should actually _hit_ the file system, if all your doing is
path.join().
Just
"Just" <ju**@xs4all.nl> wrote in message
news:ju************************@news1.news.xs4all. nl... In article <vv************@news.supernews.com>, "John Roth" <ne********@jhrothjr.com> wrote:
I'm adding a thread for comments on Gerrit Holl's pre-pep, which can be found here:
http://tinyurl.com/2578q
Frankly, I like the idea. It's about time that all of the file and directory stuff in the os module got objectified properly (or at least with some semblance of OO propriety!)
In the issues section: [ snipping those points where I agree with John ]
4) Should path expose an iterator for listdir(?)
I don't see why not, as long as the path is to a directory.
_An_ iterator, sure, but not __iter__. How about path.listdir()? :) __iter__ could also iterate over the path elements, so it's ambiguous at least.
I see what you're saying. I'd argue (softly) that iterating over
the directory entries is the natural interpretation, though. 15. Should files and directories be the same class.
Probably not. While they share a lot of common functionality (which should be spelled out as an interface) they also have a lot of dissimilar functionality. Separating them also makes it easy to create objects for things like symbolic links.
But what about paths for not-yet-existing files of folders? I don't think you should actually _hit_ the file system, if all you're doing is path.join().
I agree here. I haven't looked at any of the candidate implementations
yet, so I don't know what they're doing. I'm thinking of a
three class structure: the parent class is just the path manipulations;
it has two subclasses, one for real files and one for real directories.
That way they can not only inherit all of the common path manipulation
stuff, but the developer can instantiate a pure path manipulation
class as well.
There might also be a mixin that encapsulates the stuff that's common
to real files and directories like accessing and changing dates and
permissions.
I'm sure there are use cases that will throw a curve at that structure
as well. Just
In article <vv************@news.supernews.com>,
"John Roth" <ne********@jhrothjr.com> wrote: 4) Should path expose an iterator for listdir(?)
I don't see why not, as long as the path is to a directory.
_An_ iterator, sure, but not __iter__. How about path.listdir()? :) __iter__ could also iterate over the path elements, so it's ambiguous at least.
I see what you're saying. I'd argue (softly) that iterating over the directory entries is the natural interpretation, though.
It's far too implicit to my taste; for one since it's a folder-only
operation (and I don't see much merit in having separate classes for
folder and file paths). Would you also be in favor of interating over
file-paths meaning iterating over the lines in the file?
Just
On Mon, Jan 05, 2004 at 10:06:59AM -0500, John Roth wrote: I'm adding a thread for comments on Gerrit Holl's pre-pep, which can be found here:
http://tinyurl.com/2578q
Frankly, I like the idea. It's about time that all of the file and directory stuff in the os module got objectified properly (or at least with some semblance of OO propriety!)
"Peroperly"? There is nothing particularly "proper" or "improper"
about objects or any other programming paradigm supported by Python.
Objectifying is not a goal in itself. I like the Path object because
the interface is easier to learn and use, not because it is
"objectified".
5) Should == operator be the same as os.path.samefile()?
Why not...
No. Symbolic links are something you would sometimes want to treat as
distinct from the files they point to.
walk is a nice function, but it should be redone to use the visitor pattern directly, with different method names for files, directories and whatever else a particular file system has in it's warped little mind.
I find a generator and a couple of elifs are much easier to read. No
need to define a class, pass context information to the methods of
that class, etc.
13) chdir, chmod, etc?
No. This has nothing to do with pathname.
What's the difference between chmod and touch? They both affect the
file metadata in similar ways.
Oren
John Roth wrote: Subject: PRE-PEP: new Path class
I'm adding a thread for comments on Gerrit Holl's pre-pep, which can be found here:
http://tinyurl.com/2578q
I will update the Pre-PEP tomorrow based on the comments I already have
seen in this PEP. Note that it is very 'pre' and opinions expressed in
the PEP are not guaranteed to be consistent in any way ;)
I will comment on the comments later.
yours,
Gerrit.
--
Asperger's Syndrome - a personal approach: http://people.nl.linux.org/~gerrit/english/
John Roth wrote: "Just" <ju**@xs4all.nl> wrote in message news:ju************************@news1.news.xs4all .nl...
In article <vv************@news.supernews.com>, "John Roth" <ne********@jhrothjr.com> wrote:
.... 15. Should files and directories be the same class.
Probably not. While they share a lot of common functionality (which should be spelled out as an interface) they also have a lot of dissimilar functionality. Separating them also makes it easy to create objects for things like symbolic links.
But what about paths for not-yet-existing files of folders? I don't think you should actually _hit_ the file system, if all you're doing is path.join().
I agree here. I haven't looked at any of the candidate implementations yet, so I don't know what they're doing. I'm thinking of a three class structure: the parent class is just the path manipulations; it has two subclasses, one for real files and one for real directories. That way they can not only inherit all of the common path manipulation stuff, but the developer can instantiate a pure path manipulation class as well.
There might also be a mixin that encapsulates the stuff that's common to real files and directories like accessing and changing dates and permissions.
I'm sure there are use cases that will throw a curve at that structure as well.
My implementation combines the two into a single class. Here's the logic:
* There is no necessary distinction between files and directories at
the path level
o Particularly with upcoming ReiserFS 4, where structured
storage shows up, it's possible to have files behaving much
like directories.
o Zip files also come to mind if we have virtual file system
support eventually
* These objects represent paths, not the things pointed to by the paths.
o They allow you to operate on the path, which is "almost" the
filesystem, but not quite.
o In the space of "paths", there's no distinction between a
file and a directory, really.
o Even a path that traverses a few symbolic links, and drops
into a zip-file is still just a path, it's a way of
specifying something, similar to a "name" or "location" class.
o You can find out what the path-object points to via the path
methods, but the path itself isn't those objects.
* Don't want to have to explicitly cast your paths to file/directory
to get the basic file/directory operations when joining paths.
o Mix-ins require changing the class of the instance by
somehow figuring out that it's a file, that requires a
file-system access (to what may be a non-existent or very
expensive-to-access file).
o There's not much of a conflict between the file/directory
path operations
Enjoy,
Mike
_______________________________________
Mike C. Fletcher
Designer, VR Plumber, Coder http://members.rogers.com/mcfletch/
John Roth wrote:
.... 1) Should path be a subclass of str?
No. Outside of the difficulty of deciding whether it's a subclass of single byte or unicode strings, it's a pure and simple case of Feature Envy. Granted, there will be times a developer wants to use string methods, but the most common operations should be supported directly.
It's not the methods that make me want to use str/unicode as the base,
it's the ability to pass the resulting instances to built-in methods
that explicitly expect/require str/unicode types. Not sure how many of
the core functions/libraries still have such requirements, but I'd guess
it's a few.
That said, I don't mind making path it's own base-class, I just *really*
want to be able to pass them to path-unaware code without extra coercian
(otherwise switching a module to *producing* paths instead of raw
strings will cause the *clients* of that module to break, which is a
serious problem for rapid adoption).
3) Should the / operator map joinpath.
Agreed, no. As for using + for join, that will break a lot of code that
does things like this:
p = mymodule.getSomeFilename()
backup = p + '.bak'
copyfile( p, backup )
open( p, 'w').write( whatever )
i.e. we're thinking of returning these things in a lot of situations
where strings were previously returned, string-like operations should
IMO, be the norm. But then we disagree on that anyway ;) .
4) Should path expose an iterator for listdir(?)
I don't see why not, as long as the path is to a directory.
Seems ambiguous to me. Also seems silly to use a generator when we're
producing a list anyway from the underlying call, might as well return
the list to allow length checks and random access. Iterators for
"ancestors" might be useful, but again, doesn't really seem like it
needs to be __iter__ instead of "ancestors".
5) Should == operator be the same as os.path.samefile()?
Why not...
__eq__ sounds about right. I gather this call goes out to the
filesystem first, though. Might be good to first check for absolute
equality (i.e. the same actual path) before doing that.
6) Path.open()?
Of course.
Ditto.
7) Should the various gettime methods return Datetime objects.
Of course.
What are we doing for Python 2.2 then? I agree with the principle, but
we should likely have a fallback when datetime isn't available.
8) Touch method?
Of course.
Neutral, seems fine.
9) Current OS constants?
What are they? Are we talking about the four constants in the access() function, or about something else?
Don't know myself.
10) Commonprefix, walk and sameopenfile?
Commonprefix should be a string or list method, it doesn't fit here.
Path commonprefix are different operations from str commonprefix. Paths
should only accept entire path-segments (names) as being equal, while
strings should accept any set of characters:
'/this/that/those/them'
'/this/thatly/those/them'
should see '/this/' as the commonprefix for the paths, not '/this/that'.
walk is a nice function, but it should be redone to use the visitor pattern directly, with different method names for files, directories and whatever else a particular file system has in it's warped little mind.
Reworking walk is probably a good idea. I'll let others worry about it,
as I've re-implemented the functionality so many times for my own code
that I'm just sick of it :) .
11) rename join and split.
I wouldn't bother. Since I'm against making it a subclass of str(), the issue doesn't arise.
No real preference one way or another here. join -> "append" for paths
seems fine. split -> "elements" or "steps" for paths also seems fine.
12) Should == compare file sizes.
No. Might have a method to do that.
Agreed, though even then, if we have a method that returns file-sizes:
path( ... ).size() == path( ... ).size()
seems almost as reasonable as having a method for it?
13) chdir, chmod, etc?
No. This has nothing to do with pathname.
chmod has to do with altering the access mode of a file/directory by
specifying it's path, no? Seems like it could readily be a method of
the path. chdir should accept a path, otherwise doesn't seem like it
should be a method.
14. Unicode filenames
Have to have them on Windows and probably on the Mac.
Yes.
15. Should files and directories be the same class.
Replied to this in the sub-thread...
Enjoy all,
Mike
_______________________________________
Mike C. Fletcher
Designer, VR Plumber, Coder http://members.rogers.com/mcfletch/
"Mike C. Fletcher" <mc******@rogers.com> wrote in message
news:ma**************************************@pyth on.org... John Roth wrote: ...
1) Should path be a subclass of str?
No. Outside of the difficulty of deciding whether it's a subclass of single byte or unicode strings, it's a pure and simple case of Feature Envy. Granted, there will be times a developer wants to use string methods, but the most common operations should be supported directly.
It's not the methods that make me want to use str/unicode as the base, it's the ability to pass the resulting instances to built-in methods that explicitly expect/require str/unicode types. Not sure how many of the core functions/libraries still have such requirements, but I'd guess it's a few.
That said, I don't mind making path it's own base-class, I just *really* want to be able to pass them to path-unaware code without extra coercian (otherwise switching a module to *producing* paths instead of raw strings will cause the *clients* of that module to break, which is a serious problem for rapid adoption).
That's an excellent point, but it begs the question of which
string class it should subclass. Unless it's got some way of
changing its base class depending on the system it's running
on. That, in turn, probably violates the Principle of Least
Astonishment. 5) Should == operator be the same as os.path.samefile()?
Why not...
__eq__ sounds about right. I gather this call goes out to the filesystem first, though. Might be good to first check for absolute equality (i.e. the same actual path) before doing that.
I think this has to do with "conceptual integrity." Are we talking
about a path object that happens to have the ability to do file
system operations in appropriate circumstances, or are we talking
about a file system object that includes all of the usual path
manipulations? You seem to be thinking of the first approach,
and I'm thinking of the second. You're beginning to convince me,
though. 7) Should the various gettime methods return Datetime objects.
Of course.
What are we doing for Python 2.2 then? I agree with the principle, but we should likely have a fallback when datetime isn't available.
Do we care? If this is going into Python, it will be in 2.4 at the
earliest, with a possible addon to a late 2.3 release. I don't see
it going into 2.2 at all, although a backwards version would
be nice. 10) Commonprefix, walk and sameopenfile?
Commonprefix should be a string or list method, it doesn't fit here.
Path commonprefix are different operations from str commonprefix. Paths should only accept entire path-segments (names) as being equal, while strings should accept any set of characters:
'/this/that/those/them' '/this/thatly/those/them'
should see '/this/' as the commonprefix for the paths, not '/this/that'.
Good point if you're thinking of heterogenous collections. If you're
thinking (as I am) that an object can represent a directory, then it
seems like a singularly useless method. walk is a nice function, but it should be redone to use the visitor pattern directly, with different method names for files, directories and whatever else a particular file system has in it's warped little mind.
Reworking walk is probably a good idea. I'll let others worry about it, as I've re-implemented the functionality so many times for my own code that I'm just sick of it :) .
I can understand that. [grin]
13) chdir, chmod, etc?
No. This has nothing to do with pathname.
chmod has to do with altering the access mode of a file/directory by specifying it's path, no? Seems like it could readily be a method of the path.
Right. I forgot that these are two totally different issues.
chdir should accept a path, otherwise doesn't seem like it should be a method.
If the path object describes a directory, then I'd see
a .chdir() method as useful. Otherwise, it belongs
somewhere else, although I don't have a clue where
at the moment. Enjoy all, Mike
John Roth
"John Roth" <ne********@jhrothjr.com> wrote in message news:<vv************@news.supernews.com>... I'm adding a thread for comments on Gerrit Holl's pre-pep, which can be found here:
http://tinyurl.com/2578q
.... 1) Should path be a subclass of str?
No.
So will the file constructor be "overloaded" to accept path objects?
What about all those functions in the os module?
[Mike C. Fletcher] That said, I don't mind making path it's own base-class, I just *really* want to be able to pass them to path-unaware code without extra coercian (otherwise switching a module to *producing* paths instead of raw strings will cause the *clients* of that module to break, which is a serious problem for rapid adoption).
[John Roth] That's an excellent point, but it begs the question of which string class it should subclass. Unless it's got some way of changing its base class depending on the system it's running on. That, in turn, probably violates the Principle of Least Astonishment.
That's in fact exactly what Jason Orendorff's path module does. But it's
buggy due to os.path.supports_unicode_filenames being buggy.
It would be interesting to figure out to what extent non-string (and
non-unicode) path objects can or can't be made to work for existing
string-accepting code. I would very much prefer a path _not_ to inherit
from str or unicode, but Mike's point is an important one. What is
missing in Python to allow non-string objects to act like (unicode)
strings?
Just
John Roth wrote: I'm adding a thread for comments on Gerrit Holl's pre-pep, which can be found here:
http://tinyurl.com/2578q
Frankly, I like the idea. It's about time that all of the file and directory stuff in the os module got objectified properly (or at least with some semblance of OO propriety!)
Great I've been thinking of something like this, while I've been writing
a "File-Class" (I'll open a thread for this ...)
[1] I think Path being a subclass of str is odd. There are a lot of
string-operations that don't fit to path (some of them should be
implemented in a different way e.g. __mul__ if at all).
However the point with the old os function etc. is very sound. So it
might be a good idea to have Path being a subclass of str *for
transition*. But finally all those functions should call str(argument)
instead of of demanding a str-object as argument (if they don't already
today).
This determines Path.__str__ to return a valid path-string.
Path's constructor should be able to use such a string.
(when there are Path-Classes for other stuff like URLs we maybe need a
factory ...)
[21] I think file-paths and directory-paths shouldn't be the same class
(they have different meaning. Think about a "walk"-function for dirs)
But it might be hard if the path doesn't exist, how to decide whether
it's a file-path or a dir-path?
You could do the follwing (for Unixes): if its string-representation
ends with '/' it's a directory otherwise it's a file.
Bash is autocompleting this way but "cd /home" is valid to and would
cause trouble with this algorithm.
(If the path exists it's easier)
file-path and directory-path should have a common superclass.
There seems to be a distinction between existing and non-existing paths.
Probably a lot of things shared between file-paths and directory-paths
are valid for non-existing-paths.
"about a path object that happens to have the ability to do file
system operations in appropriate circumstances" [John Roth]
This is a good thing but I think there are problems: maybe the given
path does not exist.
This takes me to my last point:
What about invalid paths?
Should Path-Class take care of always being a valid path (this doesn't
necessarily mean a path of an existing file/directory)
Especially if someone uses string-methods on a Path-object there could
arise invalid paths, even if finaly the path is valid again.
The validity of filesystem-paths is os/filesystem dependendt.
Christoph Becker-Freyseng
Dan Bishop wrote: "John Roth" <ne********@jhrothjr.com> wrote in message news:<vv************@news.supernews.com>...
I'm adding a thread for comments on Gerrit Holl's pre-pep, which can be found here:
http://tinyurl.com/2578q
...
1) Should path be a subclass of str?
No.
So will the file constructor be "overloaded" to accept path objects? What about all those functions in the os module?
IMO this is the bettter way.
However it might be useful for transition subclass str. (see my other
posting)
Mike C. Fletcher wrote: 9) Current OS constants?
What are they? Are we talking about the four constants in the access() function, or about something else?
Don't know myself.
I meant os.path constants: curdir, pathsep, defpath, etc.
They should be included. use the visitor pattern directly, with different method names for files, directories and whatever else a particular file system has in it's warped little mind.
Reworking walk is probably a good idea. I'll let others worry about it, as I've re-implemented the functionality so many times for my own code that I'm just sick of it :) .
I think os.walk is good as it is.
yours,
Gerrit.
--
34. If a ... or a ... harm the property of a captain, injure the
captain, or take away from the captain a gift presented to him by the
king, then the ... or ... shall be put to death.
-- 1780 BC, Hammurabi, Code of Law
--
Asperger's Syndrome - a personal approach: http://people.nl.linux.org/~gerrit/english/
Christoph Becker-Freyseng wrote: [1] I think Path being a subclass of str is odd. There are a lot of string-operations that don't fit to path (some of them should be implemented in a different way e.g. __mul__ if at all). However the point with the old os function etc. is very sound. So it might be a good idea to have Path being a subclass of str *for transition*. But finally all those functions should call str(argument) instead of of demanding a str-object as argument (if they don't already today).
Another possibility, which I have put in the Pre-PEP, is;
We can add a method .openwith(), which takes a callable as it's first
argument: p.openwith(f, *args) would result in f(str(p), *args). This
would make p.open(*args) a shorthand for p.openwith(file, args).
What do you think?
This takes me to my last point: What about invalid paths? Should Path-Class take care of always being a valid path (this doesn't necessarily mean a path of an existing file/directory)
It may be a good idea to do so. At first, I didn't understand what it
meant, an 'invalid path', but let's define it as anything that triggers
a TypeError when passed to open or listdir. On POSIX, I know only one
case: \0 in path. It may be a lot more difficult on Windows or the Mac.
I'm not sure about this idea yet.
Especially if someone uses string-methods on a Path-object there could arise invalid paths, even if finaly the path is valid again.
Yes. But I can't really think of a use case for doing operations on a
path which make it invalid. Does it occur in practice?
yours,
Gerrit.
--
123. If he turn it over for safe keeping without witness or contract,
and if he to whom it was given deny it, then he has no legitimate claim.
-- 1780 BC, Hammurabi, Code of Law
--
Asperger's Syndrome - a personal approach: http://people.nl.linux.org/~gerrit/english/
Just wrote: In article <vv************@news.supernews.com>, "John Roth" <ne********@jhrothjr.com> wrote: 4) Should path expose an iterator for listdir(?)
I don't see why not, as long as the path is to a directory.
_An_ iterator, sure, but not __iter__. How about path.listdir()? :) __iter__ could also iterate over the path elements, so it's ambiguous at least.
I think it should be called .list(): this way, it is better extendable
to archive files like zip and tar. Indeed: I know at least 3 different
possibilities for path.__iter__. Because of "In the face of ambiguity,
refuse the temptation to guess.", I think there should be no __iter__
(which is even another reason not to subclass __str__, by the way) [0]. 15. Should files and directories be the same class.
Probably not. While they share a lot of common functionality (which should be spelled out as an interface) they also have a lot of dissimilar functionality. Separating them also makes it easy to create objects for things like symbolic links.
But what about paths for not-yet-existing files of folders? I don't think you should actually _hit_ the file system, if all your doing is path.join().
Another problem is dat I may not know whether I have a file or a
directory. If a directory is a different type than a file, it would
probably have a different constructor as well, and I may end up doing:
p = path(foo)
if p.isdir():
p = dirpath(foo)
If this is done implicitly, you can't create a path without
fs-interaction, which is bad for virtual-fs extensibility and confusing
if it doesn't mean a path always exists [1].
[0] http://people.nl.linux.org/~gerrit/c...-foo-in-mypath
[1] http://people.nl.linux.org/~gerrit/c...l#absent-paths
yours,
Gerrit.
--
257. If any one hire a field laborer, he shall pay him eight gur of
corn per year.
-- 1780 BC, Hammurabi, Code of Law
--
Asperger's Syndrome - a personal approach: http://people.nl.linux.org/~gerrit/english/
[PEP] 7) Should the various gettime methods return Datetime objects.
[John Roth]Of course.
I was hasitating, because of the backwards-compability, but this should
not be a reason not to make things better, of course (TV isn't backward
compatible with radio either ;)
[Mike C. Fletcher] What are we doing for Python 2.2 then? I agree with the principle, but we should likely have a fallback when datetime isn't available.
[John Roth] Do we care? If this is going into Python, it will be in 2.4 at the earliest, with a possible addon to a late 2.3 release. I don't see it going into 2.2 at all, although a backwards version would be nice.
If the PEP will be finished and may be accepted, I think that the
roadmap of introducing the feature will be like sets: In 2.4, it's a
library, and if it's succesful/popular, it may become a builtin in 2.5.
Path commonprefix are different operations from str commonprefix. Paths should only accept entire path-segments (names) as being equal, while strings should accept any set of characters:
'/this/that/those/them' '/this/thatly/those/them'
should see '/this/' as the commonprefix for the paths, not '/this/that'.
In should... it doesn't seem to do so currently.
Good point if you're thinking of heterogenous collections. If you're thinking (as I am) that an object can represent a directory, then it seems like a singularly useless method.
The only place where I can think of a use is a tarfile/zipfile. For a
path, it means nothing. It can be useful but since it needs multiple
paths, it can't be a method. The only thing I can think of is a
classmethod, a constructor, but I don't really like the idea much.
13) chdir, chmod, etc?
No. This has nothing to do with pathname.
Is p.chdir() better or worse than chdir(p)?
The latter reads better, but that may be because we're used to it.
But on the other hand, that may be a convincing argument just as well :)
chdir should accept a path, otherwise doesn't seem like it should be a method.
If the path object describes a directory, then I'd see a .chdir() method as useful. Otherwise, it belongs somewhere else, although I don't have a clue where at the moment.
I think there should be no distinction between files and directories,
and that p.chdir() for a non-directory should raise the same exception
as currently (OSError Errno 20).
yours,
Gerrit.
--
27. If a chieftain or man be caught in the misfortune of the king
(captured in battle), and if his fields and garden be given to another and
he take possession, if he return and reaches his place, his field and
garden shall be returned to him, he shall take it over again.
-- 1780 BC, Hammurabi, Code of Law
--
Asperger's Syndrome - a personal approach: http://people.nl.linux.org/~gerrit/english/
"John Roth" <ne********@jhrothjr.com> writes: 5) Should == operator be the same as os.path.samefile()?
Why not...
ISTM, that it would essentially make path objects non-hashable.
posixpath.samefile compares the os.stat values for both filenames.
These values can change over time vor various reasons so object equality
changes too. That it changes is desired, obviously, but what do you use
as hash value? p1 == p2 has to imply that hash(p1) == hash(p2) but the
only way to achieve that is to give all path objects the same hash
value. That would make dicts with path objects as keys very
inefficient, though.
As for the path objects in general: Cute idea, but IMO strings for
filenames work fine and there's nothing unpythonic about it. The
virtual filesytem bit seems like a good reason not to introduce the path
type just yet:
2) virtual file system extensibility.
No opinion at this time. I'd like to see a couple of attempts at an implementation first before settling on a single design.
At this point it doesn't seem clear what virtual filesystems would mean
for Python, so it's unclear, too, what it would mean for a Path class.
Introduce a Path class once there is a need for having several distinct
classes, not earlier.
Bernhard
--
Intevation GmbH http://intevation.de/
Sketch http://sketch.sourceforge.net/
Thuban http://thuban.intevation.org/
Christoph Becker-Freyseng wrote: openwith would be a nice add-on. I see two problems with it: 1.) it's long. I mean f(str(path), *args) is shorter than path.openwith(f, *args)
This is indead a disadvantage. On the other hand, although p.openwith is
longer, I do think it is more readable. It occurs often that shorter is
not more readable: just think of all those 'obfuscated-oneliners'
contests in C and Perl.
path > (f, arg1, arg2, ...)
(this works by overwriting path.__gt__)
I think this is not a good idea. In my opinion, any __gt__ method should
always compare, no more, no less. Further, it's very unusal to call
something like this.
Another possibility is defining __call__:
path(f, *args) == f(str(path), *args)
which may be unconvinient as well, however. Is it intuitive to let
calling mean opening?
2.) the position of the argument for the path can only be the first one. (maybe one could misuse even more operators e.g. the __r...__ ones; But I think this will result in obscure code)
Hm, I think almost all file constructors have the path as the first
argument. Are there counter-examples?
path.open shouldn't always call the ordinary file-constructor. I mean it should be changed for special path-classes like FTPPath ... (Of course for ordinary file-paths the ordinary file-class is the right one.)
Yes... I was planning to define it in a class which is able to 'touch'
the filesystem, so an FTPPath could subclass basepath without the need
to overload open, or subclass ReadablePath with this need.
Additionaly path-validity is filesystem-dependend. And worse on system like Linux there can be more than one file system within the same root / and they all could have different restrictions! (if I were a viscious guy I could even umount and mount a different fs making a valid path possibly invalid)
I think this makes validating a path essentially impossible to get
right. Let's say we can declare path to be invalid, but we can't declare
a path to be valid. Is it a good thing to add a method for it then? (I
think yes)
So I think the better solution would be to define a path.isValid()
I agree that it's better. We should allow invalid paths after all.
We also need a path.exists() method.
Sure.
I'm not sure how both should interact ??
Well, if path.exists(), path.isValid(). The question is - should
path.isValid() read the filesystem?
Another Point: Should Path be immutable like string?
I have though about this, too. It should certainly not be fully mutable,
because if a path changes, it changes. But maybe we could have a
..normalise_inplace() which mutates the Path? What consequences would
this have for hashability?
I like paths to be hashable. so they probably should be immutable.
yours,
Gerrit.
--
16. If any one receive into his house a runaway male or female slave of
the court, or of a freedman, and does not bring it out at the public
proclamation of the major domus, the master of the house shall be put to
death.
-- 1780 BC, Hammurabi, Code of Law
--
PrePEP: Builtin path type http://people.nl.linux.org/~gerrit/c.../pep-xxxx.html
Asperger's Syndrome - a personal approach: http://people.nl.linux.org/~gerrit/english/
John Roth wrote: I'm adding a thread for comments on Gerrit Holl's pre-pep, which can be found here:
http://tinyurl.com/2578q
I have updated a lot of things on the PEP in the past few days.
There are still larger and smaller open issues, though, besides the
usual 'I-can-change-my-mind-and-the-PEP-can-change-its-mind' things:
Quoting my own PEP:
Should path.__eq__ match path.samefile?
There are at least 2 possible ways to do it:
- Normalize both operands by checking to which actual file they
point (same (l)stat).
- Try to find out whether the paths point to the same filesystem
entry, without doing anything with the filesystem.
pro
- A path usually points to a certain place on the filesystem, and
two paths with different string representations may point to the same place,
which means they are essentially equal in usage.
con
- We would have to choose a way, so we should first decide which is
better and whether the difference is intuitive enough.
- It makes hashing more difficult/impossible.
conclusion
- I don't know.
links
- Bernard Herzog `points out
<http://mail.python.org/pipermail/python-list/2004-January/201857.html>`__
that it would essentialy make path-objects non-hashable.
- `James Orendorff's Path`_ inherits str.__eq__.
- `Mike C. Fletcher's Path`_ chooses for the first variant.
Do we need to treat unicode filenames specially?
I have no idea.
links
- An `explanation
<http://mail.python.org/pipermail/python-list/2004-January/201418.html>`__
by Martin von Loewis.
- should os.tempnam be included?
- can normpath be coded using only os.path constants (if so, it's in the
'platform-independent' class? (I think no)
- Should normalize be called normalized or not?
- Should stat be defined in the platform-dependent or -independent class?
- Should we include chdir and chroot? If so, where?
- Should rename return a new path object?
- Should renames be included?
And one meta-question:
Shall I submit this as an official PEP? Or shall I first fill in more
open issues and perhaps give the possibility to change "closed" issues?
See also: http://people.nl.linux.org/~gerrit/c.../pep-xxxx.html
yours,
Gerrit.
--
202. If any one strike the body of a man higher in rank than he, he
shall receive sixty blows with an ox-whip in public.
-- 1780 BC, Hammurabi, Code of Law
--
PrePEP: Builtin path type http://people.nl.linux.org/~gerrit/c.../pep-xxxx.html
Asperger's Syndrome - a personal approach: http://people.nl.linux.org/~gerrit/english/
Gerrit Holl wrote: Christoph Becker-Freyseng wrote:
openwith would be a nice add-on. I see two problems with it: 1.) it's long. I mean f(str(path), *args) is shorter than path.openwith(f, *args)
This is indead a disadvantage. On the other hand, although p.openwith is longer, I do think it is more readable. It occurs often that shorter is not more readable: just think of all those 'obfuscated-oneliners' contests in C and Perl.
Sure. However this case seems to be quite obvious both ways.
path > (f, arg1, arg2, ...)
(this works by overwriting path.__gt__)
I think this is not a good idea. In my opinion, any __gt__ method should always compare, no more, no less. Further, it's very unusal to call something like this.
Additionaly getting the right documentation for these "operator-tricks"
is harder. Another possibility is defining __call__:
path(f, *args) == f(str(path), *args)
which may be unconvinient as well, however. Is it intuitive to let calling mean opening?
I like this one. What else could calling a path mean?
2.) the position of the argument for the path can only be the first one. (maybe one could misuse even more operators e.g. the __r...__ ones; But I think this will result in obscure code)
Hm, I think almost all file constructors have the path as the first argument. Are there counter-examples?
The whole openwith (other then path.open) is IMO mainly for
"backward-compatibility" if the function doesn't know the path-class.
I think openwith or better __call__ could be used for other things, too
--- not only for opening a file. E.g. there could be some
"FileWatcher-Modules" that might only accept strings and have a call like:
watchFile(onChangeFunc, path_string)
For different postition of the path_string we could make a special case
if the path-object is given as an argument of the call.
path(f, arg1, arg2, path, arg3, arg4, ...)
results in: f(arg1, arg2, str(path), arg3, arg4, ...)
Changing old code to use the new Path-class could be done with a minimal
amount of work then.
OLD: result= f(arg1, arg2, path, arg3, arg4, ...) # path is/was a string
here
..... result= f,arg1, arg2, path, arg3, arg4, ...)
..... result= (f, arg1, arg2, path, arg3, arg4, ...)
NEW: result= path(f, arg1, arg2, path, arg3, arg4, ...)
path.open shouldn't always call the ordinary file-constructor. I mean it should be changed for special path-classes like FTPPath ... (Of course for ordinary file-paths the ordinary file-class is the right one.)
Yes... I was planning to define it in a class which is able to 'touch' the filesystem, so an FTPPath could subclass basepath without the need to overload open, or subclass ReadablePath with this need.
Fine :-)
Christoph Becker-Freyseng
P.S.: ((I'll post the other things in different Re:'s))
Gerrit Holl wrote: Christoph Becker-Freyseng wrote:
Additionaly path-validity is filesystem-dependend. And worse on system like Linux there can be more than one file system within the same root / and they all could have different restrictions! (if I were a viscious guy I could even umount and mount a different fs making a valid path possibly invalid)
I think this makes validating a path essentially impossible to get right. Let's say we can declare path to be invalid, but we can't declare a path to be valid. Is it a good thing to add a method for it then? (I think yes)
So I think the better solution would be to define a path.isValid()
I agree that it's better. We should allow invalid paths after all.
Yes. path.isValid would it make possible to check better for situations
were calling things like mkdir and mkdirs (they directly depend on the
path being valid) makes trouble.
We could also add an InvalidPathException. Which will at least help
debugging. isValid could have a default argument "raiseExc=False" to
make checking in these functions convienent e.g.
def mkdir(self):
self.isValid(raiseExc=True)
moreStuff ...
If the path is invalid it will stop with an InvalidPathException.
Also path.exists should depend on path.isValid (not the other way).
If the full-path doesn't exist one can go up all the parent-dirs until
one exist. Here we can check if the specified sub-path is valid by
getting some information about the filesystem where the parent-dir is
stored. *this implicit makes isValid a reading method* --- however AFAIK
isValid is only needed for reading and writing methods.
We also need a path.exists() method.
Sure.
I'm not sure how both should interact ??
Well, if path.exists(), path.isValid(). The question is - should path.isValid() read the filesystem?
Yes as stated above.
path.exists should at first check if the path isValid at all. If it
isn't a statement about it's existance is senseless. In this case it
should return None, which evaluates also False but is different (it's a
"dreiwertige Logik" --- when you have 3 states (true, false, unknown)
how is this called in English)
FIXME: We have to finetune the "recursive" behavior of isValid and
exists otherwise we have a lot of unnecessary calls as exists and
isValid call each other going one dir up ...
Christoph Becker-Freyseng
Gerrit Holl wrote: Christoph Becker-Freyseng wrote:
Should Path be immutable like string?
I have though about this, too. It should certainly not be fully mutable, because if a path changes, it changes. But maybe we could have a .normalise_inplace() which mutates the Path? What consequences would this have for hashability?
I like paths to be hashable. so they probably should be immutable.
Yes. (already in the PEP)
While paths aren't strings they have a lot in common because paths (as I
now think of them) are not directly associated with files. (Paths can be
nonexistent or even invalid)
Moreover the basic operations like __eq__ shouldn't be reading methods ()!
__hash__ has to be compatible with __eq__.
hash(p1) == hash(p2) <<<=== p1 == p2
Also
hash(p1) == hash(p2) ===>>> p1 == p2
should be true as far as possible.
I think
def __hash__(self):
return hash(str(self.normalized()))
would do this fine.
So for __eq__ it follows naturaly
def __eq__(self, other):
FIXME: isinstance checking
return (str(self.normalized()) == str(other.normalized()))
It cares about nonexistent paths, too. (The samefile-solution won't ---
we might code a special case for it ...)
What about __cmp__?
I've to admit that __cmp__ comparing the file-sizes is nice (__eq__=
samefile is attractive, too --- they're both evil temptations :-) )
However __eq__ and __cmp__ returning possibly different results is odd.
Finaly implementing __cmp__ that way would make it a reading method and
is problematic for nonexistent paths.
I'd like an implementation of __cmp__ which is more path specific than
just string.__cmp__. But it should be consistent with __eq__.
Could we do something about parent and sub dirs?
Christoph Becker-Freyseng
I think the implementation should be changed for the "NormalFSPath".
def exists(self):
try:
os.stat(str(self))
return True
except OSError, exc: # Couldn't stat so what's up
if exc.errno == errno.ENOENT: # it simply doesn't exist
return False
return None # the path is invalid
def isValid(self, raiseExc=False):
if self.exists() is None:
if raiseExc:
raise InvalidPath
else:
return False
else:
return True
Christoph Becker-Freyseng
Christoph Becker-Freyseng <we*******@beyond-thoughts.com> writes: So for __eq__ it follows naturaly def __eq__(self, other): FIXME: isinstance checking return (str(self.normalized()) == str(other.normalized())) It cares about nonexistent paths, too. (The samefile-solution won't --- we might code a special case for it ...)
What exactly does normalized() do? If it's equivalent to
os.path.normpath, then p1 == p2 might be true even though they refer to
different files (on posix, a/../b is not necessarily the same file as
b). OTOH, if it also called os.path.realpath too to take symlinks into
account, __eq__ would depend on the state of the filesystem which is
also bad.
IMO __eq__ should simply compare the strings without any modification.
If you want to compare normalized paths you should have to normalize
them explicitly.
Bernhard
--
Intevation GmbH http://intevation.de/
Sketch http://sketch.sourceforge.net/
Thuban http://thuban.intevation.org/
Bernhard Herzog wrote: Christoph Becker-Freyseng <we*******@beyond-thoughts.com> writes:
So for __eq__ it follows naturaly def __eq__(self, other): FIXME: isinstance checking return (str(self.normalized()) == str(other.normalized())) It cares about nonexistent paths, too. (The samefile-solution won't --- we might code a special case for it ...)
What exactly does normalized() do? If it's equivalent to os.path.normpath, then p1 == p2 might be true even though they refer to
IMO yes. different files (on posix, a/../b is not necessarily the same file as b). OTOH, if it also called os.path.realpath too to take symlinks into account, __eq__ would depend on the state of the filesystem which is also bad.
IMO __eq__ should simply compare the strings without any modification. If you want to compare normalized paths you should have to normalize them explicitly.
I agree with that. While it would be nice if __eq__ could match such
things it is ambiguous.
So better let __eq__ be a bit strict than faulty.
Christoph Becker-Freyseng
As I pointed out path.__cmp__ should not be used for e.g. comparing
filesizes.
But features like sorting on filesizes are very useful.
I'm not sure if Gerrit Holl already meant this in his conclusion on
"Comparing files" in the PEP.
I'll outline it a bit ...
I propose a callable singleton class which only instance we assign to
sort_on (defined in the path-module).
It will have methods like: filesize, extension, filename, etc.
They will all be defined like:
def filesize(self, path1, path2):
try:
return path1._cmp_filesize(path2)
except XXX: # catch Exceptions that are raised because path1 doesn't
know how to compare with path2 (for different path-subclasses)
XXX
try:
return (-1) * path2._cmp_filesize(path1) # is this the best way to do
this?
except XXX:
XXX
raise YYY # "path1 and path2 can't be compared on filesize; class1 and
class2 are not compatible"
And
def __call__(self, *args):
if len(args) == 0:
return self.filesize # example!
elif len(args) == 1: # allow comparing uncommon things for subclasses
of path e.g. ServerName/IPs for FTPPath ...
def cmp_x(path1, path2, what=str(args[0])):
# like filesize but
pathCmpFunc= getattr(path1, '_cmp_'+what)
return pathCmpFunc(path2)
# Catch exceptions ...
return cmp_x
elif len(args) == 2: # default comparison
return self.filesize(path1, path2) # example!
else:
raise "Won't work ... FIXME"
Then we can have things like:
l= [path1, path2, path3]
l.sort(path.sort_on.filesize)
l.sort(path.sort_on.extension)
.....
I like this :-)
What do You think?
Christoph Becker-Freyseng
On Fri, Jan 09, 2004 at 07:41:32PM +0100, Christoph Becker-Freyseng wrote: As I pointed out path.__cmp__ should not be used for e.g. comparing filesizes.
But features like sorting on filesizes are very useful. I'm not sure if Gerrit Holl already meant this in his conclusion on "Comparing files" in the PEP. I'll outline it a bit ...
This seems to be covered by the new builtin DSU support which will exist
in 2.4. See the (many, many) posts on python-dev on the "groupby" iterator: http://mail.python.org/pipermail/pyt...er/thread.html
In particular, the ones talking about `attrget'.
Jp
Gerrit Holl wrote: John Roth wrote:
I'm adding a thread for comments on Gerrit Holl's pre-pep, which
[...] Shall I submit this as an official PEP? Or shall I first fill in more open issues and perhaps give the possibility to change "closed" issues?
I think there are still a lot of issues. I think letting settle things
down at first is wiser. And then present a PEP where many (even better
all) contributors agree. (I didn't like the result of PEP308 very much ...)
In additions there are still good points in the older discussions and
existing modules that should be integrated (You linked in the prePEP).
Moreover I'd like to extend the pre-PEP (and PEP):
"PEP xxx: new path module"
Because there is more than just the Path-Class: BaseClasses, Exceptions,
Helper-Functions ...
Christoph Becker-Freyseng
Christoph Becker-Freyseng wrote: Gerrit Holl wrote:John Roth wrote:
I'm adding a thread for comments on Gerrit Holl's pre-pep, which [...]Shall I submit this as an official PEP? Or shall I first fill in more open issues and perhaps give the possibility to change "closed" issues?
I think there are still a lot of issues. I think letting settle things down at first is wiser. And then present a PEP where many (even better all) contributors agree. (I didn't like the result of PEP308 very much ...)
Yes. But of course, a PEP being an official PEP does not mean there
can't be any more changes to it. So the question is, at what point does
a pre-PEP become a PEP? Some PEPs have a $Revision: 1.20$, after all.
In additions there are still good points in the older discussions and existing modules that should be integrated (You linked in the prePEP).
Yes, that's true. I'll do that.
yours,
Gerrit.
--
49. If any one take money from a merchant, and give the merchant a
field tillable for corn or sesame and order him to plant corn or sesame in
the field, and to harvest the crop; if the cultivator plant corn or sesame
in the field, at the harvest the corn or sesame that is in the field shall
belong to the owner of the field and he shall pay corn as rent, for the
money he received from the merchant, and the livelihood of the cultivator
shall he give to the merchant.
-- 1780 BC, Hammurabi, Code of Law
--
PrePEP: Builtin path type http://people.nl.linux.org/~gerrit/c.../pep-xxxx.html
Asperger's Syndrome - a personal approach: http://people.nl.linux.org/~gerrit/english/ This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics
by: GriffithsJ |
last post by:
Hi
I have been given some text that needs to be displayed on a web page. The
text is pre-formatted (includes things like lists etc) and displays okay if
I wrap it using the <pre/> tag. ...
|
by: Headless |
last post by:
I've marked up song lyrics with the <pre> tag because it seems the most
appropriate type of markup for the type of data. This results in
inefficient use of horizontal space due to UA's default...
|
by: Alan Illeman |
last post by:
How do I set several different properties for PRE in
a CSS stylesheet, rather than resorting to this:
<BODY>
<PRE STYLE="font-family:monospace;
font-size:0.95em;
width:40%;
border:red 2px...
|
by: Porthos |
last post by:
I'm authoring an XML document and using the <pre> html tag for the
portions that are not dynamically generated. The <pre> text is
displaying in a smaller font size (and I believe different font)...
|
by: Michael Shell |
last post by:
Greetings,
Consider the XHTML document attached at the end of this post.
When viewed under Firefox 1.0.5 on Linux, highlighting
and pasting (into a text editor) the <pre> tag listing will...
|
by: Jarno Suni not |
last post by:
It seems to be invalid in HTML 4.01, but valid in XHTML 1.0. Why is there the difference? Can that pose a problem when such a XHTML document is served as text/html?
|
by: Rocky Moore |
last post by:
I have a web site called HintsAndTips.com. On this site people post tips
using a very simply webform with a multi line TextBox for inputing the tip
text. This text is encode to HTML so that no...
|
by: Eric Lindsay |
last post by:
I can't figure how to best display little snippets of shell script using
<pre>. I just got around to organising to bulk validate some of my web
pages, and one of the problems occurs with Bash...
|
by: Xah Lee |
last post by:
The Concepts and Confusions of Pre-fix, In-fix, Post-fix and Fully
Functional Notations
Xah Lee, 2006-03-15
Let me summarize: The LISP notation, is a functional notation, and is
not a...
|
by: Kemmylinns12 |
last post by:
Blockchain technology has emerged as a transformative force in the business world, offering unprecedented opportunities for innovation and efficiency. While initially associated with cryptocurrencies...
|
by: jalbright99669 |
last post by:
Am having a bit of a time with URL Rewrite. I need to incorporate http to https redirect with a reverse proxy. I have the URL Rewrite rules made but the http to https rule only works for...
|
by: antdb |
last post by:
Ⅰ. Advantage of AntDB: hyper-convergence + streaming processing engine
In the overall architecture, a new "hyper-convergence" concept was proposed, which integrated multiple engines and...
|
by: Matthew3360 |
last post by:
Hi, I have a python app that i want to be able to get variables from a php page on my webserver. My python app is on my computer. How would I make it so the python app could use a http request to get...
|
by: AndyPSV |
last post by:
HOW CAN I CREATE AN AI with an .executable file that would suck all files in the folder and on my computerHOW CAN I CREATE AN AI with an .executable file that would suck all files in the folder and...
|
by: Arjunsri |
last post by:
I have a Redshift database that I need to use as an import data source. I have configured the DSN connection using the server, port, database, and credentials and received a successful connection...
|
by: Carina712 |
last post by:
Setting background colors for Excel documents can help to improve the visual appeal of the document and make it easier to read and understand. Background colors can be used to highlight important...
|
by: BLUEPANDA |
last post by:
At BluePanda Dev, we're passionate about building high-quality software and sharing our knowledge with the community. That's why we've created a SaaS starter kit that's not only easy to use but also...
|
by: Rahul1995seven |
last post by:
Introduction:
In the realm of programming languages, Python has emerged as a powerhouse. With its simplicity, versatility, and robustness, Python has gained popularity among beginners and experts...
| |