473,847 Members | 1,444 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

PRE-PEP: new Path class

I'm adding a thread for comments on Gerrit Holl's pre-pep, which
can be found here:

http://tinyurl.com/2578q

Frankly, I like the idea. It's about time that all of the file
and directory stuff in the os module got objectified
properly (or at least with some semblance of OO propriety!)

In the issues section:

1) Should path be a subclass of str?

No. Outside of the difficulty of deciding whether it's a
subclass of single byte or unicode strings, it's a pure
and simple case of Feature Envy. Granted, there will
be times a developer wants to use string methods, but
the most common operations should be supported directly.

2) virtual file system extensibility.

No opinion at this time. I'd like to see a couple
of attempts at an implementation first before
settling on a single design.

3) Should the / operator map joinpath.

I agree. No way. In the first place, that's a unixism
(Windows uses \, the Mac uses :) In the second
place it doesn't fit the common use of /, which is
to divide (separate) things. If we want an operator
for join (not a bad idea) I'd suggest using '+'. String
already overloads it for concatenation, and as I said
above, I'd just as soon *not* have this be a subclass
of string.

4) Should path expose an iterator for listdir(?)

I don't see why not, as long as the path is to a
directory.

5) Should == operator be the same as os.path.samefil e()?

Why not...

6) Path.open()?

Of course.

7) Should the various gettime methods return Datetime
objects.

Of course.

8) Touch method?

Of course.

9) Current OS constants?

What are they? Are we talking about the four
constants in the access() function, or about something
else?

10) Commonprefix, walk and sameopenfile?

Commonprefix should be a string or list method,
it doesn't fit here.

walk is a nice function, but it should be redone to
use the visitor pattern directly, with different method
names for files, directories and whatever else a
particular file system has in it's warped little mind.

sameopenfile doesn't belong in the os.path module
in the first place. It belongs in the os module under
6.1.3 - File Descriptor Operations.

11) rename join and split.

I wouldn't bother. Since I'm against making it a
subclass of str(), the issue doesn't arise.

12) Should == compare file sizes.

No. Might have a method to do that.

13) chdir, chmod, etc?

No. This has nothing to do with pathname.

14. Unicode filenames

Have to have them on Windows and probably
on the Mac.

15. Should files and directories be the same
class.

Probably not. While they share a lot of common
functionality (which should be spelled out as an
interface) they also have a lot of dissimilar
functionality. Separating them also makes it easy
to create objects for things like symbolic links.

In addition to this, we should have the ability
to update the other times (utime()) directly
using another file or directory object as well
as a Datetime object.

John Roth
Jul 18 '05 #1
31 3946
In article <vv************ @news.supernews .com>,
"John Roth" <ne********@jhr othjr.com> wrote:
I'm adding a thread for comments on Gerrit Holl's pre-pep, which
can be found here:

http://tinyurl.com/2578q

Frankly, I like the idea. It's about time that all of the file
and directory stuff in the os module got objectified
properly (or at least with some semblance of OO propriety!)

In the issues section:
[ snipping those points where I agree with John ]
4) Should path expose an iterator for listdir(?)

I don't see why not, as long as the path is to a
directory.
_An_ iterator, sure, but not __iter__. How about path.listdir()? :)
__iter__ could also iterate over the path elements, so it's ambiguous at
least.
15. Should files and directories be the same
class.

Probably not. While they share a lot of common
functionality (which should be spelled out as an
interface) they also have a lot of dissimilar
functionality. Separating them also makes it easy
to create objects for things like symbolic links.


But what about paths for not-yet-existing files of folders? I don't
think you should actually _hit_ the file system, if all your doing is
path.join().

Just
Jul 18 '05 #2

"Just" <ju**@xs4all.nl > wrote in message
news:ju******** *************** *@news1.news.xs 4all.nl...
In article <vv************ @news.supernews .com>,
"John Roth" <ne********@jhr othjr.com> wrote:
I'm adding a thread for comments on Gerrit Holl's pre-pep, which
can be found here:

http://tinyurl.com/2578q

Frankly, I like the idea. It's about time that all of the file
and directory stuff in the os module got objectified
properly (or at least with some semblance of OO propriety!)

In the issues section:
[ snipping those points where I agree with John ]
4) Should path expose an iterator for listdir(?)

I don't see why not, as long as the path is to a
directory.


_An_ iterator, sure, but not __iter__. How about path.listdir()? :)
__iter__ could also iterate over the path elements, so it's ambiguous at
least.


I see what you're saying. I'd argue (softly) that iterating over
the directory entries is the natural interpretation, though.
15. Should files and directories be the same
class.

Probably not. While they share a lot of common
functionality (which should be spelled out as an
interface) they also have a lot of dissimilar
functionality. Separating them also makes it easy
to create objects for things like symbolic links.


But what about paths for not-yet-existing files of folders? I don't
think you should actually _hit_ the file system, if all you're doing is
path.join().


I agree here. I haven't looked at any of the candidate implementations
yet, so I don't know what they're doing. I'm thinking of a
three class structure: the parent class is just the path manipulations;
it has two subclasses, one for real files and one for real directories.
That way they can not only inherit all of the common path manipulation
stuff, but the developer can instantiate a pure path manipulation
class as well.

There might also be a mixin that encapsulates the stuff that's common
to real files and directories like accessing and changing dates and
permissions.

I'm sure there are use cases that will throw a curve at that structure
as well.


Just

Jul 18 '05 #3
In article <vv************ @news.supernews .com>,
"John Roth" <ne********@jhr othjr.com> wrote:
4) Should path expose an iterator for listdir(?)

I don't see why not, as long as the path is to a
directory.


_An_ iterator, sure, but not __iter__. How about path.listdir()? :)
__iter__ could also iterate over the path elements, so it's ambiguous at
least.


I see what you're saying. I'd argue (softly) that iterating over
the directory entries is the natural interpretation, though.


It's far too implicit to my taste; for one since it's a folder-only
operation (and I don't see much merit in having separate classes for
folder and file paths). Would you also be in favor of interating over
file-paths meaning iterating over the lines in the file?

Just
Jul 18 '05 #4
On Mon, Jan 05, 2004 at 10:06:59AM -0500, John Roth wrote:
I'm adding a thread for comments on Gerrit Holl's pre-pep, which
can be found here:

http://tinyurl.com/2578q

Frankly, I like the idea. It's about time that all of the file
and directory stuff in the os module got objectified
properly (or at least with some semblance of OO propriety!)
"Peroperly" ? There is nothing particularly "proper" or "improper"
about objects or any other programming paradigm supported by Python.
Objectifying is not a goal in itself. I like the Path object because
the interface is easier to learn and use, not because it is
"objectifie d".
5) Should == operator be the same as os.path.samefil e()?

Why not...
No. Symbolic links are something you would sometimes want to treat as
distinct from the files they point to.
walk is a nice function, but it should be redone to
use the visitor pattern directly, with different method
names for files, directories and whatever else a
particular file system has in it's warped little mind.
I find a generator and a couple of elifs are much easier to read. No
need to define a class, pass context information to the methods of
that class, etc.
13) chdir, chmod, etc?

No. This has nothing to do with pathname.


What's the difference between chmod and touch? They both affect the
file metadata in similar ways.

Oren
Jul 18 '05 #5
John Roth wrote:
Subject: PRE-PEP: new Path class I'm adding a thread for comments on Gerrit Holl's pre-pep, which
can be found here:

http://tinyurl.com/2578q


I will update the Pre-PEP tomorrow based on the comments I already have
seen in this PEP. Note that it is very 'pre' and opinions expressed in
the PEP are not guaranteed to be consistent in any way ;)

I will comment on the comments later.

yours,
Gerrit.

--
Asperger's Syndrome - a personal approach:
http://people.nl.linux.org/~gerrit/english/

Jul 18 '05 #6
John Roth wrote:
"Just" <ju**@xs4all.nl > wrote in message
news:ju******* *************** **@news1.news.x s4all.nl...

In article <vv************ @news.supernews .com>,
"John Roth" <ne********@jhr othjr.com> wrote:

....
15. Should files and directories be the same
class.

Probably not. While they share a lot of common
functionalit y (which should be spelled out as an
interface) they also have a lot of dissimilar
functionalit y. Separating them also makes it easy
to create objects for things like symbolic links.

But what about paths for not-yet-existing files of folders? I don't
think you should actually _hit_ the file system, if all you're doing is
path.join() .


I agree here. I haven't looked at any of the candidate implementations
yet, so I don't know what they're doing. I'm thinking of a
three class structure: the parent class is just the path manipulations;
it has two subclasses, one for real files and one for real directories.
That way they can not only inherit all of the common path manipulation
stuff, but the developer can instantiate a pure path manipulation
class as well.

There might also be a mixin that encapsulates the stuff that's common
to real files and directories like accessing and changing dates and
permissions.

I'm sure there are use cases that will throw a curve at that structure
as well.

My implementation combines the two into a single class. Here's the logic:

* There is no necessary distinction between files and directories at
the path level
o Particularly with upcoming ReiserFS 4, where structured
storage shows up, it's possible to have files behaving much
like directories.
o Zip files also come to mind if we have virtual file system
support eventually
* These objects represent paths, not the things pointed to by the paths.
o They allow you to operate on the path, which is "almost" the
filesystem, but not quite.
o In the space of "paths", there's no distinction between a
file and a directory, really.
o Even a path that traverses a few symbolic links, and drops
into a zip-file is still just a path, it's a way of
specifying something, similar to a "name" or "location" class.
o You can find out what the path-object points to via the path
methods, but the path itself isn't those objects.
* Don't want to have to explicitly cast your paths to file/directory
to get the basic file/directory operations when joining paths.
o Mix-ins require changing the class of the instance by
somehow figuring out that it's a file, that requires a
file-system access (to what may be a non-existent or very
expensive-to-access file).
o There's not much of a conflict between the file/directory
path operations

Enjoy,
Mike

_______________ _______________ _________
Mike C. Fletcher
Designer, VR Plumber, Coder
http://members.rogers.com/mcfletch/


Jul 18 '05 #7
John Roth wrote:
....
1) Should path be a subclass of str?

No. Outside of the difficulty of deciding whether it's a
subclass of single byte or unicode strings, it's a pure
and simple case of Feature Envy. Granted, there will
be times a developer wants to use string methods, but
the most common operations should be supported directly.

It's not the methods that make me want to use str/unicode as the base,
it's the ability to pass the resulting instances to built-in methods
that explicitly expect/require str/unicode types. Not sure how many of
the core functions/libraries still have such requirements, but I'd guess
it's a few.

That said, I don't mind making path it's own base-class, I just *really*
want to be able to pass them to path-unaware code without extra coercian
(otherwise switching a module to *producing* paths instead of raw
strings will cause the *clients* of that module to break, which is a
serious problem for rapid adoption).
3) Should the / operator map joinpath.

Agreed, no. As for using + for join, that will break a lot of code that
does things like this:

p = mymodule.getSom eFilename()
backup = p + '.bak'
copyfile( p, backup )
open( p, 'w').write( whatever )

i.e. we're thinking of returning these things in a lot of situations
where strings were previously returned, string-like operations should
IMO, be the norm. But then we disagree on that anyway ;) .
4) Should path expose an iterator for listdir(?)

I don't see why not, as long as the path is to a
directory.

Seems ambiguous to me. Also seems silly to use a generator when we're
producing a list anyway from the underlying call, might as well return
the list to allow length checks and random access. Iterators for
"ancestors" might be useful, but again, doesn't really seem like it
needs to be __iter__ instead of "ancestors" .
5) Should == operator be the same as os.path.samefil e()?

Why not...

__eq__ sounds about right. I gather this call goes out to the
filesystem first, though. Might be good to first check for absolute
equality (i.e. the same actual path) before doing that.
6) Path.open()?

Of course.

Ditto.
7) Should the various gettime methods return Datetime
objects.

Of course.

What are we doing for Python 2.2 then? I agree with the principle, but
we should likely have a fallback when datetime isn't available.
8) Touch method?

Of course.

Neutral, seems fine.
9) Current OS constants?

What are they? Are we talking about the four
constants in the access() function, or about something
else?

Don't know myself.
10) Commonprefix, walk and sameopenfile?

Commonprefix should be a string or list method,
it doesn't fit here.

Path commonprefix are different operations from str commonprefix. Paths
should only accept entire path-segments (names) as being equal, while
strings should accept any set of characters:

'/this/that/those/them'
'/this/thatly/those/them'

should see '/this/' as the commonprefix for the paths, not '/this/that'.
walk is a nice function, but it should be redone to
use the visitor pattern directly, with different method
names for files, directories and whatever else a
particular file system has in it's warped little mind.

Reworking walk is probably a good idea. I'll let others worry about it,
as I've re-implemented the functionality so many times for my own code
that I'm just sick of it :) .
11) rename join and split.

I wouldn't bother. Since I'm against making it a
subclass of str(), the issue doesn't arise.

No real preference one way or another here. join -> "append" for paths
seems fine. split -> "elements" or "steps" for paths also seems fine.
12) Should == compare file sizes.

No. Might have a method to do that.

Agreed, though even then, if we have a method that returns file-sizes:

path( ... ).size() == path( ... ).size()

seems almost as reasonable as having a method for it?
13) chdir, chmod, etc?

No. This has nothing to do with pathname.

chmod has to do with altering the access mode of a file/directory by
specifying it's path, no? Seems like it could readily be a method of
the path. chdir should accept a path, otherwise doesn't seem like it
should be a method.
14. Unicode filenames

Have to have them on Windows and probably
on the Mac.

Yes.
15. Should files and directories be the same
class.

Replied to this in the sub-thread...

Enjoy all,
Mike

_______________ _______________ _________
Mike C. Fletcher
Designer, VR Plumber, Coder
http://members.rogers.com/mcfletch/


Jul 18 '05 #8

"Mike C. Fletcher" <mc******@roger s.com> wrote in message
news:ma******** *************** *************** @python.org...
John Roth wrote:
...
1) Should path be a subclass of str?

No. Outside of the difficulty of deciding whether it's a
subclass of single byte or unicode strings, it's a pure
and simple case of Feature Envy. Granted, there will
be times a developer wants to use string methods, but
the most common operations should be supported directly.

It's not the methods that make me want to use str/unicode as the base,
it's the ability to pass the resulting instances to built-in methods
that explicitly expect/require str/unicode types. Not sure how many of
the core functions/libraries still have such requirements, but I'd guess
it's a few.

That said, I don't mind making path it's own base-class, I just *really*
want to be able to pass them to path-unaware code without extra coercian
(otherwise switching a module to *producing* paths instead of raw
strings will cause the *clients* of that module to break, which is a
serious problem for rapid adoption).


That's an excellent point, but it begs the question of which
string class it should subclass. Unless it's got some way of
changing its base class depending on the system it's running
on. That, in turn, probably violates the Principle of Least
Astonishment.
5) Should == operator be the same as os.path.samefil e()?

Why not...

__eq__ sounds about right. I gather this call goes out to the
filesystem first, though. Might be good to first check for absolute
equality (i.e. the same actual path) before doing that.


I think this has to do with "conceptual integrity." Are we talking
about a path object that happens to have the ability to do file
system operations in appropriate circumstances, or are we talking
about a file system object that includes all of the usual path
manipulations? You seem to be thinking of the first approach,
and I'm thinking of the second. You're beginning to convince me,
though.
7) Should the various gettime methods return Datetime
objects.

Of course.

What are we doing for Python 2.2 then? I agree with the principle, but
we should likely have a fallback when datetime isn't available.


Do we care? If this is going into Python, it will be in 2.4 at the
earliest, with a possible addon to a late 2.3 release. I don't see
it going into 2.2 at all, although a backwards version would
be nice.
10) Commonprefix, walk and sameopenfile?

Commonprefix should be a string or list method,
it doesn't fit here.

Path commonprefix are different operations from str commonprefix. Paths
should only accept entire path-segments (names) as being equal, while
strings should accept any set of characters:

'/this/that/those/them'
'/this/thatly/those/them'

should see '/this/' as the commonprefix for the paths, not '/this/that'.


Good point if you're thinking of heterogenous collections. If you're
thinking (as I am) that an object can represent a directory, then it
seems like a singularly useless method.
walk is a nice function, but it should be redone to
use the visitor pattern directly, with different method
names for files, directories and whatever else a
particular file system has in it's warped little mind. Reworking walk is probably a good idea. I'll let others worry about it,
as I've re-implemented the functionality so many times for my own code
that I'm just sick of it :) .
I can understand that. [grin]
13) chdir, chmod, etc?

No. This has nothing to do with pathname.

chmod has to do with altering the access mode of a file/directory by
specifying it's path, no? Seems like it could readily be a method of
the path.


Right. I forgot that these are two totally different issues.
chdir should accept a path, otherwise doesn't seem like it
should be a method.
If the path object describes a directory, then I'd see
a .chdir() method as useful. Otherwise, it belongs
somewhere else, although I don't have a clue where
at the moment.

Enjoy all,
Mike


John Roth
Jul 18 '05 #9
"John Roth" <ne********@jhr othjr.com> wrote in message news:<vv******* *****@news.supe rnews.com>...
I'm adding a thread for comments on Gerrit Holl's pre-pep, which
can be found here:

http://tinyurl.com/2578q
.... 1) Should path be a subclass of str?

No.


So will the file constructor be "overloaded " to accept path objects?
What about all those functions in the os module?
Jul 18 '05 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
3113
by: GriffithsJ | last post by:
Hi I have been given some text that needs to be displayed on a web page. The text is pre-formatted (includes things like lists etc) and displays okay if I wrap it using the <pre/> tag. However, the font used is rather "naff" and looks too different to the rest of my web page. I'm not sure how I can (or even whether I can) override the font used with the <pre/> tag. If not, is there another tag I can use to display pre-formatted...
21
10234
by: Headless | last post by:
I've marked up song lyrics with the <pre> tag because it seems the most appropriate type of markup for the type of data. This results in inefficient use of horizontal space due to UA's default rendering of <pre> in a fixed width font. To change that I'd have to specify a proportional font family, thereby falling into the size pitfall that is associated with any sort of author specified font family: a) If I specify a sans serif font...
7
18544
by: Alan Illeman | last post by:
How do I set several different properties for PRE in a CSS stylesheet, rather than resorting to this: <BODY> <PRE STYLE="font-family:monospace; font-size:0.95em; width:40%; border:red 2px solid; color:red;
5
24120
by: Porthos | last post by:
I'm authoring an XML document and using the <pre> html tag for the portions that are not dynamically generated. The <pre> text is displaying in a smaller font size (and I believe different font) than standard text. I would like all of my text to be formatted this way, though I don't want to use the <pre> tag for all of it. Can anyone tell me what font and size are used by default for the <pre> tag? I don't want to change it's...
5
718
by: Michael Shell | last post by:
Greetings, Consider the XHTML document attached at the end of this post. When viewed under Firefox 1.0.5 on Linux, highlighting and pasting (into a text editor) the <pre> tag listing will preserve formatting (white space and line feeds). However, this is not true when doing the same with the <code> tag listing (it will all be pasted on one line with multiple successive spaces treated as a single space) despite the fact that...
8
3876
by: Jarno Suni not | last post by:
It seems to be invalid in HTML 4.01, but valid in XHTML 1.0. Why is there the difference? Can that pose a problem when such a XHTML document is served as text/html?
7
2755
by: Rocky Moore | last post by:
I have a web site called HintsAndTips.com. On this site people post tips using a very simply webform with a multi line TextBox for inputing the tip text. This text is encode to HTML so that no tags will remain making the page safe (I have to convert the linefeeds to <BR>s because the Server.EncodeHTML does not do that it seems). The problem is that users can use a special tag when editing the top to specify an area of the tip that will...
9
5560
by: Eric Lindsay | last post by:
I can't figure how to best display little snippets of shell script using <pre>. I just got around to organising to bulk validate some of my web pages, and one of the problems occurs with Bash shell pieces like this: <pre><code> #!/bin/sh ftp -i -n ftp.server.com&lt; &lt;EOF user username password epsv4 cd /
23
3661
by: Xah Lee | last post by:
The Concepts and Confusions of Pre-fix, In-fix, Post-fix and Fully Functional Notations Xah Lee, 2006-03-15 Let me summarize: The LISP notation, is a functional notation, and is not a so-called pre-fix notation or algebraic notation. Algebraic notations have the concept of operators, meaning, symbols placed around arguments. In algebraic in-fix notation, different
0
9889
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
9732
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10987
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
10652
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
7885
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
7058
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5723
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
2
4125
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
3166
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.