473,732 Members | 2,201 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

[path-PEP] Path inherits from basestring again

Hi,

the arguments in the previous thread were convincing enough, so I made the
Path class inherit from str/unicode again.

It still can be found in CVS: /python/nondist/sandbox/path/{path.py,test_p ath.py}

One thing is still different, though: a Path instance won't compare to a regular
string.

Other minor differences, as requested on python-dev, are:

* size property -> getsize() method.
* atime/mtime/ctime properties -> atime()/mtime()/ctime() methods

* dirname() method -> directory property
* no parent property
* basename() method -> basename property
* no name property

* listdir() method -> children() method
* there is still a listdir() method, but with the semantics of os.listdir
* dirs() method -> subdirs() method
* joinpath() method -> added alias joinwith()
* splitall() method -> parts() method

* Default constructor: Path() == Path(os.curdir)
* staticmethod Path.getcwd() -> Path.cwd()

* bytes() / lines() / text() -> read_file_{byte s,lines,text} methods
* write_{bytes,li nes,text} -> write_file_{byt es,lines,text} methods

These may be removed though.

Reinhold
Jul 23 '05 #1
34 3260
Reinhold Birkenfeld wrote:
One thing is still different, though: a Path instance won't compare to a regular
string.
Could you please expand on what this means? Are you referring to doing
< and >= type operations on Paths and strings, or == and != or all those
or something else entirely?
Other minor differences, as requested on python-dev, are:

* size property -> getsize() method.
* atime/mtime/ctime properties -> atime()/mtime()/ctime() methods
What does this mean? The .size property and a getsize() method both
already exist (in my copy of path.py anyway) and do the same thing.
Same with the other ones mentioned above. Is someone working from an
out-of-date copy of path.py?
* dirs() method -> subdirs() method
Given that .files() exists, and returns a list of the files contained in
a path which represents a folder, why would one want to use subdirs()
instead of just dirs() to do the same operation for contained folders?
If subdirs() is preferred, then I suggest subfiles() as well. Otherwise
the change seems arbitrary and ill-conceived.
* joinpath() method -> added alias joinwith()
* splitall() method -> parts() method
This reminds me of the *one* advantage I can think of for not
subclassing basestring, though it still doesn't make the difference in
my mind: strings already have "split()", so Jason had to go with
"splitpath( )" for the basic split operation to avoid a conflict. A
minor wart I guess.
* Default constructor: Path() == Path(os.curdir)
To construct an empty path then one can still do Path('') ?
* staticmethod Path.getcwd() -> Path.cwd()

* bytes() / lines() / text() -> read_file_{byte s,lines,text} methods
* write_{bytes,li nes,text} -> write_file_{byt es,lines,text} methods


Under Linux isn't it possible to open and read from directories much as
with files? If that's true, the above would seem to conflict with that
in some way. As with the the .subdirs() suggestion above, these changes
seem to me somewhat arbitrary. .bytes() and friends have felt quite
friendly in actual use, and I suspect .read_file_byte s() will feel quite
unwieldy. Not a show-stopper however.

-Peter
Jul 23 '05 #2
Peter Hansen wrote:
Under Linux isn't it possible to open and read from directories much as
with files?


Not really, no.

Python 2.3.4 (#2, Jan 5 2005, 08:24:51)
[GCC 3.3.5 (Debian 1:3.3.5-5)] on linux2
Type "help", "copyright" , "credits" or "license" for more information.
d = open('/usr/bin') Traceback (most recent call last):
File "<stdin>", line 1, in ?
IOError: [Errno 21] Is a directory


--
Robert Kern
rk***@ucsd.edu

"In the fields of hell where the grass grows high
Are the graves of dreams allowed to die."
-- Richard Harter

Jul 23 '05 #3
Peter Hansen wrote:
Reinhold Birkenfeld wrote:
One thing is still different, though: a Path instance won't compare to a regular
string.


Could you please expand on what this means? Are you referring to doing
< and >= type operations on Paths and strings, or == and != or all those
or something else entirely?


All of these. Do you need them?
Other minor differences, as requested on python-dev, are:

* size property -> getsize() method.
* atime/mtime/ctime properties -> atime()/mtime()/ctime() methods


What does this mean? The .size property and a getsize() method both
already exist (in my copy of path.py anyway) and do the same thing.
Same with the other ones mentioned above. Is someone working from an
out-of-date copy of path.py?


No. But the size of a file is somewhat volatile, and does not feel like
a "property" of the path to it. Remember: the path is not the file. Same
goes with the xtime() methods.

Different is the basename/directory/etc.: as long as the path stays the same,
these properties will stay the same.
* dirs() method -> subdirs() method


Given that .files() exists, and returns a list of the files contained in
a path which represents a folder, why would one want to use subdirs()
instead of just dirs() to do the same operation for contained folders?
If subdirs() is preferred, then I suggest subfiles() as well. Otherwise
the change seems arbitrary and ill-conceived.


Well, I think that's right. Will change back to dirs().
* joinpath() method -> added alias joinwith()
* splitall() method -> parts() method


This reminds me of the *one* advantage I can think of for not
subclassing basestring, though it still doesn't make the difference in
my mind: strings already have "split()", so Jason had to go with
"splitpath( )" for the basic split operation to avoid a conflict. A
minor wart I guess.


At the moment, I think about overriding certain string methods that make
absolutely no sense on a path and raising an exception from them.
* Default constructor: Path() == Path(os.curdir)


To construct an empty path then one can still do Path('') ?


Yes.
* staticmethod Path.getcwd() -> Path.cwd()

* bytes() / lines() / text() -> read_file_{byte s,lines,text} methods
* write_{bytes,li nes,text} -> write_file_{byt es,lines,text} methods


Under Linux isn't it possible to open and read from directories much as
with files? If that's true, the above would seem to conflict with that
in some way. As with the the .subdirs() suggestion above, these changes
seem to me somewhat arbitrary. .bytes() and friends have felt quite
friendly in actual use, and I suspect .read_file_byte s() will feel quite
unwieldy. Not a show-stopper however.


It has even been suggested to throw them out, as they don't have so much to
do with a path per se. When the interface is too burdened, we'll have less
chance to be accepted. Renaming these makes clear that they are not operations
on the path, but on a file the path points to.

Phillip J. Eby suggested these to be set_file_xxx and get_file_xxx to demonstrate
that they do not read or write a stream; how about that?

Reinhold

Jul 23 '05 #4
Reinhold Birkenfeld wrote:
Peter Hansen wrote (on Paths not allowing comparison with strings):
Could you please expand on what this means? Are you referring to doing
< and >= type operations on Paths and strings, or == and != or all those
or something else entirely?
All of these. Do you need them?


I believe so. If they are going to be basestring subclasses, why should
they be restricted in any particular way? I suppose that if you wanted
to compare a Path to a string, you could just wrap the string in a Path
first, but if the Path is already a basestring subclass, why make
someone jump through that particular hoop?
Other minor differences, as requested on python-dev, are:

* size property -> getsize() method.
* atime/mtime/ctime properties -> atime()/mtime()/ctime() methods


What does this mean? The .size property and a getsize() method both
already exist (in my copy of path.py anyway) and do the same thing.
Same with the other ones mentioned above. Is someone working from an
out-of-date copy of path.py?


No. But the size of a file is somewhat volatile, and does not feel like
a "property" of the path to it. Remember: the path is not the file. Same
goes with the xtime() methods.


Oh, so your original text was meant to imply that those properties *were
being removed*. That wasn't at all clear to me.

I understand the reasoning, but I'm unsure I agree with it. I fully
accept that the path is not the file, and yet I have a feeling this is a
pedanticism: most of the time when one is dealing with the _file_ one is
concerned with the content, and not much else. When one is dealing with
the _path_ one often wants to check the size, the modification time, and
so forth. For example, once one has the file open, one very rarely is
interested in when it was last modified.

In other words, I feel once again that Jason's original intuition here
was excellent, and that he chose practicality over purity in appropriate
ways, in a very Pythonic fashion. I confess to feeling that the
suggested changes are being proposed by those who have never actually
tried to put path.py to use in practical code, though I'm sure that's
not the case for everyone making those suggestions.

Still, once again this doesn't seem a critical issue to me and I'm happy
with either approach, if it means Path gets accepted in the stdlib.
At the moment, I think about overriding certain string methods that make
absolutely no sense on a path and raising an exception from them.
That would seem reasonable. It seems best to be very tolerant about
what "makes no sense", though istitle() would surely be one of those to
go first. Also capitalize() (in spite of what Windows Explorer seems to
do sometimes), center(), expandtabs(), ljust(), rjust(), splitlines(),
title(), and zfill(). Hmm... maybe not zfill() actually. I could
imagine an actual (if rare) use for that.
.bytes() and friends have felt quite
friendly in actual use, and I suspect .read_file_byte s() will feel quite
unwieldy. Not a show-stopper however.


It has even been suggested to throw them out, as they don't have so much to
do with a path per se. When the interface is too burdened, we'll have less
chance to be accepted. Renaming these makes clear that they are not operations
on the path, but on a file the path points to.


Here again I would claim the "practicali ty over purity" argument. When
one has a Path, it is very frequently because one intends to open a file
object using it and do reads and writes (obviously). Also very often,
the type of reading and writing one wants to do is an "all at once" type
of thing, as those methods support. They're merely a convenience, to
save one doing the Path(xxx).open( 'rb').read thing when one can merely
do Path(xxx).bytes (), in much the same way that the whole justification
for Path() is that it bundles useful and commonly used operations
together into one place.
Phillip J. Eby suggested these to be set_file_xxx and get_file_xxx to demonstrate
that they do not read or write a stream; how about that?


If they are there, they do exactly what they do, don't they? And they
do file.read() and file.write() operations, with slight nuances in the
mode passed to open() or the way the data is manipulated. Why would one
want to hide that, making it even harder to tie these operations
together with what is really going on under the covers? I think the
existing names, or at least ones with _read_ and _write_ in them
somewhere are better than set/get alternatives. It's just rare in
Python to encounter names quite as cumbersome as _write_file_byt es().

It might be good for those involved to discuss and agree on the
philosophy/principles behind using Path in the first place. If it's one
of pragmatism, then the arguments in favour of strictly differentiating
between path- and file- related operations should probably not be given
as much weight as those in favour of simple and convenient access to
commonly needed functionality. If, on the other hand, Path is seen as
some kind of a Java-esque universal path object which is cleanly and
tightly decoupled from everything else, then it would probably be best
to eliminate things like .getsize() and .read_file_byte s()/.bytes()
entirely and leave those in the hands of the cleanly defined and tightly
decoupled File object (currently spelled "file"?), again in a Java-esque
fashion. IMHO. :-)

(I'll like to say for the record that I feel that just about *any* form
of Path with even just the basics, basestring-based or not, would be a
huge improvement over the status quo, and I'm not trying to make a big
war out of this. Just offering my own view as a recent (a month or two
ago) but very enthusiastic convert to path.py.)

-Peter

-Peter
Jul 23 '05 #5
Peter Hansen <pe***@engcorp. com> writes:
* staticmethod Path.getcwd() -> Path.cwd()
* bytes() / lines() / text() -> read_file_{byte s,lines,text} methods
* write_{bytes,li nes,text} -> write_file_{byt es,lines,text} methods

Under Linux isn't it possible to open and read from directories much
as with files?


The OS doesn't matter - python won't let you open a directory as a
file, even if the underlying OS will. The comment in
Objects/fileobject.c is:

/* On Unix, fopen will succeed for directories.
In Python, there should be no file objects referring to
directories, so we need a check. */

I think - but I'm not positive, and don't have a Linux box handy to
check on - that this comment is false if your Unix is really Linux.

<mike
--
Mike Meyer <mw*@mired.or g> http://www.mired.org/home/mwm/
Independent WWW/Perforce/FreeBSD/Unix consultant, email for more information.
Jul 23 '05 #6
Peter Hansen wrote:
Reinhold Birkenfeld wrote:
Peter Hansen wrote (on Paths not allowing comparison with strings):
Could you please expand on what this means? Are you referring to doing
< and >= type operations on Paths and strings, or == and != or all those
or something else entirely?
All of these. Do you need them?


I believe so. If they are going to be basestring subclasses, why should
they be restricted in any particular way? I suppose that if you wanted
to compare a Path to a string, you could just wrap the string in a Path
first, but if the Path is already a basestring subclass, why make
someone jump through that particular hoop?


Do you have a use case for the comparison? Paths should be compared only
with other paths.
Other minor differences, as requested on python-dev, are:

* size property -> getsize() method.
* atime/mtime/ctime properties -> atime()/mtime()/ctime() methods

What does this mean? The .size property and a getsize() method both
already exist (in my copy of path.py anyway) and do the same thing.
Same with the other ones mentioned above. Is someone working from an
out-of-date copy of path.py?


No. But the size of a file is somewhat volatile, and does not feel like
a "property" of the path to it. Remember: the path is not the file. Same
goes with the xtime() methods.


Oh, so your original text was meant to imply that those properties *were
being removed*. That wasn't at all clear to me.

I understand the reasoning, but I'm unsure I agree with it. I fully
accept that the path is not the file, and yet I have a feeling this is a
pedanticism: most of the time when one is dealing with the _file_ one is
concerned with the content, and not much else. When one is dealing with
the _path_ one often wants to check the size, the modification time, and
so forth. For example, once one has the file open, one very rarely is
interested in when it was last modified.


My line of thought is that a path may, but does not need to refer to an
existing, metadata-readable file. For this, I think a property is not
proper.
In other words, I feel once again that Jason's original intuition here
was excellent, and that he chose practicality over purity in appropriate
ways, in a very Pythonic fashion. I confess to feeling that the
suggested changes are being proposed by those who have never actually
tried to put path.py to use in practical code, though I'm sure that's
not the case for everyone making those suggestions.

Still, once again this doesn't seem a critical issue to me and I'm happy
with either approach, if it means Path gets accepted in the stdlib.
At the moment, I think about overriding certain string methods that make
absolutely no sense on a path and raising an exception from them.
That would seem reasonable. It seems best to be very tolerant about
what "makes no sense", though istitle() would surely be one of those to
go first. Also capitalize() (in spite of what Windows Explorer seems to
do sometimes), center(), expandtabs(), ljust(), rjust(), splitlines(),
title(), and zfill(). Hmm... maybe not zfill() actually. I could
imagine an actual (if rare) use for that.


I'll look into it. What about iteration and indexing? Should it support
"for element in path" or "for char in path" or nothing?
.bytes() and friends have felt quite
friendly in actual use, and I suspect .read_file_byte s() will feel quite
unwieldy. Not a show-stopper however.


It has even been suggested to throw them out, as they don't have so much to
do with a path per se. When the interface is too burdened, we'll have less
chance to be accepted. Renaming these makes clear that they are not operations
on the path, but on a file the path points to.


Here again I would claim the "practicali ty over purity" argument. When
one has a Path, it is very frequently because one intends to open a file
object using it and do reads and writes (obviously). Also very often,
the type of reading and writing one wants to do is an "all at once" type
of thing, as those methods support. They're merely a convenience, to
save one doing the Path(xxx).open( 'rb').read thing when one can merely
do Path(xxx).bytes (), in much the same way that the whole justification
for Path() is that it bundles useful and commonly used operations
together into one place.
Phillip J. Eby suggested these to be set_file_xxx and get_file_xxx to demonstrate
that they do not read or write a stream; how about that?


If they are there, they do exactly what they do, don't they? And they
do file.read() and file.write() operations, with slight nuances in the
mode passed to open() or the way the data is manipulated. Why would one
want to hide that, making it even harder to tie these operations
together with what is really going on under the covers? I think the
existing names, or at least ones with _read_ and _write_ in them
somewhere are better than set/get alternatives. It's just rare in
Python to encounter names quite as cumbersome as _write_file_byt es().


I think it is not exactly bad that these names are somehow outstanding,
as that demonstrates that something complex and special happens.
It might be good for those involved to discuss and agree on the
philosophy/principles behind using Path in the first place. If it's one
of pragmatism, then the arguments in favour of strictly differentiating
between path- and file- related operations should probably not be given
as much weight as those in favour of simple and convenient access to
commonly needed functionality. If, on the other hand, Path is seen as
some kind of a Java-esque universal path object which is cleanly and
tightly decoupled from everything else, then it would probably be best
to eliminate things like .getsize() and .read_file_byte s()/.bytes()
entirely and leave those in the hands of the cleanly defined and tightly
decoupled File object (currently spelled "file"?), again in a Java-esque
fashion. IMHO. :-)
Hm. No, that's not my intention either. I think that path as it is is already
very good. The PEP must follow, and stress this point.
(I'll like to say for the record that I feel that just about *any* form
of Path with even just the basics, basestring-based or not, would be a
huge improvement over the status quo, and I'm not trying to make a big
war out of this. Just offering my own view as a recent (a month or two
ago) but very enthusiastic convert to path.py.)


That's a basis we can build on. ;)

Reinhold
Jul 23 '05 #7

"Reinhold Birkenfeld" <re************ ************@wo lke7.net> wrote in
message news:3k******** ****@individual .net...

I'll look into it. What about iteration and indexing? Should it support
"for element in path" or "for char in path" or nothing?
I frankly can't think of a use for iterating over the characters in
the path, but I have a number of programs that check elements,
iterate over them and index them (frequently backwards).

I also like to know the number of elements, which seems to make
sense as len(path). Again, the number of characters in the path seems
to be utterly useless information - at least, I can't imagine a use for
it.

John Roth

Reinhold


Jul 24 '05 #8
Reinhold Birkenfeld wrote:
[on comparing Paths and stings]
Do you have a use case for the comparison? Paths should be compared only
with other paths.
I can think of lots, though I don't know that I've used any in my
existing (somewhat limited) code that uses Path, but they all involve
cases where I would expect, if comparisons were disallowed, to just wrap
the string in a Path first, even though to me that seems like it should
be an unnecessary step:

if mypath.splitpat h()[0] == 'c:/temp':

if 'tests' in mypath.dirs():

and lots of other uses which start by treating a Path as a string
first, such as by doing .endswith('_uni t.py')

Any of these could be resolved by ensuring both are Paths, but then I'm
not sure there's much justification left for using a baseclass of
basestring in the first place:

if mypath.splitpat h()[0] == Path('c:/temp'):

if Path('tests') in mypath.dirs():

Question: would this latter one actually work? Would this check items
in the list using comparison or identity? Identity would simply be
wrong here.

[on removing properties in favour of methods for volatile data] My line of thought is that a path may, but does not need to refer to an
existing, metadata-readable file. For this, I think a property is not
proper.
Fair enough, though in either case an attempt to access that information
leads to the same exception. I can't make a strong argument in favour
of properties (nor against them, really).
What about iteration and indexing? Should it support
"for element in path" or "for char in path" or nothing?
As John Roth suggests, the former seems a much more useful thing to do.
The latter is probably as rarely needed as it is with regular strings
(which I believe is roughly "never" in Python).

[on .read_file_byte s() etc] I think it is not exactly bad that these names are somehow outstanding,
as that demonstrates that something complex and special happens.


Point taken. What about ditching the "file" part, since it is redundant
and obvious that a file is in fact what is being accessed. Thus:
..read_bytes(), .read_text(), .write_lines() etc.

-Peter
Jul 24 '05 #9
On Sat, 23 Jul 2005 17:51:31 -0600, John Roth wrote:
I also like to know the number of elements, which seems to make
sense as len(path). Again, the number of characters in the path seems
to be utterly useless information - at least, I can't imagine a use for
it.


There are (were?) operating systems that could only deal with a maximum
length for pathnames. If I recall correctly, and I probably don't, Classic
Mac (pre-OS X) was limited to file names of 31 or fewer characters and no
more than 250-odd for the entire pathname. At the very least, some file
manager routines would work and some would not.

If you are printing the pathname, you may care about the length so that
you can truncate it:

longname = "C:\really\real ly\really\reall y\really\long\p ath\name.txt"
if len(longname) > 30:
# do magic here
print "C:\really\ ... \path\name.txt"
else:
print longname

--
Steven.

Jul 24 '05 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

31
3920
by: John Roth | last post by:
I'm adding a thread for comments on Gerrit Holl's pre-pep, which can be found here: http://tinyurl.com/2578q Frankly, I like the idea. It's about time that all of the file and directory stuff in the os module got objectified properly (or at least with some semblance of OO propriety!) In the issues section:
70
4099
by: Michael Hoffman | last post by:
Many of you are familiar with Jason Orendorff's path module <http://www.jorendorff.com/articles/python/path/>, which is frequently recommended here on c.l.p. I submitted an RFE to add it to the Python standard library, and Reinhold Birkenfeld started a discussion on it in python-dev <http://mail.python.org/pipermail/python-dev/2005-June/054438.html>. The upshot of the discussion was that many python-dev'ers wanted path added to the...
3
1260
by: Javaman59 | last post by:
Visual Studio 2003/5 question... I'm in the solution explorer, looking at a file. I want to find it's path. The only way i know of doing this is to open it (with a double-click), then mouse over the tab title. This is should be really basic, but I don't know of any other way (simple or complicated) of finding the path to a file. And as for the solution file (.sln), I don't know of *any* way of finding
1
5163
by: terrorix | last post by:
I want to save uploaded file to disk. I have this construction: HttpPostedFile myFile = ((HttpRequest)Request).Files; if (myFile != null) { string fn = "c:\\Inetpub\\wwwroot\\MyWeb\\upload\\File_1"; try { myFile.SaveAs(fn); lblMsg.Text = "File was uploaded and saved successfully!";
0
8774
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
9447
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
9181
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
8186
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
0
6031
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
4550
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
4809
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
2
2721
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
2180
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.