473,785 Members | 2,720 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

PEP on path module for standard library

Many of you are familiar with Jason Orendorff's path module
<http://www.jorendorff. com/articles/python/path/>, which is frequently
recommended here on c.l.p. I submitted an RFE to add it to the Python
standard library, and Reinhold Birkenfeld started a discussion on it in
python-dev
<http://mail.python.org/pipermail/python-dev/2005-June/054438.html>.

The upshot of the discussion was that many python-dev'ers wanted path
added to the stdlib, but Guido was not convinced and said it must have a
PEP. So Reinhold and I are going to work on one. Reinhold has already
made some changes to the module to fit the python-dev discussion and put
it in CPython CVS at nondist/sandbox/path.

For the PEP, do any of you have arguments for or against including path?
Code samples that are much easier or more difficult with this class
would also be most helpful.

I use path in more of my modules and scripts than any other third-party
module, and I know it will be very helpful when I no longer have to
worry about deploying it.

Thanks in advance,
--
Michael Hoffman
Jul 21 '05
70 4116
on 22.07.2005 16:14 George Sakkis said the following:
"Duncan Booth" <du**********@i nvalid.invalid> wrote:
Personally I think the concept of a specific path type is a good one, but
subclassing string just cries out to me as the wrong thing to do. In other
words, to me a path represents something in a filesystem, the fact that it
has one, or indeed several string representations does not mean that the
path itself is simply a more specific type of string.
[snip] practicality-vs-purity decision. You're right, conceptually a path
HAS_A string description, not IS_A string, so from a pure OO point of
view, it should not inherit string.


Java has `File` which mixes the concepts "an object in the filesystem"
and "a structured locator for such objects (in a hierarchical fs) that
might or might not correspond to an object that is actually there".

`file` and `path` separate that. I think this is very reasonable.

(It would be nice to get `path`(s) easily from a `file`, at the moment
there is only file.name if I'm not mistaken).

And a `path`, to me, actually IS_A string (unicode string) that happens
to have special behaviour (such as os dependent quirks like a
pathseparator that automatically get translated, comparable to '\n' used
internally in strings translated to '\n'|'\r\n')

stefan

Jul 22 '05 #31
George Sakkis wrote:
Havind said that, I think
the choice between subclassing or not is going to be a
practicality-vs-purity decision. You're right, conceptually a path
HAS_A string description, not IS_A string, so from a pure OO point of
view, it should not inherit string. OTOH, people in favor of the
subclassing point out the convenience for many (or most) common cases.


It would be an entirely different matter if we were designing a language
from scratch. But we have to deal with an existing codebase that expects
strings.

Here's some code I just wrote seconds ago to construct a path for a scp
upload:

"""
DST_DIRPATH = path("host:~/destination")
RSS_EXT = "rss"

dst_filenamebas e = os.extsep.join([postcode.lower( ), RSS_EXT])
dst_filepath = DST_DIRPATH.joi npath(dst_filen amebase)
"""

With the current path implementation, this Just Works. If I were using
something that parsed and understood paths, the scp/rcp convention of
host:filename would either cause an error or have to be programmed in
separately. The current implementation is much more flexible.

What are the practical advantages and conveniences of *not* subclassing
from basestring?
--
Michael Hoffman
Jul 22 '05 #32
"Duncan Booth" <du**********@i nvalid.invalid> wrote:
Personally I think the concept of a specific path type is a good one, but
subclassing string just cries out to me as the wrong thing to do. In other
words, to me a path represents something in a filesystem, the fact that it
has one, or indeed several string representations does not mean that the
path itself is simply a more specific type of string.

You should need an explicit call to convert a path to a string and that
forces you when passing the path to something that requires a string to
think whether you wanted the string relative, absolute, UNC, uri etc.


First off, I find this is a relatively small detail overall, that is, regardless of whether path
subclasses string or not, its addition in the standard library will be a step forward. Havind said
that, I think the choice between subclassing or not is going to be a practicality-vs-purity
decision. You're right, conceptually a path HAS_A string description, not IS_A string, so from a
pure OO point of view, it should not inherit string. OTOH, people in favor of the subclassing point
out the convenience for many (or most) common cases. It's a tradeoff, so arguments for both cases
should be discussed.

George


Jul 22 '05 #33
Peter Hansen wrote:
Duncan, are you another formerly non-user of path who has this opinion,
or have you already attempted to use path extensively in your code?
I'm a currently non-user of path who would probably use it if it were in
the standard library but so far have been satisfied to use os.path.
I'm not saying I dismiss the opinions of those who haven't actually
tried working with a string-based path object, but it's worth
considering that you might adopt a different opinion after using it for
a while.
I fully accept that. My point is simply that as a non-user, it sounds to me
as though subclassing string is the wrong approach. I would have expected a
path object to be a sequence of path elements rather than a sequence of
characters. This is basically just a gut feeling though, so I'm perfectly
happy to be told that I'm wrong.

BTW, does it matter at all in practical use that the base class of path
varies between str and unicode depending on the platform?

John Roth wrote: You have to start somewhere. One of the lessons that's beginning
to seep into people's minds is that getting something that works
out there is almost always preferable to (over) design by committee.


Dead right, but once it goes into the standard library it has to pretty
well stop evolving, so it needs to be right, or as close as possible before
that happens.

Jul 22 '05 #34
Duncan Booth wrote:
I would have expected a
path object to be a sequence of path elements rather than a sequence of
characters.


Maybe it's nitpicking, but I don't think that a path object should be a
'sequence of path elements' in an iterator context.

This means that

for element in pathobject:

has no intuitive meaning for me, so it shouldn't be allowed.

Daniel
Jul 22 '05 #35
"Duncan Booth" <du**********@i nvalid.invalid> wrote in message
news:Xn******** *************** **@127.0.0.1...

John Roth wrote:
You have to start somewhere. One of the lessons that's beginning
to seep into people's minds is that getting something that works
out there is almost always preferable to (over) design by committee.
Dead right, but once it goes into the standard library it has to pretty
well stop evolving, so it needs to be right, or as close as possible
before
that happens.


It has to stop evolving in incompatible directions, at least. Although
there is a precident with the process functions, classes, module,
whatever it is. It's up to five versions now, isn't it?

AFAICT, from a very broad brush perspective, there is really
only one substantive issue: how to handle multiple path-like
"things". URLs have been mentioned in this thread, different
file systems and a possible in-memory file system have been
mentioned in other threads.

So whatever gets out there first shouldn't preempt the ability
to eventually fit into a wider structure without substantial
and incompatible modifications.

John Roth


Jul 22 '05 #36
"Daniel Dittmar" <da************ @sap.corp> wrote in message
news:db******** **@news.sap-ag.de...
Duncan Booth wrote:
I would have expected a path object to be a sequence of path elements
rather than a sequence of characters.
Maybe it's nitpicking, but I don't think that a path object should be a
'sequence of path elements' in an iterator context.

This means that

for element in pathobject:

has no intuitive meaning for me, so it shouldn't be allowed.


However, a path as a sequence of characters has even less
meaning - I can't think of a use, while I have an application
where traversing a path as a sequence of path elements makes
perfect sense: I need to descend the directory structure, directory
by directory, looking for specific files and types.

John Roth
Daniel


Jul 22 '05 #37
Michael Hoffman wrote:
Here's some code I just wrote seconds ago to construct a path for a
scp upload:

"""
DST_DIRPATH = path("host:~/destination")
RSS_EXT = "rss"

dst_filenamebas e = os.extsep.join([postcode.lower( ), RSS_EXT])
dst_filepath = DST_DIRPATH.joi npath(dst_filen amebase)
"""

With the current path implementation, this Just Works.
It isn't at all obvious to me that it works:
import os
from path import path
postcode = "AA2 9ZZ"
DST_DIRPATH = path("host:~/destination")
RSS_EXT = "rss"

dst_filenamebas e = os.extsep.join([postcode.lower( ), RSS_EXT])
dst_filepath = DST_DIRPATH.joi npath(dst_filen amebase)
print dst_filepath host:~/destination\aa2 9zz.rss

If I were using
something that parsed and understood paths, the scp/rcp convention of
host:filename would either cause an error or have to be programmed in
separately. The current implementation is much more flexible.
You still have to program your scp path separately from your filesystem
path in order to handle the different conventions for path separator
characters and maybe also escaping special characters in the path (I don't
use scp much so I don't know if this is required).
What are the practical advantages and conveniences of *not*
subclassing from basestring?


Simplification of the api: not having methods such as center, expandtabs
and zfill.

Not having the base class change from str to unicode depending on which
system you run your code?

Fewer undetected bugs (explicit is better than implicit)?

Perhaps none of these matter in practice. As I said elsewhere I haven't
used path for anything real, so I'm still finding surprises such as why
this doesn't do what I expect:
p = path('a/b')
q = path('c/d')
p+q

path(u'a/bc/d')

If path didn't subclass string then either this would have been
implemented, and probably would Do The Right Thing, or it wouldn't be
implemented so I'd quickly realise I needed to do something else. Instead
it does something suprising.
Jul 22 '05 #38
Duncan Booth wrote:
Personally I think the concept of a specific path type is a good one, but
subclassing string just cries out to me as the wrong thing to do.
I disagree. I've tried using a class which wasn't derived from
a basestring and kept running into places where it didn't work well.
For example, "open" and "mkdir" take strings as input. There is no
automatic coercion.
class Spam: .... def __getattr__(sel f, name):
.... print "Want", repr(name)
.... raise AttributeError, name
.... open(Spam()) Traceback (most recent call last):
File "<stdin>", line 1, in ?
TypeError: coercing to Unicode: need string or buffer, instance found import os
os.mkdir(Spam() ) Traceback (most recent call last):
File "<stdin>", line 1, in ?
TypeError: coercing to Unicode: need string or buffer, instance found


The solutions to this are:
1) make the path object be derived from str or unicode. Doing
this does not conflict with any OO design practice (eg, Liskov
substitution).

2) develop a new "I represent a filename" protocol, probably done
via adapt().

I've considered the second of these but I think it's a more
complicated solution and it won't fit well with existing APIs
which do things like
if isinstance(inpu t, basestring):
input = open(input, "rU")
for line in input:
print line

I showed several places in the stdlib and in 3rd party packages
where this is used.

In other words, to me a path represents something in a filesystem,
Being picky - or something that could be in a filesystem.
the fact that it
has one, or indeed several string representations does not mean that the
path itself is simply a more specific type of string.
I didn't follow this.
You should need an explicit call to convert a path to a string and that
forces you when passing the path to something that requires a string to
think whether you wanted the string relative, absolute, UNC, uri etc.
You are broadening the definition of a file path to include URIs?
That's making life more complicated. Eg, the rules for joining
file paths may be different than the rules for joining URIs.
Consider if I have a file named "mail:da***@exa mple.com" and I
join that with "file://home/dalke/badfiles/".

Additionally, the actions done on URIs are different than on file
paths. What should os.listdir("htt p://www.python.org/") do?

As I mentioned, I tried some classes which emulated file
paths. One was something like

class TempDir:
"""removes the directory when the refcount goes to 0"""
def __init__(self):
self.filename = ... use a function from the tempfile module
def __del__(self):
if os.path.exists( self.filename):
shutil.rmtree(s elf.filename)
def __str__(self):
return self.filename

I could do

dirname = TempDir()

but then instead of

os.mkdir(dirnam e)
tmpfile = os.path.join(di rname, "blah.txt")

I needed to write it as

os.mkdir(str(di rname))
tmpfile = os.path.join(st r(dirname), "blah.txt") )

or have two variables, one which could delete the
directory and the other for the name. I didn't think
that was good design.
If I had derived from str/unicode then things would
have been cleaner.

Please note, btw, that some filesystems are unicode
based and others are not. As I recall, one nice thing
about the path module is that it chooses the appropriate
base class at import time. My "str()" example above
does not and would fail on a Unicode filesystem aware
Python build.
It may even be that we need a hierarchy of path
classes: URLs need similar but not identical manipulations
to file paths, so if we want to address the failings
of os.path perhaps we should also look at the failings
of urlparse at the same time.


I've found that hierarchies are rarely useful compared
to the number of times they are proposed and used. One
of the joys to me of Python is its deemphasis of class
hierarchies.

I think the same is true here. File paths and URIs are
sufficiently different that there are only a few bits
of commonality between them. Consider 'split' which
for files creates (dirname, filename) while for urls
it creates (scheme, netloc, path, query, fragment)

Andrew
da***@dalkescie ntific.com

Jul 22 '05 #39
George Sakkis wrote:
You're right, conceptually a path
HAS_A string description, not IS_A string, so from a pure OO point of
view, it should not inherit string.


How did you decide it's "has-a" vs. "is-a"?

All C calls use a "char *" for filenames and paths,
meaning the C model file for the filesystem says
paths are strings.

Paths as strings fit the Liskov substitution principle
in that any path object can be used any time a
string is used (eg, "loading from " + filename)

Good information hiding suggests that a better API
is one that requires less knowledge. I haven't
seen an example of how deriving from (unicode)
string makes things more complicated than not doing so.

Andrew
da***@dalkescie ntific.com

Jul 22 '05 #40

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
24852
by: Jeff Wagner | last post by:
Is it possible to append a new Path to some file permanently? It seems like a sys.path.append('/my/new/path') statement is temporary. In other words, where and in what file on a Win32 box or Linux box is the sys.path info kept. I have a couple of paths I keep practice files in I want to add to the path permanently. Thanks
3
9734
by: Stephen Ferg | last post by:
I need a little help here. I'm developing some introductory material on Python for non-programmers. The first draft includes this statement. Is this correct? ----------------------------------------------------------------- When loading modules, Python looks for modules in the following places in the following order:
5
1762
by: chirayuk | last post by:
Hi, I am trying to treat an environment variable as a python list - and I'm sure there must be a standard and simple way to do so. I know that the interpreter itself must use it (to process $PATH / %PATH%, etc) but I am not able to find a simple function to do so. os.environ.split(os.sep) is wrong on Windows for the case when PATH="c:\\A;B";c:\\D; where there is a ';' embedded in the quoted path.
34
3267
by: Reinhold Birkenfeld | last post by:
Hi, the arguments in the previous thread were convincing enough, so I made the Path class inherit from str/unicode again. It still can be found in CVS: /python/nondist/sandbox/path/{path.py,test_path.py} One thing is still different, though: a Path instance won't compare to a regular string.
17
1743
by: chris.atlee | last post by:
Hi there, I haven't seen this topic pop up in a while, so I thought I'd raise it again... What is the status of the path module/class PEP? Did somebody start writing one, or did it die? I would really like to see something like Jason Orendorff's path class make its way into the python standard library.
11
26641
by: cybervigilante | last post by:
I can't seem to change the include path on my local winmachine no matter what I do. It comes up as includ_path .;C:\php5\pear in phpinfo() but there is no such file. I installed the WAMP package and PEAR is in c:\wamp\php\pear I modified php.ini in the c:\wamp\php directory to reflect the actual path, but even stopping and restarting my server shows the c: \php5\pear path. I can't change it no matter what I do I also tried the...
0
9480
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10147
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
0
9947
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
8971
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
7496
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5380
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
5511
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
4046
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
3
2877
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.