pre-PEP: Object-oriented file module

Kenneth McDonald

I'd like to propose a new PEP [no, that isn't a redundant 'process'
in there :-)--pre-PEP is a different process than PEP], for a
standard library module that deals with files and file paths in an
object oriented manner. I believe this module should be included as
part of the standard Python distribution.

Background
==========
Some time ago, I wrote such a module for myself, and have found it
extremely useful. Recently, I found a reference to a similar module,
http://www.jorendorff.com/articles/python/path/ by Jeff Orendorff.
There are of course differences--I think mine is more comprehensive
but probably less stable--but the similarities in thought are
striking. Both work by creating a class representing file paths, and
then using that class to unify methods from shutil, os.path, and some
builtin functions such as 'open' (and maybe some other stuff I can't
remember).

I haven't looked at Jeff's code yet, but for my own, a major enabler
of the enhanced functionality has been the inclusion of generators in
Python. This allows, for example, a method which yields all of the
lines in a file and automatically closes that file after. The
availability of attributes also makes certain things cleaner than was
the case in previous versions of python.

Fit With Python Philosophy
=========================
One of the strengths of Python is that it is a highly object-oriented
language, but this is not true when it comes to handling files. As
far as python is concerned a file path is just a string, and there
are a bunch of things you can do with it, but they all have to be
done with function calls (not methods) since there is no concept of a
file path object. Even worse, these functions are spread out across
various modules, and often have cryptic names that hardly make it
obvious what they do.

Given that two different people concluded that such a module was
desirable, and independently implemented modules that are actually
very similar, I suspect there is an 'object-oriented mindset' to
which this way of addressing files and file paths is natural. And
that should be part of Python.

Pragmatic Justification
=================
I've been using my module for about a year and a half now. The ease-
of-use and uniformity make a huge (I'm tempted to say 'vast')
difference in dealing with files. I believe other users would
experience an increase in efficiency when dealing with files ranging
from 'significant' to 'very large' (in precise technical terms :-) )
Also, I think this type of API would be much easier for new users to
learn and use.

Examples
========
A few examples are in order. Again, these are from my own library,
since I'm not too familiar with Jeff's. Also, this is stuff I'm just
typing in right now as an illustration--there may be syntactic
errors. (However, all of this functionality is present.) And these by
no means represent the full functionality that is already defined.

# define a new path object
mydir = filepath("#&*$directory")

# Note that special characters are automatically escaped
# by filepath, as necessary for the current OS. If a character
# is illegal in a file name no matter what (cannot be escaped),
# an exception will be raised.
# A file in that directory
f = mydir / "some.txt"

# Go through the lines in the file. When all lines are done,
# the file will be closed automatically. If the file does not
# actually exist, an appropriate exception will be raised.
for line in f.iterlines():
...do something...

#The directory containing f is, of course, 'mydir'
assert f.parent == mydir

#Another path
aPath = filepath(....)

#In my module (not in Jeff's), a file path is considered
# semantically as a sequence of directory names terminated
# by the name of a file or directory. This makes it easy to
# obtain the name of the file at the end of a path:
theFile = aPath[-1]

# or the directory leading to that file
parentDir = aPath[0:-1]

#of course, these two common indexes/slices are accessible through
attributes
theFile = aPath.basename
parentDir = aPath.parent

# A more powerful 'walk'-type method is included. Below,
# the 'recursive' indicates that directories should be recursively
# walked, and the 'preorder' indicates that directories should
# be included in the iteration _before_ their contents are given.
# There is also a 'postorder' argument, and both may be used to
# yield directories both before and after their contents.
aPath.iterfiles(recursive=True, preorder=True)

# With the advent of the 'itertools' module in python, there is no
# need to provide an argument taking a function that is applied
# during the walk process, so in that sense, iterfiles is actually
simpler than walk.

#...and more. All of the various file capabilities available in
Python are provided
# in a unified package in this module.

Aug 25 '05 #1

Subscribe Post Reply

1830

Martin v. Löwis

Kenneth McDonald wrote:

I'd like to propose a new PEP [no, that isn't a redundant 'process' in
there :-)--pre-PEP is a different process than PEP], for a standard
library module that deals with files and file paths in an object
oriented manner. I believe this module should be included as part of
the standard Python distribution.

See the discussions at

http://python.org/sf/1226256
http://mail.python.org/pipermail/pyt...ne/054439.html
http://mail.python.org/pipermail/pyt...ly/054535.html

I'd be personally curious as to how you would be dealing with
Unicode file names. How to access file attributes is also
an interesting question (e.g. how to find out whether it is
a symlink, whether it is a hidden file, what the POSIX ACL
is, and what the 8.3 short name is)

Regards,
Martin

Aug 25 '05 #2

Martin v. Löwis

Kenneth McDonald wrote:

Why would any of the issues below be any more difficult than they are with
the current file functions? I'm not proposing a C replacement for current
functions, merely a Python module that wraps all of those functions (and
adds some additional ones) in an appropriate class.

I'm not saying they are difficult. I want to know how your library deals
with them. There is a good chance that some of these questions remain
unanswered in the PEP, and I just want to indicate that I would be
unhappy if they are. Specifying this API is a huge task, much more so
than coming up with an implementation that does "something".

This is one of the reasons why nothing like this has made it to the
standard library: as a library module, it would have to face many
more scenarios that the authors of the module originally did not
consider. Therefore, the documentation must be complete and consistent,
and there should be an agreement as to what this library can do and
what it cannot do.

Regards,
Martin

Aug 25 '05 #3

by: Headless | last post by:

I've marked up song lyrics with the <pre> tag because it seems the most appropriate type of markup for the type of data. This results in inefficient use of horizontal space due to UA's default...

HTML / CSS

Pre

by: Neal | last post by:

A few questions about pre... When presenting preformatted text using the white-space: pre; property/value, Opera renders long lines at small viewport widths as exiting the borders of the...

HTML / CSS

stylesheet with several <pre> styles?

by: Alan Illeman | last post by:

How do I set several different properties for PRE in a CSS stylesheet, rather than resorting to this: <BODY> <PRE STYLE="font-family:monospace; font-size:0.95em; width:40%; border:red 2px...

HTML / CSS

Two <PRE> Selectors

by: Buck Turgidson | last post by:

I want to have a css with 2 PRE styles, one bold with large font, and another non-bold and smaller font. I am new to CSS (and not exactly an expert in HTML, for that matter). Is there a way to...

HTML / CSS

<pre> versus <code> white-space:pre; when pasting

by: Michael Shell | last post by:

Greetings, Consider the XHTML document attached at the end of this post. When viewed under Firefox 1.0.5 on Linux, highlighting and pasting (into a text editor) the <pre> tag listing will...

HTML / CSS

img inside pre

by: Jarno Suni not | last post by:

It seems to be invalid in HTML 4.01, but valid in XHTML 1.0. Why is there the difference? Can that pose a problem when such a XHTML document is served as text/html?

HTML / CSS

RegEx for changing linefeeds to <BR> except between <PRE></PRE> tags?

by: Rocky Moore | last post by:

I have a web site called HintsAndTips.com. On this site people post tips using a very simply webform with a multi line TextBox for inputing the tip text. This text is encode to HTML so that no...

ASP.NET

Xah's Edu Corner: The Concepts and Confusions of Pre-fix, In-fix, Post-fix and Fully Functional Notations

by: Xah Lee | last post by:

The Concepts and Confusions of Pre-fix, In-fix, Post-fix and Fully Functional Notations Xah Lee, 2006-03-15 Let me summarize: The LISP notation, is a functional notation, and is not a...

Python

Background image position and <pre>/<code> tag question

by: Schraalhans Keukenmeester | last post by:

I am building a default sheet for my linux-related pages. Since many linux users still rely on/prefer viewing textmode and unstyled content I try to stick to the correct html tags to pertain good...

HTML / CSS

Wordpress or something else?

by: Faith0G | last post by:

I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...

Content Management Systems

Access Europe: Command bars, the Access Shortcut Tool and a simple Audit Log - Wed 3 April

by: isladogs | last post by:

The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

General

Easy Steps to Fix "Canon Printer Won't Connect to WiFi Network"

by: taylorcarr | last post by:

A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...

General

Basic Javascript concepts

by: aa123db | last post by:

Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...

Javascript

Batch import of multiple excel files into the database

by: ryjfgjl | last post by:

If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...

Data Management

Merging data from multiple Excel files

by: ryjfgjl | last post by:

In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...

Data Management

Migrating Website to Cloud - Emmanuel Katto

by: emmanuelkatto | last post by:

Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel

General

Navigating the Data Structures and Algorithms (DSA)

by: BarryA | last post by:

What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...

Algorithms / Advanced Math

Looking to do Android software development, any suggestions? Is flutter better?

by: nemocccc | last post by:

hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?

General

pre-PEP: Object-oriented file module

Similar topics