By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
443,907 Members | 1,832 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 443,907 IT Pros & Developers. It's quick & easy.

Parsing a path to components

P: n/a
Hello,

os.path.split returns the head and tail of a path, but what if I want
to have all the components ? I could not find a portable way to do
this in the standard library, so I've concocted the following
function. It uses os.path.split to be portable, at the expense of
efficiency.

----------------------------------
def parse_path(path):
""" Parses a path to its components.

Example:
parse_path("C:\\Python25\\lib\\site-packages\
\zipextimporter.py")

Returns:
['C:\\', 'Python25', 'lib', 'site-packages',
'zipextimporter.py']

This function uses os.path.split in an attempt to be portable.
It costs in performance.
"""
lst = []

while 1:
head, tail = os.path.split(path)

if tail == '':
if head != '': lst.insert(0, head)
break
else:
lst.insert(0, tail)
path = head

return lst
----------------------------------

Did I miss something and there is a way to do this standardly ?
Is this function valid, or will there be cases that will confuse it ?

Thanks in advance
Eli
Jun 27 '08 #1
Share this Question
Share on Google+
8 Replies


P: n/a
On Jun 7, 12:55*am, eliben <eli...@gmail.comwrote:
Hello,

os.path.split returns the head and tail of a path, but what if I want
to have all the components ? I could not find a portable way to do
this in the standard library, so I've concocted the following
function. It uses os.path.split to be portable, at the expense of
efficiency.

----------------------------------
def parse_path(path):
* * """ Parses a path to its components.

* * * * Example:
* * * * * * parse_path("C:\\Python25\\lib\\site-packages\
\zipextimporter.py")

* * * * * * Returns:
* * * * * * ['C:\\', 'Python25', 'lib', 'site-packages',
'zipextimporter.py']

* * * * This function uses os.path.split in an attempt to be portable.
* * * * It costs in performance.
* * """
* * lst = []

* * while 1:
* * * * head, tail = os.path.split(path)

* * * * if tail == '':
* * * * * * if head != '': lst.insert(0, head)
* * * * * * break
* * * * else:
* * * * * * lst.insert(0, tail)
* * * * * * path = head

* * return lst
----------------------------------

Did I miss something and there is a way to do this standardly ?
Is this function valid, or will there be cases that will confuse it ?
You can just split the path on `os.sep', which contains the path
separator of the platform on which Python is running:

components = pathString.split(os.sep)

Sebastian

Jun 27 '08 #2

P: n/a
On Fri, 06 Jun 2008 23:57:03 -0700, s0suk3 wrote:
You can just split the path on `os.sep', which contains the path
separator of the platform on which Python is running:

components = pathString.split(os.sep)
Won't work for platforms with more than one path separator and if a
separator is repeated. For example r'\foo\\bar/baz//spam.py' or:

In [140]: os.path.split('foo//bar')
Out[140]: ('foo', 'bar')

In [141]: 'foo//bar'.split(os.sep)
Out[141]: ['foo', '', 'bar']

Ciao,
Marc 'BlackJack' Rintsch
Jun 27 '08 #3

P: n/a
On Jun 7, 10:15 am, Marc 'BlackJack' Rintsch <bj_...@gmx.netwrote:
On Fri, 06 Jun 2008 23:57:03 -0700, s0suk3 wrote:
You can just split the path on `os.sep', which contains the path
separator of the platform on which Python is running:
components = pathString.split(os.sep)

Won't work for platforms with more than one path separator and if a
separator is repeated. For example r'\foo\\bar/baz//spam.py' or:

In [140]: os.path.split('foo//bar')
Out[140]: ('foo', 'bar')

In [141]: 'foo//bar'.split(os.sep)
Out[141]: ['foo', '', 'bar']

Ciao,
Marc 'BlackJack' Rintsch
Can you recommend a generic way to achieve this ?
Eli
Jun 27 '08 #4

P: n/a
On Jun 7, 3:15*am, Marc 'BlackJack' Rintsch <bj_...@gmx.netwrote:
On Fri, 06 Jun 2008 23:57:03 -0700, s0suk3 wrote:
You can just split the path on `os.sep', which contains the path
separator of the platform on which Python is running:
components = pathString.split(os.sep)

Won't work for platforms with more than one path separator and if a
separator is repeated. *For example r'\foo\\bar/baz//spam.py' or:

In [140]: os.path.split('foo//bar')
Out[140]: ('foo', 'bar')

In [141]: 'foo//bar'.split(os.sep)
Out[141]: ['foo', '', 'bar']
But those are invalid paths, aren't they? If you have a jumble of a
path, I think the solution is to call os.path.normpath() before
splitting.

Sebastian
Jun 27 '08 #5

P: n/a
On Sat, 07 Jun 2008 02:15:07 -0700, s0suk3 wrote:
On Jun 7, 3:15*am, Marc 'BlackJack' Rintsch <bj_...@gmx.netwrote:
>On Fri, 06 Jun 2008 23:57:03 -0700, s0suk3 wrote:
You can just split the path on `os.sep', which contains the path
separator of the platform on which Python is running:
components = pathString.split(os.sep)

Won't work for platforms with more than one path separator and if a
separator is repeated. *For example r'\foo\\bar/baz//spam.py' or:

In [140]: os.path.split('foo//bar')
Out[140]: ('foo', 'bar')

In [141]: 'foo//bar'.split(os.sep)
Out[141]: ['foo', '', 'bar']

But those are invalid paths, aren't they?
No. See `os.altsep` on Windows. And repeating separators is allowed too.

Ciao,
Marc 'BlackJack' Rintsch
Jun 27 '08 #6

P: n/a

"eliben" <el****@gmail.comwrote in message
news:e5**********************************@a70g2000 hsh.googlegroups.com...
On Jun 7, 10:15 am, Marc 'BlackJack' Rintsch <bj_...@gmx.netwrote:
>On Fri, 06 Jun 2008 23:57:03 -0700, s0suk3 wrote:
You can just split the path on `os.sep', which contains the path
separator of the platform on which Python is running:
components = pathString.split(os.sep)

Won't work for platforms with more than one path separator and if a
separator is repeated. For example r'\foo\\bar/baz//spam.py' or:

In [140]: os.path.split('foo//bar')
Out[140]: ('foo', 'bar')

In [141]: 'foo//bar'.split(os.sep)
Out[141]: ['foo', '', 'bar']

Ciao,
Marc 'BlackJack' Rintsch

Can you recommend a generic way to achieve this ?
Eli
>>import os
from os.path import normpath,abspath
x=r'\foo\\bar/baz//spam.py'
normpath(x)
'\\foo\\bar\\baz\\spam.py'
>>normpath(abspath(x))
'C:\\foo\\bar\\baz\\spam.py'
>>normpath(abspath(x)).split(os.sep)
['C:', 'foo', 'bar', 'baz', 'spam.py']

-Mark

Jun 27 '08 #7

P: n/a
"Mark Tolonen" <M8********@mailinator.comwrote:
>Can you recommend a generic way to achieve this ?
Eli
>>>import os
from os.path import normpath,abspath
x=r'\foo\\bar/baz//spam.py'
normpath(x)
'\\foo\\bar\\baz\\spam.py'
>>>normpath(abspath(x))
'C:\\foo\\bar\\baz\\spam.py'
>>>normpath(abspath(x)).split(os.sep)
['C:', 'foo', 'bar', 'baz', 'spam.py']
That gets a bit messy with UNC pathnames. With the OP's code the double
backslah leadin is preserved (although arguably it has split one time too
many, '\\\\frodo' would make more sense as the first element:
>>parse_path(r'\\frodo\foo\bar')
['\\\\', 'frodo', 'foo', 'bar']

With your code you just get two empty strings as the leadin:
>>normpath(abspath(r'\\frodo\foo\bar')).split(os.s ep)
['', '', 'frodo', 'foo', 'bar']

--
Duncan Booth http://kupuguy.blogspot.com
Jun 27 '08 #8

P: n/a
eliben wrote:
.... a prety good try ...
def parse_path(path):
"""..."""
By the way, the comment is fine. I am going for brevity here.
lst = []
while 1:
head, tail = os.path.split(path)
if tail == '':
if head != '': lst.insert(0, head)
break
else:
lst.insert(0, tail)
path = head
return lst
----------------------------------

Did I miss something and there is a way to do this standardly ?
Nope, the requirement is rare.
Is this function valid, or will there be cases that will confuse it ?
parse_path('/a/b/c//d/')

Try something like:
def parse_path(path):
'''...same comment...'''
head, tail = os.path.split(path)
result = []
if not tail:
if head == path:
return [head]
# Perhaps result = [''] here to an indicate ends-in-sep
head, tail = os.path.split(head)
while head and tail:
result.append(tail)
head, tail = os.path.split(head)
result.append(head or tail)
result.reverse()
return result

--Scott David Daniels
Sc***********@Acm.Org
Jun 27 '08 #9

This discussion thread is closed

Replies have been disabled for this discussion.