473,287 Members | 1,582 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,287 software developers and data experts.

Parsing a path to components

Hello,

os.path.split returns the head and tail of a path, but what if I want
to have all the components ? I could not find a portable way to do
this in the standard library, so I've concocted the following
function. It uses os.path.split to be portable, at the expense of
efficiency.

----------------------------------
def parse_path(path):
""" Parses a path to its components.

Example:
parse_path("C:\\Python25\\lib\\site-packages\
\zipextimporter.py")

Returns:
['C:\\', 'Python25', 'lib', 'site-packages',
'zipextimporter.py']

This function uses os.path.split in an attempt to be portable.
It costs in performance.
"""
lst = []

while 1:
head, tail = os.path.split(path)

if tail == '':
if head != '': lst.insert(0, head)
break
else:
lst.insert(0, tail)
path = head

return lst
----------------------------------

Did I miss something and there is a way to do this standardly ?
Is this function valid, or will there be cases that will confuse it ?

Thanks in advance
Eli
Jun 27 '08 #1
8 1961
On Jun 7, 12:55*am, eliben <eli...@gmail.comwrote:
Hello,

os.path.split returns the head and tail of a path, but what if I want
to have all the components ? I could not find a portable way to do
this in the standard library, so I've concocted the following
function. It uses os.path.split to be portable, at the expense of
efficiency.

----------------------------------
def parse_path(path):
* * """ Parses a path to its components.

* * * * Example:
* * * * * * parse_path("C:\\Python25\\lib\\site-packages\
\zipextimporter.py")

* * * * * * Returns:
* * * * * * ['C:\\', 'Python25', 'lib', 'site-packages',
'zipextimporter.py']

* * * * This function uses os.path.split in an attempt to be portable.
* * * * It costs in performance.
* * """
* * lst = []

* * while 1:
* * * * head, tail = os.path.split(path)

* * * * if tail == '':
* * * * * * if head != '': lst.insert(0, head)
* * * * * * break
* * * * else:
* * * * * * lst.insert(0, tail)
* * * * * * path = head

* * return lst
----------------------------------

Did I miss something and there is a way to do this standardly ?
Is this function valid, or will there be cases that will confuse it ?
You can just split the path on `os.sep', which contains the path
separator of the platform on which Python is running:

components = pathString.split(os.sep)

Sebastian

Jun 27 '08 #2
On Fri, 06 Jun 2008 23:57:03 -0700, s0suk3 wrote:
You can just split the path on `os.sep', which contains the path
separator of the platform on which Python is running:

components = pathString.split(os.sep)
Won't work for platforms with more than one path separator and if a
separator is repeated. For example r'\foo\\bar/baz//spam.py' or:

In [140]: os.path.split('foo//bar')
Out[140]: ('foo', 'bar')

In [141]: 'foo//bar'.split(os.sep)
Out[141]: ['foo', '', 'bar']

Ciao,
Marc 'BlackJack' Rintsch
Jun 27 '08 #3
On Jun 7, 10:15 am, Marc 'BlackJack' Rintsch <bj_...@gmx.netwrote:
On Fri, 06 Jun 2008 23:57:03 -0700, s0suk3 wrote:
You can just split the path on `os.sep', which contains the path
separator of the platform on which Python is running:
components = pathString.split(os.sep)

Won't work for platforms with more than one path separator and if a
separator is repeated. For example r'\foo\\bar/baz//spam.py' or:

In [140]: os.path.split('foo//bar')
Out[140]: ('foo', 'bar')

In [141]: 'foo//bar'.split(os.sep)
Out[141]: ['foo', '', 'bar']

Ciao,
Marc 'BlackJack' Rintsch
Can you recommend a generic way to achieve this ?
Eli
Jun 27 '08 #4
On Jun 7, 3:15*am, Marc 'BlackJack' Rintsch <bj_...@gmx.netwrote:
On Fri, 06 Jun 2008 23:57:03 -0700, s0suk3 wrote:
You can just split the path on `os.sep', which contains the path
separator of the platform on which Python is running:
components = pathString.split(os.sep)

Won't work for platforms with more than one path separator and if a
separator is repeated. *For example r'\foo\\bar/baz//spam.py' or:

In [140]: os.path.split('foo//bar')
Out[140]: ('foo', 'bar')

In [141]: 'foo//bar'.split(os.sep)
Out[141]: ['foo', '', 'bar']
But those are invalid paths, aren't they? If you have a jumble of a
path, I think the solution is to call os.path.normpath() before
splitting.

Sebastian
Jun 27 '08 #5
On Sat, 07 Jun 2008 02:15:07 -0700, s0suk3 wrote:
On Jun 7, 3:15Â*am, Marc 'BlackJack' Rintsch <bj_...@gmx.netwrote:
>On Fri, 06 Jun 2008 23:57:03 -0700, s0suk3 wrote:
You can just split the path on `os.sep', which contains the path
separator of the platform on which Python is running:
components = pathString.split(os.sep)

Won't work for platforms with more than one path separator and if a
separator is repeated. Â*For example r'\foo\\bar/baz//spam.py' or:

In [140]: os.path.split('foo//bar')
Out[140]: ('foo', 'bar')

In [141]: 'foo//bar'.split(os.sep)
Out[141]: ['foo', '', 'bar']

But those are invalid paths, aren't they?
No. See `os.altsep` on Windows. And repeating separators is allowed too.

Ciao,
Marc 'BlackJack' Rintsch
Jun 27 '08 #6

"eliben" <el****@gmail.comwrote in message
news:e5**********************************@a70g2000 hsh.googlegroups.com...
On Jun 7, 10:15 am, Marc 'BlackJack' Rintsch <bj_...@gmx.netwrote:
>On Fri, 06 Jun 2008 23:57:03 -0700, s0suk3 wrote:
You can just split the path on `os.sep', which contains the path
separator of the platform on which Python is running:
components = pathString.split(os.sep)

Won't work for platforms with more than one path separator and if a
separator is repeated. For example r'\foo\\bar/baz//spam.py' or:

In [140]: os.path.split('foo//bar')
Out[140]: ('foo', 'bar')

In [141]: 'foo//bar'.split(os.sep)
Out[141]: ['foo', '', 'bar']

Ciao,
Marc 'BlackJack' Rintsch

Can you recommend a generic way to achieve this ?
Eli
>>import os
from os.path import normpath,abspath
x=r'\foo\\bar/baz//spam.py'
normpath(x)
'\\foo\\bar\\baz\\spam.py'
>>normpath(abspath(x))
'C:\\foo\\bar\\baz\\spam.py'
>>normpath(abspath(x)).split(os.sep)
['C:', 'foo', 'bar', 'baz', 'spam.py']

-Mark

Jun 27 '08 #7
"Mark Tolonen" <M8********@mailinator.comwrote:
>Can you recommend a generic way to achieve this ?
Eli
>>>import os
from os.path import normpath,abspath
x=r'\foo\\bar/baz//spam.py'
normpath(x)
'\\foo\\bar\\baz\\spam.py'
>>>normpath(abspath(x))
'C:\\foo\\bar\\baz\\spam.py'
>>>normpath(abspath(x)).split(os.sep)
['C:', 'foo', 'bar', 'baz', 'spam.py']
That gets a bit messy with UNC pathnames. With the OP's code the double
backslah leadin is preserved (although arguably it has split one time too
many, '\\\\frodo' would make more sense as the first element:
>>parse_path(r'\\frodo\foo\bar')
['\\\\', 'frodo', 'foo', 'bar']

With your code you just get two empty strings as the leadin:
>>normpath(abspath(r'\\frodo\foo\bar')).split(os.s ep)
['', '', 'frodo', 'foo', 'bar']

--
Duncan Booth http://kupuguy.blogspot.com
Jun 27 '08 #8
eliben wrote:
.... a prety good try ...
def parse_path(path):
"""..."""
By the way, the comment is fine. I am going for brevity here.
lst = []
while 1:
head, tail = os.path.split(path)
if tail == '':
if head != '': lst.insert(0, head)
break
else:
lst.insert(0, tail)
path = head
return lst
----------------------------------

Did I miss something and there is a way to do this standardly ?
Nope, the requirement is rare.
Is this function valid, or will there be cases that will confuse it ?
parse_path('/a/b/c//d/')

Try something like:
def parse_path(path):
'''...same comment...'''
head, tail = os.path.split(path)
result = []
if not tail:
if head == path:
return [head]
# Perhaps result = [''] here to an indicate ends-in-sep
head, tail = os.path.split(head)
while head and tail:
result.append(tail)
head, tail = os.path.split(head)
result.append(head or tail)
result.reverse()
return result

--Scott David Daniels
Sc***********@Acm.Org
Jun 27 '08 #9

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
by: Richard L Rosenheim | last post by:
I have a recollection that somewhere in the .Net Framework are methods that will parse a fully qualified filename into its components (path, base file name, extension, etc.). Unfortunately, I'm...
2
by: John Young | last post by:
I'm trying to parse a directory, but am not sure of the best way of doing it. Preferably using only .net instructions. Can anyone give me an idea of how to do this? Thanks in advance for any...
1
by: Steve | last post by:
I have been trying to find documentation on the behavior Can anyone tell me why the first example works and the second doesn't and where I can read about it in the language reference? Steve ...
4
by: Rick Walsh | last post by:
I have an HTML table in the following format: <table> <tr><td>Header 1</td><td>Header 2</td></tr> <tr><td>1</td><td>2</td></tr> <tr><td>3</td><td>4</td></tr> <tr><td>5</td><td>6</td></tr>...
3
by: Lone Wolf | last post by:
I want to thank everybody who tried to help me, and also to post my solution, even though I don’t think it is a very good one. Many of you correctly guessed that there was an “\r” included with...
3
by: aspineux | last post by:
My goal is to write a parser for these imaginary string from the SMTP protocol, regarding RFC 821 and 1869. I'm a little flexible with the BNF from these RFC :-) Any comment ? tests= def...
4
by: =?Utf-8?B?QWxwYW5h?= | last post by:
I am making a thin email client and want to get emails from a pop3 server...Is there any built in support in C# to get emails from a pop3 server and parse the email to show up on the UI ?
2
by: nicky123 | last post by:
Hi everyone, This is a brief description that I have provided for parsing & displaying an XML document using DOM API. Please feel free to post your own comments & views regarding...
1
by: Jon Harrop | last post by:
Are there standard library functions for parsing URLs into their components (scheme, host, port, path etc.)? -- Dr Jon D Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?u
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
by: Aftab Ahmad | last post by:
Hello Experts! I have written a code in MS Access for a cmd called "WhatsApp Message" to open WhatsApp using that very code but the problem is that it gives a popup message everytime I clicked on...
0
by: Aftab Ahmad | last post by:
So, I have written a code for a cmd called "Send WhatsApp Message" to open and send WhatsApp messaage. The code is given below. Dim IE As Object Set IE =...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: marcoviolo | last post by:
Dear all, I would like to implement on my worksheet an vlookup dynamic , that consider a change of pivot excel via win32com, from an external excel (without open it) and save the new file into a...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.