473,473 Members | 1,844 Online
Bytes | Software Development & Data Engineering Community
Create Post

Home Posts Topics Members FAQ

Parsing a path to components

Hello,

os.path.split returns the head and tail of a path, but what if I want
to have all the components ? I could not find a portable way to do
this in the standard library, so I've concocted the following
function. It uses os.path.split to be portable, at the expense of
efficiency.

----------------------------------
def parse_path(path):
""" Parses a path to its components.

Example:
parse_path("C:\\Python25\\lib\\site-packages\
\zipextimporter.py")

Returns:
['C:\\', 'Python25', 'lib', 'site-packages',
'zipextimporter.py']

This function uses os.path.split in an attempt to be portable.
It costs in performance.
"""
lst = []

while 1:
head, tail = os.path.split(path)

if tail == '':
if head != '': lst.insert(0, head)
break
else:
lst.insert(0, tail)
path = head

return lst
----------------------------------

Did I miss something and there is a way to do this standardly ?
Is this function valid, or will there be cases that will confuse it ?

Thanks in advance
Eli
Jun 27 '08 #1
8 1968
On Jun 7, 12:55*am, eliben <eli...@gmail.comwrote:
Hello,

os.path.split returns the head and tail of a path, but what if I want
to have all the components ? I could not find a portable way to do
this in the standard library, so I've concocted the following
function. It uses os.path.split to be portable, at the expense of
efficiency.

----------------------------------
def parse_path(path):
* * """ Parses a path to its components.

* * * * Example:
* * * * * * parse_path("C:\\Python25\\lib\\site-packages\
\zipextimporter.py")

* * * * * * Returns:
* * * * * * ['C:\\', 'Python25', 'lib', 'site-packages',
'zipextimporter.py']

* * * * This function uses os.path.split in an attempt to be portable.
* * * * It costs in performance.
* * """
* * lst = []

* * while 1:
* * * * head, tail = os.path.split(path)

* * * * if tail == '':
* * * * * * if head != '': lst.insert(0, head)
* * * * * * break
* * * * else:
* * * * * * lst.insert(0, tail)
* * * * * * path = head

* * return lst
----------------------------------

Did I miss something and there is a way to do this standardly ?
Is this function valid, or will there be cases that will confuse it ?
You can just split the path on `os.sep', which contains the path
separator of the platform on which Python is running:

components = pathString.split(os.sep)

Sebastian

Jun 27 '08 #2
On Fri, 06 Jun 2008 23:57:03 -0700, s0suk3 wrote:
You can just split the path on `os.sep', which contains the path
separator of the platform on which Python is running:

components = pathString.split(os.sep)
Won't work for platforms with more than one path separator and if a
separator is repeated. For example r'\foo\\bar/baz//spam.py' or:

In [140]: os.path.split('foo//bar')
Out[140]: ('foo', 'bar')

In [141]: 'foo//bar'.split(os.sep)
Out[141]: ['foo', '', 'bar']

Ciao,
Marc 'BlackJack' Rintsch
Jun 27 '08 #3
On Jun 7, 10:15 am, Marc 'BlackJack' Rintsch <bj_...@gmx.netwrote:
On Fri, 06 Jun 2008 23:57:03 -0700, s0suk3 wrote:
You can just split the path on `os.sep', which contains the path
separator of the platform on which Python is running:
components = pathString.split(os.sep)

Won't work for platforms with more than one path separator and if a
separator is repeated. For example r'\foo\\bar/baz//spam.py' or:

In [140]: os.path.split('foo//bar')
Out[140]: ('foo', 'bar')

In [141]: 'foo//bar'.split(os.sep)
Out[141]: ['foo', '', 'bar']

Ciao,
Marc 'BlackJack' Rintsch
Can you recommend a generic way to achieve this ?
Eli
Jun 27 '08 #4
On Jun 7, 3:15*am, Marc 'BlackJack' Rintsch <bj_...@gmx.netwrote:
On Fri, 06 Jun 2008 23:57:03 -0700, s0suk3 wrote:
You can just split the path on `os.sep', which contains the path
separator of the platform on which Python is running:
components = pathString.split(os.sep)

Won't work for platforms with more than one path separator and if a
separator is repeated. *For example r'\foo\\bar/baz//spam.py' or:

In [140]: os.path.split('foo//bar')
Out[140]: ('foo', 'bar')

In [141]: 'foo//bar'.split(os.sep)
Out[141]: ['foo', '', 'bar']
But those are invalid paths, aren't they? If you have a jumble of a
path, I think the solution is to call os.path.normpath() before
splitting.

Sebastian
Jun 27 '08 #5
On Sat, 07 Jun 2008 02:15:07 -0700, s0suk3 wrote:
On Jun 7, 3:15Â*am, Marc 'BlackJack' Rintsch <bj_...@gmx.netwrote:
>On Fri, 06 Jun 2008 23:57:03 -0700, s0suk3 wrote:
You can just split the path on `os.sep', which contains the path
separator of the platform on which Python is running:
components = pathString.split(os.sep)

Won't work for platforms with more than one path separator and if a
separator is repeated. Â*For example r'\foo\\bar/baz//spam.py' or:

In [140]: os.path.split('foo//bar')
Out[140]: ('foo', 'bar')

In [141]: 'foo//bar'.split(os.sep)
Out[141]: ['foo', '', 'bar']

But those are invalid paths, aren't they?
No. See `os.altsep` on Windows. And repeating separators is allowed too.

Ciao,
Marc 'BlackJack' Rintsch
Jun 27 '08 #6

"eliben" <el****@gmail.comwrote in message
news:e5**********************************@a70g2000 hsh.googlegroups.com...
On Jun 7, 10:15 am, Marc 'BlackJack' Rintsch <bj_...@gmx.netwrote:
>On Fri, 06 Jun 2008 23:57:03 -0700, s0suk3 wrote:
You can just split the path on `os.sep', which contains the path
separator of the platform on which Python is running:
components = pathString.split(os.sep)

Won't work for platforms with more than one path separator and if a
separator is repeated. For example r'\foo\\bar/baz//spam.py' or:

In [140]: os.path.split('foo//bar')
Out[140]: ('foo', 'bar')

In [141]: 'foo//bar'.split(os.sep)
Out[141]: ['foo', '', 'bar']

Ciao,
Marc 'BlackJack' Rintsch

Can you recommend a generic way to achieve this ?
Eli
>>import os
from os.path import normpath,abspath
x=r'\foo\\bar/baz//spam.py'
normpath(x)
'\\foo\\bar\\baz\\spam.py'
>>normpath(abspath(x))
'C:\\foo\\bar\\baz\\spam.py'
>>normpath(abspath(x)).split(os.sep)
['C:', 'foo', 'bar', 'baz', 'spam.py']

-Mark

Jun 27 '08 #7
"Mark Tolonen" <M8********@mailinator.comwrote:
>Can you recommend a generic way to achieve this ?
Eli
>>>import os
from os.path import normpath,abspath
x=r'\foo\\bar/baz//spam.py'
normpath(x)
'\\foo\\bar\\baz\\spam.py'
>>>normpath(abspath(x))
'C:\\foo\\bar\\baz\\spam.py'
>>>normpath(abspath(x)).split(os.sep)
['C:', 'foo', 'bar', 'baz', 'spam.py']
That gets a bit messy with UNC pathnames. With the OP's code the double
backslah leadin is preserved (although arguably it has split one time too
many, '\\\\frodo' would make more sense as the first element:
>>parse_path(r'\\frodo\foo\bar')
['\\\\', 'frodo', 'foo', 'bar']

With your code you just get two empty strings as the leadin:
>>normpath(abspath(r'\\frodo\foo\bar')).split(os.s ep)
['', '', 'frodo', 'foo', 'bar']

--
Duncan Booth http://kupuguy.blogspot.com
Jun 27 '08 #8
eliben wrote:
.... a prety good try ...
def parse_path(path):
"""..."""
By the way, the comment is fine. I am going for brevity here.
lst = []
while 1:
head, tail = os.path.split(path)
if tail == '':
if head != '': lst.insert(0, head)
break
else:
lst.insert(0, tail)
path = head
return lst
----------------------------------

Did I miss something and there is a way to do this standardly ?
Nope, the requirement is rare.
Is this function valid, or will there be cases that will confuse it ?
parse_path('/a/b/c//d/')

Try something like:
def parse_path(path):
'''...same comment...'''
head, tail = os.path.split(path)
result = []
if not tail:
if head == path:
return [head]
# Perhaps result = [''] here to an indicate ends-in-sep
head, tail = os.path.split(head)
while head and tail:
result.append(tail)
head, tail = os.path.split(head)
result.append(head or tail)
result.reverse()
return result

--Scott David Daniels
Sc***********@Acm.Org
Jun 27 '08 #9

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
by: Richard L Rosenheim | last post by:
I have a recollection that somewhere in the .Net Framework are methods that will parse a fully qualified filename into its components (path, base file name, extension, etc.). Unfortunately, I'm...
2
by: John Young | last post by:
I'm trying to parse a directory, but am not sure of the best way of doing it. Preferably using only .net instructions. Can anyone give me an idea of how to do this? Thanks in advance for any...
1
by: Steve | last post by:
I have been trying to find documentation on the behavior Can anyone tell me why the first example works and the second doesn't and where I can read about it in the language reference? Steve ...
4
by: Rick Walsh | last post by:
I have an HTML table in the following format: <table> <tr><td>Header 1</td><td>Header 2</td></tr> <tr><td>1</td><td>2</td></tr> <tr><td>3</td><td>4</td></tr> <tr><td>5</td><td>6</td></tr>...
3
by: Lone Wolf | last post by:
I want to thank everybody who tried to help me, and also to post my solution, even though I don’t think it is a very good one. Many of you correctly guessed that there was an “\r” included with...
3
by: aspineux | last post by:
My goal is to write a parser for these imaginary string from the SMTP protocol, regarding RFC 821 and 1869. I'm a little flexible with the BNF from these RFC :-) Any comment ? tests= def...
4
by: =?Utf-8?B?QWxwYW5h?= | last post by:
I am making a thin email client and want to get emails from a pop3 server...Is there any built in support in C# to get emails from a pop3 server and parse the email to show up on the UI ?
2
by: nicky123 | last post by:
Hi everyone, This is a brief description that I have provided for parsing & displaying an XML document using DOM API. Please feel free to post your own comments & views regarding...
1
by: Jon Harrop | last post by:
Are there standard library functions for parsing URLs into their components (scheme, host, port, path etc.)? -- Dr Jon D Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?u
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
1
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
1
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...
0
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...
0
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
0
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated ...
0
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.