473,408 Members | 2,839 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,408 software developers and data experts.

Dumb glob question

I've run into an issue with glob and matching filenames with brackets '[]'
in them. The problem comes when I'm using part of such a filename as the
path I'm passing to glob. Here's a trimmed down dumb example. Let's say I
have a directory with the following files in it.

foo.par2
foo.vol0+1.par2
foo.vol1+1.par2
zzz [foo].par2
zzz [foo].vol0+1.par2
zzz [foo].vol1+1.par2

While processing one of the files I want to do certain things in batch so
I've been using glob as a means to get all of the files in a set. The
following code will print the filenames for parity volumes in each set
while working with the base checksum, unless there are brackets in the
name.
#re2 = re.compile(r'vol', re.IGNORECASE)

#for nuke in glob.glob('*.par2'):
# if not re2.search(nuke):
# list = glob.glob(nuke[:-5]+'*vol*')
# for name in list: print os.path.join(os.getcwd(),name)

I'm sure there is something obvious I'm missing. I figured I could use
something like re.escape on the trimmed filename for matching but that
hasn't worked either. Using win32api.FindFiles instead of glob works but
I'd obviously rather do it the _right_ way and have it work properly in
*nix too.
Jul 18 '05 #1
4 3415
code like below willprint all files ending on 'par2', except tose not
containong 'vol' from the 5th position. is that what you need?
-import glob
-for nuke in glob.glob(r"""c:\temp\*.par2"""):
- try:
- nuke.index('vol', 5)
- print nuke
- except ValueError, e:
- print e

Jul 18 '05 #2
"wi******@hotmail.com" <wi******@hotmail.com> wrote in comp.lang.python:
code like below willprint all files ending on 'par2', except tose not
containong 'vol' from the 5th position. is that what you need?
-import glob
-for nuke in glob.glob(r"""c:\temp\*.par2"""):
- try:
- nuke.index('vol', 5)
- print nuke
- except ValueError, e:
- print e


Not quite. I'm sorry my example wasn't very clear. While working with any
single file I need to be able to build a list of all the other files in a
particular set. Basically I just need globbing of the base filename.

glob.glob(basename+'.*some_extension')

So if I was working with 'foo.par2' at the moment...

glob.glob(filename[:-5]+'.*par2')

would catch all of the files belonging to the set including 'foo.par2'
'foo.vol0+1.par2' 'foo.vol1+1.par2' etc.

This works great (as expected) until you are working with a filename with
brackets '[]' in it. Then glob just returns an empty list. So if I happen
to be processing 'foo [bar].par2'

glob.glob(filename[:-5]+'.*par2')

doesn't return anything. Using win32api.FindFiles(filename[:-5]+'.*par2')
works perfectly, but I don't want to rely on win32api functions. I hope
that made more sense :).

Jul 18 '05 #3
Python Dunce wrote:
So if I happen
to be processing 'foo [bar].par2'

glob.glob(filename[:-5]+'.*par2')

doesn't return anything. Using win32api.FindFiles(filename[:-5]+'.*par2')
works perfectly, but I don't want to rely on win32api functions. I hope
that made more sense :).


If you look in the source for glob.py, you will find that it calls the
fnmatch module, and this is the docstring for fnmatch.translate():

"""Translate a shell PATTERN to a regular expression.

There is no way to quote meta-characters.
"""

So you cannot do what you want with glob.

You can replace [] with ? in your glob string, if you are sure that
there won't be other characters there. That's a bit of a hack, and I
wouldn't do it.

In my mind it would probably be best to do:

re_vol = re.compile(re.escape(startpart) + ".*vol.*")
lst = [filename for filename in os.listdir(".") if re_vol.match(filename)]

I changed "list" to "lst" because the former shadows a built-in.
--
Michael Hoffman
Jul 18 '05 #4
Michael Hoffman <ca*******@mh391.invalid> wrote in comp.lang.python:
Python Dunce wrote:
So if I happen
to be processing 'foo [bar].par2'

glob.glob(filename[:-5]+'.*par2')

doesn't return anything. Using
win32api.FindFiles(filename[:-5]+'.*par2') works perfectly, but I don't
want to rely on win32api functions. I hope that made more sense :).


If you look in the source for glob.py, you will find that it calls the
fnmatch module, and this is the docstring for fnmatch.translate():

"""Translate a shell PATTERN to a regular expression.

There is no way to quote meta-characters.
"""

So you cannot do what you want with glob.

You can replace [] with ? in your glob string, if you are sure that
there won't be other characters there. That's a bit of a hack, and I
wouldn't do it.

In my mind it would probably be best to do:

re_vol = re.compile(re.escape(startpart) + ".*vol.*")
lst = [filename for filename in os.listdir(".") if
re_vol.match(filename)]

I changed "list" to "lst" because the former shadows a built-in.


Thanks, that should do the trick! I had tried basically the same thing
once but I was getting back empty lists. I think it was just a brain fart
involving a case sensitive regex that didn't match the files I was testing
it on :/.
Jul 18 '05 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

15
by: Georgy Pruss | last post by:
On Windows XP glob.glob doesn't work properly for files without extensions. E.g. C:\Temp contains 4 files: 2 with extensions, 2 without. C:\Temp>dir /b * aaaaa.aaa bbbbb.bbb ccccc ddddd ...
5
by: Elbert Lev | last post by:
#Here is the script #Python 2.3 on W2K import glob name = glob.glob(u"./*.mp3") print type(name) name = glob.glob(u"*.mp3") print type(name)
2
by: Zain Homer | last post by:
Someone please let me know if I'm sending this to the wrong email alias... I'm wondering why we can't use the glob module to glob with curly brackets like we can from the command line (at least...
6
by: Hitesh | last post by:
import string import os f = open ("c:\\servername.txt", 'r') linelist = f.read() lineLog = string.split(linelist, '\n') lineLog = lineLog #print lineLog for l in lineLog:
2
by: J | last post by:
Greetings Group- I'm trying to put together a pattern matching script that scans a directory tree for tif images contained in similar folder names, but running into a NewB problem already. Is it...
8
by: mark.bergman | last post by:
I am porting from Digital Unix to Linux (RHEL 4), and am seeing a difference in the return value of glob(). Given a non-existant directory "/tmp/a", and the following line of code: result =...
3
by: billiejoex | last post by:
os.listdir(path) return a list of file names independently if the path- argument used is absolute or relative: '/home/user' glob.glob, instead, return file names only if given path is...
1
by: crybaby | last post by:
when I do this in my python code and run it in windows xp, it creates ctemp/..../.../.../../ so on and creates file t. Not file starting with the name complist and ending with .txt...
5
by: jo3c | last post by:
hi everyone happy new year! im a newbie to python i have a question by using linecache and glob how do i read a specific line from a file in a batch and then insert it into database? because...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.