473,757 Members | 7,200 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

newb question: file searching

I'm new at Python and I need a little advice. Part of the script I'm
trying to write needs to be aware of all the files of a certain
extension in the script's path and all sub-directories. Can someone
set me on the right path to what modules and calls to use to do that?
You'd think that it would be a fairly simple proposition, but I can't
find examples anywhere. Thanks.

Aug 8 '06 #1
29 2655
Hey, I've done similar things.

You can use system comands with the following

import os
os.system('comm and here')

You'd maybe want to do something like

dir_name = 'mydirectory'
import os
os.system('ls ' +dir_name + ' lsoutput.tmp')
fin = open('lsoutput. tmp', 'r')
file_list = fin.readlines()
fin.close()

Now you have a list of all the files in dir_name stored in file_list.

Then you'll have to parse the input with string methods. They're easy
in python. Here's the list of them:
http://docs.python.org/lib/string-methods.html

There is probably a better way to get the data from an os.system
command but i haven't figured it out. Instead what i'm doing is
writing the stdio output to a file and reading in the data. It works
fine. Put it in your tmp dir if you're in linux.
ja*******@gmail .com wrote:
I'm new at Python and I need a little advice. Part of the script I'm
trying to write needs to be aware of all the files of a certain
extension in the script's path and all sub-directories. Can someone
set me on the right path to what modules and calls to use to do that?
You'd think that it would be a fairly simple proposition, but I can't
find examples anywhere. Thanks.
Aug 8 '06 #2
Hey, I've done similar things.

You can use system comands with the following

import os
os.system('comm and here')

You'd maybe want to do something like

dir_name = 'mydirectory'
import os
os.system('ls ' +dir_name + ' lsoutput.tmp')
fin = open('lsoutput. tmp', 'r')
file_list = fin.readlines()
fin.close()

Now you have a list of all the files in dir_name stored in file_list.

Then you'll have to parse the input with string methods. They're easy
in python. Here's the list of them:
http://docs.python.org/lib/string-methods.html

There is probably a better way to get the data from an os.system
command but i haven't figured it out. Instead what i'm doing is
writing the stdio output to a file and reading in the data. It works
fine. Put it in your tmp dir if you're in linux.
ja*******@gmail .com wrote:
I'm new at Python and I need a little advice. Part of the script I'm
trying to write needs to be aware of all the files of a certain
extension in the script's path and all sub-directories. Can someone
set me on the right path to what modules and calls to use to do that?
You'd think that it would be a fairly simple proposition, but I can't
find examples anywhere. Thanks.
Aug 8 '06 #3

ja*******@gmail .com wrote:
I'm new at Python and I need a little advice. Part of the script I'm
trying to write needs to be aware of all the files of a certain
extension in the script's path and all sub-directories.
What you want is os.walk().

http://www.python.org/doc/current/lib/os-file-dir.html

Aug 8 '06 #4
ja*******@gmail .com wrote:
I'm new at Python and I need a little advice. Part of the script I'm
trying to write needs to be aware of all the files of a certain
extension in the script's path and all sub-directories. Can someone
set me on the right path to what modules and calls to use to do that?
You'd think that it would be a fairly simple proposition, but I can't
find examples anywhere. Thanks.
dir_name = 'mydirectory'
extension = 'my extension'
import os
files = os.listdir(dir_ name)
files_with_ext = [file for file in files if file.endswith(e xtension)]

That will only do the top level (not subdirectories) , but you can use
the os.walk procedure (or some of the other procedures in the os and
os.path modules) to do that.

--Dave

Aug 8 '06 #5
Mike Kent wrote:
What you want is os.walk().

http://www.python.org/doc/current/lib/os-file-dir.html
I'm thinking os.walk() could definitely be a big part of my solution,
but I need a little for info. If I'm reading this correctly, os.walk()
just goes file by file and serves it up for your script to decide what
to do with each one. Is that right? So, for each file it found, I'd
have to decide if it met the criteria of the filetype I'm searching for
and then add that info to whatever datatype I want to make a little
list for myself? Am I being coherent?

Something like:

for files in os.walk(top, topdown=False):
for name in files:
(do whatever to decide if criteria is met, etc.)

Does this look correct?

Aug 8 '06 #6
I'm thinking os.walk() could definitely be a big part of my solution,
but I need a little for info. If I'm reading this correctly, os.walk()
just goes file by file and serves it up for your script to decide what
to do with each one. Is that right? So, for each file it found, I'd
have to decide if it met the criteria of the filetype I'm searching for
and then add that info to whatever datatype I want to make a little
list for myself? Am I being coherent?

Something like:

for files in os.walk(top, topdown=False):
for name in files:
(do whatever to decide if criteria is met, etc.)

Does this look correct?
IIRC, repeated calls to os.walk would implement a depth-first search on
your current directory. Each call returns a list:
[<directory name relative to where you started>, <list of files and
directories in that directory>]

--dave

Aug 8 '06 #7
Thanks, Dave. That's exactly what I was looking for, well, except for
a few small alterations I'll make to achieve the desired effect. I
must ask, in the interest of learning, what is

[file for file in files if file.endswith(e xtension)]

actually doing? I know that 'file' is a type, but what's with the set
up and the brackets and all? Can someone run down the syntax for me on
that? And also, I'm still not sure I know exactly how os.walk() works.
And, finally, the python docs all note that symbols like . and ..
don't work with these commands. How can I grab the directory that my
script is residing in?

Thanks.
hiaips wrote:
ja*******@gmail .com wrote:
I'm new at Python and I need a little advice. Part of the script I'm
trying to write needs to be aware of all the files of a certain
extension in the script's path and all sub-directories. Can someone
set me on the right path to what modules and calls to use to do that?
You'd think that it would be a fairly simple proposition, but I can't
find examples anywhere. Thanks.

dir_name = 'mydirectory'
extension = 'my extension'
import os
files = os.listdir(dir_ name)
files_with_ext = [file for file in files if file.endswith(e xtension)]

That will only do the top level (not subdirectories) , but you can use
the os.walk procedure (or some of the other procedures in the os and
os.path modules) to do that.

--Dave
Aug 8 '06 #8

ja*******@gmail .com wrote:
Mike Kent wrote:
What you want is os.walk().

http://www.python.org/doc/current/lib/os-file-dir.html

I'm thinking os.walk() could definitely be a big part of my solution,
but I need a little for info. If I'm reading this correctly, os.walk()
just goes file by file and serves it up for your script to decide what
to do with each one. Is that right? So, for each file it found, I'd
have to decide if it met the criteria of the filetype I'm searching for
and then add that info to whatever datatype I want to make a little
list for myself? Am I being coherent?

Something like:

for files in os.walk(top, topdown=False):
for name in files:
(do whatever to decide if criteria is met, etc.)

Does this look correct?
Not completely. According to the documentation, os.walk returns a
tuple:
(directory, subdirectories, files)
So the files you want are in the third element of the tuple.

You can use the fnmatch module to find the names that match your
filename pattern.

You'll want to do something like this:
>>for (dir, subdirs, files) in os.walk('.'):
.... for cppFile in fnmatch.filter( files, '*.cpp'):
.... print cppFile
....
ActiveX Test.cpp
ActiveX TestDoc.cpp
ActiveX TestView.cpp
MainFrm.cpp
StdAfx.cpp
>>>
Please note that your results will vary, of course.

Aug 8 '06 #9

ja*******@gmail .com wrote:
Thanks, Dave. That's exactly what I was looking for, well, except for
a few small alterations I'll make to achieve the desired effect. I
must ask, in the interest of learning, what is

[file for file in files if file.endswith(e xtension)]

actually doing? I know that 'file' is a type, but what's with the set
up and the brackets and all? Can someone run down the syntax for me on
that? And also, I'm still not sure I know exactly how os.walk() works.
And, finally, the python docs all note that symbols like . and ..
don't work with these commands. How can I grab the directory that my
script is residing in?
[file for file in files if file.endswith(e xtension)] is called a list
comprehension. Functionally, it is equivalent to something like this:
files_with_ext = []
for file in files:
if file.endswith(e xtension):
files_with_ext. append(file)
However, list comprehensions provide a much more terse, declarative
syntax without sacrificing readability.

To get your current working directory (i.e., the directory in which
your script is residing):
cwd = os.getcwd()

As far as os.walk...
This is actually a generator (effectively, a function that eventually
produces a sequence, but rather than returning the entire sequence, it
"yields" one value at a time). You might use it as follows:
for x in os.walk(mydirec tory):
for file in x[1]:
<whatever tests or code you need to run>

You might want to read up on the os and os.path modules - these
probably have many of the utilities you're looking for.

Good luck,
dave

Aug 8 '06 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
2131
by: claudel | last post by:
Hi I have a newb PHP/Javascript question regarding checkbox processing I'm not sure which area it falls into so I crossposted to comp.lang.php and comp.lang.javascript. I'm trying to construct a checkbox array in a survey form where one of the choices is "No Preference" which is checked by default. If the victim chooses other than "No Preference", I'd like to uncheck
5
2034
by: Alexandre | last post by:
Hi, Im a newb to dev and python... my first sefl assigned mission was to read a pickled file containing a list with DB like data and convert this to MySQL... So i wrote my first module which reads this pickled file and writes an XML file with list of tables and fields (... next step will the module who creates the tables according to details found in the XML file). If anyone has some minutes to spare, suggestions and comments would be...
24
2373
by: Apotheosis | last post by:
The problem professor gave us is: Write a program which reads two integer values. If the first is less than the second, print the message "up". If the second is less than the first, print the message "down" If the numbers are equal, print the message "equal" If there is an error reading the data, print a message containing the word "Error" and perform exit( 0 ); And this is what I wrote:
5
1911
by: none | last post by:
hi all, (i am running on win 2k pro). i saw a program i like on a website and when i went to download it it was just a load of 'c' code. now, i know very little about 'c' or programming but I downloaded 'miracle c' and pasted the code in and when i comiled it i got a load of errors. (below)
11
1998
by: The_Kingpin | last post by:
Hi all, I'm new to C programming and looking for some help. I have a homework project to do and could use every tips, advises, code sample and references I can get. Here's what I need to do. I have a file named books.txt that contains all the informations on the books. Each book is a struc containing 6 fields written on separated line in the
2
1195
by: JR | last post by:
I have tried searching boards but have not been able to find an answer. What is the best way to display text from a log.txt file and then display it in three seperate text boxes? I have a log file that is continually going to have 3 temperatures appended to it. I need to read that temperature in from the log file and display it to the user. The temperatures are seperated by commas. I was reading and is Streamreader
1
1000
by: jgibbens | last post by:
First, thank you for even looking at my question. I am using VB 2005 and I have a question about XML. what I want to do is: 1. keep adding to this file with out over writing it 2. be able to change stuff the XML without overwriting the rest of the file This is the XML I have:
0
9489
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
9298
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10072
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
9906
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
0
9737
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
8737
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
7286
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5172
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
3
2698
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.