473,387 Members | 1,512 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,387 software developers and data experts.

Expanding Search to Subfolders

This is the beginning of a script that I wrote to open all the text
files in a single directory, then process the data in the text files
line by line into a single index file.

os.chdir("C:\\Python23\\programs\\filetree")
mydir = glob.glob("*.txt")

index = open("index.rtf", 'w')

for File in mydir:
count = 1
file = open(File)
fileContent = file.readlines()
for line in fileContent:
if not line.startswith("\n"):
if count == 1:

I'm now trying to the program to process all the text files in
subdirectories, so that I don't have to run the script more than once.
I know that the following script will SHOW me the contents of the
subdirectories, but I can't integrate the two:

def print_tree(tree_root_dir):
def printall(junk, dirpath, namelist):
for name in namelist:
print os.path.join(dirpath, name)
os.path.walk(tree_root_dir, printall, None)

print_tree("C:\\Python23\\programs\\filetree")

I've taught myself out of online tutorials, so I think that this is a
matter of a command that I haven't learned rather a matter of logic.
Could someone tell me where to learn more about directory processes or
show me an improved version of my first script snippet?

Thanks

Jun 5 '06 #1
9 2820
On 5 Jun 2006 10:01:06 -0700, PipedreamerGrey <pi*************@gmail.com> wrote:
This is the beginning of a script that I wrote to open all the text
files in a single directory, then process the data in the text files
line by line into a single index file.

os.chdir("C:\\Python23\\programs\\filetree")
mydir = glob.glob("*.txt")

index = open("index.rtf", 'w')

for File in mydir:
count = 1
file = open(File)
fileContent = file.readlines()
for line in fileContent:
if not line.startswith("\n"):
if count == 1:

I'm now trying to the program to process all the text files in
subdirectories, so that I don't have to run the script more than once.
I know that the following script will SHOW me the contents of the
subdirectories, but I can't integrate the two:

def print_tree(tree_root_dir):
def printall(junk, dirpath, namelist):
for name in namelist:
print os.path.join(dirpath, name)
os.path.walk(tree_root_dir, printall, None)

print_tree("C:\\Python23\\programs\\filetree")

I've taught myself out of online tutorials, so I think that this is a
matter of a command that I haven't learned rather a matter of logic.
Could someone tell me where to learn more about directory processes or
show me an improved version of my first script snippet?

Thanks

--
http://mail.python.org/mailman/listinfo/python-list

How about something like:
import os, stat

class DirectoryWalker:
# a forward iterator that traverses a directory tree, and
# returns the filename

def __init__(self, directory):
self.stack = [directory]
self.files = []
self.index = 0

def __getitem__(self, index):
while 1:
try:
file = self.files[self.index]
self.index = self.index + 1
except IndexError:
# pop next directory from stack
self.directory = self.stack.pop()
self.files = os.listdir(self.directory)
self.index = 0
else:
# got a filename
fullname = os.path.join(self.directory, file)
if os.path.isdir(fullname) and not os.path.islink(fullname):
self.stack.append(fullname)
else:
return fullname

for file, st in DirectoryWalker("."):
your function here

not tested

Lou
--
Artificial Intelligence is no match for Natural Stupidity
Jun 5 '06 #2
On 2006-06-05, PipedreamerGrey <pi*************@gmail.com> wrote:

Just in case you really are trying to accomplish something
other than learn Python, there are far easier ways to do these
tasks:
This is the beginning of a script that I wrote to open all the
text files in a single directory, then process the data in the
text files line by line into a single index file.
#!/bin/bash
cat *.txt >outputfile
I'm now trying to the program to process all the text files in
subdirectories, so that I don't have to run the script more
than once.


#!/bin/bash
cat `find . -name '*.txt'` >outputfile

--
Grant Edwards grante Yow! An INK-LING? Sure --
at TAKE one!! Did you BUY any
visi.com COMMUNIST UNIFORMS??
Jun 5 '06 #3
there are far easier ways
#!/bin/bash
cat *.txt >outputfile
Well, yes, but if he's kicking things off with:
os.chdir("C:\\Python23\\programs\\filetree")


I'm guessing he's not on Linux. Maybe you're trying to convert him?

rd

Jun 5 '06 #4
>>> Could someone tell me where to learn more about directory
processes or show me an improved version of my first
script snippet?


Use os.walk

http://docs.python.org/lib/os-file-dir.html

It takes a little reading to get it if you are a beginner, but there
are zillions of examples if you just search this Google Group on
"os.walk"

http://tinyurl.com/kr3m6

Good luck

rd

"I don't have any solution, but I certainly admire the
problem."--Ashleigh Brilliant

Jun 6 '06 #5
Thanks, that was a big help. It worked fine once I removed
os.chdir("C:\\Python23\\programs\\Magazine\\Sample sE")


and changed "for file, st in DirectoryWalker("."):"
to
"for file in DirectoryWalker("."):" (removing the "st")

Jun 6 '06 #6
Thanks everyone!

Jun 6 '06 #7
Here's the final working script. It opens all of the text files in a
directory and its subdirectories and combines them into one Rich text
file (index.rtf):

#! /usr/bin/python
import glob
import fileinput
import os
import string
import sys

index = open("index.rtf", 'w')

class DirectoryWalker:
# a forward iterator that traverses a directory tree, and
# returns the filename

def __init__(self, directory):
self.stack = [directory]
self.files = []
self.index = 0

def __getitem__(self, index):
while 1:
try:
file = self.files[self.index]
self.index = self.index + 1
except IndexError:
# pop next directory from stack
self.directory = self.stack.pop()
self.files = os.listdir(self.directory)
self.index = 0
else:
# get a filename, eliminate directories from list
fullname = os.path.join(self.directory, file)
if os.path.isdir(fullname) and not
os.path.islink(fullname):
self.stack.append(fullname)
else:
return fullname

for file in DirectoryWalker("."):
# divide files names into path and extention
path, ext = os.path.splitext(file)
# choose the extention you would like to see in the list
if ext == ".txt":
print file

# print the contents of each file into the index
file = open(file)
fileContent = file.readlines()
for line in fileContent:
if not line.startswith("\n"):
index.write(line)
index.write("\n")

index.close()

Jun 6 '06 #8
Lou Losee wrote:
How about something like: import os, stat

class DirectoryWalker:
# a forward iterator that traverses a directory tree, and
# returns the filename
... not tested


speak for yourself ;-)

(the code is taken from http://effbot.org/librarybook/os-path.htm )

</F>

Jun 6 '06 #9
On 6/6/06, Fredrik Lundh <fr*****@pythonware.com> wrote:
Lou Losee wrote:
How about something like:

import os, stat

class DirectoryWalker:
# a forward iterator that traverses a directory tree, and
# returns the filename
> ...

not tested


speak for yourself ;-)

(the code is taken from http://effbot.org/librarybook/os-path.htm )

</F>

Well that is good to know :) I know I did not get it from there but I
had it in my snippits directory.

Lou
--
Artificial Intelligence is no match for Natural Stupidity
Jun 6 '06 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

11
by: Ben | last post by:
Greetings, I am looking for a way to search for and delete files based on a pattern mask. For example, the search method would find all files matching a certain pattern containing wildcards (e.g....
1
by: Les Juby | last post by:
A year or two back I needed a search script to scan thru HTML files on a client site. Usual sorta thing. A quick search turned up a neat script that provided great search results. It was fast,...
2
by: Philip Wagenaar | last post by:
How do I search for a directory on a windows xp machine and delete the directory including all subfolders and files in them?
0
by: =?Utf-8?B?QnJ5YW4=?= | last post by:
Hello group. I've migrated from Win 2003 server to Win 2008 server. I've been banging my head agaist a wall for several days now trying to figure this out. I have the following script that will...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.