469,358 Members | 1,682 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,358 developers. It's quick & easy.

Simple script to merge multiple text files

Hi,

I have just started using Python and I am slowly getting into it. I wanted to make a little script to merge all files in a directory into one.

All of these files will be text files.

I know I need to use os.path.walk() to walk through all the files, and if not os.path.isdir(filename) with open(filename,'r').read() to read all the files that are not directories.

Can someone give me a clue or a very simple example that I can work with please?

Thanks in advance.
Nov 1 '07 #1
6 28999
bvdet
2,851 Expert Mod 2GB
Hi,

I have just started using Python and I am slowly getting into it. I wanted to make a little script to merge all files in a directory into one.

All of these files will be text files.

I know I need to use os.path.walk() to walk through all the files, and if not os.path.isdir(filename) with open(filename,'r').read() to read all the files that are not directories.

Can someone give me a clue or a very simple example that I can work with please?

Thanks in advance.
I think this code will do what you want:
Expand|Select|Wrap|Line Numbers
  1. import os
  2.  
  3. def dir_list(dir_name, subdir, *args):
  4.     '''Return a list of file names in directory 'dir_name'
  5.     If 'subdir' is True, recursively access subdirectories under 'dir_name'.
  6.     Additional arguments, if any, are file extensions to add to the list.
  7.     Example usage: fileList = dir_list(r'H:\TEMP', False, 'txt', 'py', 'dat', 'log', 'jpg')
  8.     '''
  9.     fileList = []
  10.     for file in os.listdir(dir_name):
  11.         dirfile = os.path.join(dir_name, file)
  12.         if os.path.isfile(dirfile):
  13.             if len(args) == 0:
  14.                 fileList.append(dirfile)
  15.             else:
  16.                 if os.path.splitext(dirfile)[1][1:] in args:
  17.                     fileList.append(dirfile)
  18.  
  19.         # recursively access file names in subdirectories
  20.         elif os.path.isdir(dirfile) and subdir:
  21.             # print "Accessing directory:", dirfile
  22.             fileList += dir_list(dirfile, subdir, *args)
  23.     return fileList
  24.  
  25. def combine_files(fileList, fn):
  26.     f = open(fn, 'w')
  27.     for file in fileList:
  28.         print 'Writing file %s' % file
  29.         f.write(open(file).read())
  30.     f.close()
  31.  
  32. if __name__ == '__main__':
  33.     search_dir = "C:/directory"
  34.     fn = "output_file.txt"
  35.     combine_files(dir_list(search_dir, False, 'txt'), fn)
Nov 1 '07 #2
Thanks , you have helped a lot :)
Nov 1 '07 #3
ghostdog74
511 Expert 256MB
another way
Expand|Select|Wrap|Line Numbers
  1. import os,shutil
  2. f=open("/tmp/fileappend.txt","a")
  3. for r,d,fi in os.walk("/home/me"):
  4.     for files in fi:
  5.         if files.endswith(".txt"):                         
  6.             g=open(os.path.join(r,files))
  7.             shutil.copyfileobj(g,f)
  8.             g.close()
  9. f.close()
  10.  
Nov 2 '07 #4
I think this code will do what you want:
Expand|Select|Wrap|Line Numbers
  1. import os
  2.  
  3. def dir_list(dir_name, subdir, *args):
  4.     '''Return a list of file names in directory 'dir_name'
  5.     If 'subdir' is True, recursively access subdirectories under 'dir_name'.
  6.     Additional arguments, if any, are file extensions to add to the list.
  7.     Example usage: fileList = dir_list(r'H:\TEMP', False, 'txt', 'py', 'dat', 'log', 'jpg')
  8.     '''
  9.     fileList = []
  10.     for file in os.listdir(dir_name):
  11.         dirfile = os.path.join(dir_name, file)
  12.         if os.path.isfile(dirfile):
  13.             if len(args) == 0:
  14.                 fileList.append(dirfile)
  15.             else:
  16.                 if os.path.splitext(dirfile)[1][1:] in args:
  17.                     fileList.append(dirfile)
  18.  
  19.         # recursively access file names in subdirectories
  20.         elif os.path.isdir(dirfile) and subdir:
  21.             # print "Accessing directory:", dirfile
  22.             fileList += dir_list(dirfile, subdir, *args)
  23.     return fileList
  24.  
  25. def combine_files(fileList, fn):
  26.     f = open(fn, 'w')
  27.     for file in fileList:
  28.         print 'Writing file %s' % file
  29.         f.write(open(file).read())
  30.     f.close()
  31.  
  32. if __name__ == '__main__':
  33.     search_dir = "C:/directory"
  34.     fn = "output_file.txt"
  35.     combine_files(dir_list(search_dir, False, 'txt'), fn)
This seemed to work for a while for me now it comes up with "NameError: name 'False' is not defined"
Nov 2 '07 #5
bvdet
2,851 Expert Mod 2GB
This seemed to work for a while for me now it comes up with "NameError: name 'False' is not defined"
Expand|Select|Wrap|Line Numbers
  1. >>> bool(0)
  2. False
  3. >>> bool(1)
  4. True
  5. >>> 
'0' and '1' can be substituted for 'False' and 'True' respectively. The question is what happened to 'False' on your system?
Nov 2 '07 #6
Expand|Select|Wrap|Line Numbers
  1. >>> bool(0)
  2. False
  3. >>> bool(1)
  4. True
  5. >>> 
'0' and '1' can be substituted for 'False' and 'True' respectively. The question is what happened to 'False' on your system?
Did a quick reset on my machine and is working now. very strange.
Nov 9 '07 #7

Post your reply

Sign in to post your reply or Sign up for a free account.

Similar topics

reply views Thread by Guillaume Lahitette | last post: by
2 posts views Thread by Chris Murphy via DotNetMonster.com | last post: by
9 posts views Thread by malla | last post: by
2 posts views Thread by ManningFan | last post: by
emaghero
3 posts views Thread by emaghero | last post: by
1 post views Thread by CARIGAR | last post: by
reply views Thread by zhoujie | last post: by
reply views Thread by suresh191 | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.