473,544 Members | 1,775 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

remove header line when reading/writing files

I'm a newbie with a large number of data files in multiple
directories. I want to uncompress, read, and copy the contents of
each file into one master data file. The code below seems to be doing
this perfectly. The problem is each of the data files has a header
row in the first line, which I do not want in the master file. How
can I skip that first line when writing to the master file? Any help
is much appreciated. Thank you.

import os
import sys
import glob
import gzip
zipdir = "G:/Research/Data/"
outfilename = "G:/Research/Data/master_data.txt "
outfile = open(outfilenam e,'w')
os.chdir(zipdir )
dirlist = os.listdir(os.c urdir)
for item in dirlist:
if os.path.isdir(i tem):
os.chdir(item)
filelist = glob.glob("*.gz ")
for zipfile in filelist:
filein = gzip.GzipFile(z ipfile,'r')
filecontent = filein.read()
filein.close()
outfile.write(f ilecontent)
os.chdir(os.par dir)
outfile.close()

Oct 11 '07 #1
5 8162
each file into one master data file. The code below seems to be doing
this perfectly. The problem is each of the data files has a header
row in the first line, which I do not want in the master file. How
can I skip that first line when writing to the master file? Any help
is much appreciated. Thank you.
[snip]
for zipfile in filelist:
filein = gzip.GzipFile(z ipfile,'r')
filecontent = filein.read()
filein.close()
outfile.write(f ilecontent)
for zipfile in filelist:
for i, line in gzip.Gzipfile(z ipfile,'r'):
if i: outfile.write(l ine)

should do the trick for you.

If you like a little more readable code, you can change that line to

if i <0: outfile.write(l ine)

or

if i == 0: continue
outfile.write(l ine)

whichever you like.

-tkc

Oct 11 '07 #2
Forgot the enumerate call of all things
for zipfile in filelist:
for i, line in enumerate(gzip. Gzipfile(zipfil e,'r')):
if i: outfile.write(l ine)

Some days, I'm braindead.

-tkc
Oct 11 '07 #3
On Oct 12, 12:23 pm, Tim Chase <python.l...@ti m.thechases.com wrote:
Forgot the enumerate call of all things
for zipfile in filelist:
for i, line in enumerate(gzip. Gzipfile(zipfil e,'r')):
if i: outfile.write(l ine)

Some days, I'm braindead.

-tkc
I would move the 'if' test outside the loop :

for zipfile in filelist:
zfiter = iter(gzip.Gzipf ile(zipfile,'r' ))
zfiter.next() # ignore header line
for i, line in enumerate(fzite r):
outfile.write(l ine)

I'm not sure if the iter(...) is required. This will raise a
StopIteration exception if zipfile is empty.

Cheers
Tim

Oct 12 '07 #4
ti******@gmail. com wrote:
...
for zipfile in filelist:
zfiter = iter(gzip.Gzipf ile(zipfile,'r' ))
zfiter.next() # ignore header line
for i, line in enumerate(fzite r):
outfile.write(l ine)
Or even:
writes = outfile.write
for zipfile in filelist:
zfiter = iter(gzip.Gzipf ile(zipfile,'r' ))
zfiter.next() # ignore header line
for line in zfiter:
writes(line)
Oct 12 '07 #5
On Thu, 11 Oct 2007 22:52:55 +0000, RyanL wrote:
I'm a newbie with a large number of data files in multiple
directories. I want to uncompress, read, and copy the contents of
each file into one master data file. The code below seems to be doing
this perfectly. The problem is each of the data files has a header
row in the first line, which I do not want in the master file. How
can I skip that first line when writing to the master file? Any help
is much appreciated. Thank you.
Untested version with `itertools.isli ce()`:

import glob
import gzip
import os
from itertools import islice
def main():
zipdir = 'G:/Research/Data/'
outfilename = 'G:/Research/Data/master_data.txt '
out_file = open(outfilenam e, 'w')
for name in os.listdir(os.c urdir):
if os.path.isdir(n ame):
os.chdir(name)
for zip_name in glob.glob('*.gz '):
in_file = gzip.GzipFile(z ip_name, 'r')
out_file.writel ines(islice(in_ file, 1, None))
in_file.close()
os.chdir(os.par dir)
out_file.close( )

Ciao,
Marc 'BlackJack' Rintsch
Oct 12 '07 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
2400
by: jock | last post by:
A script i'm working on is behaving real strange. It is more or less the same as dozens of others that work aok. Info is posted from one script to another, the second script does some work (nothing too taxing, just checks the input and updates/moves a couple of files) then redirects to the first script. Its one of these second scripts I'm...
3
1925
by: Girish | last post by:
I have this XML FILE where I am reading data from and it has this node doctype "< ! D O C T Y P E a d f S Y S T E M " h t t p : / / w h o s c a l l i n g . c o m / d t d / a d f d t d . d t d " > " ONLY When I remove this line I could display the data onto the browser, any ideas WHY? if not I am using following code to remove the above...
3
3078
by: pooja | last post by:
Suppose i have created a class c1 with f1()in c1.cpp and included this c1.cpp in file1.cpp file , which is also having main() by giving the statement #include "c1.cpp". the same i can do by using header file. i can create a class c1 with f1() in c1.h and include this c1.h in file1.cpp by giving the statement #include "c1.h" tell me that...
16
12582
by: Michael | last post by:
I have a data application in a2k that I need to create two fixed width text files and then combine them to a single file The first file is header information and the second is transaction data. I have tried and tried but just cant seem to get this right, I am using Queries to created my export files with specifications which works fine, I...
2
2197
by: Kevin Joplin | last post by:
Hi there, I've almost finished writing simple server application and i have one little doubt. Suppose we have main server code in server.c file. Rest of the code we divided into some pices and put into server_2.c, server_3.c etc. Now we need to share some of the server.c variables between server_2.c, server_3.c etc. so we put the...
0
3914
by: Lokkju | last post by:
I am pretty much lost here - I am trying to create a managed c++ wrapper for this dll, so that I can use it from c#/vb.net, however, it does not conform to any standard style of coding I have seen. It is almost like it is trying to implement it's own COM interfaces... below is the header, and a link to the dll+code: Zip file with header,...
5
6188
by: grinder | last post by:
first off, i am an extreme newbie to C. i am an undergrad research assistant and i have been shifted to a project that involves building a fairly involved c program. The part that i am stuck on now is as follows: - i am trying to write code that will take any number of text inputs (names of other text files) and put them into a file...
2
6579
by: Cliff72 | last post by:
I'm creating a database that will be uploading some text files into an access table. The problem is that the text files have a header which messes up my import specs. so what i have had to do is to open the files and manually delete the data before importing them. Is there a way i can programatically do this? ie programmatically edit the text...
30
1661
by: xiao | last post by:
HI~ EVERY ONE~ I have a small program here, when I tried to compile it , it always reminds that arrary.c: In function `main': arrary.c:39: error: `header' undeclared (first use in this function) arrary.c:39: error: (Each undeclared identifier is reported only once arrary.c:39: error: for each function it appears in.) Why is that ? I...
0
7360
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language...
0
7600
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. ...
1
5289
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes...
0
4906
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert...
0
3403
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in...
0
3400
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
1833
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
1
983
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
0
653
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.