473,811 Members | 3,687 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

remove header line when reading/writing files

I'm a newbie with a large number of data files in multiple
directories. I want to uncompress, read, and copy the contents of
each file into one master data file. The code below seems to be doing
this perfectly. The problem is each of the data files has a header
row in the first line, which I do not want in the master file. How
can I skip that first line when writing to the master file? Any help
is much appreciated. Thank you.

import os
import sys
import glob
import gzip
zipdir = "G:/Research/Data/"
outfilename = "G:/Research/Data/master_data.txt "
outfile = open(outfilenam e,'w')
os.chdir(zipdir )
dirlist = os.listdir(os.c urdir)
for item in dirlist:
if os.path.isdir(i tem):
os.chdir(item)
filelist = glob.glob("*.gz ")
for zipfile in filelist:
filein = gzip.GzipFile(z ipfile,'r')
filecontent = filein.read()
filein.close()
outfile.write(f ilecontent)
os.chdir(os.par dir)
outfile.close()

Oct 11 '07 #1
5 8202
each file into one master data file. The code below seems to be doing
this perfectly. The problem is each of the data files has a header
row in the first line, which I do not want in the master file. How
can I skip that first line when writing to the master file? Any help
is much appreciated. Thank you.
[snip]
for zipfile in filelist:
filein = gzip.GzipFile(z ipfile,'r')
filecontent = filein.read()
filein.close()
outfile.write(f ilecontent)
for zipfile in filelist:
for i, line in gzip.Gzipfile(z ipfile,'r'):
if i: outfile.write(l ine)

should do the trick for you.

If you like a little more readable code, you can change that line to

if i <0: outfile.write(l ine)

or

if i == 0: continue
outfile.write(l ine)

whichever you like.

-tkc

Oct 11 '07 #2
Forgot the enumerate call of all things
for zipfile in filelist:
for i, line in enumerate(gzip. Gzipfile(zipfil e,'r')):
if i: outfile.write(l ine)

Some days, I'm braindead.

-tkc
Oct 11 '07 #3
On Oct 12, 12:23 pm, Tim Chase <python.l...@ti m.thechases.com wrote:
Forgot the enumerate call of all things
for zipfile in filelist:
for i, line in enumerate(gzip. Gzipfile(zipfil e,'r')):
if i: outfile.write(l ine)

Some days, I'm braindead.

-tkc
I would move the 'if' test outside the loop :

for zipfile in filelist:
zfiter = iter(gzip.Gzipf ile(zipfile,'r' ))
zfiter.next() # ignore header line
for i, line in enumerate(fzite r):
outfile.write(l ine)

I'm not sure if the iter(...) is required. This will raise a
StopIteration exception if zipfile is empty.

Cheers
Tim

Oct 12 '07 #4
ti******@gmail. com wrote:
...
for zipfile in filelist:
zfiter = iter(gzip.Gzipf ile(zipfile,'r' ))
zfiter.next() # ignore header line
for i, line in enumerate(fzite r):
outfile.write(l ine)
Or even:
writes = outfile.write
for zipfile in filelist:
zfiter = iter(gzip.Gzipf ile(zipfile,'r' ))
zfiter.next() # ignore header line
for line in zfiter:
writes(line)
Oct 12 '07 #5
On Thu, 11 Oct 2007 22:52:55 +0000, RyanL wrote:
I'm a newbie with a large number of data files in multiple
directories. I want to uncompress, read, and copy the contents of
each file into one master data file. The code below seems to be doing
this perfectly. The problem is each of the data files has a header
row in the first line, which I do not want in the master file. How
can I skip that first line when writing to the master file? Any help
is much appreciated. Thank you.
Untested version with `itertools.isli ce()`:

import glob
import gzip
import os
from itertools import islice
def main():
zipdir = 'G:/Research/Data/'
outfilename = 'G:/Research/Data/master_data.txt '
out_file = open(outfilenam e, 'w')
for name in os.listdir(os.c urdir):
if os.path.isdir(n ame):
os.chdir(name)
for zip_name in glob.glob('*.gz '):
in_file = gzip.GzipFile(z ip_name, 'r')
out_file.writel ines(islice(in_ file, 1, None))
in_file.close()
os.chdir(os.par dir)
out_file.close( )

Ciao,
Marc 'BlackJack' Rintsch
Oct 12 '07 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
2430
by: jock | last post by:
A script i'm working on is behaving real strange. It is more or less the same as dozens of others that work aok. Info is posted from one script to another, the second script does some work (nothing too taxing, just checks the input and updates/moves a couple of files) then redirects to the first script. Its one of these second scripts I'm having trouble with. As I was writing it, instead if redirecting to the initial script I would...
3
1936
by: Girish | last post by:
I have this XML FILE where I am reading data from and it has this node doctype "< ! D O C T Y P E a d f S Y S T E M " h t t p : / / w h o s c a l l i n g . c o m / d t d / a d f d t d . d t d " > " ONLY When I remove this line I could display the data onto the browser, any ideas WHY? if not I am using following code to remove the above line, it doesn't work Dim FindWhat, ReplaceWith
3
3100
by: pooja | last post by:
Suppose i have created a class c1 with f1()in c1.cpp and included this c1.cpp in file1.cpp file , which is also having main() by giving the statement #include "c1.cpp". the same i can do by using header file. i can create a class c1 with f1() in c1.h and include this c1.h in file1.cpp by giving the statement #include "c1.h" tell me that what exactly is the difference between c1.h and c1.cpp? Since they both are doing the same things.
16
12647
by: Michael | last post by:
I have a data application in a2k that I need to create two fixed width text files and then combine them to a single file The first file is header information and the second is transaction data. I have tried and tried but just cant seem to get this right, I am using Queries to created my export files with specifications which works fine, I get stumped with the appending the header to my transaction file. What I have so far looks like...
2
2212
by: Kevin Joplin | last post by:
Hi there, I've almost finished writing simple server application and i have one little doubt. Suppose we have main server code in server.c file. Rest of the code we divided into some pices and put into server_2.c, server_3.c etc. Now we need to share some of the server.c variables between server_2.c, server_3.c etc. so we put the declaractions of these variables in server.h with "extern" prefix and include it (server.h) in rest of the...
0
3946
by: Lokkju | last post by:
I am pretty much lost here - I am trying to create a managed c++ wrapper for this dll, so that I can use it from c#/vb.net, however, it does not conform to any standard style of coding I have seen. It is almost like it is trying to implement it's own COM interfaces... below is the header, and a link to the dll+code: Zip file with header, example, and DLL:...
5
6218
by: grinder | last post by:
first off, i am an extreme newbie to C. i am an undergrad research assistant and i have been shifted to a project that involves building a fairly involved c program. The part that i am stuck on now is as follows: - i am trying to write code that will take any number of text inputs (names of other text files) and put them into a file line-by-line at the users request. meaning, i want the user to be able to type the strings and enter them...
2
6599
by: Cliff72 | last post by:
I'm creating a database that will be uploading some text files into an access table. The problem is that the text files have a header which messes up my import specs. so what i have had to do is to open the files and manually delete the data before importing them. Is there a way i can programatically do this? ie programmatically edit the text file and remove the surplus header text, and then import the file with the DoCmd.TransferText...
30
1717
by: xiao | last post by:
HI~ EVERY ONE~ I have a small program here, when I tried to compile it , it always reminds that arrary.c: In function `main': arrary.c:39: error: `header' undeclared (first use in this function) arrary.c:39: error: (Each undeclared identifier is reported only once arrary.c:39: error: for each function it appears in.) Why is that ? I think I have declared it . :(
0
9726
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
9605
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10647
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
10384
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
10395
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
10130
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
9204
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
2
3865
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
3017
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.