473,513 Members | 2,558 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Pythonic use of CSV module to skip headers?

Hi --

I'm using the csv module to parse a tab-delimited file and wondered
whether there was a more elegant way to skip an possible header line.
I'm doing

line = 0
reader = csv.reader(file(filename))
for row in reader:
if (ignoreFirstLine & line == 0):
continue
line = line+1
# do something with row

The only thing I could think of was to specialize the default reader
class with an extra skipHeaderLine constructor parameter so that its
next() method can skip the first line appropriate. Is there any other
cleaner way to do it w/out subclassing the stdlib?

Thanks!

Ramon
Jul 18 '05 #1
10 15762
Ramon Felciano wrote:
Hi --

I'm using the csv module to parse a tab-delimited file and wondered
whether there was a more elegant way to skip an possible header line.
I'm doing

line = 0
reader = csv.reader(file(filename))
for row in reader:
if (ignoreFirstLine & line == 0):
continue
line = line+1
# do something with row

The only thing I could think of was to specialize the default reader
class with an extra skipHeaderLine constructor parameter so that its
next() method can skip the first line appropriate. Is there any other
cleaner way to do it w/out subclassing the stdlib?

Thanks!

Ramon


How about

line = 0
reader = csv.reader(file(filename))
headerline = reader.next()
for row in reader:
line = line+1
# do something with row

regards
Steve
--
http://www.holdenweb.com
http://pydish.holdenweb.com
Holden Web LLC +1 800 494 3119
Jul 18 '05 #2
In <76**************************@posting.google.com >, Ramon Felciano
wrote:
Hi --

I'm using the csv module to parse a tab-delimited file and wondered
whether there was a more elegant way to skip an possible header line.
I'm doing

line = 0
reader = csv.reader(file(filename))
for row in reader:
if (ignoreFirstLine & line == 0):
continue
line = line+1
# do something with row


What about:

reader = csv.reader(file(filename))
reader.next() # Skip header line.
for row in reader:
# do something with row

Ciao,
Marc 'BlackJack' Rintsch
Jul 18 '05 #3
Ramon Felciano wrote:
I'm using the csv module to parse a tab-delimited file and wondered
whether there was a more elegant way to skip an possible header line.
I'm doing

line = 0
reader = csv.reader(file(filename))
for row in reader:
if (ignoreFirstLine & line == 0):
continue
line = line+1
# do something with row

The only thing I could think of was to specialize the default reader
class with an extra skipHeaderLine constructor parameter so that its
next() method can skip the first line appropriate. Is there any other
cleaner way to do it w/out subclassing the stdlib?

import csv
f = file("tmp.csv")
f.next() '# header\n' for row in csv.reader(f):

.... print row
....
['a', 'b', 'c']
['1', '2', '3']

This way the reader need not mess with the header at all.

Peter

Jul 18 '05 #4
Ramon Felciano wrote:
I'm using the csv module to parse a tab-delimited file and wondered
whether there was a more elegant way to skip an possible header line.
I'm doing

line = 0
reader = csv.reader(file(filename))
for row in reader:
if (ignoreFirstLine & line == 0):
continue
line = line+1
# do something with row

The only thing I could think of was to specialize the default reader
class with an extra skipHeaderLine constructor parameter so that its
next() method can skip the first line appropriate. Is there any other
cleaner way to do it w/out subclassing the stdlib?

import csv
f = file("tmp.csv")
f.next() '# header\n' for row in csv.reader(f):

.... print row
....
['a', 'b', 'c']
['1', '2', '3']

This way the reader need not mess with the header at all.

Peter

Jul 18 '05 #5

Ramon> I'm using the csv module to parse a tab-delimited file and
Ramon> wondered whether there was a more elegant way to skip an possible
Ramon> header line.

Assuming the header line has descriptive titles, I prefer the DictReader
class. Unfortunately, it requires you to specify the titles in its
constructor. My usual idiom is the following:

f = open(filename, "rb") # don't forget the 'b'!
reader = csv.reader(f)
titles = reader.next()
reader = csv.DictReader(f, titles)
for row in reader:
...

The advantage of the DictReader class is that you get dictionaries keyed by
the titles instead of tuples. The code to manipulate them is more readable
and insensitive to changes in the order of the columns. On the down side,
if the titles aren't always named the same you lose.

Skip
Jul 18 '05 #6
Skip Montanaro wrote:
Assuming the header line has descriptive titles, I prefer the DictReader
class. Unfortunately, it requires you to specify the titles in its
constructor. My usual idiom is the following:


I deal so much with tab-delimited CSV files that I found it useful to
create a subclass of csv.DictReader to deal with this, so I can just write:

for row in tabdelim.DictReader(file(filename)):
...

I think this is a lot easier than trying to remember this cumbersome
idiom every single time.
--
Michael Hoffman
Jul 18 '05 #7
Michael Hoffman wrote:
I deal so much with tab-delimited CSV files that I found it useful to
create a subclass of csv.DictReader to deal with this, so I can just write:

for row in tabdelim.DictReader(file(filename)):
...

I think this is a lot easier than trying to remember this cumbersome
idiom every single time.


Python 2.4 makes the fieldnames paramter optional:
"If the fieldnames parameter is omitted, the values in the first row of the
csvfile will be used as the fieldnames."

i.e. the following should work fine in 2.4:

for row in csv.DictReader(file(filename)):
print sorted(row.items())

Cheers,
Nick.
Jul 18 '05 #8
Assuming the header line has descriptive titles, I prefer the
DictReader class. Unfortunately, it requires you to specify the
titles in its constructor. My usual idiom is the following:


Michael> I deal so much with tab-delimited CSV files that I found it
Michael> useful to create a subclass of csv.DictReader to deal with
Michael> this, so I can just write:

Michael> for row in tabdelim.DictReader(file(filename)):
Michael> ...

Michael> I think this is a lot easier than trying to remember this
Michael> cumbersome idiom every single time.

I'm not sure what the use of TABs as delimiters has to do with the OP's
problem. In my example I flubbed and failed to specify the delimiter to the
constructors (comma is the default delimiter).

You can create a subclass of DictReader that plucks the first line out as a
set of titles:

class SmartDictReader(csv.DictReader):
def __init__(self, f, *args, **kwds):
rdr = csv.reader(*args, **kwds)
titles = rdr.next()
csv.DictReader.__init__(self, f, titles, *args, **kwds)

Is that what you were suggesting? I don't find the couple extra lines of
code in my original example all that cumbersome to type though.

Skip
Jul 18 '05 #9
Skip Montanaro wrote:
I'm not sure what the use of TABs as delimiters has to do with the OP's
problem.
Not much. :) I just happen to use tabs more often than commas, so my
subclass defaults to
You can create a subclass of DictReader that plucks the first line out as a
set of titles:

class SmartDictReader(csv.DictReader):
def __init__(self, f, *args, **kwds):
rdr = csv.reader(*args, **kwds)
titles = rdr.next()
csv.DictReader.__init__(self, f, titles, *args, **kwds)

Is that what you were suggesting?
Exactly.
I don't find the couple extra lines of
code in my original example all that cumbersome to type though.


If you started about half of the programs you write with those extra
lines, you might <wink>. I'm a strong believer in OnceAndOnlyOnce.

Thanks to Nick Coghlan for pointing out that I no longer need do this in
Python 2.4.
--
Michael Hoffman
Jul 18 '05 #10
I don't find the couple extra lines of code in my original example
all that cumbersome to type though.


Michael> If you started about half of the programs you write with those
Michael> extra lines, you might <wink>. I'm a strong believer in
Michael> OnceAndOnlyOnce.

You're right of course. I do use csv a lot, but only from a couple
specialized programs.

Skip
Jul 18 '05 #11

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

7
2306
by: Daniel Ortmann | last post by:
These problems only happen on Windows. On Linux everything works fine. Has anyone else run into these bugs? Any suggestions? Where do I find out the proper bug reporting process? Problem...
3
4246
by: Bernard Delmée | last post by:
Hello, I can't seem to be able to specify the delimiter when building a DictReader() I can do: inf = file('data.csv') rd = csv.reader( inf, delimiter=';' ) for row in rd: # ...
1
2338
by: Fortepianissimo | last post by:
Does anyone know the existence of such module? I bet 90% of the chance that the wheel was invented before. Thanks!
15
2253
by: Ville Vainio | last post by:
Pythonic Nirvana - towards a true Object Oriented Environment ============================================================= IPython (by Francois Pinard) recently (next release - changes are...
2
585
by: Ramon Felciano | last post by:
Hi -- I'm using the csv module to parse a tab-delimited file and wondered whether there was a more elegant way to skip an possible header line. I'm doing line = 0 reader =...
3
1534
by: andrew.fabbro | last post by:
I'm working on an app that will be deployed on several different servers. In each case, I'll want to change some config info (database name, paths, etc.) In perl, I would have done something...
4
1784
by: Carl J. Van Arsdall | last post by:
It seems the more I come to learn about Python as a langauge and the way its used I've come across several discussions where people discuss how to do things using an OO model and then how to design...
0
1838
by: robert | last post by:
As more and more python packages are starting to use the bloomy (Java-ish) 'logging' module in a mood of responsibility and as I am not overly happy with the current "thickener" style of usage, I...
5
3698
by: Just Another Victim of the Ambient Morality | last post by:
I need a red-black tree in Python and I was wondering if there was one built in or if there's a good implementation out there. Something that, lets face it, does whatever the C++ std::map<allows...
26
1822
by: Frank Samuelson | last post by:
I love Python, and it is one of my 2 favorite languages. I would suggest that Python steal some aspects of the S language. ------------------------------------------------------- 1. Currently...
0
7384
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
7539
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
1
7101
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
1
5089
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...
0
4746
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...
0
3234
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The...
0
3222
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
802
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
0
456
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.