By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
445,909 Members | 2,016 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 445,909 IT Pros & Developers. It's quick & easy.

Pythonic use of CSV module to skip headers?

P: n/a
Hi --

I'm using the csv module to parse a tab-delimited file and wondered
whether there was a more elegant way to skip an possible header line.
I'm doing

line = 0
reader = csv.reader(file(filename))
for row in reader:
if (ignoreFirstLine & line == 0):
continue
line = line+1
# do something with row

The only thing I could think of was to specialize the default reader
class with an extra skipHeaderLine constructor parameter so that its
next() method can skip the first line appropriate. Is there any other
cleaner way to do it w/out subclassing the stdlib?

Thanks!

Ramon
Jul 18 '05 #1
Share this Question
Share on Google+
10 Replies


P: n/a
Ramon Felciano wrote:
Hi --

I'm using the csv module to parse a tab-delimited file and wondered
whether there was a more elegant way to skip an possible header line.
I'm doing

line = 0
reader = csv.reader(file(filename))
for row in reader:
if (ignoreFirstLine & line == 0):
continue
line = line+1
# do something with row

The only thing I could think of was to specialize the default reader
class with an extra skipHeaderLine constructor parameter so that its
next() method can skip the first line appropriate. Is there any other
cleaner way to do it w/out subclassing the stdlib?

Thanks!

Ramon


How about

line = 0
reader = csv.reader(file(filename))
headerline = reader.next()
for row in reader:
line = line+1
# do something with row

regards
Steve
--
http://www.holdenweb.com
http://pydish.holdenweb.com
Holden Web LLC +1 800 494 3119
Jul 18 '05 #2

P: n/a
In <76**************************@posting.google.com >, Ramon Felciano
wrote:
Hi --

I'm using the csv module to parse a tab-delimited file and wondered
whether there was a more elegant way to skip an possible header line.
I'm doing

line = 0
reader = csv.reader(file(filename))
for row in reader:
if (ignoreFirstLine & line == 0):
continue
line = line+1
# do something with row


What about:

reader = csv.reader(file(filename))
reader.next() # Skip header line.
for row in reader:
# do something with row

Ciao,
Marc 'BlackJack' Rintsch
Jul 18 '05 #3

P: n/a
Ramon Felciano wrote:
I'm using the csv module to parse a tab-delimited file and wondered
whether there was a more elegant way to skip an possible header line.
I'm doing

line = 0
reader = csv.reader(file(filename))
for row in reader:
if (ignoreFirstLine & line == 0):
continue
line = line+1
# do something with row

The only thing I could think of was to specialize the default reader
class with an extra skipHeaderLine constructor parameter so that its
next() method can skip the first line appropriate. Is there any other
cleaner way to do it w/out subclassing the stdlib?

import csv
f = file("tmp.csv")
f.next() '# header\n' for row in csv.reader(f):

.... print row
....
['a', 'b', 'c']
['1', '2', '3']

This way the reader need not mess with the header at all.

Peter

Jul 18 '05 #4

P: n/a
Ramon Felciano wrote:
I'm using the csv module to parse a tab-delimited file and wondered
whether there was a more elegant way to skip an possible header line.
I'm doing

line = 0
reader = csv.reader(file(filename))
for row in reader:
if (ignoreFirstLine & line == 0):
continue
line = line+1
# do something with row

The only thing I could think of was to specialize the default reader
class with an extra skipHeaderLine constructor parameter so that its
next() method can skip the first line appropriate. Is there any other
cleaner way to do it w/out subclassing the stdlib?

import csv
f = file("tmp.csv")
f.next() '# header\n' for row in csv.reader(f):

.... print row
....
['a', 'b', 'c']
['1', '2', '3']

This way the reader need not mess with the header at all.

Peter

Jul 18 '05 #5

P: n/a

Ramon> I'm using the csv module to parse a tab-delimited file and
Ramon> wondered whether there was a more elegant way to skip an possible
Ramon> header line.

Assuming the header line has descriptive titles, I prefer the DictReader
class. Unfortunately, it requires you to specify the titles in its
constructor. My usual idiom is the following:

f = open(filename, "rb") # don't forget the 'b'!
reader = csv.reader(f)
titles = reader.next()
reader = csv.DictReader(f, titles)
for row in reader:
...

The advantage of the DictReader class is that you get dictionaries keyed by
the titles instead of tuples. The code to manipulate them is more readable
and insensitive to changes in the order of the columns. On the down side,
if the titles aren't always named the same you lose.

Skip
Jul 18 '05 #6

P: n/a
Skip Montanaro wrote:
Assuming the header line has descriptive titles, I prefer the DictReader
class. Unfortunately, it requires you to specify the titles in its
constructor. My usual idiom is the following:


I deal so much with tab-delimited CSV files that I found it useful to
create a subclass of csv.DictReader to deal with this, so I can just write:

for row in tabdelim.DictReader(file(filename)):
...

I think this is a lot easier than trying to remember this cumbersome
idiom every single time.
--
Michael Hoffman
Jul 18 '05 #7

P: n/a
Michael Hoffman wrote:
I deal so much with tab-delimited CSV files that I found it useful to
create a subclass of csv.DictReader to deal with this, so I can just write:

for row in tabdelim.DictReader(file(filename)):
...

I think this is a lot easier than trying to remember this cumbersome
idiom every single time.


Python 2.4 makes the fieldnames paramter optional:
"If the fieldnames parameter is omitted, the values in the first row of the
csvfile will be used as the fieldnames."

i.e. the following should work fine in 2.4:

for row in csv.DictReader(file(filename)):
print sorted(row.items())

Cheers,
Nick.
Jul 18 '05 #8

P: n/a
Assuming the header line has descriptive titles, I prefer the
DictReader class. Unfortunately, it requires you to specify the
titles in its constructor. My usual idiom is the following:


Michael> I deal so much with tab-delimited CSV files that I found it
Michael> useful to create a subclass of csv.DictReader to deal with
Michael> this, so I can just write:

Michael> for row in tabdelim.DictReader(file(filename)):
Michael> ...

Michael> I think this is a lot easier than trying to remember this
Michael> cumbersome idiom every single time.

I'm not sure what the use of TABs as delimiters has to do with the OP's
problem. In my example I flubbed and failed to specify the delimiter to the
constructors (comma is the default delimiter).

You can create a subclass of DictReader that plucks the first line out as a
set of titles:

class SmartDictReader(csv.DictReader):
def __init__(self, f, *args, **kwds):
rdr = csv.reader(*args, **kwds)
titles = rdr.next()
csv.DictReader.__init__(self, f, titles, *args, **kwds)

Is that what you were suggesting? I don't find the couple extra lines of
code in my original example all that cumbersome to type though.

Skip
Jul 18 '05 #9

P: n/a
Skip Montanaro wrote:
I'm not sure what the use of TABs as delimiters has to do with the OP's
problem.
Not much. :) I just happen to use tabs more often than commas, so my
subclass defaults to
You can create a subclass of DictReader that plucks the first line out as a
set of titles:

class SmartDictReader(csv.DictReader):
def __init__(self, f, *args, **kwds):
rdr = csv.reader(*args, **kwds)
titles = rdr.next()
csv.DictReader.__init__(self, f, titles, *args, **kwds)

Is that what you were suggesting?
Exactly.
I don't find the couple extra lines of
code in my original example all that cumbersome to type though.


If you started about half of the programs you write with those extra
lines, you might <wink>. I'm a strong believer in OnceAndOnlyOnce.

Thanks to Nick Coghlan for pointing out that I no longer need do this in
Python 2.4.
--
Michael Hoffman
Jul 18 '05 #10

P: n/a
I don't find the couple extra lines of code in my original example
all that cumbersome to type though.


Michael> If you started about half of the programs you write with those
Michael> extra lines, you might <wink>. I'm a strong believer in
Michael> OnceAndOnlyOnce.

You're right of course. I do use csv a lot, but only from a couple
specialized programs.

Skip
Jul 18 '05 #11

This discussion thread is closed

Replies have been disabled for this discussion.