473,396 Members | 1,777 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,396 software developers and data experts.

CSV with comments

In csv.reader, is there any way of skip lines that start whith '#' or
empty lines
I would add comments at my CSV file

Jul 18 '06 #1
11 13447

GinTon wrote:
In csv.reader, is there any way of skip lines that start whith '#' or
empty lines
I would add comments at my CSV file
For skip comment I get a dirty trick:

reader = csv.reader(open(csv_file))
for csv_line in reader:
if csv_line[0].startswith('#'):
continue

But not possible let blank lines.

I think that CSV class should to let skip comments and new lines of
auto. way.

Jul 18 '06 #2
GinTon wrote:
GinTon wrote:
>In csv.reader, is there any way of skip lines that start whith '#' or
empty lines
I would add comments at my CSV file

For skip comment I get a dirty trick:

reader = csv.reader(open(csv_file))
for csv_line in reader:
if csv_line[0].startswith('#'):
continue

But not possible let blank lines.

I think that CSV class should to let skip comments and new lines of
auto. way.
write an iterator that filters line to your liking and use it as input
to cvs.reader:

def CommentStripper (iterator):
for line in iterator:
if line [:1] == '#':
continue
if not line.strip ():
continue
yield line

reader = csv.reader (CommentStripper (open (csv_file)))

CommentStripper is actually quite useful for other files. Of course
there might be differences if a comment starts
- on the first character
- on the first non-blank character
- anywhere in the line

Daniel
Jul 18 '06 #3
>In csv.reader, is there any way of skip lines that start whith '#' or
>empty lines
Nope. When we wrote the module we weren't aware of any "spec" that
specified comments or blank lines. You can easily write a file wrapper to
filter them out though:

class BlankCommentCSVFile:
def __init__(self, fp):
self.fp = fp

def __iter__(self):
return self

def next(self):
line = self.fp.next()
if not line.strip() or line[0] == "#":
return self.next()
return line

Use it like so:

reader = csv.reader(BlankCommentCSVFile(open("somefile.csv" )))
for row in reader:
print row

Skip
Jul 18 '06 #4
On 19/07/2006 5:34 AM, sk**@pobox.com wrote:
>In csv.reader, is there any way of skip lines that start whith '#' or
>empty lines

Nope. When we wrote the module we weren't aware of any "spec" that
specified comments or blank lines. You can easily write a file wrapper to
filter them out though:

class BlankCommentCSVFile:
def __init__(self, fp):
self.fp = fp

def __iter__(self):
return self

def next(self):
line = self.fp.next()
if not line.strip() or line[0] == "#":
return self.next()
This is recursive. Unlikely of course, but if the file contained a large
number of empty lines, might this not cause the recursion limit to be
exceeded?

return line

Use it like so:

reader = csv.reader(BlankCommentCSVFile(open("somefile.csv" )))
for row in reader:
print row
Hi Skip,

Is there any reason to prefer this approach to Daniel's, apart from
being stuck with an older (pre-yield) version of Python?

A file given to csv.reader is supposed to be opened with "rb" so that
newlines embedded in data fields can be handled properly, and also
(according to a post by Andrew MacNamara (IIRC)) for DIY emulation of
"rU". It is not apparent how well this all hangs together when a filter
is interposed, nor whether there are any special rules about what the
filter must/mustn't do. Perhaps a few lines for the docs?

Cheers,
John

Jul 18 '06 #5

JohnThis is recursive. Unlikely of course, but if the file contained a
Johnlarge number of empty lines, might this not cause the recursion
Johnlimit to be exceeded?

Sure, but I was lazy. ;-)

Skip
Jul 19 '06 #6
Whoops, missed the second part.

JohnIs there any reason to prefer this approach to Daniel's, apart
Johnfrom being stuck with an older (pre-yield) version of Python?

No, it's just what I came up with off the top of my head.

JohnA file given to csv.reader is supposed to be opened with "rb" so
Johnthat newlines embedded in data fields can be handled properly, and
Johnalso (according to a post by Andrew MacNamara (IIRC)) for DIY
Johnemulation of "rU". It is not apparent how well this all hangs
Johntogether when a filter is interposed, nor whether there are any
Johnspecial rules about what the filter must/mustn't do. Perhaps a few
Johnlines for the docs?

Yeah, I was also aware of that. In the common case though it's not too big
a deal. If the OP is editing a CSV file manually it probably isn't too
complex (no newlines inside fields, for example).

Skip

Jul 19 '06 #7
Daniel Dittmar <da************@sap.corpwrote:
if line [:1] == '#':
What's wrong with line[0] == '#' ? (For one thing, it's fractionally
faster than [:1].)

--
\S -- si***@chiark.greenend.org.uk -- http://www.chaos.org.uk/~sion/
___ | "Frankly I have no feelings towards penguins one way or the other"
\X/ | -- Arthur C. Clarke
her nu becomež se bera eadward ofdun hlęddre heafdes bęce bump bump bump
Jul 19 '06 #8
Sion Arrowsmith wrote:
Daniel Dittmar <da************@sap.corpwrote:
> if line [:1] == '#':

What's wrong with line[0] == '#' ? (For one thing, it's fractionally
faster than [:1].)
line[0] assumes that the line isn't blank. If the input iterator is a file
then that will hold true, but if you were ever to reuse CommentStripper on
a list of strings which didn't have a trailing newline it would break at
the first blank string.

Personally I would use:

if line.startswith('#'):

which takes about three times as long to execute but I think reads more
clearly.

timeit.py -s "line=' hello world'" "line[:1]=='#'"
1000000 loops, best of 3: 0.236 usec per loop

timeit.py -s "line=' hello world'" "line[0]=='#'"
1000000 loops, best of 3: 0.218 usec per loop

timeit.py -s "line=' hello world'" "line.startswith('#')"
1000000 loops, best of 3: 0.639 usec per loop
Jul 19 '06 #9
Sion Arrowsmith wrote:
Daniel Dittmar <da************@sap.corpwrote:
> if line [:1] == '#':


What's wrong with line[0] == '#' ? (For one thing, it's fractionally
faster than [:1].)

For that matter, what's wrong with

line.startswith('#')

which expresses the intent rather better as well.

regards
Steve
--
Steve Holden +44 150 684 7255 +1 800 494 3119
Holden Web LLC/Ltd http://www.holdenweb.com
Skype: holdenweb http://holdenweb.blogspot.com
Recent Ramblings http://del.icio.us/steve.holden

Jul 19 '06 #10
Sion Arrowsmith wrote:
Daniel Dittmar <da************@sap.corpwrote:
> if line [:1] == '#':

What's wrong with line[0] == '#' ? (For one thing, it's fractionally
faster than [:1].)
Matter of taste. Occasionally, I use line iterators that strip the '\n'
from the end of each line, so empty lines have to be handled. Of course,
in my example code, one could have moved the check for the empty line
before the check for the comment.

Daniel
Jul 19 '06 #11
and which method is the best, Daniel's generator or the subclass?

Jul 20 '06 #12

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
by: Sims | last post by:
Hi, I proud myself in having good comments, (/**/, // etc...), all over my scripts as well as a very descriptive section at the beginning of the script. No correct me if i am wrong but php must...
17
by: lkrubner | last post by:
I've got a PHP application that's 2 megs in size. Of that, my guess is 200k-400k is comments. Do they impose a performance hit? I've been postponing any kind of optimization, but at some point I'll...
4
by: Uwe Ziegenhagen | last post by:
Hello, my fellows and me implement a c++ tool that is able to divide blank/tab separated files into <number>, <text>, <c-singlelinecomment> and <multilinecomment>. So far it's not working bad,...
28
by: Benjamin Niemann | last post by:
Hello, I've been just investigating IE conditional comments - hiding things from non-IE/Win browsers is easy, but I wanted to know, if it's possible to hide code from IE/Win browsers. I found...
4
by: lorinh | last post by:
Hi Folks, I'm trying to strip C/C++ style comments (/* ... */ or // ) from source code using Python regexps. If I don't have to worry about comments embedded in strings, it seems pretty...
40
by: Edward Elliott | last post by:
At the risk of flogging a dead horse, I'm wondering why Python doesn't have any multiline comments. One can abuse triple-quotes for that purpose, but that's obviously not what it's for and doesn't...
7
by: Bob Stearns | last post by:
Several weeks ago I asked what comments I could pass to DB2 in a SELECT statement. I don't remember whether I said via PHP/ODBC. I was assured that both /* */ style comment blocks and -- comment...
98
by: tjb | last post by:
I often see code like this: /// <summary> /// Removes a node. /// </summary> /// <param name="node">The node to remove.</param> public void RemoveNode(Node node) { <...> }
40
by: jacob navia | last post by:
Recently we had poor Mr "teapot" that was horrified at the heresy of lcc-win of accepting // comments. C is a nice language, and you can do anything with it, inclusive a program that transforms...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.