471,092 Members | 1,357 Online
Bytes | Software Development & Data Engineering Community
Post +

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 471,092 software developers and data experts.

Suggestions for workaround in CSV bug

Hi,

I've come across a bug in CSV where the csv.reader() raises an
exception if the input line contains '\r'. Example code and output
below shows a test case where csv.reader() cannot read an array
written by csv.writer().

I believe this is a known bug and may have been fixed for Python 2.5.
However I'm after suggestions for workarounds for Python 2.4.2.

This is part of a project where I'm storing large tables from
mainframe systems as CSVs for subsequent data cleansing and
post-processing. Some tables have 300 columns and tens of millions
of rows. The mainframe data fields are poorly documented, so I
don't know at the time of writing the CSV whether a '\r'
is part of a binary field and so must be retained,
or is a random byte in an uninitialised field and so
can safely be deleted. Therefore I'd prefer
to make minimum changes that might screw up the data.

Any suggestions for how to proceed are most welcome!

Thanks in advance,

Stephen Simmons
#================================================= =====
# Bug in Python 2.4.2's csv module
# Stephen Simmons, mail at stevesimmons.com, 24 Jan 2006

import csv

s = [ ['a'], ['\r'], ['b'] ]
name = 'c://temp//test2.csv'

print 'Writing CSV file containing %s' % repr(s)
f = file(name, 'wb')
csv.writer(f).writerows(s)
f.close()

print 'CSV file is %s' % repr(file(name, 'rb').read())

print 'Now reading back as CSV...'
for r in csv.reader(file(name, 'rb')):
print 'Read row containing %s' % repr(r)
# Output is
"""In [29]: run csv_error.py
Writing CSV file containing [['a'], ['\r'], ['b']]
Contents of the CSV file are 'a\r\n"\r"\r\nb\r\n'
Now reading back as CSV...
Read row containing ['a']
---------------------------------------------------------------------------
_csv.Error Traceback (most recent call last)
c:\temp\csv_error.py
14 print 'CSV file is %s' % repr(file(name, 'rb').read())
15
16 print 'Now reading back as CSV...'
---> 17 for r in csv.reader(file(name, 'rb')):
18 print 'Read row containing %s' % repr(r)

Error: newline inside string
WARNING: Failure executing file: <csv_error.py>

"""

Jan 23 '06 #1
1 4169
Simmons, Stephen wrote:

I've come across a bug in CSV where the csv.reader() raises an
exception if the input line contains '\r'. Example code and output
below shows a test case where csv.reader() cannot read an array
written by csv.writer().

Error: newline inside string
WARNING: Failure executing file: <csv_error.py>


Did you play with the csv.Dialect setting lineterminator='\n' ?

csv.reader(file(name, 'rb'),lineterminator='\n)

See also: http://docs.python.org/lib/csv-fmt-params.html

Ciao, Michael.
Jan 24 '06 #2

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

2 posts views Thread by HI-Lab * the Social Technology | last post: by
1 post views Thread by klappnase | last post: by
6 posts views Thread by Chris | last post: by
reply views Thread by Jonas Smithson | last post: by
7 posts views Thread by Justin Shen | last post: by
1 post views Thread by slyi | last post: by

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.