473,321 Members | 1,622 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,321 software developers and data experts.

csv bugs

It seems that when a line termination is escaped (using the current
escape character), csv.reader treats it as a line continuation, which
is well an good -- but it doesn't discard the escape character;
instead, it escapes it implicitly. This seems like a bug to me. E.g.

foo:bar:baz\
frozz:bozz

with separator ':' and escape character '\\' is parsed into

['foo', 'bar', 'baz\\\nfrozz', 'bozz']

In my opinion, it *ought* to be parsed into

['foo', 'bar', 'baz\nfrozz', 'bozz']

As far as I know, this is the UNIX convention, as used in (e.g.)
/etc/passwd.

Am I off target here? If the current behaviour is desirable (although
I can't see why it should be) then at least I think there should be a
way of implementing "normal" line continuations (as in my example),
which is the standard UNIX behavior, and the behavior of Python
source, for that matter. Otherwise, csv can't be used to parse (e.g.)
/etc/passwd...

And another thing: Perhaps a 'passwd' dialect could be added alongside
'excel'? Something like:

class passwd(Dialect):
delimiter = ':'
doublequote = False
escapechar = '\\'
lineterminator = '\n'
quotechar = '?'
quoting = QUOTE_NONE
skipinitialspace = False
register_dialect("passwd", passwd)

For some reason you *have* to supply a quotechar, even if you set
QUOTE_NONE... I guess that's a bug too, in my book.

If there are no objections, I might submit some of this as a bug
report or two (or even a patch).

--
Magnus Lie Hetland "The mind is not a vessel to be filled,
http://hetland.org but a fire to be lighted." [Plutarch]
Jul 18 '05 #1
2 1391

(A better place for this discussion would probably be cs*@mail.mojam.com.
I'm adding it to the cc list.)

Magnus> It seems that when a line termination is escaped (using the
Magnus> current escape character), csv.reader treats it as a line
Magnus> continuation, which is well an good -- but it doesn't discard
Magnus> the escape character; instead, it escapes it implicitly. This
Magnus> seems like a bug to me. E.g.

Magnus> foo:bar:baz\
Magnus> frozz:bozz

Magnus> with separator ':' and escape character '\\' is parsed into

Magnus> ['foo', 'bar', 'baz\\\nfrozz', 'bozz']

Magnus> In my opinion, it *ought* to be parsed into

Magnus> ['foo', 'bar', 'baz\nfrozz', 'bozz']

Magnus> As far as I know, this is the UNIX convention, as used in (e.g.)
Magnus> /etc/passwd.

That may be, however development of the csv module's parser was driven by
how Microsoft Excel behaves. The assumption was (rightly I think) that
Excel reads or writes more CSV files than anything else. I don't believe it
does anything with backslashes.

Magnus> Am I off target here? If the current behaviour is desirable
Magnus> (although I can't see why it should be) then at least I think
Magnus> there should be a way of implementing "normal" line
Magnus> continuations (as in my example), which is the standard UNIX
Magnus> behavior, and the behavior of Python source, for that
Magnus> matter. Otherwise, csv can't be used to parse (e.g.)
Magnus> /etc/passwd...

You're welcome to submit a patch. I don't have time for it.

Magnus> And another thing: Perhaps a 'passwd' dialect could be added
Magnus> alongside 'excel'? Something like:

Magnus> class passwd(Dialect):
Magnus> delimiter = ':'
Magnus> doublequote = False
Magnus> escapechar = '\\'
Magnus> lineterminator = '\n'
Magnus> quotechar = '?'
Magnus> quoting = QUOTE_NONE
Magnus> skipinitialspace = False
Magnus> register_dialect("passwd", passwd)

I'll take a look at that.

Magnus> For some reason you *have* to supply a quotechar, even if you
Magnus> set QUOTE_NONE... I guess that's a bug too, in my book.

Maybe. Maybe just a feature.

Magnus> If there are no objections, I might submit some of this as a bug
Magnus> report or two (or even a patch).

Please do.

Skip

Jul 18 '05 #2
In <sl****************@furu.idi.ntnu.no> Magnus Lie Hetland wrote:
And another thing: Perhaps a 'passwd' dialect could be added alongside
'excel'? Something like:


I wanted this, and started to write it in Nov-2003, but because of bugs
in csv, outlined in

http://groups.google.com/groups?selm....supernews.com

it is not possible to implement a passwd dialect, at least as of Python
2.3.2. Unless I missed something obvious.

--
Francis Avila
Jul 18 '05 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
by: Mudge | last post by:
Hi, I'm trying to help my hosting company decide whether or not to upgrade to php 5. My hosting company does not want to upgrade to php 5 if it has bugs that would cause problems. So i'm doing...
83
by: kartik | last post by:
there seems to be a serious problem with allowing numbers to grow in a nearly unbounded manner, as int/long unification does: it hides bugs. most of the time, i expect my numbers to be small. 2**31...
3
by: Brett C. | last post by:
Anthony Baxter, our ever-diligent release manager, mentioned this past week that Python 2.3.5 will most likely come to fruition some time in January (this is not guaranteed date). This means that...
4
by: Alex Bell | last post by:
There have been several postings recently dealing with bugs in Internet Explorer. Can anyone point me to a review of these bugs and how to work around them? Regards, Alex
20
by: Prashanth Badabagni | last post by:
hi, i'm prashanth Badabagni .. Can anyone tell me the BUGS present in C language whether programming or syntactical BUGS .... Thanks in advance ... Prashanth Badabagni
9
by: David Teran | last post by:
Hi, we are currently using another database product but besides some licensing issues we are finding more and more problems with the database. We are evaluating PostgreSQL and it looks quite...
2
by: TheSteph | last post by:
Using : Windows 2000 Pro SP4 / VS.NET 2005 / .NET 2.0 / C# - All updates done. I have several bugs when I use the DataGridView : When scrolling (or after scrolling) the grid have these...
19
by: Alan Silver | last post by:
Hello, Having discovered what I believe to be two CSS bugs in IE7, I have submitted bug reports to MS. AFAIK, these don't get acted on until they receive votes form others to say they are worth...
15
by: Gary Peek | last post by:
Can anyone tell us the browsers/versions that exhibit errors when tables are nested too deeply? And how many levels of nesting produces errors? (not a tables vs CSS question)
87
by: CJ | last post by:
Hello: We know that C programs are often vulnerable to buffer overflows which overwrite the stack. But my question is: Why does C insist on storing local variables on the stack in the first...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
1
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
1
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.