473,382 Members | 1,441 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,382 software developers and data experts.

split CSV fields

What is a most simple expression for splitting a CSV line with "-protected fields?

s='"123","a,b,\"c\"",5.640'
Nov 16 '06 #1
10 2158

s.split(',');
robert wrote:
What is a most simple expression for splitting a CSV line with "-protected fields?

s='"123","a,b,\"c\"",5.640'
Nov 16 '06 #2
robert wrote:
What is a most simple expression for splitting a CSV line
with "-protected fields?

s='"123","a,b,\"c\"",5.640'
import csv

the preferred way is to read the file using that module. if you insist
on processing a single line, you can do

cols = list(csv.reader([string]))

</F>

Nov 16 '06 #3
robert wrote:
What is a most simple expression for splitting a CSV line with "-protected
fields?

s='"123","a,b,\"c\"",5.640'
Use the csv-module. It should have a dialect for this, albeit I'm not 100%
sure if the escaping of the " is done properly from csv POV. Might be that
it requires excel-standard.

Diez
Nov 16 '06 #4
robert wrote:
What is a most simple expression for splitting a CSV line with "-protected
fields?

s='"123","a,b,\"c\"",5.640'
>>import csv
class mydialect(csv.excel):
.... escapechar = "\\"
....
>>csv.reader(['"123","a,b,\\"c\\"",5.640'], dialect=mydialect).next()
['123', 'a,b,"c"', '5.640']

Peter

Nov 16 '06 #5
Fredrik Lundh wrote:
robert wrote:
What is a most simple expression for splitting a CSV line
with "-protected fields?

s='"123","a,b,\"c\"",5.640'

import csv

the preferred way is to read the file using that module. if you insist
on processing a single line, you can do

cols = list(csv.reader([string]))

</F>
Python 2.5 (r25:51908, Sep 19 2006, 09:52:17) [MSC v.1310 32 bit
(Intel)] on win32
| >>import csv
| >>s='"123","a,b,\"c\"",5.640'
| >>cols = list(csv.reader([s]))
| >>cols
[['123', 'a,b,c""', '5.640']]
# maybe we need a bit more:
| >>cols = list(csv.reader([s]))[0]
| >>cols
['123', 'a,b,c""', '5.640']

I'd guess that the OP is expecting 'a,b,"c"' for the second field.

Twiddling with the knobs doesn't appear to help:

| >>list(csv.reader([s], escapechar='\\'))[0]
['123', 'a,b,c""', '5.640']
| >>list(csv.reader([s], escapechar='\\', doublequote=False))[0]
['123', 'a,b,c""', '5.640']

Looks like a bug to me; AFAICT from the docs, the last attempt should
have worked.

Cheers,
John

Nov 16 '06 #6
John Machin wrote:
Fredrik Lundh wrote:
robert wrote:
What is a most simple expression for splitting a CSV line
with "-protected fields?
>
s='"123","a,b,\"c\"",5.640'
import csv

the preferred way is to read the file using that module. if you insist
on processing a single line, you can do

cols = list(csv.reader([string]))

</F>

Python 2.5 (r25:51908, Sep 19 2006, 09:52:17) [MSC v.1310 32 bit
(Intel)] on win32
| >>import csv
| >>s='"123","a,b,\"c\"",5.640'
| >>cols = list(csv.reader([s]))
| >>cols
[['123', 'a,b,c""', '5.640']]
# maybe we need a bit more:
| >>cols = list(csv.reader([s]))[0]
| >>cols
['123', 'a,b,c""', '5.640']

I'd guess that the OP is expecting 'a,b,"c"' for the second field.

Twiddling with the knobs doesn't appear to help:

| >>list(csv.reader([s], escapechar='\\'))[0]
['123', 'a,b,c""', '5.640']
| >>list(csv.reader([s], escapechar='\\', doublequote=False))[0]
['123', 'a,b,c""', '5.640']

Looks like a bug to me; AFAICT from the docs, the last attempt should
have worked.
Given Peter Otten's post, looks like
(1) there's a bug in the "fmtparam" mechanism -- it's ignoring the
escapechar in my first twiddle, which should give the same result as
Peter's.
(2)
| >>csv.excel.doublequote
True
According to my reading of the docs:
"""
doublequote
Controls how instances of quotechar appearing inside a field should be
themselves be quoted. When True, the character is doubled. When False,
the escapechar is used as a prefix to the quotechar. It defaults to
True.
"""
Peter's example should not have worked.

Nov 16 '06 #7
John Machin wrote:
Given Peter Otten's post, looks like
(1) there's a bug in the "fmtparam" mechanism -- it's ignoring the
escapechar in my first twiddle, which should give the same result as
Peter's.
(2)
| >>csv.excel.doublequote
True
According to my reading of the docs:
"""
doublequote
Controls how instances of quotechar appearing inside a field should be
themselves be quoted. When True, the character is doubled. When False,
the escapechar is used as a prefix to the quotechar. It defaults to
True.
"""
Peter's example should not have worked.
the documentation also mentions a "quoting" parameter that "controls
when quotes should be generated by the writer and recognised by the
reader.". not sure how that changes things.

anyway, it's either unclear documentation or a bug in the code. better
submit a bug report so someone can fix one of them.

</F>

Nov 16 '06 #8

John Machin wrote:
John Machin wrote:
Fredrik Lundh wrote:
robert wrote:
>
What is a most simple expression for splitting a CSV line
with "-protected fields?

s='"123","a,b,\"c\"",5.640'
>
import csv
>
the preferred way is to read the file using that module. if you insist
on processing a single line, you can do
>
cols = list(csv.reader([string]))
>
</F>
Python 2.5 (r25:51908, Sep 19 2006, 09:52:17) [MSC v.1310 32 bit
(Intel)] on win32
| >>import csv
| >>s='"123","a,b,\"c\"",5.640'
| >>cols = list(csv.reader([s]))
| >>cols
[['123', 'a,b,c""', '5.640']]
# maybe we need a bit more:
| >>cols = list(csv.reader([s]))[0]
| >>cols
['123', 'a,b,c""', '5.640']

I'd guess that the OP is expecting 'a,b,"c"' for the second field.

Twiddling with the knobs doesn't appear to help:

| >>list(csv.reader([s], escapechar='\\'))[0]
['123', 'a,b,c""', '5.640']
| >>list(csv.reader([s], escapechar='\\', doublequote=False))[0]
['123', 'a,b,c""', '5.640']

Looks like a bug to me; AFAICT from the docs, the last attempt should
have worked.

Given Peter Otten's post, looks like
(1) there's a bug in the "fmtparam" mechanism -- it's ignoring the
escapechar in my first twiddle, which should give the same result as
Peter's.
(2)
| >>csv.excel.doublequote
True
According to my reading of the docs:
"""
doublequote
Controls how instances of quotechar appearing inside a field should be
themselves be quoted. When True, the character is doubled. When False,
the escapechar is used as a prefix to the quotechar. It defaults to
True.
"""
Peter's example should not have worked.
Doh. The OP's string was a raw string. I need some sleep.
Scrap bug #1!

| >>s=r'"123","a,b,\"c\"",5.640'
| >>list(csv.reader([s]))[0]
['123', 'a,b,\\c\\""', '5.640']
# What's that???
| >>list(csv.reader([s], escapechar='\\'))[0]
['123', 'a,b,"c"', '5.640']
| >>list(csv.reader([s], escapechar='\\', doublequote=False))[0]
['123', 'a,b,"c"', '5.640']

And there's still the problem with doublequote ....

Goodnight ...

Nov 16 '06 #9
John Machin wrote:
| >>s='"123","a,b,\"c\"",5.640'
Note how I fixed the input:
>>'"123","a,b,\"c\"",5.640'
'"123","a,b,"c"",5.640'
>>'"123","a,b,\\"c\\"",5.640'
'"123","a,b,\\"c\\"",5.640'

Peter
Nov 16 '06 #10

Fredrik Lundh wrote:
John Machin wrote:
Given Peter Otten's post, looks like
(1) there's a bug in the "fmtparam" mechanism -- it's ignoring the
escapechar in my first twiddle, which should give the same result as
Peter's.
(2)
| >>csv.excel.doublequote
True
According to my reading of the docs:
"""
doublequote
Controls how instances of quotechar appearing inside a field should be
themselves be quoted. When True, the character is doubled. When False,
the escapechar is used as a prefix to the quotechar. It defaults to
True.
"""
Peter's example should not have worked.

the documentation also mentions a "quoting" parameter that "controls
when quotes should be generated by the writer and recognised by the
reader.". not sure how that changes things.
Hi Fredrik, I read that carefully -- "quoting" appears to have no
effect in this situation.
>
anyway, it's either unclear documentation or a bug in the code. better
submit a bug report so someone can fix one of them.
Tomorrow :-)
Cheers,
John

Nov 16 '06 #11

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
by: SL_McManus | last post by:
Hi All; I am fairly new to Perl. I have a file with close to 3000 lines that I would like to split out in a certain way. I would like to put the record type starting in column 1 for 2 spaces,...
8
by: uc_sk | last post by:
Hello All I am a newbie to PERL language...If i have a file with data of form abcd 4 {1,2,3} 3 lmn- 3 {12,18,19,22} 4 then i can read them as... ($list $listTotal $set $noElements) = split /...
5
by: NewToThis | last post by:
I am trying to use the split function to bread up lines in a file I am reading from. Some lines are working just fine, but a couple of the lines don't split up the way I would have thought. ...
2
by: ThurstonHowl | last post by:
Hello, my task is the following: Input are tables with fields containing strings where the strings are actually delimited lists. For example, one field could contain 'AB|CD|EF|GH' I've...
4
by: William Stacey [MVP] | last post by:
Would like help with a (I think) a common regex split example. Thanks for your example in advance. Cheers! Source Data Example: one "two three" four Optional, but would also like to...
19
by: David Logan | last post by:
We need an additional function in the String class. We need the ability to suppress empty fields, so that we can more effectively parse. Right now, multiple whitespace characters create multiple...
3
by: Ben | last post by:
Hi I am creating a dynamic function to return a two dimensional array from a delimeted string. The delimited string is like: field1...field2...field3... field1...field2...field3......
9
by: MrHelpMe | last post by:
Hello again experts, I have successfully pulled data from an LDAP server and now what I want to do is drop the data into a database table. The following is my code that will insert the data but...
4
by: kaplan.gillian | last post by:
Hi everyone, I currently have an Access database that includes quite a few long memo fields. When I create a report of my data, Access does not allow the memo fields to be split with the page...
4
by: Gangreen | last post by:
Hi, I'm new to Perl but I have some experience in other languages. Anyway I need to split a string on the character "|". let's say our string is "one|two|three" when I try the following code:...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...
0
by: ryjfgjl | last post by:
In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.