473,403 Members | 2,222 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,403 software developers and data experts.

Using the CSV module

Hi,

I ve been playing with the CSV module for parsing a few files. A row
in a file looks like this:

some_id\t|\tsome_data\t|t\some_more_data\t|\tlast_ data\t\n

so the lineterminator is \t\n and the delimiter is \t|\t, however when
I subclass Dialect and try to set delimiter is "\t|\t" it says
delimiter can only be a character.

I know its an easy fix to just do .strip("\t") on the output I get,
but I was wondering
a) if theres a better way of doing this when the file is actually
being parsed by the csv module
b) Why are delimiters only allowed to be one character in length.

Many Thanks in advance
Nathan
May 9 '07 #1
1 1138
On May 9, 6:40 pm, "Nathan Harmston" <ratchetg...@googlemail.com>
wrote:
Hi,

I ve been playing with the CSV module for parsing a few files. A row
in a file looks like this:

some_id\t|\tsome_data\t|t\some_more_data\t|\tlast_ data\t\n

so the lineterminator is \t\n and the delimiter is \t|\t, however when
I subclass Dialect and try to set delimiter is "\t|\t" it says
delimiter can only be a character.

I know its an easy fix to just do .strip("\t") on the output I get,
but I was wondering
a) if theres a better way of doing this when the file is actually
being parsed by the csv module
No; usually one would want at least to do .strip() on each field
anyway to remove *all* leading and trailing whitespace. Replacing
multiple whitespace characters with one space is often a good idea.
One may want to get fancier and ensure that NO-BREAK SPACE aka &nbsp;
(\xA0 in many encodings) is treated as whitespace.

So your gloriously redundant tabs vanish, for free.
b) Why are delimiters only allowed to be one character in length.
Speed. The reader is a hand-crafted finite-state machine designed to
operate on a byte at a time. Allowing for variable-length delimiters
would increase the complexity and lower the speed -- for what gain?
How often does one see 2-byte or 3-byte delimiters?

May 9 '07 #2

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
by: David | last post by:
I have this error message poping up when I try to import a module I made in C using the Python/C API. Everything compiles like a charm. Gives me this error message : Traceback (most recent...
1
by: orit | last post by:
I have the following xml file: <?xml version="1.0" encoding="utf-8" ?> <course id="2555" title="Developing Microsoft .NET Applications for Windows (Visual C# .NET)" length="5 days"...
1
by: Venky | last post by:
I'm compiling a C program that is using Interbase 6.0 APIS. Getting the following errors at the time of linking. Linking test.exe: Linker Warning: No module definition file specified: using...
13
by: Bijoy Naick | last post by:
My project contains multiple aspx pages. Many of these pages have code-behind that use several helper functions. Instead of copying each helper function into each aspx page, I am thinking of...
2
by: Martin v. Löwis | last post by:
I've been working on PEP 353 for some time now. Please comment, in particular if you are using 64-bit systems. Regards, Martin PEP: 353 Title: Using ssize_t as the index type Version:...
5
by: pyapplico | last post by:
Is there any possible way that I can place a .py file on the internet, and use that source code in an .py file on my computer?
21
KevinADC
by: KevinADC | last post by:
Note: You may skip to the end of the article if all you want is the perl code. Introduction Uploading files from a local computer to a remote web server has many useful purposes, the most...
18
by: Angus | last post by:
Hello We have a lot of C++ code. And we need to now create a library which can be used from C and C++. Given that we have a lot of C++ code using classes how can we 'hide' the fact that it is...
3
by: Tony Johansson | last post by:
Hello! You can set target Module for AttributeUsage. I just wonder what does it mean with module ? //Tony
11
by: minishilpi | last post by:
I have a question - I have this code below to send an email and I have referenced the Microsoft CDO Library 2000. It doesn't throw any exception in the console window. It goes throughout the code...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.