473,221 Members | 1,932 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,221 software developers and data experts.

reading a specific column from file

Hi,

I have a file containing four columns of data separated by tabs (\t)
and I'd like to read a specific column from it (say the third). Is
there any simple way to do this in Python?

I've found quite interesting the linecache module but unfortunately
that is (to my knowledge) only working on lines, not columns.

Any suggestion?

Thanks and regards
Francesco
Jan 11 '08 #1
7 26337
On 2008-01-11, cesco <fd**********@gmail.comwrote:
Hi,

I have a file containing four columns of data separated by tabs (\t)
and I'd like to read a specific column from it (say the third). Is
there any simple way to do this in Python?

I've found quite interesting the linecache module but unfortunately
that is (to my knowledge) only working on lines, not columns.

Any suggestion?
the csv module may do what you want.
Jan 11 '08 #2
On Jan 11, 2:15 pm, cesco <fd.calabr...@gmail.comwrote:
Hi,

I have a file containing four columns of data separated by tabs (\t)
and I'd like to read a specific column from it (say the third). Is
there any simple way to do this in Python?

I've found quite interesting the linecache module but unfortunately
that is (to my knowledge) only working on lines, not columns.

Any suggestion?

Thanks and regards
Francesco
for (i, each_line) in enumerate(open('input_file.txt','rb')):
try:
column_3 = each_line.split('\t')[2].strip()
except IndexError:
print 'Not enough columns on line %i of file.' % (i+1)
continue

do_something_with_column_3()
Jan 11 '08 #3
cesco wrote:
I have a file containing four columns of data separated by tabs (\t)
and I'd like to read a specific column from it (say the third). Is
there any simple way to do this in Python?
use the "split" method and plain old indexing:

for line in open("file.txt"):
columns = line.split("\t")
print columns[2] # indexing starts at zero

also see the "csv" module, which can read all sorts of
comma/semicolon/tab-separated spreadsheet-style files.
I've found quite interesting the linecache module
the "linecache" module seems to be quite popular on comp.lang.python
these days, but it's designed for a very specific purpose (displaying
Python code in tracebacks), and is a really lousy way to read text files
in the general case. please unlearn.

</F>

Jan 11 '08 #4
A.T.Hofkamp wrote:
On 2008-01-11, cesco <fd**********@gmail.comwrote:
>Hi,

I have a file containing four columns of data separated by tabs (\t)
and I'd like to read a specific column from it (say the third). Is
there any simple way to do this in Python?

I've found quite interesting the linecache module but unfortunately
that is (to my knowledge) only working on lines, not columns.

Any suggestion?

the csv module may do what you want.
Here's an example:
>>print open("tmp.csv").read()
alpha beta gamma delta
one two three for
>>records = csv.reader(open("tmp.csv"), delimiter="\t")
[record[2] for record in records]
['gamma', 'three']

Peter
Jan 11 '08 #5
On Jan 11, 4:15 am, cesco <fd.calabr...@gmail.comwrote:
Hi,

I have a file containing four columns of data separated by tabs (\t)
and I'd like to read a specific column from it (say the third). Is
there any simple way to do this in Python?
You say you would like to "read" a specific column. I wonder if you
meant read all the data and then just seperate out the 3rd column or
if you really mean only do disk IO for the 3rd column of data and
thereby making your read faster. The second seems more interesting
but much harder and I wonder if any one has any ideas. As for the
just filtering out the third column, you have been given many
suggestions already.

Regards,
Ivan Novick
http://www.0x4849.net
Jan 11 '08 #6
Here is another suggestion:

col = 2 # third column
filename = '4columns.txt'
third_column = [line[:-1].split('\t')[col] for line in open(filename,
'r')]

third_column now contains a list of items in the third column.

This solution is great for small files (up to a couple of thousand of
lines). For larger file, performance could be a problem, so you might
need a different solution.
Jan 17 '08 #7
On Jan 17, 8:47 pm, Hai Vu <wuh...@gmail.comwrote:
Here is another suggestion:

col = 2 # third column
filename = '4columns.txt'
third_column = [line[:-1].split('\t')[col] for line in open(filename,
'r')]

third_column now contains a list of items in the third column.

This solution is great for small files (up to a couple of thousand of
lines). For larger file, performance could be a problem, so you might
need a different solution.
Using the maxsplit arg could speed it up a little:

line[:-1].split('\t', col+1)[col]

Jan 17 '08 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
by: Leeor Chernov | last post by:
Hi , I am using the method: DSSap.ReadXml( XmlPath,XmlReadMode.InferSchema ); And as I expected I get an error when the xml does not matching my DataSet that contains already a schema , The...
3
by: The Cool Giraffe | last post by:
I have a 3D matrix in MatLab and i have saved it binary (i think) to a file data.mat and now i'd like to retrieve those values in C++. Even though i'm a ground breaking genius working with...
4
waynetheengineer
by: waynetheengineer | last post by:
Hello everyone :) I am trying to write VB code for reading filenames and file property values in a specific directory. For example, I have a directory called C:/Bears and in that directory...
0
by: Problematic coder | last post by:
Hi, I have just been asked to create an app that will import a specific column from an excel spreadsheet, something like sheet1 columnD or a column with the first row with a value of 'ID'...
6
convexcube
by: convexcube | last post by:
After searching the web for a solution to total a specific column in a list box and not finding it, I came up with this: Dim varTotal As Currency Dim varRow As Integer For varRow = 1 To...
2
by: Derik | last post by:
I've got a XML file I read using a file_get_contents and turn into a simpleXML node every time index.php loads. I suspect this is causing a noticeable lag in my page-execution time. (Or the...
2
by: anatanat | last post by:
hello :) I want to bind a combobox to a specific column in a DataTable. I want the combobox items to display distinct values. I tried: combo.datasource = dataTable; combo.DisplayMember =...
1
jbt007
by: jbt007 | last post by:
Hi All, I am using the following code to import a spreadsheet into my Access 2003 tblWeekly table: DoCmd.TransferSpreadsheet acImport, acSpreadsheetTypeExcel8, "tblweekly", strOutFile, True,...
0
by: veera ravala | last post by:
ServiceNow is a powerful cloud-based platform that offers a wide range of services to help organizations manage their workflows, operations, and IT services more efficiently. At its core, ServiceNow...
0
by: VivesProcSPL | last post by:
Obviously, one of the original purposes of SQL is to make data query processing easy. The language uses many English-like terms and syntax in an effort to make it easy to learn, particularly for...
3
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 3 Jan 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). For other local times, please check World Time Buddy In...
0
by: jianzs | last post by:
Introduction Cloud-native applications are conventionally identified as those designed and nurtured on cloud infrastructure. Such applications, rooted in cloud technologies, skillfully benefit from...
2
by: jimatqsi | last post by:
The boss wants the word "CONFIDENTIAL" overlaying certain reports. He wants it large, slanted across the page, on every page, very light gray, outlined letters, not block letters. I thought Word Art...
2
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 7 Feb 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:30 (7.30PM). In this month's session, the creator of the excellent VBE...
0
by: fareedcanada | last post by:
Hello I am trying to split number on their count. suppose i have 121314151617 (12cnt) then number should be split like 12,13,14,15,16,17 and if 11314151617 (11cnt) then should be split like...
0
by: stefan129 | last post by:
Hey forum members, I'm exploring options for SSL certificates for multiple domains. Has anyone had experience with multi-domain SSL certificates? Any recommendations on reliable providers or specific...
0
by: MeoLessi9 | last post by:
I have VirtualBox installed on Windows 11 and now I would like to install Kali on a virtual machine. However, on the official website, I see two options: "Installer images" and "Virtual machines"....

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.