467,894 Members | 1,446 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 467,894 developers. It's quick & easy.

How to parse specific numeric data from csv file using python

Good day.

I have series of data in cvs file like below,

1,,,
1,137.1,1198,1.6
2,159,300,0.4
3,176,253,0.3
4,197,231,0.3
5,198,525,0.7
6,199,326,0.4
7,215,183,0.2
8,217.1,178,0.2
9,244.2,416,0.5
10,245.1,316,0.4

I want to extract specific data from second column for example 217.1 and 245.1 and have them concatenated into a new file like,

8,217.1,178,0.2
10,245.1,316,0.4

I use cvs module to read my cvs file, but, I can't extract specific data as I desire. Could anyone kindly please help me. Thank you.
May 14 '10 #1
  • viewed: 2256
Share:
10 Replies
Expert 256MB
Just open the cvs file, and open a second cvs file to write.

Then loop through the first cvs file, (for line in fileobject), and check whether the data in the second column matches your criteria. If it does, write the line to the second cvs file.

Personally, if your data is really as simple as you suggest, I wouldn't bother with the cvs file, and just use the built in open, read, write and close commands. Should only be a handful of lines. If you're struggling, post back with what you've tried, and I'll write out an example of the code. I'm rushing now, though, I'm afraid.
May 14 '10 #2
+1 on opening it as a normal file and splitting on the comma. Also, check the length of each split record before doing anything else. You should not rely on their not being any empty or malformed records in the file.
May 14 '10 #3
@Glenton
Dear Glenton,

Actually I'm dealing with 10K rows of data in my csv file. I will try your suggestion. Thanks for prompt replied.
May 14 '10 #4
Expert 256MB
10K should be trivial, ie comfortably less than 0.1s to run. But I was referring to the structure of the file, rather that the size of it anyway. Let us know how it goes, and if you can post back your solution - it will help future users!
May 16 '10 #5
@Glenton
Thanks for suggestion.

With help, I managed extract the reading I desire. Below are my codes,

Expand|Select|Wrap|Line Numbers
  1. import sys
  2. import re
  3. import csv
  4.  
  5. filename = sys.argv [1]
  6. myfile = csv.reader(open(filename) , delimiter = ',')
  7.  
  8. d_values = ['288', '305', '347', '389', '437', '483']
  9.  
  10. for row in myfile:
  11.     for val in d_values:
  12.         m = re.match(val +  '\.[0-9]*', row[1])
  13.         if m:
  14.            print row
  15.  
The output on the screen will be,

['22', '288.4', '11239', '14.7']
['31', '305.2', '2241', '2.9']
['56', '347.2', '76661', '100']
['86', '389.2', '48408', '63.1']
['118', '437.3', '1701', '2.2']
['158', '483.2', '11048', '14.4']
['192', '521.3', '8429', '11']
['233', '563.3', '9916', '12.9']
['281', '613.4', '327', '0.4']
['295', '627.3', '370', '0.5']
['337', '669.4', '1032', '1.3']
['362', '695.3', '4592', '6']
['401', '737.3', '6065', '7.9']
['422', '759.3', '300', '0.4']
['439', '779.3', '1775', '2.3']
['527', '869.3', '1640', '2.1']
['567', '911.4', '1598', '2.1']
['21', '288.4', '14775', '18.3']
['30', '305.2', '1979', '2.4']
['57', '347.2', '80888', '100']
['84', '389.2', '52990', '65.5']
['118', '437.3', '2052', '2.5']
['155', '483.2', '12031', '14.9']
Now, I'm stuck with how to write the output into a flie. I tried convert the string list to numeric, but, I failed also.

I'm sorry for asking many questions. Python is my first programming language, I'm still learning. Thanks for your time.
May 16 '10 #6
Expert 256MB
Good job! Please use code tags for posting code.

The easiest way to write to file is to do so line by line, much like you printed it! You can also simplify the matching!

Expand|Select|Wrap|Line Numbers
  1. import sys
  2. import re
  3. import csv
  4.  
  5. filename = sys.argv [1]
  6. myfile = csv.reader(open(filename) , delimiter = ',')
  7. myfile2 = open(filename2,"w")
  8.  
  9. d_values = ['288', '305', '347', '389', '437', '483']
  10.  
  11. for row in myfile:
  12.     if row[1] in d_values:
  13.         myfile2.write(row)
It might be better for you to create a csv writer object instead and use writerow(row), but I haven't had much experience with this. But what I gave you should work, since row is coming out of csv anyway.
May 16 '10 #7
@Glenton
Thanks for your reminder. I'll use code tag to post code in future. Thanks.
May 16 '10 #8
numberwhun
Expert Mod 2GB
While I am no Python expert, have you looked at all at the csv module in the Python library? It may prove helpful in the process.

Regards,

Jeff
May 16 '10 #9
Expert 256MB
@numberwhun
Thanks Jeff, we have looked at the csv module.
May 16 '10 #10
Expert 256MB
Oh, by the way, you should append the following to the end of your code:
Expand|Select|Wrap|Line Numbers
  1. myfile.close()
  2. myfile2.close()
May 16 '10 #11

Post your reply

Sign in to post your reply or Sign up for a free account.

Similar topics

1 post views Thread by Serge Guay | last post: by
3 posts views Thread by sophie_newbie | last post: by
6 posts views Thread by Jasper | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.