By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
464,767 Members | 1,053 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 464,767 IT Pros & Developers. It's quick & easy.

Reading a line from text, and separating it into variables/ Structure

P: 10
So i recently was forced to switch from C to python for Numerical Analysis reasons, and being new to Python/NumPy I was wondering if there was any equivalent of the function fscanf for Python.NumPy or how I would go about reading in a line of data and store the individual "strings" into variables.

I figured my best bet was probably splitting the string into x pieces using split(), but I'm not entirely sure how to assign each individual piece of the string to a corresponding variable.

Also wondering if there is an equivalent of a C structure in Python.

Thank you for reading.
Feb 29 '12 #1

✓ answered by dwblas

If you want to insert the entire line, then it would just be data.append(s) which would give you a list of lists=2 dimensions. The alternative would be to append the subset
data.append([s[0], s[3], s[5]]) --> appends 3 fields only, 1st, 4th, and 6th
Note the [] around the 3 fields which indicates a sub-list which can be interpreted as a second dimension. Unless you have a data set in the gigabytes, it is just as easy to append the entire list and then use whatever part is relevant.

And you can always use a class as a replacement for a C structure but since you know what each position is then it is just as easy to say data[x][y] as some variable name.
Expand|Select|Wrap|Line Numbers
  1. test_data="""USB270.15385-29.63146 270.153847 -29.631455 2.966699e+03 -9.99 1.300391e+03 -9.99 -9.99 A-A-- 6.787463e+01 -9.99 1.555773e+02 -9.99 -9.99 10100 | 0.373 13.554 12.928 12.670 AAA"""
  2. ##fileObj = open(file_name)
  3. bogus_file_obj=test_data.split("|")
  4. data = []
  5. for line in bogus_file_obj:
  6.     s = line.split()
  7.     data.append(s)
  8.  
  9. for rec in data:     ## print the results
  10.     print "-"*30
  11.     for sub_rec in rec:
  12.         print sub_rec 
  13. #
  14. #  print using "array" indexing
  15. print "\n==================================\n"
  16. titles=["Name", "Variance", "Third"]
  17. for x in range(len(data)):
  18.     print "-"*30
  19.     for y in range(len(data[x])):
  20.         ## associate a name with the field location
  21.         if y < len(titles):
  22.             print titles[y],
  23.         print data[x][y]

Share this Question
Share on Google+
6 Replies

Smygis
100+
P: 126
Guessing wildly here but I think what you are after is best accomplished with a dictionary.

But I'm not sure what you are on about with "assign each individual piece of the string to a corresponding variable.". Or exactly how the data looks like. Or how you want to access it later.

But still a dictionary is a great tool.

Expand|Select|Wrap|Line Numbers
  1. >>> dd = {}
  2. >>> dd["add"] = lambda x, y: x+y
  3. >>> dd["hello"] = "Hello world"
  4. >>> dd
  5. {'add': <function <lambda> at 0x0000000002C2CF98>, 'hello': 'Hello world'}
  6. >>> dd["add"](2,7)
  7. 9
Feb 29 '12 #2

P: 10
Okay, so the data I am dealing with looks like a set of thousands of what is below:

USB270.15385-29.63146 270.153847 -29.631455 2.966699e+03 -9.99 1.300391e+03 -9.99 -9.99 A-A-- 6.787463e+01 -9.99 1.555773e+02 -9.99 -9.99 10100 | ----- ------ ------ ------ | 0.373 13.554 12.928 12.670 AAA | ----- -------- - -------- - -------- - -------- - | --- ---------- - ---------- - --------- - --------- - --------- - ---------- -

I want to get each segment of the line for example "USB 270.15385-29.63146 270.153847" and store it in a two dimensional array named star[i][0].
And the second segment -29.631455 and store it in the same array star[i][1]... for the first 14 segments.

Using numpy I came accross the function genfromtext() where it allows me to get a string of data and break it up based on delimiters, but I'm not entirely sure how to make that into the two dimensional array I want
Feb 29 '12 #3

bvdet
Expert Mod 2.5K+
P: 2,851
It appears the data you displayed is on one line which makes sense. You can compile a two dimensional list by iterating on the file object something like:
Expand|Select|Wrap|Line Numbers
  1. fileObj = open(file_name)
  2. data = []
  3. for line in fileObj:
  4.     s = line.split()
  5.     data.append((s[0:2], s[2:14]))
  6. fileObj.close()
Feb 29 '12 #4

P: 10
I am getting a
Type Error: 'int' object is not subscriptable

Let me see if I get the jist of how the for loop works...
It's going to scan the file line by line,
For each line it will split it into however elements it contains.

I don't really understand the append line part of the program, from what I understand, it inserts the first two parts segments of the string in the first dimension of the array, and the last 13 elements on the other part.

I think what I want to do is this here, but I can't get around the int object error :

Expand|Select|Wrap|Line Numbers
  1. fileObj = open(file_name)
  2. data = []
  3. i=0
  4. for line in fileObj:
  5.     s = line.split()
  6.     for j in range(13)
  7.        data[i].append(s[j:j+1])
  8.     i=i+1
  9. fileObj.close()
  10.  
where it inputs all of the data of the line onto one dimension determined by the int i, which is updated every new line
Feb 29 '12 #5

Expert 100+
P: 626
If you want to insert the entire line, then it would just be data.append(s) which would give you a list of lists=2 dimensions. The alternative would be to append the subset
data.append([s[0], s[3], s[5]]) --> appends 3 fields only, 1st, 4th, and 6th
Note the [] around the 3 fields which indicates a sub-list which can be interpreted as a second dimension. Unless you have a data set in the gigabytes, it is just as easy to append the entire list and then use whatever part is relevant.

And you can always use a class as a replacement for a C structure but since you know what each position is then it is just as easy to say data[x][y] as some variable name.
Expand|Select|Wrap|Line Numbers
  1. test_data="""USB270.15385-29.63146 270.153847 -29.631455 2.966699e+03 -9.99 1.300391e+03 -9.99 -9.99 A-A-- 6.787463e+01 -9.99 1.555773e+02 -9.99 -9.99 10100 | 0.373 13.554 12.928 12.670 AAA"""
  2. ##fileObj = open(file_name)
  3. bogus_file_obj=test_data.split("|")
  4. data = []
  5. for line in bogus_file_obj:
  6.     s = line.split()
  7.     data.append(s)
  8.  
  9. for rec in data:     ## print the results
  10.     print "-"*30
  11.     for sub_rec in rec:
  12.         print sub_rec 
  13. #
  14. #  print using "array" indexing
  15. print "\n==================================\n"
  16. titles=["Name", "Variance", "Third"]
  17. for x in range(len(data)):
  18.     print "-"*30
  19.     for y in range(len(data[x])):
  20.         ## associate a name with the field location
  21.         if y < len(titles):
  22.             print titles[y],
  23.         print data[x][y]
Mar 1 '12 #6

P: 10
Thank you very much for your help!
Mar 1 '12 #7

Post your reply

Sign in to post your reply or Sign up for a free account.