By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
458,069 Members | 1,249 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 458,069 IT Pros & Developers. It's quick & easy.

how to access the individual elements of a matrix in python

100+
P: 111
my file is of the form

01 "\t" 10.19 "\t" 0.00 "\t" 10.65
02 "\t" 11.19 "\t" 10.12 "\t" 99.99


and i need to access the individual floating point numbers from it!
say for ex. the first no is 10.19.. i want to access this and add one to it.
Expand|Select|Wrap|Line Numbers
  1. filename=open("half.transfac","r")
  2. file_content=filename.readlines()
  3. sam=""
  4. for line in file_content:
  5.     for char in line:
  6.         if char=="\tchar\t\n":
  7.             sam+=char
  8.             print sam
  9.  
for char accesss every digit and not the numbers{"10.19","0.00")etc.. how do i do this..help
Jul 5 '07 #1
Share this Question
Share on Google+
60 Replies


dshimer
Expert 100+
P: 136
There are powerful ways to do this all in one line, but by way of explanation, start by using split() to separate each line into individual data lists for further manipulation.

Expand|Select|Wrap|Line Numbers
  1. >>> for line in file_content:
  2. ...     line.split()
  3. ... 
  4. ['01', '10.19', '0.00', '10.65']
  5. ['02', '11.19', '10.12', '99.99']
Each of which could be appended to an empty list, forming a multi-dimensional data set. Note that all the elements are strings and would have to be converted to numbers before the math.
Expand|Select|Wrap|Line Numbers
  1. >>> datalist=[]
  2. >>> for line in file_content:
  3. ...     datalist.append(line.split())
  4. ... 
  5. >>> datalist
  6. [['01', '10.19', '0.00', '10.65'], ['02', '11.19', '10.12', '99.99']]
  7. >>> int(datalist[0][0])
  8. 1
  9. >>> float(datalist[0][1])
  10. 10.19
  11.  

my file is of the form

01 "\t" 10.19 "\t" 0.00 "\t" 10.65
02 "\t" 11.19 "\t" 10.12 "\t" 99.99


and i need to access the individual floating point numbers from it!
say for ex. the first no is 10.19.. i want to access this and add one to it.
Expand|Select|Wrap|Line Numbers
  1. filename=open("half.transfac","r")
  2. file_content=filename.readlines()
  3. sam=""
  4. for line in file_content:
  5.     for char in line:
  6.         if char=="\tchar\t\n":
  7.             sam+=char
  8.             print sam
  9.  
for char accesss every digit and not the numbers{"10.19","0.00")etc.. how do i do this..help
Jul 5 '07 #2

bvdet
Expert Mod 2.5K+
P: 2,851
my file is of the form

01 "\t" 10.19 "\t" 0.00 "\t" 10.65
02 "\t" 11.19 "\t" 10.12 "\t" 99.99


and i need to access the individual floating point numbers from it!
say for ex. the first no is 10.19.. i want to access this and add one to it.
Expand|Select|Wrap|Line Numbers
  1. filename=open("half.transfac","r")
  2. file_content=filename.readlines()
  3. sam=""
  4. for line in file_content:
  5.     for char in line:
  6.         if char=="\tchar\t\n":
  7.             sam+=char
  8.             print sam
  9.  
for char accesss every digit and not the numbers{"10.19","0.00")etc.. how do i do this..help
Here is another way to access the numbers from a dictionary:
Expand|Select|Wrap|Line Numbers
  1. >>> lineList = open(file_name).readlines()
  2. >>> dataDict = {}
  3. >>> for line in lineList:
  4. ...     line = line.split()
  5. ...     dataDict[line[0]] = [float(r) for r in line[1:]]
  6. ...     
  7. >>> dataDict
  8. {'02': [11.19, 10.119999999999999, 99.989999999999995], '01': [10.19, 0.0, 10.65]}
  9. >>> dataDict['01'][0]
  10. 10.19
  11. >>> dataDict['02'][2]
  12. 99.989999999999995
You can perform mathematical operations on elements of the dictionary:
Expand|Select|Wrap|Line Numbers
  1. >>> dataDict
  2. {'02': [11.19, 10.119999999999999, 99.989999999999995], '01': [10.19, 0.0, 10.65]}
  3. >>> dataDict['01'][0] += 1
  4. >>> dataDict
  5. {'02': [11.19, 10.119999999999999, 99.989999999999995], '01': [11.19, 0.0, 10.65]}
  6. >>> 
To access individual characters:
Expand|Select|Wrap|Line Numbers
  1. >>> for key in dataDict:
  2. ...     for item in dataDict[key]:
  3. ...         print ' '.join([ch for ch in '%0.2f' % item])
  4. ...         
  5. 1 1 . 1 9
  6. 1 0 . 1 2
  7. 9 9 . 9 9
  8. 1 0 . 1 9
  9. 0 . 0 0
  10. 1 0 . 6 5
OR
Expand|Select|Wrap|Line Numbers
  1. >>> [ch for ch in '%0.2f' % dataDict['01'][0]]
  2. ['1', '1', '.', '1', '9']
Jul 5 '07 #3

100+
P: 111
THis is my file! AS u can see the first column gives the position,the second third ..etc are the part of the matrix.my problem here is that.I have to write a code to access the individual values.like if i say A[01] i should have the value 0.00 or say A[06] i should have a value 3.46. or C[04]=3.67 etc. and i have to add one to each of these elements.can you gimme the code. my code is not working.

NA bap

PO A C G T

01 0.00 3.67 0.00 0.00

02 0.00 0.00 3.67 0.00

03 0.00 0.00 0.00 3.67

04 0.00 3.67 0.00 0.00

05 3.67 0.00 0.00 0.00

06 3.46 0.00 0.22 0.00

07 0.00 0.00 3.67 0.00

08 0.00 0.00 0.00 3.67

09 0.00 0.00 0.00 3.67

10 0.00 3.67 0.00 0.00

11 3.67 0.00 0.00 0.00

12 3.67 0.00 0.00 0.00

13 0.00 0.00 3.67 0.00

14 0.00 0.00 0.00 3.67

15 0.00 0.00 3.67 0.00

16 0.00 3.67 0.00 0.00

//

//

NA bcd
Jul 6 '07 #4

bartonc
Expert 5K+
P: 6,596
Files operate much like lists, so I alway practice with a list, then go to a file:
Here's one way to look at such data in python:
Expand|Select|Wrap|Line Numbers
  1. rawdata = \
  2. ['01 0.00 3.67 0.00 0.00',
  3.  
  4. '02 0.00 0.00 3.67 0.00',
  5.  
  6. '03 0.00 0.00 0.00 3.67',
  7.  
  8. '04 0.00 3.67 0.00 0.00',
  9.  
  10. '05 3.67 0.00 0.00 0.00',
  11.  
  12. '06 3.46 0.00 0.22 0.00',
  13.  
  14. '07 0.00 0.00 3.67 0.00',
  15.  
  16. '08 0.00 0.00 0.00 3.67',
  17.  
  18. '09 0.00 0.00 0.00 3.67',
  19.  
  20. '10 0.00 3.67 0.00 0.00',
  21.  
  22. '11 3.67 0.00 0.00 0.00',
  23.  
  24. '12 3.67 0.00 0.00 0.00',
  25.  
  26. '13 0.00 0.00 3.67 0.00',
  27.  
  28. '14 0.00 0.00 0.00 3.67',
  29.  
  30. '15 0.00 0.00 3.67 0.00']
  31.  
  32. datadictionary = {} # usually shorten the name to dd
  33.  
  34. for line in rawdata:
  35.     items = line.split()
  36.     key = items[0]
  37.     datadictionary[key] = [float(item) for item in items[1:]]
  38.  
  39. print datadictionary['09'][3]
Jul 6 '07 #5

bartonc
Expert 5K+
P: 6,596
<Moderator NOTE: Merged threads by OP on a single topic (after posting to the second)>
Jul 6 '07 #6

100+
P: 111
i have to consider the columns also:
like..this is my file
A T G C
01 1.00 2.00 3.00 4.00
02 2.00 3.00 4.00 5.00

now if i say A[01] i should have a value 1.00 or C[01]=4.00
those column names A,T,G,C. i am not able to format my query file properly hope it is understood













Files operate much like lists, so I alway practice with a list, then go to a file:
Here's one way to look at such data in python:
Expand|Select|Wrap|Line Numbers
  1. rawdata = \
  2. ['01 0.00 3.67 0.00 0.00',
  3.  
  4. '02 0.00 0.00 3.67 0.00',
  5.  
  6. '03 0.00 0.00 0.00 3.67',
  7.  
  8. '04 0.00 3.67 0.00 0.00',
  9.  
  10. '05 3.67 0.00 0.00 0.00',
  11.  
  12. '06 3.46 0.00 0.22 0.00',
  13.  
  14. '07 0.00 0.00 3.67 0.00',
  15.  
  16. '08 0.00 0.00 0.00 3.67',
  17.  
  18. '09 0.00 0.00 0.00 3.67',
  19.  
  20. '10 0.00 3.67 0.00 0.00',
  21.  
  22. '11 3.67 0.00 0.00 0.00',
  23.  
  24. '12 3.67 0.00 0.00 0.00',
  25.  
  26. '13 0.00 0.00 3.67 0.00',
  27.  
  28. '14 0.00 0.00 0.00 3.67',
  29.  
  30. '15 0.00 0.00 3.67 0.00']
  31.  
  32. datadictionary = {} # usually shorten the name to dd
  33.  
  34. for line in rawdata:
  35.     items = line.split()
  36.     key = items[0]
  37.     datadictionary[key] = [float(item) for item in items[1:]]
  38.  
  39. print datadictionary['09'][3]
Jul 6 '07 #7

bartonc
Expert 5K+
P: 6,596
i have to consider the columns also:
like..this is my file
A T G C
01 1.00 2.00 3.00 4.00
02 2.00 3.00 4.00 5.00

now if i say A[01] i should have a value 1.00 or C[01]=4.00
those column names A,T,G,C. i am not able to format my query file properly hope it is understood
I hope you understand that you should be thinking "row 0n, column X, not the other way around. Rows enclose columns, so that is the first index that you deal with.
Expand|Select|Wrap|Line Numbers
  1. A = 0; T = 1; G = 2; C = 3
  2. rawdata = \
  3. ['01 0.00 3.67 0.00 0.00',
  4.  
  5. '02 0.00 0.00 3.67 0.00',
  6.  
  7. '03 0.00 0.00 0.00 3.67',
  8.  
  9. '04 0.00 3.67 0.00 0.00',
  10.  
  11. '05 3.67 0.00 0.00 0.00',
  12.  
  13. '06 3.46 0.00 0.22 0.00',
  14.  
  15. '07 0.00 0.00 3.67 0.00',
  16.  
  17. '08 0.00 0.00 0.00 3.67',
  18.  
  19. '09 0.00 0.00 0.00 3.67',
  20.  
  21. '10 0.00 3.67 0.00 0.00',
  22.  
  23. '11 3.67 0.00 0.00 0.00',
  24.  
  25. '12 3.67 0.00 0.00 0.00',
  26.  
  27. '13 0.00 0.00 3.67 0.00',
  28.  
  29. '14 0.00 0.00 0.00 3.67',
  30.  
  31. '15 0.00 0.00 3.67 0.00']
  32.  
  33. datadictionary = {} # usually shorten the name to dd
  34.  
  35. for line in rawdata:
  36.     items = line.split()
  37.     key = items[0]
  38.     datadictionary[key] = [float(item) for item in items[1:]]
  39.  
  40. print datadictionary['09'][C]
Jul 6 '07 #8

bvdet
Expert Mod 2.5K+
P: 2,851
THis is my file! AS u can see the first column gives the position,the second third ..etc are the part of the matrix.my problem here is that.I have to write a code to access the individual values.like if i say A[01] i should have the value 0.00 or say A[06] i should have a value 3.46. or C[04]=3.67 etc. and i have to add one to each of these elements.can you gimme the code. my code is not working.

NA bap

PO A C G T

01 0.00 3.67 0.00 0.00

02 0.00 0.00 3.67 0.00

03 0.00 0.00 0.00 3.67

04 0.00 3.67 0.00 0.00

05 3.67 0.00 0.00 0.00

06 3.46 0.00 0.22 0.00

07 0.00 0.00 3.67 0.00

08 0.00 0.00 0.00 3.67

09 0.00 0.00 0.00 3.67

10 0.00 3.67 0.00 0.00

11 3.67 0.00 0.00 0.00

12 3.67 0.00 0.00 0.00

13 0.00 0.00 3.67 0.00

14 0.00 0.00 0.00 3.67

15 0.00 0.00 3.67 0.00

16 0.00 3.67 0.00 0.00

//

//

NA bcd
For simplicity, let us assume the data file looks like this:

PO A C G T
01 0.00 3.67 0.00 0.00
02 0.00 0.00 3.67 0.00
03 0.00 0.00 0.00 3.67
04 0.00 3.67 0.00 0.00
05 3.67 0.00 0.00 0.00
06 3.46 0.00 0.22 0.00
07 0.00 0.00 3.67 0.00
08 0.00 0.00 0.00 3.67
09 0.00 0.00 0.00 3.67
10 0.00 3.67 0.00 0.00
11 3.67 0.00 0.00 0.00
12 3.67 0.00 0.00 0.00
13 0.00 0.00 3.67 0.00
14 0.00 0.00 0.00 3.67
15 0.00 0.00 3.67 0.00
16 0.00 3.67 0.00 0.00

Create a dictionary of dictionaries:
Expand|Select|Wrap|Line Numbers
  1. fn = r'H:\TEMP\temsys\data7.txt'
  2. lineList = [line.strip().split() for line in open(fn).readlines() if line != '\n']
  3.  
  4. headerList = lineList.pop(0)[1:]
  5.  
  6. # Key list
  7. keys = [i[0] for i in lineList]
  8. # Values list
  9. values = [[float(s) for s in item] for item in [j[1:] for j in lineList]]
  10.  
  11. # Create a dictionary from keys and values
  12. lineDict = dict(zip(keys, values))
  13.  
  14. dataDict = {}
  15.  
  16. for i, item in enumerate(headerList):
  17.     dataDict[item] = {}
  18.     for key in lineDict:
  19.         dataDict[item][key] = lineDict[key][i]
Expand|Select|Wrap|Line Numbers
  1. >>> dataDict['A']['05']
  2. 3.6699999999999999
  3. >>> globals().update(dataDict)
  4. >>> A['05']
  5. 3.6699999999999999
  6. >>> 
Jul 6 '07 #9

100+
P: 111
hey,
I don really understand what this line does.can u please explain?
values = [[float(s) for s in item] for item in [j[1:] for j in lineList]]
and

for i, item in enumerate(headerList):
dataDict[item] = {}
for key in lineDict:
dataDict[item][key] = lineDict[key]
I dont understand this either.
for this code,I am getting an error which says
value error:invalid literal for float():A



For simplicity, let us assume the data file looks like this:

PO A C G T
01 0.00 3.67 0.00 0.00
02 0.00 0.00 3.67 0.00
03 0.00 0.00 0.00 3.67
04 0.00 3.67 0.00 0.00
05 3.67 0.00 0.00 0.00
06 3.46 0.00 0.22 0.00
07 0.00 0.00 3.67 0.00
08 0.00 0.00 0.00 3.67
09 0.00 0.00 0.00 3.67
10 0.00 3.67 0.00 0.00
11 3.67 0.00 0.00 0.00
12 3.67 0.00 0.00 0.00
13 0.00 0.00 3.67 0.00
14 0.00 0.00 0.00 3.67
15 0.00 0.00 3.67 0.00
16 0.00 3.67 0.00 0.00

Create a dictionary of dictionaries:
Expand|Select|Wrap|Line Numbers
  1. fn = r'H:\TEMP\temsys\data7.txt'
  2. lineList = [line.strip().split() for line in open(fn).readlines() if line != '\n']
  3.  
  4. headerList = lineList.pop(0)[1:]
  5.  
  6. # Key list
  7. keys = [i[0] for i in lineList]
  8. # Values list
  9. values = [[float(s) for s in item] for item in [j[1:] for j in lineList]]
  10.  
  11. # Create a dictionary from keys and values
  12. lineDict = dict(zip(keys, values))
  13.  
  14. dataDict = {}
  15.  
  16. for i, item in enumerate(headerList):
  17.     dataDict[item] = {}
  18.     for key in lineDict:
  19.         dataDict[item][key] = lineDict[key][i]
Expand|Select|Wrap|Line Numbers
  1. >>> dataDict['A']['05']
  2. 3.6699999999999999
  3. >>> globals().update(dataDict)
  4. >>> A['05']
  5. 3.6699999999999999
  6. >>> 
Jul 7 '07 #10

bvdet
Expert Mod 2.5K+
P: 2,851
hey,
I don really understand what this line does.can u please explain?
values = [[float(s) for s in item] for item in [j[1:] for j in lineList]]
and

for i, item in enumerate(headerList):
dataDict[item] = {}
for key in lineDict:
dataDict[item][key] = lineDict[key]
I dont understand this either.
for this code,I am getting an error which says
value error:invalid literal for float():A
The first question - The code is a list comprehension. Another way to write it would be:
Expand|Select|Wrap|Line Numbers
  1. values = []
  2. for line in lineList:
  3.     line = line[1:]
  4.     tem = []
  5.     for item in line:
  6.         tem.append(float(item))
  7.     values.append(tem)
The result:
Expand|Select|Wrap|Line Numbers
  1. >>> [[0.0, 3.6699999999999999, 0.0, 0.0], [0.0, 0.0, 3.6699999999999999, 0.0], [0.0, 0.0, 0.0, 3.6699999999999999], [0.0, 3.6699999999999999, 0.0, 0.0], [3.6699999999999999, 0.0, 0.0, 0.0], [3.46, 0.0, 0.22, 0.0], [0.0, 0.0, 3.6699999999999999, 0.0], [0.0, 0.0, 0.0, 3.6699999999999999], [0.0, 0.0, 0.0, 3.6699999999999999], [0.0, 3.6699999999999999, 0.0, 0.0], [3.6699999999999999, 0.0, 0.0, 0.0], [3.6699999999999999, 0.0, 0.0, 0.0], [0.0, 0.0, 3.6699999999999999, 0.0], [0.0, 0.0, 0.0, 3.6699999999999999], [0.0, 0.0, 3.6699999999999999, 0.0], [0.0, 3.6699999999999999, 0.0, 0.0]]
Jul 7 '07 #11

bvdet
Expert Mod 2.5K+
P: 2,851
hey,
I don really understand what this line does.can u please explain?
values = [[float(s) for s in item] for item in [j[1:] for j in lineList]]
and

for i, item in enumerate(headerList):
dataDict[item] = {}
for key in lineDict:
dataDict[item][key] = lineDict[key]
I dont understand this either.
for this code,I am getting an error which says
value error:invalid literal for float():A
The second question - This is 'headerList':
Expand|Select|Wrap|Line Numbers
  1. >>> headerList
  2. ['A', 'C', 'G', 'T']
  3. >>> 
The values in 'headerList' will be 'keys' in 'dataDict'. 'dataDict' will be the main dictionary, and the values will be subdictionaries. A dictionary key is associated with a value - in this case the value will be another dictionary. Variable 'keys' contain the subdictionary keys:
Expand|Select|Wrap|Line Numbers
  1. >>> keys
  2. ['01', '02', '03', '04', '05', '06', '07', '08', '09', '10', '11', '12', '13', '14', '15', '16']
  3. >>> 
'lineDict' is a temporary dictionary created to make it easier to compile the data in the form you wanted:
Expand|Select|Wrap|Line Numbers
  1. >>> lineDict
  2. {'02': [0.0, 0.0, 3.6699999999999999, 0.0], '03': [0.0, 0.0, 0.0, 3.6699999999999999], '13': [0.0, 0.0, 3.6699999999999999, 0.0], '01': [0.0, 3.6699999999999999, 0.0, 0.0], '06': [3.46, 0.0, 0.22, 0.0], '07': [0.0, 0.0, 3.6699999999999999, 0.0], '04': [0.0, 3.6699999999999999, 0.0, 0.0], '05': [3.6699999999999999, 0.0, 0.0, 0.0], '08': [0.0, 0.0, 0.0, 3.6699999999999999], '09': [0.0, 0.0, 0.0, 3.6699999999999999], '16': [0.0, 3.6699999999999999, 0.0, 0.0], '12': [3.6699999999999999, 0.0, 0.0, 0.0], '14': [0.0, 0.0, 0.0, 3.6699999999999999], '11': [3.6699999999999999, 0.0, 0.0, 0.0], '15': [0.0, 0.0, 3.6699999999999999, 0.0], '10': [0.0, 3.6699999999999999, 0.0, 0.0]}
  3. >>> 
Using 'enumerate' on 'headerList', Python gives me these values:
Expand|Select|Wrap|Line Numbers
  1. >>> for i, item in enumerate(headerList):
  2. ...     print i, item
  3. ...     
  4. 0 A
  5. 1 C
  6. 2 G
  7. 3 T
  8. >>> 
Here's an interactive example showing what is happening inside the nested for loop:
Expand|Select|Wrap|Line Numbers
  1. >>> dataDict
  2. {}
  3. >>> key
  4. '10'
  5. >>> item
  6. 'T'
  7. >>> lineDict
  8. {'02': [0.0, 0.0, 3.6699999999999999, 0.0], '03': [0.0, 0.0, 0.0, 3.6699999999999999], '13': [0.0, 0.0, 3.6699999999999999, 0.0], '01': [0.0, 3.6699999999999999, 0.0, 0.0], '06': [3.46, 0.0, 0.22, 0.0], '07': [0.0, 0.0, 3.6699999999999999, 0.0], '04': [0.0, 3.6699999999999999, 0.0, 0.0], '05': [3.6699999999999999, 0.0, 0.0, 0.0], '08': [0.0, 0.0, 0.0, 3.6699999999999999], '09': [0.0, 0.0, 0.0, 3.6699999999999999], '16': [0.0, 3.6699999999999999, 0.0, 0.0], '12': [3.6699999999999999, 0.0, 0.0, 0.0], '14': [0.0, 0.0, 0.0, 3.6699999999999999], '11': [3.6699999999999999, 0.0, 0.0, 0.0], '15': [0.0, 0.0, 3.6699999999999999, 0.0], '10': [0.0, 3.6699999999999999, 0.0, 0.0]}
  9. >>> lineDict[key][3]
  10. 0.0
  11. >>> dataDict[item] = {}
  12. >>> dataDict[item][key] = lineDict[key][2]
  13. >>> dataDict
  14. {'T': {'10': 0.0}}
  15. >>> 
I hope this helps you understand what is happening.
Jul 7 '07 #12

bvdet
Expert Mod 2.5K+
P: 2,851
for this code,I am getting an error which says
value error:invalid literal for float():A
Check variable 'lineList'. It should look like this:
Expand|Select|Wrap|Line Numbers
  1. >>> for line in lineList:
  2. ...     print line
  3. ...     
  4. ['01', '0.00', '3.67', '0.00', '0.00']
  5. ['02', '0.00', '0.00', '3.67', '0.00']
  6. ['03', '0.00', '0.00', '0.00', '3.67']
  7. ['04', '0.00', '3.67', '0.00', '0.00']
  8. ['05', '3.67', '0.00', '0.00', '0.00']
  9. ['06', '3.46', '0.00', '0.22', '0.00']
  10. ['07', '0.00', '0.00', '3.67', '0.00']
  11. ['08', '0.00', '0.00', '0.00', '3.67']
  12. ['09', '0.00', '0.00', '0.00', '3.67']
  13. ['10', '0.00', '3.67', '0.00', '0.00']
  14. ['11', '3.67', '0.00', '0.00', '0.00']
  15. ['12', '3.67', '0.00', '0.00', '0.00']
  16. ['13', '0.00', '0.00', '3.67', '0.00']
  17. ['14', '0.00', '0.00', '0.00', '3.67']
  18. ['15', '0.00', '0.00', '3.67', '0.00']
  19. ['16', '0.00', '3.67', '0.00', '0.00']
  20. >>> 
Jul 7 '07 #13

100+
P: 111
The header list is supposed to take the first four letters right? like A,T,G,C.it is not happening
and my linelist has the entire file(I mean with the A,T,G,C)

Check variable 'lineList'. It should look like this:
Expand|Select|Wrap|Line Numbers
  1. >>> for line in lineList:
  2. ...     print line
  3. ...     
  4. ['01', '0.00', '3.67', '0.00', '0.00']
  5. ['02', '0.00', '0.00', '3.67', '0.00']
  6. ['03', '0.00', '0.00', '0.00', '3.67']
  7. ['04', '0.00', '3.67', '0.00', '0.00']
  8. ['05', '3.67', '0.00', '0.00', '0.00']
  9. ['06', '3.46', '0.00', '0.22', '0.00']
  10. ['07', '0.00', '0.00', '3.67', '0.00']
  11. ['08', '0.00', '0.00', '0.00', '3.67']
  12. ['09', '0.00', '0.00', '0.00', '3.67']
  13. ['10', '0.00', '3.67', '0.00', '0.00']
  14. ['11', '3.67', '0.00', '0.00', '0.00']
  15. ['12', '3.67', '0.00', '0.00', '0.00']
  16. ['13', '0.00', '0.00', '3.67', '0.00']
  17. ['14', '0.00', '0.00', '0.00', '3.67']
  18. ['15', '0.00', '0.00', '3.67', '0.00']
  19. ['16', '0.00', '3.67', '0.00', '0.00']
  20. >>> 
Jul 8 '07 #14

100+
P: 111
fn = 'half.txt'
fn_=open("half.txt","r")
file_content=fn_.readlines()

for line in file_content:
linelist=line.strip().split()
print linelist

headerList = linelist.pop(0)[1:]
print headerList

when i do this my code has a headerlist which has
0
1
2
3
4
5
6
7
8
9
0
1
2
3
4
5
6
WHY IS THIS HAppening?
and what does the strip( ) do?
Jul 8 '07 #15

100+
P: 111
headerList = linelist.pop(0)[1:]
WHAT DOES THIS LINE DO?


Check variable 'lineList'. It should look like this:
Expand|Select|Wrap|Line Numbers
  1. >>> for line in lineList:
  2. ...     print line
  3. ...     
  4. ['01', '0.00', '3.67', '0.00', '0.00']
  5. ['02', '0.00', '0.00', '3.67', '0.00']
  6. ['03', '0.00', '0.00', '0.00', '3.67']
  7. ['04', '0.00', '3.67', '0.00', '0.00']
  8. ['05', '3.67', '0.00', '0.00', '0.00']
  9. ['06', '3.46', '0.00', '0.22', '0.00']
  10. ['07', '0.00', '0.00', '3.67', '0.00']
  11. ['08', '0.00', '0.00', '0.00', '3.67']
  12. ['09', '0.00', '0.00', '0.00', '3.67']
  13. ['10', '0.00', '3.67', '0.00', '0.00']
  14. ['11', '3.67', '0.00', '0.00', '0.00']
  15. ['12', '3.67', '0.00', '0.00', '0.00']
  16. ['13', '0.00', '0.00', '3.67', '0.00']
  17. ['14', '0.00', '0.00', '0.00', '3.67']
  18. ['15', '0.00', '0.00', '3.67', '0.00']
  19. ['16', '0.00', '3.67', '0.00', '0.00']
  20. >>> 
Jul 8 '07 #16

bartonc
Expert 5K+
P: 6,596
<snip>what does the strip( ) do?
strip() removes whitespace (or any characters you tell it to) from BOTH ends of a string. Like this:
Expand|Select|Wrap|Line Numbers
  1. >>> s = "   had_whitespace   "
  2. >>> t = s.strip()
  3. >>> print repr(t)  # repr() prints the string representation of a varialbe 
  4. 'had_whitespace'
  5. >>> 
Jul 8 '07 #17

100+
P: 111
headerList = linelist.pop(0)[1:]
WHAT DOES THIS LINE DO?



strip() removes whitespace (or any characters you tell it to) from BOTH ends of a string. Like this:
Expand|Select|Wrap|Line Numbers
  1. >>> s = "   had_whitespace   "
  2. >>> t = s.strip()
  3. >>> print repr(t)  # repr() prints the string representation of a varialbe 
  4. 'had_whitespace'
  5. >>> 
Jul 8 '07 #18

bartonc
Expert 5K+
P: 6,596
headerList = linelist.pop(0)[1:]
WHAT DOES THIS LINE DO?
Since I didn't write that, I'll just have to do my best to describe it:
Pop(0) takes the zeroth element from linelist (which looks like another list). The result is stored in an unseen, temporary variable. [1:] takes all but the zeroth element from that temporary variable and that (shortened list) is stored in headerList. You can check it out by doing something like
Expand|Select|Wrap|Line Numbers
  1. temp = linelist.pop(0)
  2. print temp
  3. headerList = temp[1:]   # this is called a "slice" from the temp list.
  4. print headerList
Hope that helps.
Jul 8 '07 #19

100+
P: 111
thanks for that . but could you give ur mail id..because I am not able to paste my exact file here and am stuck with this problem.plz help!
Since I didn't write that, I'll just have to do my best to describe it:
Pop(0) takes the zeroth element from linelist (which looks like another list). The result is stored in an unseen, temporary variable. [1:] takes all but the zeroth element from that temporary variable and that (shortened list) is stored in headerList. You can check it out by doing something like
Expand|Select|Wrap|Line Numbers
  1. temp = linelist.pop(0)
  2. print temp
  3. headerList = temp[1:]   # this is called a "slice" from the temp list.
  4. print headerList
Hope that helps.
Jul 8 '07 #20

bartonc
Expert 5K+
P: 6,596
thanks for that . but could you give ur mail id..because I am not able to paste my exact file here and am stuck with this problem.plz help!
Paste what you can here and I'll format it for you.
Jul 8 '07 #21

100+
P: 111
Paste what you can here and I'll format it for you.
NA bap
PO A C G T
01 0.00 3.67 0.00 0.00
02 0.00 0.00 3.67 0.00
03 0.00 0.00 0.00 3.67
04 0.00 3.67 0.00 0.00
05 3.67 0.00 0.00 0.00
06 3.46 0.00 0.22 0.00
07 0.00 0.00 3.67 0.00
08 0.00 0.00 0.00 3.67
09 0.00 0.00 0.00 3.67
10 0.00 3.67 0.00 0.00
11 3.67 0.00 0.00 0.00
12 3.67 0.00 0.00 0.00
13 0.00 0.00 3.67 0.00
14 0.00 0.00 0.00 3.67
15 0.00 0.00 3.67 0.00
16 0.00 3.67 0.00 0.00
//
//
NA bcd
PO A C G T
01 42.55 8.75 145.86 8.14
02 0.14 0.53 204.64 0.00
03 126.83 78.02 0.11 0.34
04 0.21 0.17 0.00 204.92
05 0.00 12.38 0.43 192.50
06 174.48 0.95 1.32 28.56
07 79.53 4.70 100.44 20.64
//
//
NA bin
PO A C G T
01 0.45 8.27 0.00 11.39
02 0.00 0.00 10.02 10.09
03 5.80 1.39 0.00 12.93
04 12.33 5.18 2.60 0.00
05 12.43 0.00 0.00 7.68
06 18.55 0.00 1.57 0.00
07 0.05 0.58 0.00 19.48
08 20.11 0.00 0.00 0.00
09 20.06 0.05 0.00 0.00
10 20.11 0.00 0.00 0.00
11 0.00 15.33 0.00 4.78
12 20.06 0.05 0.00 0.00
13 14.99 0.35 4.78 0.00
14 13.64 2.42 3.37 0.68
15 5.03 0.00 15.08 0.00
16 7.23 0.45 10.94 1.49
//
//
I want to add one to each and every element of the file(of course except the first column giving the position)
and want to access the elements like A[01]=,c[02]=,etc..
Jul 8 '07 #22

bartonc
Expert 5K+
P: 6,596
NA bap
PO A C G T
01 0.00 3.67 0.00 0.00
02 0.00 0.00 3.67 0.00
03 0.00 0.00 0.00 3.67
04 0.00 3.67 0.00 0.00
05 3.67 0.00 0.00 0.00
06 3.46 0.00 0.22 0.00
07 0.00 0.00 3.67 0.00
08 0.00 0.00 0.00 3.67
09 0.00 0.00 0.00 3.67
10 0.00 3.67 0.00 0.00
11 3.67 0.00 0.00 0.00
12 3.67 0.00 0.00 0.00
13 0.00 0.00 3.67 0.00
14 0.00 0.00 0.00 3.67
15 0.00 0.00 3.67 0.00
16 0.00 3.67 0.00 0.00
//
//
NA bcd
PO A C G T
01 42.55 8.75 145.86 8.14
02 0.14 0.53 204.64 0.00
03 126.83 78.02 0.11 0.34
04 0.21 0.17 0.00 204.92
05 0.00 12.38 0.43 192.50
06 174.48 0.95 1.32 28.56
07 79.53 4.70 100.44 20.64
//
//
NA bin
PO A C G T
01 0.45 8.27 0.00 11.39
02 0.00 0.00 10.02 10.09
03 5.80 1.39 0.00 12.93
04 12.33 5.18 2.60 0.00
05 12.43 0.00 0.00 7.68
06 18.55 0.00 1.57 0.00
07 0.05 0.58 0.00 19.48
08 20.11 0.00 0.00 0.00
09 20.06 0.05 0.00 0.00
10 20.11 0.00 0.00 0.00
11 0.00 15.33 0.00 4.78
12 20.06 0.05 0.00 0.00
13 14.99 0.35 4.78 0.00
14 13.64 2.42 3.37 0.68
15 5.03 0.00 15.08 0.00
16 7.23 0.45 10.94 1.49
//
//
I want to add one to each and every element of the file(of course except the first column giving the position)
and want to access the elements like A[01]=,c[02]=,etc..
Lacking your working program pasted here, I'm afraid that the best I can tell you is it should be as simple as:
Expand|Select|Wrap|Line Numbers
  1. temp = A[01]
  2. print temp
  3. A[01] = temp + 1
  4. print A[01]
or even
Expand|Select|Wrap|Line Numbers
  1. print A[01]
  2. A[01] += 1
  3. print A[01]
in a loop through your data structure.
Jul 8 '07 #23

100+
P: 111
Expand|Select|Wrap|Line Numbers
  1. file_=open("half1.txt","r")
  2. file_content=file_.readlines()
  3. linelist=[line.strip().split() for line in file_content if line != '\n']
  4.  
  5. headerlist=linelist.pop(0)[1:]
  6.  
  7. keys=[i[0] for i in linelist]
  8.  
  9.  
  10. values=[[float(s) for s in item] for item in [j[1:] for j in linelist]]
  11.  
  12. linedict =dict(zip(keys,values))
  13.  
  14. datadict={}
  15. for i,item in enumerate(headerlist):
  16.     datadict[item]={}
  17.     for key in linedict:
  18.         print key
  19.         datadict[item][key]=linedict[key][i]
  20.         print datadict['A']['01']
  21.  
when i execute i get an error that says
keyerror: '01'

<removed quote=bvdet to make this post appear.>

'headerList':
Expand|Select|Wrap|Line Numbers
  1. >>> headerList
  2. ['A', 'C', 'G', 'T']
  3. >>> 
The values in 'headerList' will be 'keys' in 'dataDict'. 'dataDict' will be the main dictionary, and the values will be subdictionaries. A dictionary key is associated with a value - in this case the value will be another dictionary. Variable 'keys' contain the subdictionary keys:
Expand|Select|Wrap|Line Numbers
  1. >>> keys
  2. ['01', '02', '03', '04', '05', '06', '07', '08', '09', '10', '11', '12', '13', '14', '15', '16']
  3. >>> 
'lineDict' is a temporary dictionary created to make it easier to compile the data in the form you wanted:
Expand|Select|Wrap|Line Numbers
  1. >>> lineDict
  2. {'02': [0.0, 0.0, 3.6699999999999999, 0.0], '03': [0.0, 0.0, 0.0, 3.6699999999999999], '13': [0.0, 0.0, 3.6699999999999999, 0.0], '01': [0.0, 3.6699999999999999, 0.0, 0.0], '06': [3.46, 0.0, 0.22, 0.0], '07': [0.0, 0.0, 3.6699999999999999, 0.0], '04': [0.0, 3.6699999999999999, 0.0, 0.0], '05': [3.6699999999999999, 0.0, 0.0, 0.0], '08': [0.0, 0.0, 0.0, 3.6699999999999999], '09': [0.0, 0.0, 0.0, 3.6699999999999999], '16': [0.0, 3.6699999999999999, 0.0, 0.0], '12': [3.6699999999999999, 0.0, 0.0, 0.0], '14': [0.0, 0.0, 0.0, 3.6699999999999999], '11': [3.6699999999999999, 0.0, 0.0, 0.0], '15': [0.0, 0.0, 3.6699999999999999, 0.0], '10': [0.0, 3.6699999999999999, 0.0, 0.0]}
  3. >>> 
Using 'enumerate' on 'headerList', Python gives me these values:
Expand|Select|Wrap|Line Numbers
  1. >>> for i, item in enumerate(headerList):
  2. ...     print i, item
  3. ...     
  4. 0 A
  5. 1 C
  6. 2 G
  7. 3 T
  8. >>> 
Here's an interactive example showing what is happening inside the nested for loop:
Expand|Select|Wrap|Line Numbers
  1. >>> dataDict
  2. {}
  3. >>> key
  4. '10'
  5. >>> item
  6. 'T'
  7. >>> lineDict
  8. {'02': [0.0, 0.0, 3.6699999999999999, 0.0], '03': [0.0, 0.0, 0.0, 3.6699999999999999], '13': [0.0, 0.0, 3.6699999999999999, 0.0], '01': [0.0, 3.6699999999999999, 0.0, 0.0], '06': [3.46, 0.0, 0.22, 0.0], '07': [0.0, 0.0, 3.6699999999999999, 0.0], '04': [0.0, 3.6699999999999999, 0.0, 0.0], '05': [3.6699999999999999, 0.0, 0.0, 0.0], '08': [0.0, 0.0, 0.0, 3.6699999999999999], '09': [0.0, 0.0, 0.0, 3.6699999999999999], '16': [0.0, 3.6699999999999999, 0.0, 0.0], '12': [3.6699999999999999, 0.0, 0.0, 0.0], '14': [0.0, 0.0, 0.0, 3.6699999999999999], '11': [3.6699999999999999, 0.0, 0.0, 0.0], '15': [0.0, 0.0, 3.6699999999999999, 0.0], '10': [0.0, 3.6699999999999999, 0.0, 0.0]}
  9. >>> lineDict[key][3]
  10. 0.0
  11. >>> dataDict[item] = {}
  12. >>> dataDict[item][key] = lineDict[key][2]
  13. >>> dataDict
  14. {'T': {'10': 0.0}}
  15. >>> 
I hope this helps you understand what is happening.
Jul 8 '07 #24

bartonc
Expert 5K+
P: 6,596
Expand|Select|Wrap|Line Numbers
  1. file_=open("half1.txt","r")
  2. file_content=file_.readlines()
  3. linelist=[line.strip().split() for line in file_content if line != '\n']
  4.  
  5. headerlist=linelist.pop(0)[1:]
  6.  
  7. keys=[i[0] for i in linelist]
  8.  
  9.  
  10. values=[[float(s) for s in item] for item in [j[1:] for j in linelist]]
  11.  
  12. linedict =dict(zip(keys,values))
  13.  
  14. datadict={}
  15. for i,item in enumerate(headerlist):
  16.     datadict[item]={}
  17.     for key in linedict:
  18.         print key
  19.         datadict[item][key]=linedict[key][i]
  20.         print datadict['A']['01']
  21.  
when i execute i get an error that says
keyerror: '01'
Expand|Select|Wrap|Line Numbers
  1. file_=open("half1.txt","r")
  2. file_content=file_.readlines()
  3. linelist=[line.strip().split() for line in file_content if line != '\n']
  4.  
  5. headerlist=linelist.pop(0)[1:]
  6.  
  7. keys=[i[0] for i in linelist]
  8.  
  9.  
  10. values=[[float(s) for s in item] for item in [j[1:] for j in linelist]]
  11.  
  12. linedict =dict(zip(keys,values))
  13.  
  14.  
  15. # Form the data dictionary first:
  16. datadict={}
  17. for i,item in enumerate(headerlist):
  18.     datadict[item]={}
  19.     for key in linedict:
  20.         datadict[item][key]=linedict[key][i]
  21.  
  22. # Then loop through the data structure. But since you have turned
  23. # the data sideways, I can not see the structure at this moment.
  24.  
Jul 8 '07 #25

100+
P: 111
Expand|Select|Wrap|Line Numbers
  1. file_=open("half1.txt","r")
  2. file_content=file_.readlines()
  3. linelist=[line.strip().split() for line in file_content if line != '\n']
  4.  
  5. headerlist=linelist.pop(0)[1:]
  6. print headerlist
  7.  
  8. keys=[i[0] for i in linelist]
  9.  
  10.  
  11. values=[[float(s) for s in item] for item in [j[1:] for j in linelist]]
  12.  
  13. linedict =dict(zip(keys,values))
  14.  
  15. datadict={}
  16. for i,item in enumerate(headerlist):
  17.     datadict[item]={}
  18.     for key in linedict:
  19.         print key
  20.         datadict[item][key]=linedict[key][i]
  21.         print datadict['A']['01']
this is my program. i am getting an error that says
key error:'01' and i do want to add one to all the elements
(i mean the float values)
can you see the error?
Jul 8 '07 #26

bartonc
Expert 5K+
P: 6,596
Expand|Select|Wrap|Line Numbers
  1. file_=open("half1.txt","r")
  2. file_content=file_.readlines()
  3. linelist=[line.strip().split() for line in file_content if line != '\n']
  4.  
  5. headerlist=linelist.pop(0)[1:]
  6. print headerlist
  7.  
  8. keys=[i[0] for i in linelist]
  9.  
  10.  
  11. values=[[float(s) for s in item] for item in [j[1:] for j in linelist]]
  12.  
  13. linedict =dict(zip(keys,values))
  14.  
  15. datadict={}
  16. for i,item in enumerate(headerlist):
  17.     datadict[item]={}
  18.     for key in linedict:
  19.         print key
  20.         datadict[item][key]=linedict[key][i]
  21.         print datadict['A']['01']
this is my program. i am getting an error that says
key error:'01' and i do want to add one to all the elements
(i mean the float values)
can you see the error?
I've answered above.

You must learn to use code tags. Instructions are on the right hand side of the page when you are making your post (or reply). There is also very much helpful information in the How to ask a question section of our Posting Guidelines. Thanks
Jul 8 '07 #27

bvdet
Expert Mod 2.5K+
P: 2,851
Lacking your working program pasted here, I'm afraid that the best I can tell you is it should be as simple as:
Expand|Select|Wrap|Line Numbers
  1. temp = A[01]
  2. print temp
  3. A[01] = temp + 1
  4. print A[01]
or even
Expand|Select|Wrap|Line Numbers
  1. print A[01]
  2. A[01] += 1
  3. print A[01]
in a loop through your data structure.
It looks like you have three data sets in the file. Which one do you want to work on?

Notice that I simplified my example code by eliminating some of the lines in the data file. It appears that you have not adjusted the code to account for that.
Jul 8 '07 #28

bvdet
Expert Mod 2.5K+
P: 2,851
I have modified the first part of my example code to read the first data set in the OP file:
Expand|Select|Wrap|Line Numbers
  1. fn = r'H:\TEMP\temsys\data9.txt'
  2. f = open(fn)
  3.  
  4. line = f.next()
  5. while not line.startswith('PO'):
  6.     line = f.next()
  7.  
  8. headerList = line.strip().split()[1:]
  9. lineList = []
  10.  
  11. line = f.next().strip()
  12. while not line.startswith('/'):
  13.     if line != '':
  14.         lineList.append(line.strip().split())
  15.     line = f.next().strip()
  16.  
  17. f.close()
This will add a set amount to every element in the data:
Expand|Select|Wrap|Line Numbers
  1. # Add 1.0 to every element in dataDict subdictionaries
  2. for keyMain in dataDict:
  3.     for keySub in dataDict[keyMain]:
  4.         dataDict[keyMain][keySub] += 1.0
Expand|Select|Wrap|Line Numbers
  1. >>> dataDict
  2. {'A': {'02': 1.0, '03': 1.0, '13': 1.0, '01': 1.0, '06': 4.46, '07': 1.0, '04': 1.0, '05': 4.6699999999999999, '08': 1.0, '09': 1.0, '10': 1.0, '16': 1.0, '11': 4.6699999999999999, '15': 1.0, '12': 4.6699999999999999, '14': 1.0}, 'C': {'02': 1.0, '03': 1.0, '13': 1.0, '01': 4.6699999999999999, '06': 1.0, '07': 1.0, '04': 4.6699999999999999, '05': 1.0, '08': 1.0, '09': 1.0, '10': 4.6699999999999999, '16': 4.6699999999999999, '11': 1.0, '15': 1.0, '12': 1.0, '14': 1.0}, 'T': {'02': 1.0, '03': 4.6699999999999999, '13': 1.0, '01': 1.0, '06': 1.0, '07': 1.0, '04': 1.0, '05': 1.0, '08': 4.6699999999999999, '09': 4.6699999999999999, '10': 1.0, '16': 1.0, '11': 1.0, '15': 1.0, '12': 1.0, '14': 4.6699999999999999}, 'G': {'02': 4.6699999999999999, '03': 1.0, '13': 4.6699999999999999, '01': 1.0, '06': 1.22, '07': 4.6699999999999999, '04': 1.0, '05': 1.0, '08': 1.0, '09': 1.0, '10': 1.0, '16': 1.0, '11': 1.0, '15': 4.6699999999999999, '12': 1.0, '14': 1.0}}
  3. >>> 
Jul 8 '07 #29

100+
P: 111
Expand|Select|Wrap|Line Numbers
  1. f=open("weight_matrix.transfac.txt","r")
  2. line=f.next()
  3. while not line.startswith('PO'):
  4.     line=f.next()
  5.  
  6. headerlist=line.strip().split()[1:]
  7. linelist=[]
  8.  
  9.  
  10. line=f.next().strip()
  11. while not line.startswith('/'):
  12.     if line != '':
  13.         linelist.append(line.strip().split())
  14.     line=f.next().strip()
  15.  
  16. keys=[i[0] for i in linelist]
  17. values=[[float(s) for s in item] for item in [j[1:] for j in linelist]]
  18.  
  19. linedict=dict(zip(keys,values))
  20. datadict={}
  21. for i,item in enumerate(headerlist):
  22.     datadict[item]={}
  23.     for key in linedict:
  24.         datadict[item][key]=linedict[key][i]
  25.         for keymain in datadict:
  26.             for keysub in datadict[keymain]:
  27.                 datadict[keymain][keysub]+=1.0
  28.                 print datadict
  29.  
so here is the code that you suggested for creating dictionaries for a file(matrix)
now i have gotta do something like..
supposing the sequence entered is something like "ATATTA".. so A is in the first position so A[01]*T[02]*A[03]*T[04]*T[05]*A[06]=??
the sequence is going to be entered by the user everytime(so it will keep changing)
how do i do this?? what changes should i do??hope I am clear!!
THIS is the file containing the matrix
NA bap
PO A C G T
01 0.00 3.67 0.00 0.00
02 0.00 0.00 3.67 0.00
03 0.00 0.00 0.00 3.67
04 0.00 3.67 0.00 0.00
05 3.67 0.00 0.00 0.00
06 3.46 0.00 0.22 0.00
07 0.00 0.00 3.67 0.00
08 0.00 0.00 0.00 3.67
09 0.00 0.00 0.00 3.67
10 0.00 3.67 0.00 0.00
11 3.67 0.00 0.00 0.00
12 3.67 0.00 0.00 0.00
13 0.00 0.00 3.67 0.00
14 0.00 0.00 0.00 3.67
15 0.00 0.00 3.67 0.00
16 0.00 3.67 0.00 0.00
//
//
NA bcd
PO A C G T
01 42.55 8.75 145.86 8.14
02 0.14 0.53 204.64 0.00
03 126.83 78.02 0.11 0.34
04 0.21 0.17 0.00 204.92
05 0.00 12.38 0.43 192.50
06 174.48 0.95 1.32 28.56
07 79.53 4.70 100.44 20.64
//
//
NA bin
PO A C G T
01 0.45 8.27 0.00 11.39
02 0.00 0.00 10.02 10.09
03 5.80 1.39 0.00 12.93
04 12.33 5.18 2.60 0.00
05 12.43 0.00 0.00 7.68
06 18.55 0.00 1.57 0.00
07 0.05 0.58 0.00 19.48
08 20.11 0.00 0.00 0.00
09 20.06 0.05 0.00 0.00
10 20.11 0.00 0.00 0.00
11 0.00 15.33 0.00 4.78
12 20.06 0.05 0.00 0.00
13 14.99 0.35 4.78 0.00
14 13.64 2.42 3.37 0.68
15 5.03 0.00 15.08 0.00
16 7.23 0.45 10.94 1.49
//
//
Jul 9 '07 #30

100+
P: 111
Expand|Select|Wrap|Line Numbers
  1. f=open("weight_matrix.transfac.txt","r")
  2. line=f.next()
  3. while not line.startswith('PO'):
  4.     line=f.next()
  5.  
  6. headerlist=line.strip().split()[1:]
  7. linelist=[]
  8.  
  9.  
  10. line=f.next().strip()
  11. while not line.startswith('/'):
  12.     if line != '':
  13.         linelist.append(line.strip().split())
  14.     line=f.next().strip()
  15.  
  16. keys=[i[0] for i in linelist]
  17. values=[[float(s) for s in item] for item in [j[1:] for j in linelist]]
  18.  
  19. linedict=dict(zip(keys,values))
  20. datadict={}
  21. for i,item in enumerate(headerlist):
  22.     datadict[item]={}
  23.     for key in linedict:
  24.         datadict[item][key]=linedict[key][i]
  25.         for keymain in datadict:
  26.             for keysub in datadict[keymain]:
  27.                 datadict[keymain][keysub]+=1.0
  28.                 print datadict
  29.  
so here is the code that you suggested for creating dictionaries for a file(matrix)
now i have gotta do something like..
supposing the sequence entered is something like "ATATTA".. so A is in the first position so A[01]*T[02]*A[03]*T[04]*T[05]*A[06]=??
how do i do this?? what changes should i do??
THIS is the file containing the matrix
NA bap
PO A C G T
01 0.00 3.67 0.00 0.00
02 0.00 0.00 3.67 0.00
03 0.00 0.00 0.00 3.67
04 0.00 3.67 0.00 0.00
05 3.67 0.00 0.00 0.00
06 3.46 0.00 0.22 0.00
07 0.00 0.00 3.67 0.00
08 0.00 0.00 0.00 3.67
09 0.00 0.00 0.00 3.67
10 0.00 3.67 0.00 0.00
11 3.67 0.00 0.00 0.00
12 3.67 0.00 0.00 0.00
13 0.00 0.00 3.67 0.00
14 0.00 0.00 0.00 3.67
15 0.00 0.00 3.67 0.00
16 0.00 3.67 0.00 0.00
//
//
NA bcd
PO A C G T
01 42.55 8.75 145.86 8.14
02 0.14 0.53 204.64 0.00
03 126.83 78.02 0.11 0.34
04 0.21 0.17 0.00 204.92
05 0.00 12.38 0.43 192.50
06 174.48 0.95 1.32 28.56
07 79.53 4.70 100.44 20.64
//
//
NA bin
PO A C G T
01 0.45 8.27 0.00 11.39
02 0.00 0.00 10.02 10.09
03 5.80 1.39 0.00 12.93
04 12.33 5.18 2.60 0.00
05 12.43 0.00 0.00 7.68
06 18.55 0.00 1.57 0.00
07 0.05 0.58 0.00 19.48
08 20.11 0.00 0.00 0.00
09 20.06 0.05 0.00 0.00
10 20.11 0.00 0.00 0.00
11 0.00 15.33 0.00 4.78
12 20.06 0.05 0.00 0.00
13 14.99 0.35 4.78 0.00
14 13.64 2.42 3.37 0.68
15 5.03 0.00 15.08 0.00
16 7.23 0.45 10.94 1.49
//
//
Jul 9 '07 #31

bartonc
Expert 5K+
P: 6,596
<Mod NOTE: merged duplicate threads - again>
HERE IS MY code!!plz help how to proceed??
Thanks for using code tags.
First thing to do is try it. It looks like it should work.

Right after that, READ the rest of the POSTING GUIDELINES. You have committed several infractions and must consider yourself warned.
Jul 9 '07 #32

100+
P: 111
I am sorry about that. I will follow the rules hence forth.but i dont know how to proceed with my problem further.
Jul 9 '07 #33

bartonc
Expert 5K+
P: 6,596
I am sorry about that. I will follow the rules hence forth.but i dont know how to proceed with my problem further.
Ok. All is forgiven. Here's the thing. I can't give you working code.
I can give you examples of what I see going on here so that you can try things on your own. It looks to me like you could multiply the specified elements (after your matrix has been created) like this:
Expand|Select|Wrap|Line Numbers
  1. >>> seq = "ATATTA"
  2. >>> res = 1
  3. >>> for i, key in enumerate(seq):
  4. ...     res *= dd[key]["%02d" %(i + 1)]
  5. >>> print res
Jul 9 '07 #34

100+
P: 111
Thanks a lot for all the help! I still have a small doubt.what does this line do?
Expand|Select|Wrap|Line Numbers
  1.  
  2. while not  line.startswith('PO'):
  3.  
  4.  
  5.     line=f.next()
  6.     print line
  7.  
I mean this is supposed to refer to the line which doesnt start with "PO' right? when i say print line. It is printing
PO A C G T
but this is not what it means right?
Jul 9 '07 #35

bvdet
Expert Mod 2.5K+
P: 2,851
Thanks a lot for all the help! I still have a small doubt.what does this line do?
Expand|Select|Wrap|Line Numbers
  1.  
  2. while not  line.startswith('PO'):
  3.  
  4.  
  5.     line=f.next()
  6.     print line
  7.  
I mean this is supposed to refer to the line which doesnt start with "PO' right? when i say print line. It is printing
PO A C G T
but this is not what it means right?
f.next() merely advances one line in the file. The object is to advance to the first line that starts with "PO".

I asked you a question in an earlier post about your data. It looks like there are three data sets in your data file. The example code I posted only parses the first data set.
Jul 9 '07 #36

100+
P: 111
okay,then why do we say,
while not?? it should be while..right??
and yes there are actually a lot of data sets in my file around 15 to twenty first am trying to make it work for one.then i will do it for the rest(sorry for not answering I am really going crazy with the program)i owe u an apology
And i have one more doubt:
[code]
datadict={}
for i,item in enumerate(headerlist):
datadict[item]={}
for key in linedict:
print key
# when i say a print key here, it is printing the keys twice and in an unorderly #fashion.why is this happening?
datadict[item][key]=linedict[key][i]
for keymain in datadict:
for keysub in datadict[keymain]:
datadict[keymain][keysub]+=1.0
print datadict
looking fwd for ur reply!
cheers!
Jul 9 '07 #37

bvdet
Expert Mod 2.5K+
P: 2,851
okay,then why do we say,
while not?? it should be while..right??
and yes there are actually a lot of data sets in my file around 15 to twenty first am trying to make it work for one.then i will do it for the rest(sorry for not answering I am really going crazy with the program)i owe u an apology
And i have one more doubt:
[code]
datadict={}
for i,item in enumerate(headerlist):
datadict[item]={}
for key in linedict:
print key
# when i say a print key here, it is printing the keys twice and in an unorderly #fashion.why is this happening?
datadict[item][key]=linedict[key][i]
for keymain in datadict:
for keysub in datadict[keymain]:
datadict[keymain][keysub]+=1.0
print datadict
looking fwd for ur reply!
cheers!
I should be 'while not'. We want to skip the lines until the line starts with 'PO'. You can use this same method to advance into later data sets.

No problem about not answering.

Dictionaries are unordered collections of data. You can print in an orderly fashion like this:
Expand|Select|Wrap|Line Numbers
  1. keys = lineDict.keys()
  2. keys.sort()
  3. for key in keys:
  4.     print '%s = %s' % (key, lineDict[key])
Jul 9 '07 #38

100+
P: 111
my file:
my file:
NA bap
PO A C G T
01 0.00 3.67 0.00 0.00
02 0.00 0.00 3.67 0.00
03 0.00 0.00 0.00 3.67
04 0.00 3.67 0.00 0.00
05 3.67 0.00 0.00 0.00
06 3.46 0.00 0.22 0.00
07 0.00 0.00 3.67 0.00
08 0.00 0.00 0.00 3.67
09 0.00 0.00 0.00 3.67
10 0.00 3.67 0.00 0.00
11 3.67 0.00 0.00 0.00
12 3.67 0.00 0.00 0.00
13 0.00 0.00 3.67 0.00
14 0.00 0.00 0.00 3.67
15 0.00 0.00 3.67 0.00
16 0.00 3.67 0.00 0.00
//
//



This is my code:
Expand|Select|Wrap|Line Numbers
  1. f=open("deeps1.txt","r")
  2. line=f.next()
  3. while not line.startswith('PO'):
  4.     line=f.next()
  5.  
  6. headerlist=line.strip().split()[1:]
  7. linelist=[]
  8.  
  9.  
  10. line=f.next().strip()
  11. while not line.startswith('/'):
  12.     if line != '':
  13.         linelist.append(line.strip().split())
  14.     line=f.next().strip()
  15.  
  16. keys=[i[0] for i in linelist]
  17. values=[[float(s) for s in item] for item in [j[1:] for j in linelist]]
  18. array={}
  19. linedict=dict(zip(keys,values))
  20. keys = linedict.keys()
  21. keys.sort()
  22. for key in keys:
  23.     array=[key,linedict[key]]
  24.  
  25. datadict={}
  26. datadict1={}
  27. for i,item in enumerate(headerlist):
  28.     datadict[item]={}
  29.     print datadict[item]#this one returns empty dictionary
  30.     for key_ in linedict:
  31.         print item# all items are getting printed
  32.         datadict[item][key_]=linedict[key_][i]
  33.         # but for the print statement below its saying key error:'C'
  34.        print datadict['C']['01']
  35.  
I have written the problems in comment form in the code!

can you execute my code and see why this is happening? I have been stuck with this for a day now.
waiting for your reply
cheers!
Jul 9 '07 #39

bartonc
Expert 5K+
P: 6,596
my file:
my file:
NA bap
PO A C G T
01 0.00 3.67 0.00 0.00
02 0.00 0.00 3.67 0.00
03 0.00 0.00 0.00 3.67
04 0.00 3.67 0.00 0.00
05 3.67 0.00 0.00 0.00
06 3.46 0.00 0.22 0.00
07 0.00 0.00 3.67 0.00
08 0.00 0.00 0.00 3.67
09 0.00 0.00 0.00 3.67
10 0.00 3.67 0.00 0.00
11 3.67 0.00 0.00 0.00
12 3.67 0.00 0.00 0.00
13 0.00 0.00 3.67 0.00
14 0.00 0.00 0.00 3.67
15 0.00 0.00 3.67 0.00
16 0.00 3.67 0.00 0.00
//
//



This is my code:
Expand|Select|Wrap|Line Numbers
  1. f=open("deeps1.txt","r")
  2. line=f.next()
  3. while not line.startswith('PO'):
  4.     line=f.next()
  5.  
  6. headerlist=line.strip().split()[1:]
  7. linelist=[]
  8.  
  9.  
  10. line=f.next().strip()
  11. while not line.startswith('/'):
  12.     if line != '':
  13.         linelist.append(line.strip().split())
  14.     line=f.next().strip()
  15.  
  16. keys=[i[0] for i in linelist]
  17. values=[[float(s) for s in item] for item in [j[1:] for j in linelist]]
  18. array={}
  19. linedict=dict(zip(keys,values))
  20. keys = linedict.keys()
  21. keys.sort()
  22. for key in keys:
  23.     array=[key,linedict[key]]
  24.  
  25. datadict={}
  26. datadict1={}
  27. for i,item in enumerate(headerlist):
  28.     datadict[item]={}
  29.     print datadict[item]#this one returns empty dictionary
  30.     for key_ in linedict:
  31.         print item# all items are getting printed
  32.         datadict[item][key_]=linedict[key_][i]
  33.         # but for the print statement below its saying key error:'C'
  34.        print datadict['C']['01']
  35.  
I have written the problems in comment form in the code!

can you execute my code and see why this is happening? I have been stuck with this for a day now.
waiting for your reply
cheers!
For that to work, you'll have to print every new addition to the datadict.
Expand|Select|Wrap|Line Numbers
  1.         # print every item going into the datadict because 'C' has not been created yet
  2.        print linedict[key_][i]
Jul 9 '07 #40

bvdet
Expert Mod 2.5K+
P: 2,851
my file:
my file:
NA bap
PO A C G T
01 0.00 3.67 0.00 0.00
02 0.00 0.00 3.67 0.00
03 0.00 0.00 0.00 3.67
04 0.00 3.67 0.00 0.00
05 3.67 0.00 0.00 0.00
06 3.46 0.00 0.22 0.00
07 0.00 0.00 3.67 0.00
08 0.00 0.00 0.00 3.67
09 0.00 0.00 0.00 3.67
10 0.00 3.67 0.00 0.00
11 3.67 0.00 0.00 0.00
12 3.67 0.00 0.00 0.00
13 0.00 0.00 3.67 0.00
14 0.00 0.00 0.00 3.67
15 0.00 0.00 3.67 0.00
16 0.00 3.67 0.00 0.00
//
//



This is my code:
Expand|Select|Wrap|Line Numbers
  1. f=open("deeps1.txt","r")
  2. line=f.next()
  3. while not line.startswith('PO'):
  4.     line=f.next()
  5.  
  6. headerlist=line.strip().split()[1:]
  7. linelist=[]
  8.  
  9.  
  10. line=f.next().strip()
  11. while not line.startswith('/'):
  12.     if line != '':
  13.         linelist.append(line.strip().split())
  14.     line=f.next().strip()
  15.  
  16. keys=[i[0] for i in linelist]
  17. values=[[float(s) for s in item] for item in [j[1:] for j in linelist]]
  18. array={}
  19. linedict=dict(zip(keys,values))
  20. keys = linedict.keys()
  21. keys.sort()
  22. for key in keys:
  23.     array=[key,linedict[key]]
  24.  
  25. datadict={}
  26. datadict1={}
  27. for i,item in enumerate(headerlist):
  28.     datadict[item]={}
  29.     print datadict[item]#this one returns empty dictionary
  30.     for key_ in linedict:
  31.         print item# all items are getting printed
  32.         datadict[item][key_]=linedict[key_][i]
  33.         # but for the print statement below its saying key error:'C'
  34.        print datadict['C']['01']
  35.  
I have written the problems in comment form in the code!

can you execute my code and see why this is happening? I have been stuck with this for a day now.
waiting for your reply
cheers!
I made a few minor changes to your code:
Expand|Select|Wrap|Line Numbers
  1. f=open("deeps1.txt","r")
  2. line=f.next()
  3. while not line.startswith('PO'):
  4.     line=f.next()
  5.  
  6. headerlist=line.strip().split()[1:]
  7. linelist=[]
  8.  
  9. line=f.next().strip()
  10. while not line.startswith('/'):
  11.     if line != '':
  12.         linelist.append(line.strip().split())
  13.     line=f.next().strip()
  14.  
  15. keys=[i[0] for i in linelist]
  16. values=[[float(s) for s in item] for item in [j[1:] for j in linelist]]
  17. array={}
  18. linedict=dict(zip(keys,values))
  19. keys = linedict.keys()
  20. keys.sort()
  21. # initialize list object 'array'
  22. array = []
  23. for key in keys:
  24.     # append each item to list object
  25.     array.append([key,linedict[key]])
  26.  
  27. # initialize dictionary
  28. datadict={}
  29.  
  30. for i,item in enumerate(headerlist):
  31.     datadict[item]={}
  32.     # print 'item' here, datadict[item] is empty at this point
  33.     print item
  34.     for key_ in linedict:
  35.         datadict[item][key_]=linedict[key_][i]
  36.  
  37. # Change indentation level
  38. print datadict['C']['01']
Jul 9 '07 #41

100+
P: 111
sorry,
that doesnt seem to do anything.
Expand|Select|Wrap|Line Numbers
  1. f=open("deeps1.txt","r")
  2. line=f.next()
  3. while not line.startswith('PO'):
  4.     line=f.next()
  5.  
  6. headerlist=line.strip().split()[1:]
  7. linelist=[]
  8.  
  9.  
  10. line=f.next().strip()
  11. while not line.startswith('/'):
  12.     if line != '':
  13.         linelist.append(line.strip().split())
  14.     line=f.next().strip()
  15.  
  16. keys=[i[0] for i in linelist]
  17. values=[[float(s) for s in item] for item in [j[1:] for j in linelist]]
  18. array={}
  19. linedict=dict(zip(keys,values))
  20. keys = linedict.keys()
  21. keys.sort()
  22. for key in keys:
  23.     array=[key,linedict[key]]
  24.  
  25. datadict={}
  26. datadict1={}
  27. for i,item in enumerate(headerlist):
  28.     datadict[item]={}
  29.     print datadict[item]
  30.     for key_ in linedict:
  31.  
  32.         datadict[item][key_]=linedict[key_][i]
  33.         print linedict[key][i]#this is what i added and it still gives the same answer
  34.         print datadict['C']['01']
:( or should i add it somewhere else?
waiting for ur reply
cheers!
Jul 9 '07 #42

bvdet
Expert Mod 2.5K+
P: 2,851
CODE TAGS! This code:
Expand|Select|Wrap|Line Numbers
  1. for i,item in enumerate(headerlist):
  2.     datadict[item]={}
  3.     print item
  4.     for key_ in linedict:
  5.         datadict[item][key_]=linedict[key_][i]
  6.         print linedict[key_][i]
  7.  
  8. print datadict['C']['01']
give me this output:
Expand|Select|Wrap|Line Numbers
  1. >>> A
  2. 0.0
  3. 0.0
  4. 0.0
  5. 0.0
  6. 3.46
  7. 0.0
  8. 0.0
  9. 3.67
  10. 0.0
  11. 0.0
  12. 0.0
  13. 3.67
  14. 0.0
  15. 3.67
  16. 0.0
  17. 0.0
  18. C
  19. 0.0
  20. 0.0
  21. 0.0
  22. 3.67
  23. 0.0
  24. 0.0
  25. 3.67
  26. 0.0
  27. 0.0
  28. 0.0
  29. 3.67
  30. 0.0
  31. 0.0
  32. 0.0
  33. 0.0
  34. 3.67
  35. G
  36. 3.67
  37. 0.0
  38. 3.67
  39. 0.0
  40. 0.22
  41. 3.67
  42. 0.0
  43. 0.0
  44. 0.0
  45. 0.0
  46. 0.0
  47. 0.0
  48. 0.0
  49. 0.0
  50. 3.67
  51. 0.0
  52. T
  53. 0.0
  54. 3.67
  55. 0.0
  56. 0.0
  57. 0.0
  58. 0.0
  59. 0.0
  60. 0.0
  61. 3.67
  62. 3.67
  63. 0.0
  64. 0.0
  65. 3.67
  66. 0.0
  67. 0.0
  68. 0.0
  69. 3.67
  70. >>> 
Jul 9 '07 #43

100+
P: 111
thank you sir!that works for me too..but what exactly i want when i type
datadict['C']["01'] is 3.67(according to my file that is printed below).But it is printing the entire thing
this is what my file says:
NA bap
PO A C G T
01 0.00 3.67 0.00 0.00
02 0.00 0.00 3.67 0.00
03 0.00 0.00 0.00 3.67
04 0.00 3.67 0.00 0.00
05 3.67 0.00 0.00 0.00
06 3.46 0.00 0.22 0.00
07 0.00 0.00 3.67 0.00
08 0.00 0.00 0.00 3.67
09 0.00 0.00 0.00 3.67
10 0.00 3.67 0.00 0.00
11 3.67 0.00 0.00 0.00
12 3.67 0.00 0.00 0.00
13 0.00 0.00 3.67 0.00
14 0.00 0.00 0.00 3.67
15 0.00 0.00 3.67 0.00
16 0.00 3.67 0.00 0.00
//
//
so how do I change the code accordingly?
Expand|Select|Wrap|Line Numbers
  1. f=open("deeps1.txt","r")
  2. line=f.next()
  3. while not line.startswith('PO'):
  4.     line=f.next()
  5.  
  6. headerlist=line.strip().split()[1:]
  7. linelist=[]
  8.  
  9.  
  10. line=f.next().strip()
  11. while not line.startswith('/'):
  12.     if line != '':
  13.         linelist.append(line.strip().split())
  14.     line=f.next().strip()
  15.  
  16. keys=[i[0] for i in linelist]
  17. values=[[float(s) for s in item] for item in [j[1:] for j in linelist]]
  18. array={}
  19. linedict=dict(zip(keys,values))
  20. keys = linedict.keys()
  21. keys.sort()
  22. for key in keys:
  23.     array=[key,linedict[key]]
  24.  
  25. datadict={}
  26. datadict1={}
  27. for i,item in enumerate(headerlist):
  28.     datadict[item]={}
  29.     print item
  30.     for key_ in linedict:
  31.         datadict[item][key_]=linedict[key_][i]
  32.         print linedict[key_][i] 
  33. print datadict['C']['01']
  34.  
waiting,
cheers!
Jul 9 '07 #44

bvdet
Expert Mod 2.5K+
P: 2,851
thank you sir!that works for me too..but what exactly i want when i type
datadict['C']["01'] is 3.67(according to my file that is printed below).But it is printing the entire thing
this is what my file says:
NA bap
PO A C G T
01 0.00 3.67 0.00 0.00
02 0.00 0.00 3.67 0.00
03 0.00 0.00 0.00 3.67
04 0.00 3.67 0.00 0.00
05 3.67 0.00 0.00 0.00
06 3.46 0.00 0.22 0.00
07 0.00 0.00 3.67 0.00
08 0.00 0.00 0.00 3.67
09 0.00 0.00 0.00 3.67
10 0.00 3.67 0.00 0.00
11 3.67 0.00 0.00 0.00
12 3.67 0.00 0.00 0.00
13 0.00 0.00 3.67 0.00
14 0.00 0.00 0.00 3.67
15 0.00 0.00 3.67 0.00
16 0.00 3.67 0.00 0.00
//
//
so how do I change the code accordingly?
Expand|Select|Wrap|Line Numbers
  1. f=open("deeps1.txt","r")
  2. line=f.next()
  3. while not line.startswith('PO'):
  4.     line=f.next()
  5.  
  6. headerlist=line.strip().split()[1:]
  7. linelist=[]
  8.  
  9.  
  10. line=f.next().strip()
  11. while not line.startswith('/'):
  12.     if line != '':
  13.         linelist.append(line.strip().split())
  14.     line=f.next().strip()
  15.  
  16. keys=[i[0] for i in linelist]
  17. values=[[float(s) for s in item] for item in [j[1:] for j in linelist]]
  18. array={}
  19. linedict=dict(zip(keys,values))
  20. keys = linedict.keys()
  21. keys.sort()
  22. for key in keys:
  23.     array=[key,linedict[key]]
  24.  
  25. datadict={}
  26. datadict1={}
  27. for i,item in enumerate(headerlist):
  28.     datadict[item]={}
  29.     print item
  30.     for key_ in linedict:
  31.         datadict[item][key_]=linedict[key_][i]
  32.         print linedict[key_][i] 
  33. print datadict['C']['01']
  34.  
waiting,
cheers!
I tested your code exactly as you posted it:
Expand|Select|Wrap|Line Numbers
  1. >>> datadict['C']['01']
  2. 3.6699999999999999
  3. >>> 
There are other print statements in the code.
Jul 9 '07 #45

100+
P: 111
Thanks for all the help,I wouldnt have come this far without the help of all u ppl.
there is a new problem
Expand|Select|Wrap|Line Numbers
  1. f=open("deeps1.txt","r")
  2. line=f.next()
  3. while not line.startswith('PO'):
  4.     line=f.next()
  5.  
  6. headerlist=line.strip().split()[1:]
  7. linelist=[]
  8.  
  9.  
  10. line=f.next().strip()
  11. while not line.startswith('/'):
  12.     if line != '':
  13.         linelist.append(line.strip().split())
  14.     line=f.next().strip()
  15.  
  16. keys=[i[0] for i in linelist]
  17. values=[[float(s) for s in item] for item in [j[1:] for j in linelist]]
  18. array={}
  19. linedict=dict(zip(keys,values))
  20. keys = linedict.keys()
  21. keys.sort()
  22. for key in keys:
  23.     array=[key,linedict[key]]
  24.  
  25. datadict={}
  26. datadict1={}
  27. for i,item in enumerate(headerlist):
  28.     datadict[item]={}
  29.     for key_ in linedict:
  30.         datadict[item][key_]=linedict[key_][i]
  31.  
  32. for keymain in datadict:
  33.     for keysub in datadict[keymain]:
  34.         datadict[keymain][keysub]+=1.0
  35. #print datadict['T']['16']
  36. seq="ATA"
  37. res=1
  38. for i in range(1,len(seq)):
  39.     key=seq[i]
  40.     for keymain in datadict:
  41.         if keymain==key:
  42.             print key,i
  43.   #print datadict[key]
  44. #print res
  45.  
this is the code.as I already posted i want to find something like
A[01]*T[02]*A[03]
But the problem I am facing is that the datadict has keys like "01","02" but the in the loop of seq i have 1,2,3,4. and i cant start from zero whatsoever. how can i make the looping of my seq to 01,02 etc..if i say
for i in range('01',len(seq)):
its taking it as a string!
waiting for ur reply,
cheers!!
Jul 10 '07 #46

100+
P: 111
oops am sorry!
The one you suggested is working:
this is the code:
Expand|Select|Wrap|Line Numbers
  1. f=open("deeps1.txt","r")
  2. line=f.next()
  3. while not line.startswith('PO'):
  4.     line=f.next()
  5.  
  6. headerlist=line.strip().split()[1:]
  7. linelist=[]
  8.  
  9.  
  10. line=f.next().strip()
  11. while not line.startswith('/'):
  12.     if line != '':
  13.         linelist.append(line.strip().split())
  14.     line=f.next().strip()
  15.  
  16. keys=[i[0] for i in linelist]
  17. values=[[float(s) for s in item] for item in [j[1:] for j in linelist]]
  18. array={}
  19. linedict=dict(zip(keys,values))
  20. keys = linedict.keys()
  21. keys.sort()
  22. for key in keys:
  23.     array=[key,linedict[key]]
  24.  
  25. datadict={}
  26. datadict1={}
  27. for i,item in enumerate(headerlist):
  28.     datadict[item]={}
  29.     for key_ in linedict:
  30.         datadict[item][key_]=linedict[key_][i]
  31.  
  32. for keymain in datadict:
  33.     for keysub in datadict[keymain]:
  34.         datadict[keymain][keysub]+=1.0
  35. #print datadict['T']['16']
  36. seq="ATATT"
  37. res=1
  38. for i,key in enumerate (seq):
  39.     res*=datadict[key]["%02d"%(i+1)]# I dont understand this line.the formatting especially
  40. print res
  41.  
  42.  
This seems to do for the first letter only "A"
waiting for your reply,
cheers!
Jul 10 '07 #47

100+
P: 111
sorry!!
its working:)
cheers!!
Jul 10 '07 #48

100+
P: 111
I am doing a simple task here,and I am getting error.cant understand why that is happening!
I am trying to find out a score for a sequence after creating all the dictionaries:
if the seq="ACGT"
value of A and T is 0.3
VALUE of C and G is 0.2
score=val(A)*val(c)*val(G)*val(T)
so it should be score=0.3*0.2*0.2*0.3=3.6
My error is mentioned in comment form in the last lines of the code
this is my code:
Expand|Select|Wrap|Line Numbers
  1. f=open("deeps1.txt","r")
  2. line=f.next()
  3. while not line.startswith('PO'):
  4.     line=f.next()
  5.  
  6. headerlist=line.strip().split()[1:]
  7. linelist=[]
  8.  
  9.  
  10. line=f.next().strip()
  11. while not line.startswith('/'):
  12.     if line != '':
  13.         linelist.append(line.strip().split())
  14.     line=f.next().strip()
  15.  
  16. keys=[i[0] for i in linelist]
  17. values=[[float(s) for s in item] for item in [j[1:] for j in linelist]]
  18. array={}
  19. linedict=dict(zip(keys,values))
  20. keys = linedict.keys()
  21. keys.sort()
  22. for key in keys:
  23.     array=[key,linedict[key]]
  24.  
  25. datadict={}
  26. datadict1={}
  27. for i,item in enumerate(headerlist):
  28.     datadict[item]={}
  29.     for key_ in linedict:
  30.         datadict[item][key_]=linedict[key_][i]
  31.  
  32. for keymain in datadict:
  33.     for keysub in datadict[keymain]:
  34.         datadict[keymain][keysub]+=1.0
  35. #print datadict['T']['16']
  36. seq="CGTCAG"
  37.  
  38. res=1
  39. for i in range(0,len(seq)):
  40.     key=seq[i]
  41.     res*=datadict[key]["%02d"%(i+1)]
  42.     #print res
  43.     score=1
  44.     value={"A":"0.3","T":"0.3","C":"0.2","G":"0.2"}
  45.     for it in value:
  46.         for item in seq:
  47.             if it==key:
  48.                 score=score*value[it]
  49.                 print score# I get an error that says TypeError: can't multiply sequence by non-int of type 'str'
  50.  
  51.  
  52. print res
  53.  
waiting 4 ur reply
cheers!
Jul 10 '07 #49

bvdet
Expert Mod 2.5K+
P: 2,851
I am doing a simple task here,and I am getting error.cant understand why that is happening!
I am trying to find out a score for a sequence after creating all the dictionaries:
if the seq="ACGT"
value of A and T is 0.3
VALUE of C and G is 0.2
score=val(A)*val(c)*val(G)*val(T)
so it should be score=0.3*0.2*0.2*0.3=3.6
My error is mentioned in comment form in the last lines of the code
this is my code:
Expand|Select|Wrap|Line Numbers
  1. f=open("deeps1.txt","r")
  2. line=f.next()
  3. while not line.startswith('PO'):
  4.     line=f.next()
  5.  
  6. headerlist=line.strip().split()[1:]
  7. linelist=[]
  8.  
  9.  
  10. line=f.next().strip()
  11. while not line.startswith('/'):
  12.     if line != '':
  13.         linelist.append(line.strip().split())
  14.     line=f.next().strip()
  15.  
  16. keys=[i[0] for i in linelist]
  17. values=[[float(s) for s in item] for item in [j[1:] for j in linelist]]
  18. array={}
  19. linedict=dict(zip(keys,values))
  20. keys = linedict.keys()
  21. keys.sort()
  22. for key in keys:
  23.     array=[key,linedict[key]]
  24.  
  25. datadict={}
  26. datadict1={}
  27. for i,item in enumerate(headerlist):
  28.     datadict[item]={}
  29.     for key_ in linedict:
  30.         datadict[item][key_]=linedict[key_][i]
  31.  
  32. for keymain in datadict:
  33.     for keysub in datadict[keymain]:
  34.         datadict[keymain][keysub]+=1.0
  35. #print datadict['T']['16']
  36. seq="CGTCAG"
  37.  
  38. res=1
  39. for i in range(0,len(seq)):
  40.     key=seq[i]
  41.     res*=datadict[key]["%02d"%(i+1)]
  42.     #print res
  43.     score=1
  44.     value={"A":"0.3","T":"0.3","C":"0.2","G":"0.2"}
  45.     for it in value:
  46.         for item in seq:
  47.             if it==key:
  48.                 score=score*value[it]
  49.                 print score# I get an error that says TypeError: can't multiply sequence by non-int of type 'str'
  50.  
  51.  
  52. print res
  53.  
waiting 4 ur reply
cheers!
You are receiving the error because the values in dictionary 'value' are strings. Either define them as numbers (e.g. "A":0.3,"T":0.3,....) or convert to float in the calculation:
Expand|Select|Wrap|Line Numbers
  1. score=score*float(value[it])
Jul 10 '07 #50

60 Replies

Post your reply

Sign in to post your reply or Sign up for a free account.