443,432 Members | 790 Online Need help? Post your question and get tips & solutions from a community of 443,432 IT Pros & Developers. It's quick & easy.

# Concatenating dictionary values and keys, and further operations

 P: n/a I wrote the following code to concatenate every 2 keys of a dictionary and their corresponding values. e.g if i have tiDict1 = tiDict1 = {'a':[1,2],'b':[3,4,5]} i should get tiDict2={'ab':[1,2][3,4,5]} and similarly for dicts with larger no. of features. Now i want to check each pair to see if they are connected...element of this pair will be one from the first list and one from the second....e.g for 'ab' i want to check if 1 and 3 are connected,then 1 and 4,then 1 and 5,then 2 and 3,then 2 and 4,then 2 and 5. The information of this connected thing is in a text file as follows: 1,'a',2,'b' 3,'a',5,'a' 3,'a',6,'a' 3,'a',7,'b' 8,'a',7,'b' .. .. This means 1(type 'a') and 2(type 'b') are connected,3 and 5 are connected and so on. I am not able to figure out how to do this.Any pointers would be helpful Here is the code i have written till now: Expand|Select|Wrap|Line Numbers def genTI(tiDict): tiDict1 = {} tiList = [tiDict1.keys(),tiDict1.values()] length =len(tiDict1.keys())-1 for i in range(0,length,1): for j in range(0,length,1): for k in range(1,length+1,1): if j+k <= length: key = tiList[i][j] + tiList[i][j+k] value = [tiList[i+1][j],tiList[i+1][j+k]] tiDict2[key] = value continue continue continue return tiDict2   Thanks in advance, girish Jun 5 '06 #1
11 Replies

 P: n/a Girish Sahani wrote: I wrote the following code to concatenate every 2 keys of a dictionary and their corresponding values. e.g if i have tiDict1 = tiDict1 = {'a':[1,2],'b':[3,4,5]} i should get tiDict2={'ab':[1,2][3,4,5]} and similarly for dicts with larger no. of features. Now i want to check each pair to see if they are connected...element of this pair will be one from the first list and one from the second....e.g for 'ab' i want to check if 1 and 3 are connected,then 1 and 4,then 1 and 5,then 2 and 3,then 2 and 4,then 2 and 5. The information of this connected thing is in a text file as follows: 1,'a',2,'b' 3,'a',5,'a' 3,'a',6,'a' 3,'a',7,'b' 8,'a',7,'b' . . This means 1(type 'a') and 2(type 'b') are connected,3 and 5 are connected and so on. I am not able to figure out how to do this.Any pointers would be helpful Girish It seems you want the Cartesian product of every pair of lists in the dictionary, including the product of lists with themselves (but you don't say why ;-)). I'm not sure the following is exactly what you want or if it is very efficient, but maybe it will start you off. It uses a function 'xcombine' taken from a recipe in the ASPN cookbook by David Klaffenbach (2004). (It should give every possibility, which you then check in your file) Gerard ------------------------------------------------------------------------- def nkRange(n,k): m = n - k + 1 indexer = range(0, k) vector = range(1, k+1) last = range(m, n+1) yield vector while vector != last: high_value = -1 high_index = -1 for i in indexer: val = vector[i] if val > high_value and val < m + i: high_value = val high_index = i for j in range(k - high_index): vector[j+high_index] = high_value + j + 1 yield vector def kSubsets( alist, k ): n = len(alist) for vector in nkRange(n, k): ret = [] for i in vector: ret.append( alist[i-1] ) yield ret data = { 'a': [1,2], 'b': [3,4,5], 'c': [1,4,7] } pairs = list( kSubsets(data.keys(),2) ) + [ [k,k] for k in data.iterkeys() ] print pairs for s in pairs: for t in xcombine( data[s], data[s] ): print "%s,'%s',%s,'%s'" % ( t, s, t, s ) ------------------------------------------------------------------------- 1,'a',1,'c' 1,'a',4,'c' 1,'a',7,'c' 2,'a',1,'c' 2,'a',4,'c' 2,'a',7,'c' 1,'a',3,'b' 1,'a',4,'b' 1,'a',5,'b' 2,'a',3,'b' 2,'a',4,'b' 2,'a',5,'b' 1,'c',3,'b' 1,'c',4,'b' 1,'c',5,'b' 4,'c',3,'b' 4,'c',4,'b' 4,'c',5,'b' 7,'c',3,'b' 7,'c',4,'b' 7,'c',5,'b' 1,'a',1,'a' 1,'a',2,'a' 2,'a',1,'a' 2,'a',2,'a' 1,'c',1,'c' 1,'c',4,'c' 1,'c',7,'c' 4,'c',1,'c' 4,'c',4,'c' 4,'c',7,'c' 7,'c',1,'c' 7,'c',4,'c' 7,'c',7,'c' 3,'b',3,'b' 3,'b',4,'b' 3,'b',5,'b' 4,'b',3,'b' 4,'b',4,'b' 4,'b',5,'b' 5,'b',3,'b' 5,'b',4,'b' 5,'b',5,'b' Jun 5 '06 #2

 P: n/a Gerard Flanagan wrote: Girish Sahani wrote: I wrote the following code to concatenate every 2 keys of a dictionary and their corresponding values. e.g if i have tiDict1 = tiDict1 = {'a':[1,2],'b':[3,4,5]} i should get tiDict2={'ab':[1,2][3,4,5]} and similarly for dicts with larger no. of features. Now i want to check each pair to see if they are connected...element of this pair will be one from the first list and one from the second....e.g for 'ab' i want to check if 1 and 3 are connected,then 1 and 4,then 1 and 5,then 2 and 3,then 2 and 4,then 2 and 5. The information of this connected thing is in a text file as follows: 1,'a',2,'b' 3,'a',5,'a' 3,'a',6,'a' 3,'a',7,'b' 8,'a',7,'b' . . This means 1(type 'a') and 2(type 'b') are connected,3 and 5 are connected and so on. I am not able to figure out how to do this.Any pointers would be helpful Girish It seems you want the Cartesian product of every pair of lists in the dictionary, including the product of lists with themselves (but you don't say why ;-)). I'm not sure the following is exactly what you want or if it is very efficient, but maybe it will start you off. It uses a function 'xcombine' taken from a recipe in the ASPN cookbook by David Klaffenbach (2004). http://aspn.activestate.com/ASPN/Coo.../Recipe/302478 Jun 5 '06 #3

 P: n/a I have a text file in the following format: 1,'a',2,'b' 3,'a',5,'c' 3,'a',6,'c' 3,'a',7,'b' 8,'a',7,'b' .. .. .. Now i need to generate 2 things by reading the file: 1) A dictionary with the numbers as keys and the letters as values. e.g the above would give me a dictionary like {1:'a', 2:'b', 3:'a', 5:'c', 6:'c' ........} 2) A list containing pairs of numbers from each line. The above formmat would give me the list as [[1,2],[3,5],[3,6][3,7][8,7]......] I wrote the following codes for both of these but the problem is that lines returns a list like ["1,'a',2,'b'","3,'a',5,'c","3,'a',6,'c'".....] Now due to the "" around each line,it is treated like one object and i cannot access the elements of a line. Expand|Select|Wrap|Line Numbers #code to generate the dictionary def get_colocations(filename): lines = open(filename).read().split("\n") colocnDict = {} i = 0 for line in lines: if i <= 2: colocnDict[line[i]] = line[i+1] i+=2 continue return colocnDict   Expand|Select|Wrap|Line Numbers def genPairs(filename): lines = open(filename).read().split("\n") pairList = [] for line in lines: pair = [line,line] pairList.append(pair) i+=2 continue return pairList   Please help :(( Jun 6 '06 #4

 P: n/a Girish Sahani wrote: 1) A dictionary with the numbers as keys and the letters as values. e.g the above would give me a dictionary like {1:'a', 2:'b', 3:'a', 5:'c', 6:'c' ........} def get_dict( f ) : out = {} for line in file(f) : n1,s1,n2,s2 = line.split(',') out.update( { int(n1):s1, int(n2):s2 } ) return out 2) A list containing pairs of numbers from each line. The above formmat would give me the list as [[1,2],[3,5],[3,6][3,7][8,7]......] def get_pairs( f ) : out = [] for line in file(f) : n1,_,n2,_ = line.split(',') out.append( [int(n1),int(n2)] ) return out Regards Sreeram -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2.2 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFEhQdNrgn0plK5qqURAiVkAJ9Rr0XRRhofIP4Z2eYF1n FvvHTCUgCgmMkM 6U9ieDTmvItGbW8QKUCWrFo= =wwVC -----END PGP SIGNATURE----- Jun 6 '06 #5

 P: n/a Girish Sahani wrote: Gerard Flanagan wrote: Girish Sahani wrote: > I wrote the following code to concatenate every 2 keys of a dictionary and > their corresponding values. > e.g if i have tiDict1 = tiDict1 = {'a':[1,2],'b':[3,4,5]} i should get > tiDict2={'ab':[1,2][3,4,5]} and similarly for dicts with larger no. of > features. > Now i want to check each pair to see if they are connected...element of > this pair will be one from the first list and one from the second....e.g > for 'ab' i want to check if 1 and 3 are connected,then 1 and 4,then 1 and > 5,then 2 and 3,then 2 and 4,then 2 and 5. > The information of this connected thing is in a text file as follows: > 1,'a',2,'b' > 3,'a',5,'a' > 3,'a',6,'a' > 3,'a',7,'b' > 8,'a',7,'b' > . > . > This means 1(type 'a') and 2(type 'b') are connected,3 and 5 are connected > and so on. > I am not able to figure out how to do this.Any pointers would be helpful Girish It seems you want the Cartesian product of every pair of lists in the dictionary, including the product of lists with themselves (but you don't say why ;-)). I'm not sure the following is exactly what you want or if it is very efficient, but maybe it will start you off. It uses a function 'xcombine' taken from a recipe in the ASPN cookbook by David Klaffenbach (2004). http://aspn.activestate.com/ASPN/Coo.../Recipe/302478 -- http://mail.python.org/mailman/listinfo/python-list Thanks a lot Gerard and Roberto.but i think i should explain the exactthing with an example.Roberto what i have right now is concatenating the keys and thecorresponding values:e.g {'a':[1,2],'b':[3,4,5],'c':[6,7]} should give me{'ab':[1,2][3,4,5] 'ac':[1,2][6,7] 'bc':[3,4,5][6,7]}The order doesnt matter here.It could be 'ac' followed by 'bc' and 'ac'.Also order doesnt matter in a string:the pair 'ab':[1,2][3,4,5] is same as'ba':[3,4,5][1,2].This representation means 'a' corresponds to the list [1,2] and 'b'corresponds to the list [3,4,5].Now, for each key-value pair,e.g for 'ab' i must check each feature in thelist of 'a' i.e. [1,2] with each feature in list of 'b' i.e. [3,4,5].So Iwant to take cartesian product of ONLY the 2 lists [1,2] and [3,4,5].Finally i want to check each pair if it is present in the file,whoseformat i had specified.The code Gerard has specified takes cartesian products of every 2 lists. Hi Garish, it's better to reply to the Group. Now, for each key-value pair,e.g for 'ab' i must check each feature in thelist of 'a' i.e. [1,2] with each feature in list of 'b' i.e. [3,4,5].So Iwant to take cartesian product of ONLY the 2 lists [1,2] and [3,4,5]. I'm confused. You say *for each* key-value pair, and you wrote above that the keys were the 'concatenation' of "every 2 keys of a dictionary". Sorry, too early for me. Maybe if you list every case you want, given the example data. All the best. Gerard Jun 6 '06 #7

 P: n/a Really sorry for that indentation thing :) I tried out the code you have given, and also the one sreeram had written. In all of these,i get the same error of this type: Error i get in Sreeram's code is: n1,_,n2,_ = line.split(',') ValueError: need more than 1 value to unpack And error i get in your code is: for n1, a1, n2, a2 in reader: ValueError: need more than 0 values to unpack Any ideas why this is happening? Thanks a lot, girish Jun 6 '06 #8

 P: n/a On 6/06/2006 4:15 PM, Girish Sahani wrote: Really sorry for that indentation thing :) I tried out the code you have given, and also the one sreeram had written. In all of these,i get the same error of this type: Error i get in Sreeram's code is: n1,_,n2,_ = line.split(',') ValueError: need more than 1 value to unpack And error i get in your code is: for n1, a1, n2, a2 in reader: ValueError: need more than 0 values to unpack Any ideas why this is happening? In the case of my code, this is consistent with the line being empty, probably the last line. As my mentor Bruno D. would say, your test data does not match your spec :-) Which do you want to change, the spec or the data? You can change my csv-reading code to detect dodgy data like this (for example): for row in reader: if not row: continue # ignore empty lines, wherever they appear if len(row) != 4: raise ValueError("Malformed row %r" % row) n1, a1, n2, a2 = row In the case of Sreeram's code, perhaps you could try inserting print "line = ", repr(line) before the statement that is causing the error. Thanks a lot, girish Jun 6 '06 #9

 P: n/a Girish> I have a text file in the following format: Girish> 1,'a',2,'b' Girish> 3,'a',5,'c' Girish> 3,'a',6,'c' Girish> 3,'a',7,'b' Girish> 8,'a',7,'b' Girish> . Girish> . Girish> . Girish> Now i need to generate 2 things by reading the file: Girish> 1) A dictionary with the numbers as keys and the letters as values. Girish> e.g the above would give me a dictionary like Girish> {1:'a', 2:'b', 3:'a', 5:'c', 6:'c' ........} Girish> 2) A list containing pairs of numbers from each line. Girish> The above formmat would give me the list as Girish> [[1,2],[3,5],[3,6][3,7][8,7]......] Running this: open("some.text.file", "w").write("""\ 1,'a',2,'b' 3,'a',5,'c' 3,'a',6,'c' 3,'a',7,'b' 8,'a',7,'b' """) import csv class dialect(csv.excel): quotechar = "'" reader = csv.reader(open("some.text.file", "rb"), dialect=dialect) mydict = {} mylist = [] for row in reader: numbers = [int(n) for n in row[::2]] letters = row[1::2] mydict.update(dict(zip(numbers, letters))) mylist.append(numbers) print mydict print mylist import os os.unlink("some.text.file") displays this: {1: 'a', 2: 'b', 3: 'a', 5: 'c', 6: 'c', 7: 'b', 8: 'a'} [[1, 2], [3, 5], [3, 6], [3, 7], [8, 7]] That seems to be approximately what you're looking for. Skip Jun 6 '06 #10

 P: n/a > On 6/06/2006 4:15 PM, Girish Sahani wrote: Really sorry for that indentation thing :) I tried out the code you have given, and also the one sreeram had written. In all of these,i get the same error of this type: Error i get in Sreeram's code is: n1,_,n2,_ = line.split(',') ValueError: need more than 1 value to unpack And error i get in your code is: for n1, a1, n2, a2 in reader: ValueError: need more than 0 values to unpack Any ideas why this is happening? In the case of my code, this is consistent with the line being empty, probably the last line. As my mentor Bruno D. would say, your test data does not match your spec :-) Which do you want to change, the spec or the data? Thanks John, i just changed my Data file so as not to contain any empty lines, i guess that was the easier solution ;) You can change my csv-reading code to detect dodgy data like this (for example): for row in reader: if not row: continue # ignore empty lines, wherever they appear if len(row) != 4: raise ValueError("Malformed row %r" % row) n1, a1, n2, a2 = row In the case of Sreeram's code, perhaps you could try inserting print "line = ", repr(line) before the statement that is causing the error. Thanks a lot, girish -- http://mail.python.org/mailman/listinfo/python-list Jun 7 '06 #11

 P: n/a Girish said, through Gerard's forwarded message:Thanks a lot Gerard and Roberto.but i think i should explain the exactthing with an example.Roberto what i have right now is concatenating the keys and thecorresponding values:e.g {'a':[1,2],'b':[3,4,5],'c':[6,7]} should give me{'ab':[1,2][3,4,5] 'ac':[1,2][6,7] 'bc':[3,4,5][6,7]}The order doesnt matter here.It could be 'ac' followed by 'bc' and 'ac'.Also order doesnt matter in a string:the pair 'ab':[1,2][3,4,5] is same as'ba':[3,4,5][1,2].This representation means 'a' corresponds to the list [1,2] and 'b'corresponds to the list [3,4,5]. The problem if that the two lists aren't distinguishable when concatenated, so what you get is [1, 2, 3, 4, 5]. You have to pack both lists in a tuple: {'ab': ([1, 2], [3, 4, 5]), ...} d = {'a':[1, 2], 'b':[3, 4, 5], 'c':[6, 7]} d2 = dict(((i + j), (d[i], d[j])) for i in d for j in d if i < j) d2 {'ac': ([1, 2], [6, 7]), 'ab': ([1, 2], [3, 4, 5]), 'bc': ([3, 4, 5], [6, 7])} Now, for each key-value pair,e.g for 'ab' i must check each feature in thelist of 'a' i.e. [1,2] with each feature in list of 'b' i.e. [3,4,5].So Iwant to take cartesian product of ONLY the 2 lists [1,2] and [3,4,5]. You can do this without creating an additional dictionary: d = {'a':[1, 2], 'b':[3, 4, 5], 'c':[6, 7]} pairs = [i + j for i in d for j in d if i < j] for i, j in pairs: .... cartesian_product = [(x, y) for x in d[i] for y in d[j]] .... print i + j, cartesian_product .... ac [(1, 6), (1, 7), (2, 6), (2, 7)] ab [(1, 3), (1, 4), (1, 5), (2, 3), (2, 4), (2, 5)] bc [(3, 6), (3, 7), (4, 6), (4, 7), (5, 6), (5, 7)] You can do whatever you want with this cartesian product inside the loop. Finally i want to check each pair if it is present in the file,whoseformat i had specified. I don't understand the semantics of the file format, so I leave this as an exercise to the reader :) Best regards. -- Roberto Bonvallet Jun 7 '06 #12

### This discussion thread is closed

Replies have been disabled for this discussion. 