Connecting Tech Pros Worldwide Help | Site Map

file to dictionary

Newbie
 
Join Date: Feb 2009
Posts: 31
#1: Mar 11 '09
i am given an open file in the form of, key serperator value, its not just 1 line but many lines in that form. what the objective is is that i have to convert the open file into a dictionary where the output should be {key : value}, and the seperator is where it splits it up into both values.

for example:

File:
abc def ghj
jkl def jkk
uio def asd

the output should be :{"abc": "ghj" , "jkl" : "jkk", "uio" : "asd"}

and the parameters should be:

def file_to_dict(file, seperator)

How would i write a file like this any hints?
bvdet's Avatar
Moderator
 
Join Date: Oct 2006
Location: Nashville, TN
Posts: 1,561
#2: Mar 11 '09

re: file to dictionary


Have you tried to code this yourself? Maybe this will give you some hints:
Expand|Select|Wrap|Line Numbers
  1. s = '''
  2. abc def ghj
  3. jkl def jkk
  4. uio def asd
  5. '''
  6.  
  7. sList = [item for item in s.split('\n') if item]
  8.  
  9. sep = "def"
  10. dd = {}
  11. for item in sList:
  12.     a,b = item.split(sep)
  13.     dd[a.strip()] = b.strip()
Newbie
 
Join Date: Feb 2009
Posts: 31
#3: Mar 11 '09

re: file to dictionary


yep i've tried to code it myself but what i have so far is i've basically made the file into a string and then in split it using the seperator. it's just that now im stuck with the word before the seperator as the key and after the seperator as the value.
Newbie
 
Join Date: Feb 2009
Posts: 31
#4: Mar 14 '09

re: file to dictionary


Expand|Select|Wrap|Line Numbers
  1. import urllib
  2.  
  3. def file_to_dict(a, s):
  4.     '''Return a dictionary that contains the contents of each line in a given 
  5.     open file as a key-value pair. The file contains lines of the form key 
  6.     separator value, where separator is the second parameter of the function. 
  7.     There may be more than one occurrence of the separator in the line; 
  8.     the key is before the first one - all others are part of the value string.
  9.     Both key and value shoud be without leading or trailing whitespace. '''
  10.  
  11.     f = urllib.urlopen(a)
  12.     for c in f:
  13.         g = c.rstrip()
  14.  
  15.     sList = [item for item in g.split('\n') if item]
  16.  
  17.     sep = s
  18.     dd = {}
  19.     for item in sList:
  20.         a,b = item.split(s)
  21.         dd[a.strip()] = b.strip()
  22.     print dd

I test it with:

Expand|Select|Wrap|Line Numbers
  1. file_to_dict("http://www.utsc.utoronto.ca/~szamosi/a20/lectures/w2/price_1.py", "price")
but i get the error

Expand|Select|Wrap|Line Numbers
  1. File "y:\<string>", line 1, in <module>
  2.   File "y:\<string>", line 34, in file_to_dict
  3. ValueError: too many values to unpack
Newbie
 
Join Date: Feb 2009
Posts: 31
#5: Mar 14 '09

re: file to dictionary


Quote:

Originally Posted by bvdet View Post

Have you tried to code this yourself? Maybe this will give you some hints:

Expand|Select|Wrap|Line Numbers
  1. s = '''
  2. abc def ghj
  3. jkl def jkk
  4. uio def asd
  5. '''
  6.  
  7. sList = [item for item in s.split('\n') if item]
  8.  
  9. sep = "def"
  10. dd = {}
  11. for item in sList:
  12.     a,b = item.split(sep)
  13.     dd[a.strip()] = b.strip()

I think i know why i got that error, its because i tested a file such as

Expand|Select|Wrap|Line Numbers
  1. s = '''
  2. dfs ghj hge
  3. abc def ghj
  4. jkl def jkk
  5. uio def asd
  6. '''
the first line did not have the seperator hence the error.

also another thing if the test file were to be

Expand|Select|Wrap|Line Numbers
  1. s = '''
  2. dfs ghj hge
  3. abc def ghj
  4. jkl def jkk
  5. uio def asd
  6. dfs def qwe
  7. '''
since there is 2 occurances of dfs it won't print both of them in the dictionary it will only print out one, how should i write this program to print out both "dfs" key and value?
bvdet's Avatar
Moderator
 
Join Date: Oct 2006
Location: Nashville, TN
Posts: 1,561
#6: Mar 14 '09

re: file to dictionary


The first line has no separator, therefore the ValueError. We can trap that with a try/except block. Since we can have more than one data element with the same key, create a list for the values.
Expand|Select|Wrap|Line Numbers
  1. s = '''
  2.  
  3. dfs ghj hge
  4. abc def ghj
  5. jkl def jkk
  6. uio def asd
  7. dfs def qwe
  8. dfs def ewq
  9.  
  10.  
  11. '''
  12.  
  13.  
  14. sList = [item for item in s.split('\n') if item.strip()]
  15.  
  16. sep = "def"
  17. dd = {}
  18. for i, item in enumerate(sList):
  19.     try:
  20.         a,b = [s.strip() for s in item.split(sep)]
  21.         dd.setdefault(a, []).append(b)
  22.     except ValueError, e:
  23.         print "Malformed data line number %d" % (i+1)
  24.  
  25. for key in dd:
  26.     for item in dd[key]:
  27.         print "%s: %s" % (key, item)
Output:
Expand|Select|Wrap|Line Numbers
  1. >>> Malformed data line number 1
  2. jkl: jkk
  3. abc: ghj
  4. dfs: qwe
  5. dfs: ewq
  6. uio: asd
  7. >>> 
Reply