By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
440,034 Members | 2,000 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 440,034 IT Pros & Developers. It's quick & easy.

file to dictionary

P: 31
i am given an open file in the form of, key serperator value, its not just 1 line but many lines in that form. what the objective is is that i have to convert the open file into a dictionary where the output should be {key : value}, and the seperator is where it splits it up into both values.

for example:

File:
abc def ghj
jkl def jkk
uio def asd

the output should be :{"abc": "ghj" , "jkl" : "jkk", "uio" : "asd"}

and the parameters should be:

def file_to_dict(file, seperator)

How would i write a file like this any hints?
Mar 11 '09 #1
Share this Question
Share on Google+
5 Replies


bvdet
Expert Mod 2.5K+
P: 2,851
Have you tried to code this yourself? Maybe this will give you some hints:
Expand|Select|Wrap|Line Numbers
  1. s = '''
  2. abc def ghj
  3. jkl def jkk
  4. uio def asd
  5. '''
  6.  
  7. sList = [item for item in s.split('\n') if item]
  8.  
  9. sep = "def"
  10. dd = {}
  11. for item in sList:
  12.     a,b = item.split(sep)
  13.     dd[a.strip()] = b.strip()
Mar 11 '09 #2

P: 31
yep i've tried to code it myself but what i have so far is i've basically made the file into a string and then in split it using the seperator. it's just that now im stuck with the word before the seperator as the key and after the seperator as the value.
Mar 11 '09 #3

P: 31
Expand|Select|Wrap|Line Numbers
  1. import urllib
  2.  
  3. def file_to_dict(a, s):
  4.     '''Return a dictionary that contains the contents of each line in a given 
  5.     open file as a key-value pair. The file contains lines of the form key 
  6.     separator value, where separator is the second parameter of the function. 
  7.     There may be more than one occurrence of the separator in the line; 
  8.     the key is before the first one - all others are part of the value string.
  9.     Both key and value shoud be without leading or trailing whitespace. '''
  10.  
  11.     f = urllib.urlopen(a)
  12.     for c in f:
  13.         g = c.rstrip()
  14.  
  15.     sList = [item for item in g.split('\n') if item]
  16.  
  17.     sep = s
  18.     dd = {}
  19.     for item in sList:
  20.         a,b = item.split(s)
  21.         dd[a.strip()] = b.strip()
  22.     print dd

I test it with:

Expand|Select|Wrap|Line Numbers
  1. file_to_dict("http://www.utsc.utoronto.ca/~szamosi/a20/lectures/w2/price_1.py", "price")
but i get the error

Expand|Select|Wrap|Line Numbers
  1. File "y:\<string>", line 1, in <module>
  2.   File "y:\<string>", line 34, in file_to_dict
  3. ValueError: too many values to unpack
Mar 14 '09 #4

P: 31
@bvdet
I think i know why i got that error, its because i tested a file such as

Expand|Select|Wrap|Line Numbers
  1. s = '''
  2. dfs ghj hge
  3. abc def ghj
  4. jkl def jkk
  5. uio def asd
  6. '''
the first line did not have the seperator hence the error.

also another thing if the test file were to be

Expand|Select|Wrap|Line Numbers
  1. s = '''
  2. dfs ghj hge
  3. abc def ghj
  4. jkl def jkk
  5. uio def asd
  6. dfs def qwe
  7. '''
since there is 2 occurances of dfs it won't print both of them in the dictionary it will only print out one, how should i write this program to print out both "dfs" key and value?
Mar 14 '09 #5

bvdet
Expert Mod 2.5K+
P: 2,851
The first line has no separator, therefore the ValueError. We can trap that with a try/except block. Since we can have more than one data element with the same key, create a list for the values.
Expand|Select|Wrap|Line Numbers
  1. s = '''
  2.  
  3. dfs ghj hge
  4. abc def ghj
  5. jkl def jkk
  6. uio def asd
  7. dfs def qwe
  8. dfs def ewq
  9.  
  10.  
  11. '''
  12.  
  13.  
  14. sList = [item for item in s.split('\n') if item.strip()]
  15.  
  16. sep = "def"
  17. dd = {}
  18. for i, item in enumerate(sList):
  19.     try:
  20.         a,b = [s.strip() for s in item.split(sep)]
  21.         dd.setdefault(a, []).append(b)
  22.     except ValueError, e:
  23.         print "Malformed data line number %d" % (i+1)
  24.  
  25. for key in dd:
  26.     for item in dd[key]:
  27.         print "%s: %s" % (key, item)
Output:
Expand|Select|Wrap|Line Numbers
  1. >>> Malformed data line number 1
  2. jkl: jkk
  3. abc: ghj
  4. dfs: qwe
  5. dfs: ewq
  6. uio: asd
  7. >>> 
Mar 14 '09 #6

Post your reply

Sign in to post your reply or Sign up for a free account.