472,096 Members | 1,920 Online
Bytes | Software Development & Data Engineering Community
Post +

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 472,096 software developers and data experts.

file to dictionary

31
i am given an open file in the form of, key serperator value, its not just 1 line but many lines in that form. what the objective is is that i have to convert the open file into a dictionary where the output should be {key : value}, and the seperator is where it splits it up into both values.

for example:

File:
abc def ghj
jkl def jkk
uio def asd

the output should be :{"abc": "ghj" , "jkl" : "jkk", "uio" : "asd"}

and the parameters should be:

def file_to_dict(file, seperator)

How would i write a file like this any hints?
Mar 11 '09 #1
5 3083
bvdet
2,851 Expert Mod 2GB
Have you tried to code this yourself? Maybe this will give you some hints:
Expand|Select|Wrap|Line Numbers
  1. s = '''
  2. abc def ghj
  3. jkl def jkk
  4. uio def asd
  5. '''
  6.  
  7. sList = [item for item in s.split('\n') if item]
  8.  
  9. sep = "def"
  10. dd = {}
  11. for item in sList:
  12.     a,b = item.split(sep)
  13.     dd[a.strip()] = b.strip()
Mar 11 '09 #2
v13tn1g
31
yep i've tried to code it myself but what i have so far is i've basically made the file into a string and then in split it using the seperator. it's just that now im stuck with the word before the seperator as the key and after the seperator as the value.
Mar 11 '09 #3
v13tn1g
31
Expand|Select|Wrap|Line Numbers
  1. import urllib
  2.  
  3. def file_to_dict(a, s):
  4.     '''Return a dictionary that contains the contents of each line in a given 
  5.     open file as a key-value pair. The file contains lines of the form key 
  6.     separator value, where separator is the second parameter of the function. 
  7.     There may be more than one occurrence of the separator in the line; 
  8.     the key is before the first one - all others are part of the value string.
  9.     Both key and value shoud be without leading or trailing whitespace. '''
  10.  
  11.     f = urllib.urlopen(a)
  12.     for c in f:
  13.         g = c.rstrip()
  14.  
  15.     sList = [item for item in g.split('\n') if item]
  16.  
  17.     sep = s
  18.     dd = {}
  19.     for item in sList:
  20.         a,b = item.split(s)
  21.         dd[a.strip()] = b.strip()
  22.     print dd

I test it with:

Expand|Select|Wrap|Line Numbers
  1. file_to_dict("http://www.utsc.utoronto.ca/~szamosi/a20/lectures/w2/price_1.py", "price")
but i get the error

Expand|Select|Wrap|Line Numbers
  1. File "y:\<string>", line 1, in <module>
  2.   File "y:\<string>", line 34, in file_to_dict
  3. ValueError: too many values to unpack
Mar 14 '09 #4
v13tn1g
31
@bvdet
I think i know why i got that error, its because i tested a file such as

Expand|Select|Wrap|Line Numbers
  1. s = '''
  2. dfs ghj hge
  3. abc def ghj
  4. jkl def jkk
  5. uio def asd
  6. '''
the first line did not have the seperator hence the error.

also another thing if the test file were to be

Expand|Select|Wrap|Line Numbers
  1. s = '''
  2. dfs ghj hge
  3. abc def ghj
  4. jkl def jkk
  5. uio def asd
  6. dfs def qwe
  7. '''
since there is 2 occurances of dfs it won't print both of them in the dictionary it will only print out one, how should i write this program to print out both "dfs" key and value?
Mar 14 '09 #5
bvdet
2,851 Expert Mod 2GB
The first line has no separator, therefore the ValueError. We can trap that with a try/except block. Since we can have more than one data element with the same key, create a list for the values.
Expand|Select|Wrap|Line Numbers
  1. s = '''
  2.  
  3. dfs ghj hge
  4. abc def ghj
  5. jkl def jkk
  6. uio def asd
  7. dfs def qwe
  8. dfs def ewq
  9.  
  10.  
  11. '''
  12.  
  13.  
  14. sList = [item for item in s.split('\n') if item.strip()]
  15.  
  16. sep = "def"
  17. dd = {}
  18. for i, item in enumerate(sList):
  19.     try:
  20.         a,b = [s.strip() for s in item.split(sep)]
  21.         dd.setdefault(a, []).append(b)
  22.     except ValueError, e:
  23.         print "Malformed data line number %d" % (i+1)
  24.  
  25. for key in dd:
  26.     for item in dd[key]:
  27.         print "%s: %s" % (key, item)
Output:
Expand|Select|Wrap|Line Numbers
  1. >>> Malformed data line number 1
  2. jkl: jkk
  3. abc: ghj
  4. dfs: qwe
  5. dfs: ewq
  6. uio: asd
  7. >>> 
Mar 14 '09 #6

Post your reply

Sign in to post your reply or Sign up for a free account.

Similar topics

10 posts views Thread by Luis P. Mendes | last post: by
reply views Thread by laredotornado | last post: by
12 posts views Thread by teoryn | last post: by
5 posts views Thread by rohit | last post: by
1 post views Thread by jim-on-linux | last post: by
reply views Thread by leo001 | last post: by

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.