473,390 Members | 1,339 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,390 software developers and data experts.

Convert list to dictionary problem

GTXY20
29
Hi,

I have the following in a text file:

1,a
1,b
1,b
2,a
2,c
2,a
2,c
etc....

I have the following code to open the text file create a list from the data inside. I am trying to create a dictionary like:

{[1:a], [1:b], [1:b], [2:a], [2:c], [2:a], [2:c]}

I am using the following:
Expand|Select|Wrap|Line Numbers
  1. infile = open('input.txt', 'r')
  2. records = infile.readlines()
  3. infile.close()
  4. records = [s.replace('\n', '') for s in records]
  5. finalrecords = map(string.split() ,records)
  6.  
However I keep getting the following error:

"pythontest.py", line 5, in <module>
finalrecords = map(string.split() ,records)
NameError: name 'string' is not defined

Any advice - also moving forward I would like to create from the dictionary a count associated with each unique instance of a key:value relationship so using the above data I would write to a file:

KEY UNIQUE INSTANCES
1 2 (sum for unique key value instance 1:a and 1:b)
2 2 (sum for unique key value instance 2:a and 2:c)

I can do this in SQL but would prefer to do in python for speed and flexibility with computations.

Any advice is greatly appreciated.

GTXY20
Oct 1 '07 #1
15 17707
ghostdog74
511 Expert 256MB
string.split is deprecated.
use <string>.split() instead.
eg
Expand|Select|Wrap|Line Numbers
  1. s = "test , test1"
  2. s.split()
  3.  
by the way, you can't create dictionary will same key. dictionary keys should be unique.
Oct 1 '07 #2
GTXY20
29
Thanks - if they need to be unique how do i import so that I keep the unique key but assign the multiple associated values so that I get:

{[1:a,b], [2:a,c]}

thanks again..
Oct 1 '07 #3
bartonc
6,596 Expert 4TB
My friend ghostdog74 is correct. Given that data, you'd end up with a very small dictionary:
Expand|Select|Wrap|Line Numbers
  1. >>> records = '1,a\n1,b\n1,b\n2,a\n2,c\n2,a\n2,c' # often missing the last newline
  2. >>> lines = records.split()
  3. >>> lines
  4. ['1,a', '1,b', '1,b', '2,a', '2,c', '2,a', '2,c']
  5. >>> dd = dict((key, value) for key, value in (line.split(',') for line in lines))
  6. >>> dd
  7. {'1': 'b', '2': 'c'}
  8. >>> 
Oct 1 '07 #4
GTXY20
29
OK - if i just run the first part:

infile = open('input.txt', 'r')
records = infile.readlines()
infile.close()
records

['1,a\n', '1,b\n', '1,c\n', '1,a\n', '1,c\n', '1,a\n', '1,b\n', '2,a\n', '2,b\n', '2,c\n', '2,a\n', '3,c\n', '3,a\n', '3,b\n', '4,a\n', '4,a\n', '4,c\n', '4,c\n']

so when I try and:

lines = records.split

I am thrown:

Traceback (most recent call last):
File "<interactive input>", line 1, in <module>
AttributeError: 'list' object has no attribute 'split'

I think it is becauase records is:

['1,a\n', '1,b\n', '1,c\n', '1,a\n', '1,c\n', '1,a\n', '1,b\n', '2,a\n', '2,b\n', '2,c\n', '2,a\n', '3,c\n', '3,a\n', '3,b\n', '4,a\n', '4,a\n', '4,c\n', '4,c\n']

and not:

'1,a\n1,b\n1,b\n2,a\n2,c\n2,a\n2,c'

How can I open a text file and store records as above not using readlines.

As for teh dictionary I would like to have it so that I get:

{[1:a,b], [2:a,c]}

Any ideas - sorry new to Python and used to just working in SQL.

G.
Oct 1 '07 #5
bartonc
6,596 Expert 4TB
Thanks - if they need to be unique how do i import so that I keep the unique key but assign the multiple associated values so that I get:

{[1:a,b], [2:a,c]}

thanks again..
Expand|Select|Wrap|Line Numbers
  1. >>> records = '1,a\n1,b\n1,b\n2,a\n2,c\n2,a\n2,c' # often missing the last newline
  2. >>> lines = records.split()
  3. >>> lines
  4. ['1,a', '1,b', '1,b', '2,a', '2,c', '2,a', '2,c']
  5. >>> dd = {}
  6. >>> for line in lines:
  7. ...     key, value = line.split(',')
  8. ...     if key in dd:
  9. ...         oldvalue = dd[key]
  10. ...         if value not in oldvalue:
  11. ...             dd[key] = '%s,%s' %(oldvalue, value)
  12. ...     else:
  13. ...         dd[key] = value
  14. ...         
  15. >>> dd
  16. {'1': 'a,b', '2': 'a,c'}
  17. >>> 
Oct 1 '07 #6
bartonc
6,596 Expert 4TB
OK - if i just run the first part:

infile = open('input.txt', 'r')
records = infile.readlines()
infile.close()
records
Use
Expand|Select|Wrap|Line Numbers
  1. infile.read()
instead.
Oct 1 '07 #7
bartonc
6,596 Expert 4TB
Use
Expand|Select|Wrap|Line Numbers
  1. infile.read()
instead.
Even better: Use a tuple in the value:
Expand|Select|Wrap|Line Numbers
  1. >>> records = '1,a\n1,b\n1,b\n2,a\n2,c\n2,a\n2,c' # often missing the last newline
  2. >>> lines = records.split()
  3. >>> lines
  4. ['1,a', '1,b', '1,b', '2,a', '2,c', '2,a', '2,c']
  5. >>> dd = {}
  6. >>> for line in lines:
  7. ...     key, value = line.split(',')
  8. ...     if key in dd:
  9. ...         oldvalue = dd[key]
  10. ...         if value not in oldvalue:
  11. ...             dd[key] = oldvalue + (value,) # tuple addition
  12. ...     else:
  13. ...         dd[key] = (value,) # a tuple of one
  14. ...         
  15. >>> dd
  16. {'1': ('a', 'b'), '2': ('a', 'c')}
  17. >>> 
This allows any type a conversion on the text prior to being stored.
Oct 1 '07 #8
GTXY20
29
This is perfect!!!

I assume you can also sort the values so that values would always start like a,b,c or a,cor a,b depending on the value?

Finally I need to do two more things:

1. If I wanted to list the quantity of unique value combinations based on keys within a dictionary so for example I have the following dictionary:

{'1': 'a,b,c', '3': 'a,b,c', '2': 'a,b,c', '4': 'a,c'}

I would need:

QTY VALUE COMBINATION
3 a,b,c
1 a,c

2. Get the total number of values for a key:

{'1': 'a,b,c', '3': 'a,b,c', '2': 'a,b,c', '4': 'a,c'}

I would need:

KEY NUMBER OF VALUES
1 3
3 3
2 3
4 2

Thank you so much this is so helpful and incredibly more efficient than using SQL and VB to come up with. Do you know if there are any size limitations of a dictionary in python - I am thinking I may eventually have 2 million keys with a variety of values (average of about 5 values per key).

G.
Oct 1 '07 #9
bartonc
6,596 Expert 4TB
This is perfect!!!

I assume you can also sort the values so that values would always start like a,b,c or a,cor a,b depending on the value?

Finally I need to do two more things:

1. If I wanted to list the quantity of unique value combinations based on keys within a dictionary so for example I have the following dictionary:

{'1': 'a,b,c', '3': 'a,b,c', '2': 'a,b,c', '4': 'a,c'}

I would need:

QTY VALUE COMBINATION
3 a,b,c
1 a,c

2. Get the total number of values for a key:

{'1': 'a,b,c', '3': 'a,b,c', '2': 'a,b,c', '4': 'a,c'}

I would need:

KEY NUMBER OF VALUES
1 3
3 3
2 3
4 2

Thank you so much this is so helpful and incredibly more efficient than using SQL and VB to come up with. Do you know if there are any size limitations of a dictionary in python - I am thinking I may eventually have 2 million keys with a variety of values (average of about 5 values per key).

G.
In order to sort, you'll need a list in the value:
Expand|Select|Wrap|Line Numbers
  1. >>> records = '1,b\n1,a\n1,b\n2,c\n2,a\n2,a\n2,c' # reordered elements
  2. >>> lines = records.split()
  3. >>> lines
  4. ['1,b', '1,a', '1,b', '2,c', '2,a', '2,a', '2,c']
  5. >>> dd = {}
  6. >>> for line in lines:
  7. ...     key, value = line.split(',')
  8. ...     if key in dd:
  9. ...         valueList = dd[key]
  10. ...         if value not in valueList:
  11. ...             valueList.append(value)
  12. ...     else:
  13. ...         dd[key] = [value] # a list of one
  14. ...         
  15. >>> dd
  16. {'1': ['b', 'a'], '2': ['c', 'a']}
  17. >>> for key, valueList in dd.items():
  18. ...     valueList.sort()
  19. ...     
  20. >>> dd
  21. {'1': ['a', 'b'], '2': ['a', 'c']}
Since dictionaries are not ordered containers, you'll want to work with a sorted() list of its keys:
Expand|Select|Wrap|Line Numbers
  1. >>> for key in sorted(dd.keys()):
  2. ...     print key, len(dd[key])
  3. ...     
  4. 1 2
  5. 2 2
  6. >>> 
Size limit, huh? With Python, memory is usually the limiting factor (as in (L)ong integers, which can contain a single value large enough to fill available memory - try it sometime!).
Oct 1 '07 #10
GTXY20
29
I was able to sort by KEY with the following:

sorted(dd.items(), key=lambda(k,v):(v,k))
Oct 1 '07 #11
bartonc
6,596 Expert 4TB
I was able to sort by KEY with the following:

sorted(dd.items(), key=lambda(k,v):(v,k))
I though that
Expand|Select|Wrap|Line Numbers
  1. sorted(dd.keys())
would be sufficient.

Your way:
Expand|Select|Wrap|Line Numbers
  1. >>> sorted(dd.items(), key=lambda(k,v):(v,k))
  2. [('1', ['a', 'b']), ('2', ['a', 'c'])]
  3. >>> 
actually creates a list of tuples with one tuple for each entry in the dictionary.


PS: It's actually a rule on this site that you use the [code] tags around your code, as instructed on the right hand side of the page when posting or replying.
Oct 1 '07 #12
GTXY20
29
Thanks again - point taken about the code tags I will do this moving forward - too excited about this working out and got caught up with everything.
Oct 1 '07 #13
GTXY20
29
Hi there,

Sorry for all the questions - this is enligtening...

Any way to display the count of the values in the values list so here is my dictionary:

Expand|Select|Wrap|Line Numbers
  1. {'1': ['a', 'b', 'c'], '3': ['a', 'b', 'c'], '2': ['a', 'b', 'c'], '4': ['a', 'c']}
I would like to display count as follows and I would not know all the values in the values list:

Value QTY
a 4
b 3
c 4

Also is there anyway to display the count of the values list combinations so here again is my dictionary:

Expand|Select|Wrap|Line Numbers
  1. {'1': ['a', 'b', 'c'], '3': ['a', 'b', 'c'], '2': ['a', 'b', 'c'], '4': ['a', 'c']}
And I would like to display as follows

QTY Value List Combination
3 a,b,c
1 a,c

Once again all help is much appreciated.

G.
Oct 1 '07 #14
bartonc
6,596 Expert 4TB
Here's a neat trick that will give you a place to start:
Expand|Select|Wrap|Line Numbers
  1.  
  2. >>> dd = {'1': ['a', 'b', 'c'], '3': ['a', 'b', 'c'], '2': ['a', 'b', 'c'], '4': ['a', 'c']}
  3. >>> uniques = set(tuple(value) for key, value in dd.items())
  4. >>> uniques
  5. set([('a', 'b', 'c'), ('a', 'c')])
  6. >>> 
Then, for the last part, use list.count() on a list of values:
Expand|Select|Wrap|Line Numbers
  1. >>> all = [tuple(value) for key, value in dd.items()]
  2. >>> all
  3. [('a', 'b', 'c'), ('a', 'b', 'c'), ('a', 'b', 'c'), ('a', 'c')]
  4. >>> for item in uniques:
  5. ...     print item, all.count(item)
  6. ...     
  7. ('a', 'b', 'c') 3
  8. ('a', 'c') 1
  9. >>> 
Oct 1 '07 #15
bartonc
6,596 Expert 4TB
Here's a neat trick that will give you a place to start:
Expand|Select|Wrap|Line Numbers
  1.  
  2. >>> dd = {'1': ['a', 'b', 'c'], '3': ['a', 'b', 'c'], '2': ['a', 'b', 'c'], '4': ['a', 'c']}
  3. >>> uniques = set(tuple(value) for key, value in dd.items())
  4. >>> uniques
  5. set([('a', 'b', 'c'), ('a', 'c')])
  6. >>> 
Then, for the last part, use list.count() on a list of values:
Expand|Select|Wrap|Line Numbers
  1. >>> all = [tuple(value) for key, value in dd.items()]
  2. >>> all
  3. [('a', 'b', 'c'), ('a', 'b', 'c'), ('a', 'b', 'c'), ('a', 'c')]
  4. >>> for item in uniques:
  5. ...     print item, all.count(item)
  6. ...     
  7. ('a', 'b', 'c') 3
  8. ('a', 'c') 1
  9. >>> 
And this may just do the first part nicely:
Expand|Select|Wrap|Line Numbers
  1. >>> uniques = list(uniques)
  2. >>> uniques
  3. [('a', 'b', 'c'), ('a', 'c')]
  4. >>> # Assumes only two results above! Needs work for a longer list!
  5. >>> bits = set.union(set(uniques[0]), set(uniques[1]))
  6. >>> bits
  7. set(['a', 'c', 'b'])
  8. >>> counts = [0 for i in range(len(bits))]
  9. >>> counts
  10. [0, 0, 0]
  11. >>> for item in all:
  12. ...     for i, bit in enumerate(bits):
  13. ...         if bit in item:
  14. ...             counts[i] += 1
  15. ...             
  16. >>> zip(bits, counts)
  17. [('a', 4), ('c', 4), ('b', 3)]
  18. >>> 
Oct 1 '07 #16

Sign in to post your reply or Sign up for a free account.

Similar topics

6
by: Byron | last post by:
Hello, I am a newbie and would like to know if it is possible to convert a string back to a dictionary? For example, I can convert a dictionary to a string by doing this: >>> names =...
2
by: Tom Grove | last post by:
I have a server program that I am writing an interface to and it returns data in a perl dictionary. Is there a nice way to convert this to something useful in Python? Here is some sample data:...
6
by: buzzweetman | last post by:
Many times I have a Dictionary<string, SomeTypeand need to get the list of keys out of it as a List<string>, to pass to a another method that expects a List<string>. I often do the following: ...
10
by: Ben | last post by:
Hello... I have set up a dictionary into whose values I am putting a list. I loop around and around filling my list each time with new values, then dumping this list into the dictionary. Or so I...
3
by: Ameet Nanda | last post by:
Hi All, I access net using a proxy, which I have to authenticate everytime I try to access net from my system. Now when I use urllib2.urlopen(url) , I cant get ahead. I must provide proxy...
35
by: Abandoned | last post by:
I want to convert a string to command.. For example i have a string: a="" I want to do this list.. How can i do ?
3
by: =?Utf-8?B?YW1pcg==?= | last post by:
Hi, I have a Generic Object that has a private List field item. I populate the List in a different function I use a FOREACH LOOP with a FindAll function to get all items that have a certain...
3
by: bruce | last post by:
hi guys/gals... got a basic question that i can't get my hands around. i'm trying to programatically create/use a list/tuple (or whatever the right phrase in pyton is!!) basically,...
2
by: Terry Reedy | last post by:
SUBHABRATA, I recommend you study this excellent response carefully. castironpi wrote: It starts with a concrete test case -- an 'executable problem statement'. To me, this is cleared and...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.