472,993 Members | 3,153 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 472,993 software developers and data experts.

Pickling dictionaries containing dictionaries: failing,recursion-style!

I'm having great fun playing with Markov chains. I am making a
dictionary of all the words in a given string, getting a count of how
many appearances word1 makes in the string, getting a list of all the
word2s that follow each appearance of word1 and a count of how many
times word2 appears in the string as well. (I know I should probably
be only counting how many times word2 actually follows word1, but as I
said, I'm having great fun playing ...)
printed output of the dictionary looks like so:

{'and': [1, {'to': 1}], 'down': [1, {'upon': 1}], 'them': [1, {'down':
1}], 'no': [1, {'others': 1}], 'this': [1, {'it': 1}], 'is': [2, {'a':
2}], 'upon': [1, {'a': 1}], 'it': [2, {'is': 2}], 'think': [2, {'and':
1, 'words': 1}], 'write': [1, {'this': 1}], 'to': [3, {'write': 1,
'put': 1, 'think': 1}], 'words': [1, {'no': 1}], 'others': [1,
{'think': 1}], 'put': [1, {'them': 1}], 'sin': [2, {'to': 2}]}

Here's the actual function.

def assembleVocab(self):
self.wordDB = {}
for word in self.words:
try:
if not word in self.wordDB.keys():
wordsWeights = {}
afterwords = [self.words[i + 1] for i, e in
enumerate(self.words) if e == word]
for aw in afterwords:
if not aw in wordsWeights.keys():
wordsWeights[aw] = afterwords.count(aw)
self.wordDB[word] = [self.words.count(word), wordsWeights]
except:
pass
out = open("mchain.pkl",'wb')
pickle.dump(self.wordDB, out, -1)
out.close()

My problem is, I can't seem to get it to unpickle. When I attempt to
load the
saved data, I get:

AttributeError: 'tuple' object has no attribute 'readline'

with pickle, and

TypeError: argument must have 'read' and 'readline' attributes

Looking at the pickle pages on docs.python.org, I see that I am
indeed
supposed to be able to pickle ``tuples, lists, sets, and dictionaries
containing only picklable objects''.

I'm sure I'm missing something obvious. Clues?
Dec 1 '07 #1
6 2986
lysdexia <do**********@gmail.comwrites:
self.wordDB[word] = [self.words.count(word), wordsWeights]
what is self.words.count? Could it be an iterator? I don't think you
can pickle those.
Dec 1 '07 #2
Are you opening the file in binary mode ("rb") before doing pickle.load on it?

On 01 Dec 2007 14:13:33 -0800, Paul Rubin
<"http://phr.cx"@nospam.invalidwrote:
lysdexia <do**********@gmail.comwrites:
self.wordDB[word] = [self.words.count(word), wordsWeights]

what is self.words.count? Could it be an iterator? I don't think you
can pickle those.

--
http://mail.python.org/mailman/listinfo/python-list


--
-David
Dec 1 '07 #3
On Dec 2, 9:13 am, Paul Rubin <http://phr...@NOSPAM.invalidwrote:
lysdexia <doug.shaw...@gmail.comwrites:
self.wordDB[word] = [self.words.count(word), wordsWeights]

what is self.words.count? Could it be an iterator? I don't think you
can pickle those.
Whaaaat??
self.words is obviously an iterable (can you see "for word in
self.words" in his code?), probably just a list.
self.words.count looks like a standard sequence method to me.
self.words.count(word) will return an int -- can you see all those
"[1,", "[2," etc in his printed dict output?
Dec 1 '07 #4
John Machin <sj******@lexicon.netwrites:
self.words is obviously an iterable (can you see "for word in
self.words" in his code?), probably just a list.
It could be a file, in which case its iterator method would read lines
from the file and cause that error message. But I think the answer is
that the pickle itself needs to be opened in binary mode, as someone
else posted.
Dec 1 '07 #5
On Dec 2, 8:59 am, lysdexia <doug.shaw...@gmail.comwrote:
I'm having great fun playing with Markov chains. I am making a
dictionary of all the words in a given string, getting a count of how
many appearances word1 makes in the string, getting a list of all the
word2s that follow each appearance of word1 and a count of how many
times word2 appears in the string as well. (I know I should probably
be only counting how many times word2 actually follows word1, but as I
said, I'm having great fun playing ...)

printed output of the dictionary looks like so:

{'and': [1, {'to': 1}], 'down': [1, {'upon': 1}], 'them': [1, {'down':
1}], 'no': [1, {'others': 1}], 'this': [1, {'it': 1}], 'is': [2, {'a':
2}], 'upon': [1, {'a': 1}], 'it': [2, {'is': 2}], 'think': [2, {'and':
1, 'words': 1}], 'write': [1, {'this': 1}], 'to': [3, {'write': 1,
'put': 1, 'think': 1}], 'words': [1, {'no': 1}], 'others': [1,
{'think': 1}], 'put': [1, {'them': 1}], 'sin': [2, {'to': 2}]}

Here's the actual function.

def assembleVocab(self):
self.wordDB = {}
for word in self.words:
try:
if not word in self.wordDB.keys():
wordsWeights = {}
afterwords = [self.words[i + 1] for i, e in
enumerate(self.words) if e == word]
for aw in afterwords:
if not aw in wordsWeights.keys():
wordsWeights[aw] = afterwords.count(aw)
self.wordDB[word] = [self.words.count(word), wordsWeights]
except:
pass
out = open("mchain.pkl",'wb')
pickle.dump(self.wordDB, out, -1)
out.close()

My problem is, I can't seem to get it to unpickle. When I attempt to
load the
saved data, I get:

AttributeError: 'tuple' object has no attribute 'readline'

with pickle, and

TypeError: argument must have 'read' and 'readline' attributes
The code that created the dictionary is interesting, but not very
relevant. Please consider posting the code that is actually giving the
error!
>
Looking at the pickle pages on docs.python.org, I see that I am
indeed
supposed to be able to pickle ``tuples, lists, sets, and dictionaries
containing only picklable objects''.

I'm sure I'm missing something obvious. Clues?
The docs for pickle.load(file) say """
Read a string from the open file object file and interpret it as a
pickle data stream, reconstructing and returning the original object
hierarchy. This is equivalent to Unpickler(file).load().

file must have two methods, a read() method that takes an integer
argument, and a readline() method that requires no arguments. Both
methods should return a string. Thus file can be a file object opened
for reading, a StringIO object, or any other custom object that meets
this interface.
"""

The error message(s) [plural??] that you are getting suggest(s) that
the argument that you supplied was *not* an open file object nor
anything else with both a read and readline method. Open the file in
binary mode ('rb') and pass the result to pickle.load.
Dec 1 '07 #6
On Dec 2, 9:49 am, Paul Rubin <http://phr...@NOSPAM.invalidwrote:
John Machin <sjmac...@lexicon.netwrites:
self.words is obviously an iterable (can you see "for word in
self.words" in his code?), probably just a list.

It could be a file, in which case its iterator method would read lines
from the file and cause that error message.
Impossible:
(1) in "for word in words:" each word would end in "\n" and he'd have
to strip those and there's no evidence of that.
(2) Look at the line """afterwords = [self.words[i + 1] for i, e in
enumerate(self.words) if e == word]"""
and tell me how that works if self.words is a file!
(3) "self.words.count(word)" -- AttributeError: 'file' object has no
attribute 'count'

But I think the answer is
that the pickle itself needs to be opened in binary mode, as someone
else posted.
The answer is (1) he needs to supply a file of any kind for a start
[read the error messages that he got!!]
(2) despite the silence of the docs, it is necessary to have opened
the file in binary mode on systems where it makes a difference
(notably Windows)

[If the OP is still reading this thread, here's an example of how to
show a problem, with minimal code that reproduces the problem, and all
the output including the stack trace]

C:\junk>type dpkl.py
import pickle

d = {'and': [1, {'to': 1}], 'down': [1, {'upon': 1}], 'them': [1,
{'down':
1}], 'no': [1, {'others': 1}], 'this': [1, {'it': 1}], 'is': [2, {'a':
2}], 'upon': [1, {'a': 1}], 'it': [2, {'is': 2}], 'think': [2, {'and':
1, 'words': 1}], 'write': [1, {'this': 1}], 'to': [3, {'write': 1,
'put': 1, 'think': 1}], 'words': [1, {'no': 1}], 'others': [1,
{'think': 1}], 'put': [1, {'them': 1}], 'sin': [2, {'to': 2}]}

s = pickle.dumps(d, -1)
dnews = pickle.loads(s)
print "string", dnews == d

out = open("mchain.pkl",'wb')
pickle.dump(d, out, -1)
out.close()

f = open("mchain.pkl", "rb")
dnewb = pickle.load(f)
f.close()
print "load binary", dnewb == d

f = open("mchain.pkl", "r")
dnewa = pickle.load(f)
f.close()
print "load text", dnewa == d

C:\junk>python dpkl.py
string True
load binary True
Traceback (most recent call last):
File "dpkl.py", line 24, in <module>
dnewa = pickle.load(f)
File "c:\python25\lib\pickle.py", line 1370, in load
return Unpickler(file).load()
File "c:\python25\lib\pickle.py", line 858, in load
dispatch[key](self)
File "c:\python25\lib\pickle.py", line 1169, in load_binput
i = ord(self.read(1))
TypeError: ord() expected a character, but string of length 0 found

Changing the first line to
import cPickle as pickle
gives this:

C:\junk>python dpkl.py
string True
load binary True
Traceback (most recent call last):
File "dpkl.py", line 24, in <module>
dnewa = pickle.load(f)
EOFError

Each of the two different errors indicate that reading was terminated
prematurely by the presence of the good ol' ^Z aka CPMEOF in the file:
>>s = open('mchain.pkl', 'rb').read()
s.find(chr(26))
179
>>len(s)
363

HTH,
John
Dec 2 '07 #7

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
by: Marc | last post by:
Hi all, After some research I've decided that my previous question (Confusing problem between Tkinter.Intvar...) was headed in the wrong direction. Partly because I think I have a greater...
1
by: Edward Loper | last post by:
I'm having trouble pickling subclasses of dict when they contain cycles. In particular: >>> import pickle >>> class D(dict): pass >>> d = D() >>> d = d # add a cycle. >>> print d {1: {...}}...
8
by: Frohnhofer, James | last post by:
My initial problem was to initialize a bunch of dictionaries at the start of a function. I did not want to do def fn(): a = {} b = {} c = {} . . . z = {}
10
by: Bulba! | last post by:
Hello everyone, I'm reading the rows from a CSV file. csv.DictReader puts those rows into dictionaries. The actual files contain old and new translations of software strings. The dictionary...
2
by: Kirk Strauser | last post by:
I have a module that defines a Search class and a SearchResult class. I use these classes by writing other modules that subclass both of them as needed to interface with particular search engines....
1
by: Erik Max Francis | last post by:
I've come across a limitation in unpickling certain types of complex data structures which involve instances that override __hash__, and was wondering if it was known (basic searches didn't seem to...
3
by: manstey | last post by:
Hi, I am running a script that produces about 450,000 dictionaries. I tried putting them into a tuple and then pickling the tuple, but the tuple gets too big. Can I pickle dictionaries one after...
11
by: Anil Gupte | last post by:
I am getting the following error: ********* Login failed for user 'AUM\ASPNET'. Description: An unhandled exception occurred during the execution of the current web request. Please review the...
2
by: Jon Bowlas | last post by:
Hi All, I have the following list containing dictionaries and I would like to be able to count the total number of dictionaries I have that contain a certain value set for the 'level' key: ...
0
by: Jon Bowlas | last post by:
Many thanks for all your reponses, much appreciated. I'll get back to you on which is the best for me. BTW - yes John thats exactly what I wanted. Cheers Jon
0
by: lllomh | last post by:
Define the method first this.state = { buttonBackgroundColor: 'green', isBlinking: false, // A new status is added to identify whether the button is blinking or not } autoStart=()=>{
2
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 4 Oct 2023 starting at 18:00 UK time (6PM UTC+1) and finishing at about 19:15 (7.15PM) The start time is equivalent to 19:00 (7PM) in Central...
0
by: Aliciasmith | last post by:
In an age dominated by smartphones, having a mobile app for your business is no longer an option; it's a necessity. Whether you're a startup or an established enterprise, finding the right mobile app...
0
tracyyun
by: tracyyun | last post by:
Hello everyone, I have a question and would like some advice on network connectivity. I have one computer connected to my router via WiFi, but I have two other computers that I want to be able to...
4
NeoPa
by: NeoPa | last post by:
Hello everyone. I find myself stuck trying to find the VBA way to get Access to create a PDF of the currently-selected (and open) object (Form or Report). I know it can be done by selecting :...
3
by: nia12 | last post by:
Hi there, I am very new to Access so apologies if any of this is obvious/not clear. I am creating a data collection tool for health care employees to complete. It consists of a number of...
0
isladogs
by: isladogs | last post by:
The next online meeting of the Access Europe User Group will be on Wednesday 6 Dec 2023 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, Mike...
4
by: GKJR | last post by:
Does anyone have a recommendation to build a standalone application to replace an Access database? I have my bookkeeping software I developed in Access that I would like to make available to other...
3
SueHopson
by: SueHopson | last post by:
Hi All, I'm trying to create a single code (run off a button that calls the Private Sub) for our parts list report that will allow the user to filter by either/both PartVendor and PartType. On...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.