By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
443,660 Members | 1,105 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 443,660 IT Pros & Developers. It's quick & easy.

How to write temporary data to file?

P: n/a
Hi folks,

I have a data structure that looks like this:

d = {
'url1': {
'emails': ['a', 'b', 'c',...],
'matches': ['d', 'e', 'f',...]
},
'url2': {...
}

This dictionary will get _very_ big, so I want to write it somehow to a
file after it has grown to a certain size.

How would I achieve that?

Thanks,
Thomas
Jan 9 '07 #1
Share this Question
Share on Google+
5 Replies


P: n/a

Thomas Ploch wrote:
Hi folks,

I have a data structure that looks like this:

d = {
'url1': {
'emails': ['a', 'b', 'c',...],
'matches': ['d', 'e', 'f',...]
},
'url2': {...
}

This dictionary will get _very_ big, so I want to write it somehow to a
file after it has grown to a certain size.

How would I achieve that?

Thanks,
Thomas
Pickle/cPickle are standard library modules that can persist data.
But in this case, I would recommend ZODB/Durus.

(Your code example scares me. I hope you have benevolent purposes for
that application.)

Ravi Teja.

Jan 9 '07 #2

P: n/a
Ravi Teja schrieb:
Thomas Ploch wrote:
>Hi folks,

I have a data structure that looks like this:

d = {
'url1': {
'emails': ['a', 'b', 'c',...],
'matches': ['d', 'e', 'f',...]
},
'url2': {...
}

This dictionary will get _very_ big, so I want to write it somehow to a
file after it has grown to a certain size.

How would I achieve that?

Thanks,
Thomas

Pickle/cPickle are standard library modules that can persist data.
But in this case, I would recommend ZODB/Durus.

(Your code example scares me. I hope you have benevolent purposes for
that application.)

Ravi Teja.
Thanks, but why is this code example scaring you?

Thomas
Jan 9 '07 #3

P: n/a

Thomas Ploch wrote:
Ravi Teja schrieb:
Thomas Ploch wrote:
Hi folks,

I have a data structure that looks like this:

d = {
'url1': {
'emails': ['a', 'b', 'c',...],
'matches': ['d', 'e', 'f',...]
},
'url2': {...
}

This dictionary will get _very_ big, so I want to write it somehow to a
file after it has grown to a certain size.

How would I achieve that?

Thanks,
Thomas
Pickle/cPickle are standard library modules that can persist data.
But in this case, I would recommend ZODB/Durus.

(Your code example scares me. I hope you have benevolent purposes for
that application.)

Ravi Teja.

Thanks, but why is this code example scaring you?

Thomas
The code indicates that you are trying to harvest a _very_ (as you put
it) large set of email addresses from web pages. With my limited
imagination, I can think of only one group of people who would need to
do that. But considering that you write good English, you must not be
one of those mean people that needed me to get a new email account just
for posting to Usenet :-).

Ravi Teja.

Jan 9 '07 #4

P: n/a
Ravi Teja schrieb:
Thomas Ploch wrote:
>Ravi Teja schrieb:
>>Thomas Ploch wrote:
Hi folks,

I have a data structure that looks like this:

d = {
'url1': {
'emails': ['a', 'b', 'c',...],
'matches': ['d', 'e', 'f',...]
},
'url2': {...
}

This dictionary will get _very_ big, so I want to write it somehow to a
file after it has grown to a certain size.

How would I achieve that?

Thanks,
Thomas
Pickle/cPickle are standard library modules that can persist data.
But in this case, I would recommend ZODB/Durus.

(Your code example scares me. I hope you have benevolent purposes for
that application.)

Ravi Teja.
Thanks, but why is this code example scaring you?

Thomas

The code indicates that you are trying to harvest a _very_ (as you put
it) large set of email addresses from web pages. With my limited
imagination, I can think of only one group of people who would need to
do that. But considering that you write good English, you must not be
one of those mean people that needed me to get a new email account just
for posting to Usenet :-).

Ravi Teja.
Oh, well, yes you are right that this application is able to harvest
email addresses. But it can do much more than that. It has a text
matching engine, that according to given meta keywords can scan or not
scan documents in the web and harvest all kinds of information. It can
also be fed with callbacks for each of the Content-Types. I know that
the email matching engine is a kind of a 'grey zone', and I asked
myself, if it needs the email stuff. But I mean you could easily include
the email regex to the text matching engine yourself, so I decided to
add this functionality (it is 'OFF' by default :-) ).

Thomas

P.S.: No, I am a good person.

Jan 9 '07 #5

P: n/a
In <ma***************************************@python. org>, Thomas Ploch
wrote:
d = {
'url1': {
'emails': ['a', 'b', 'c',...],
'matches': ['d', 'e', 'f',...]
},
'url2': {...
}

This dictionary will get _very_ big, so I want to write it somehow to a
file after it has grown to a certain size.

How would I achieve that?
If you want easy access to single 'url' keys then `shelve` might be an
alternative to pickling the whole thing as one big object.

Ciao,
Marc 'BlackJack' Rintsch
Jan 9 '07 #6

This discussion thread is closed

Replies have been disabled for this discussion.

Browse more Python Questions on Bytes