How to write temporary data to file?

Thomas Ploch

Hi folks,

I have a data structure that looks like this:

d = {
'url1': {
'emails': ['a', 'b', 'c',...],
'matches': ['d', 'e', 'f',...]
},
'url2': {...
}

This dictionary will get _very_ big, so I want to write it somehow to a
file after it has grown to a certain size.

How would I achieve that?

Thanks,
Thomas

Jan 9 '07 #1

Subscribe Post Reply

1183

Ravi Teja

Thomas Ploch wrote:

Hi folks,

I have a data structure that looks like this:

d = {
'url1': {
'emails': ['a', 'b', 'c',...],
'matches': ['d', 'e', 'f',...]
},
'url2': {...
}

This dictionary will get _very_ big, so I want to write it somehow to a
file after it has grown to a certain size.

How would I achieve that?

Thanks,
Thomas

Pickle/cPickle are standard library modules that can persist data.
But in this case, I would recommend ZODB/Durus.

(Your code example scares me. I hope you have benevolent purposes for
that application.)

Ravi Teja.

Jan 9 '07 #2

Thomas Ploch

Ravi Teja schrieb:

Thomas Ploch wrote:
>Hi folks,

I have a data structure that looks like this:

d = {
'url1': {
'emails': ['a', 'b', 'c',...],
'matches': ['d', 'e', 'f',...]
},
'url2': {...
}

This dictionary will get _very_ big, so I want to write it somehow to a
file after it has grown to a certain size.

How would I achieve that?

Thanks,
Thomas

Pickle/cPickle are standard library modules that can persist data.
But in this case, I would recommend ZODB/Durus.

(Your code example scares me. I hope you have benevolent purposes for
that application.)

Ravi Teja.

Thanks, but why is this code example scaring you?

Thomas

Jan 9 '07 #3

Ravi Teja

Thomas Ploch wrote:

Ravi Teja schrieb:
Thomas Ploch wrote:
Hi folks,

I have a data structure that looks like this:

d = {
'url1': {
'emails': ['a', 'b', 'c',...],
'matches': ['d', 'e', 'f',...]
},
'url2': {...
}

This dictionary will get _very_ big, so I want to write it somehow to a
file after it has grown to a certain size.

How would I achieve that?

Thanks,
Thomas
Pickle/cPickle are standard library modules that can persist data.
But in this case, I would recommend ZODB/Durus.

(Your code example scares me. I hope you have benevolent purposes for
that application.)

Ravi Teja.

Thanks, but why is this code example scaring you?

Thomas

The code indicates that you are trying to harvest a _very_ (as you put
it) large set of email addresses from web pages. With my limited
imagination, I can think of only one group of people who would need to
do that. But considering that you write good English, you must not be
one of those mean people that needed me to get a new email account just
for posting to Usenet :-).

Ravi Teja.

Jan 9 '07 #4

Thomas Ploch

Ravi Teja schrieb:

Thomas Ploch wrote:
>Ravi Teja schrieb:
>>Thomas Ploch wrote:
Hi folks,

I have a data structure that looks like this:

d = {
'url1': {
'emails': ['a', 'b', 'c',...],
'matches': ['d', 'e', 'f',...]
},
'url2': {...
}

This dictionary will get _very_ big, so I want to write it somehow to a
file after it has grown to a certain size.

How would I achieve that?

Thanks,
Thomas
Pickle/cPickle are standard library modules that can persist data.
But in this case, I would recommend ZODB/Durus.

(Your code example scares me. I hope you have benevolent purposes for
that application.)

Ravi Teja.

Thanks, but why is this code example scaring you?

Thomas

The code indicates that you are trying to harvest a _very_ (as you put
it) large set of email addresses from web pages. With my limited
imagination, I can think of only one group of people who would need to
do that. But considering that you write good English, you must not be
one of those mean people that needed me to get a new email account just
for posting to Usenet :-).

Ravi Teja.

Oh, well, yes you are right that this application is able to harvest
email addresses. But it can do much more than that. It has a text
matching engine, that according to given meta keywords can scan or not
scan documents in the web and harvest all kinds of information. It can
also be fed with callbacks for each of the Content-Types. I know that
the email matching engine is a kind of a 'grey zone', and I asked
myself, if it needs the email stuff. But I mean you could easily include
the email regex to the text matching engine yourself, so I decided to
add this functionality (it is 'OFF' by default :-) ).

Thomas

P.S.: No, I am a good person.

Jan 9 '07 #5

Marc 'BlackJack' Rintsch

In <ma***************************************@python. org>, Thomas Ploch
wrote:

d = {
'url1': {
'emails': ['a', 'b', 'c',...],
'matches': ['d', 'e', 'f',...]
},
'url2': {...
}

This dictionary will get _very_ big, so I want to write it somehow to a
file after it has grown to a certain size.

How would I achieve that?

If you want easy access to single 'url' keys then `shelve` might be an
alternative to pickling the whole thing as one big object.

Ciao,
Marc 'BlackJack' Rintsch

Jan 9 '07 #6

Similar topics

how to write a line in a text file

by: nephish | last post by:

Hey there, kinda newbie question here. i know how to read the lines of a txt file. i know how to write a txt file. but how do i overwrite a line value with another value ? i mean, how do go...

Python

Read / Write a file

by: PHead | last post by:

I need to be able to read and write to a data file, but I dont want to load the entire file into memory to make the change. Is it possible to set a "record pointer" and just read and write to a...

C# / C Sharp

Is there any way for System.IO.StreamWriter Write method to write out part of the string to file. such as if the machine is shut down half way through? or does the file not actually exist until the entire write is completed successfully.

by: Daniel | last post by:

C# / C Sharp

CS0016: Could not write to output file...

by: IKdev | last post by:

Hi all, I have win2003 server and NET 1.1 framework. Created simple ASP .NET app with one page that has button and a label. When trying to run got the CS0016 error (see below). My temp env....

ASP.NET

ERROR [HY000] [Microsoft][ODBC Microsoft Access Driver]General error Unable to open registry key 'Temporary (volatile) Jet DSN for process 0xffc Thread 0x228 DBC 0x437b94 Jet'. ERROR [IM006] [Microsoft][ODBC Driver Manager] Driver's SQLSetConnectAttr

by: bazzer | last post by:

hey, im trying to access a microsoft access database from an ASP.NET web application in visual basic 2003.NET. i get the following error when i try running it: Server Error in...

ASP.NET

Read from database, write to another database, simultaneously

by: Sean Davis | last post by:

I am working on a simple script to read from one database (oracle) and write to another (postgresql). I retrieve the data from oracle in chunks and drop the data to postgresql continuously. The...

Python

textfile - write to beginning of file?

by: btopenworld | last post by:

Hi - question from a relative asp novice I have written to text files in the past, but always appending new data to the end of the text file. I now want to add the new data to the beginning of...

ASP / Active Server Pages

Read and Write the same file

by: JonathanB | last post by:

Ok, so this is the scenario. I need to create a simple, no-frills XML editor for non-technical users. It doesn't have to do anything fancy, what I want is a series of text boxes with the text...

Python

Write to XML with C#

by: Mike | last post by:

I am trying to collect and write some temporary data to xml file before I put them into DB. I am writing into xml file like follows: private void WriteXML() { //Use the xmlTextWriter to open a...

C# / C Sharp

read and write to a text file

by: Keith G Hicks | last post by:

I'm trying to read a text file and alter the contents of specific lines in the file. I know how to use streamreader to read each line of a file. I'm doing that already to get the data into a...

Visual Basic .NET

Migrating Website to Cloud - Emmanuel Katto

by: emmanuelkatto | last post by:

Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel

General

Navigating the Data Structures and Algorithms (DSA)

by: BarryA | last post by:

What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...

Algorithms / Advanced Math

Looking to do Android software development, any suggestions? Is flutter better?

by: nemocccc | last post by:

hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?

General

Is that possible of reading the .csv file in column wise and the column have different lengths ?

by: Sonnysonu | last post by:

This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

C / C++

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware

Maximizing Business Potential: The Nexus of Website Design and Digital Marketing

by: jinu1996 | last post by:

In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

Online Marketing

Discussion: How does Zigbee compare with other wireless protocols in smart home applications?

by: tracyyun | last post by:

Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

General

AI Job Threat for Devs

by: agi2029 | last post by:

Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

Career Advice

Access Europe - Using VBA to create a class based on a table - Wed 1 May

by: isladogs | last post by:

The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

Microsoft Access / VBA