473,399 Members | 3,919 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,399 software developers and data experts.

Recursion limit of pickle?

Hi,

I encounter a problem with pickle.
I download a html from:

http://www.amazon.com/Magellan-Maest...2541889&sr=1-2

and parse it with BeautifulSoup.
This page is very huge.
When I use pickle to dump it, a RuntimeError: maximum recursion depth
exceeded occur.
I think it is cause by this problem at first :

http://bugs.python.org/issue1757062

But and then I do not think so, because I log recursion call of pickle
in file
I found that the recursion limit is exceeded in mid-way to expand
whole the BeautifulSoup object.
Not repeat to call some methods.

This is the code for test.

from BeautifulSoup import *

import pickle as pickle
import urllib

doc = urllib.urlopen('http://www.amazon.com/Magellan-Maestro-4040-
Widescreen-Navigator/dp/B000NMKHW6/ref=sr_1_2?
ie=UTF8&s=electronics&qid=1202541889&sr=1-2')

import sys
sys.setrecursionlimit(40000)

soup = BeautifulSoup(doc)
print pickle.dumps(soup)

-------------------
What I want to ask is: Is this cause by the limit of recursion limit
and stack size?

I had tired cPickle at first, and then I try pickle, cPickle just stop
running program without any message.
I think it is also implement with recursion way, and it also over flow
stack when dumping soup.

Are there any version of pickle that implement with no-recursion way?

Thanks.

Victor Lin.
Feb 9 '08 #1
2 6540
En Sat, 09 Feb 2008 09:49:46 -0200, Victor Lin <Bo******@gmail.com>
escribi�:
I encounter a problem with pickle.
I download a html from:

http://www.amazon.com/Magellan-Maest...2541889&sr=1-2

and parse it with BeautifulSoup.
This page is very huge.
When I use pickle to dump it, a RuntimeError: maximum recursion depth
exceeded occur.
BeautifulSoup objects usually aren't pickleable, independently of your
recursion error.

pyimport pickle
pyimport BeautifulSoup
pysoup = BeautifulSoup.BeautifulSoup("<html><body>Hello, world!</html>")
pyprint pickle.dumps(soup)
Traceback (most recent call last):
....
TypeError: 'NoneType' object is not callable
py>

Why do you want to pickle it? Store the downloaded page instead, and
rebuild the BeautifulSoup object later when needed.

--
Gabriel Genellina

Feb 10 '08 #2
On 2月10日, 上午11時42分, "GabrielGenellina" <gagsl-...@yahoo.com.ar>
wrote:
En Sat, 09 Feb 2008 09:49:46 -0200, Victor Lin <Borns...@gmail.com>
escribi�:
I encounter a problem with pickle.
I download a html from:
http://www.amazon.com/Magellan-Maest...Navigator/dp/B...
and parse it with BeautifulSoup.
This page is very huge.
When I use pickle to dump it, a RuntimeError: maximum recursion depth
exceeded occur.

BeautifulSoup objects usually aren't pickleable, independently of your
recursion error.
But I pickle and unpickle other soup objects successfully.
Only this object seems too deep to pickle.
>
pyimport pickle
pyimport BeautifulSoup
pysoup = BeautifulSoup.BeautifulSoup("<html><body>Hello, world!</html>")
pyprint pickle.dumps(soup)
Traceback (most recent call last):
...
TypeError: 'NoneType' object is not callable
py>

Why do you want to pickle it? Store the downloaded page instead, and
rebuild the BeautifulSoup object later when needed.

--
Gabriel Genellina
Because parsing html cost a lots of cpu time. So I want to cache soup
object as file. If I have to get same page, I can get it from cache
file, even the parsed soup file. My program's bottleneck is on parsing
html, so if I can parse once and unpickle them later, it could save a
lots of time.
Feb 10 '08 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
by: Simon Burton | last post by:
Hi, I am pickling big graphs of data and running into this problem: File "/usr/lib/python2.2/pickle.py", line 225, in save f(self, object) File "/usr/lib/python2.2/pickle.py", line 414, in...
6
by: Georgy Pruss | last post by:
Sometimes I get this error. E.g. >>> sum = lambda n: n<=1 or n+sum(n-1) # just to illustrate the error >>> sum(999) 499500 >>> sum(1000) ............ RuntimeError: maximum recursion depth...
1
by: Andy Leszczynski | last post by:
I need to pickle quite complex objects and first limitation was default 200 for the recursion. sys.setrecursionlimit helped, but still bigger objects fail to be pickled because of XP stack size...
13
by: robert | last post by:
My code does recursion loops through a couple of functions. Due to problematic I/O input this leads sometimes to "endless" recursions and after expensive I/O to the Python recursion exception. What...
6
by: Andre Kempe | last post by:
hej folks. i have a heap with fixed size and want to determine the depth of a element with given index at compile-time. therefore i wrote some templates. however, when i use template...
10
by: elventear | last post by:
Hello everyone, I am runing into recursion limit problems. I have found that the culprit was related to the __hash__ function that I had assigned to the objects that were added to a set. ...
6
by: lysdexia | last post by:
I'm having great fun playing with Markov chains. I am making a dictionary of all the words in a given string, getting a count of how many appearances word1 makes in the string, getting a list of...
14
by: asit | last post by:
#include <stdio.h> int main() { int i; for(i=1;i<=10;i++) main(); printf("C is urs.."); return 0; }
30
by: Jeff Bigham | last post by:
So, it appears that Javascript has a recursion limit of about 1000 levels on FF, maybe less/more on other browsers. Should such deep recursion then generally be avoided in Javascript?...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development projectplanning, coding, testing,...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.