473,396 Members | 1,789 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,396 software developers and data experts.

Unpythonic Python

I started having some weird problems with Python recently; they're so
weird that I can't begin to explain them. All I can do is describe
the symptoms and hope someone else has a clue. So here goes:

FreeBSD 4.2, Python 2.2.2.

I have a nightly cron job that downloads the boost cvs tarball from
SourceForge and bunzip2s it. For about a year everything worked with
no problems. About a month ago the download started getting truncated
with no error reported. Then bunzip2 reports corruption, of course.

I took the salient part of the download script, and added a reporthook
(undocumented in urllib, BTW) to the urlretrieve call:
--
import urllib
import os

def dump(*args):
print args

#print 'downloading...'
os.chdir('/tmp')
urllib.urlretrieve('http://cvs.sourceforge.net/cvstarballs/boost-cvsroot.tar.bz2',
'boost-cvsroot.tar.bz2', dump)
---

When a recent download was truncated, the last lines of the dump were:

(1014, 8192, 34441987)
(1015, 8192, 34441987)
(1016, 8192, 34441987)
(1017, 8192, 34441987)
(1018, 8192, 34441987)
(1019, 8192, 34441987)
(1020, 8192, 34441987)
(1021, 8192, 34441987)
(1022, 8192, 34441987)
(1023, 8192, 34441987)

is 1023 a coincidence? Maybe; here's the tail of another failure:

(2439, 8192, 34455413)
(2440, 8192, 34455413)
(2441, 8192, 34455413)
(2442, 8192, 34455413)
(2443, 8192, 34455413)
(2444, 8192, 34455413)
(2445, 8192, 34455413)
(2446, 8192, 34455413)
(2447, 8192, 34455413)
(2448, 8192, 34455413)

So I figured maybe we needed a newer version of Python. I asked my
sysadmin at stlport.com to upgrade Python to the most recent release,
and all of a sudden my incoming mail started looping (see below).

I am classifying spam with SpamBayes and on my system the only way to
get it sorted into IMAP folders after classification is to send it to
myself. Only messages lacking an X-Spambayes-Classification get
classified and sent back out, so I guess when Python was upgraded the
classification stopped adding the headers? My sysadmin rolled Python
back to 2.2.2 and the mail problems stopped. But I still have the
truncated download problem.

Any clues?
Thanks in advance!

-Dave

--

From: <MA***********@stlport.com>
Subject: Undeliverable mail: RE: What's wrong with this?
To: <ag*******@meta-comm.com>
Date: Mon, 23 Aug 2004 17:08:45 -0700

Failed to deliver to 'dave'
mail loop: too many hops (too many 'Received:' header fields)
Reporting-MTA: dns; stlport.com

Original-Recipient: rfc822;<dave>
Final-Recipient: system;<dave>
Action: failed
Status: 5.0.0
[3. text/rfc822-headers]

Received: by stlport.com (CommuniGate Pro PIPE 4.2)
with PIPE id 817189; Mon, 23 Aug 2004 17:08:45 -0700
Received: by stlport.com (CommuniGate Pro PIPE 4.2)
with PIPE id 817183; Mon, 23 Aug 2004 17:08:26 -0700
<snip>
Received: from [12.163.41.8] (HELO expressmail.office.meta)
by stlport.com (CommuniGate Pro SMTP 4.2)
with SMTP id 817122 for da**@boost-consulting.com; Mon, 23 Aug 2004
17:04:16 -0700
Received-SPF: error
receiver=stlport.com; client-ip=12.163.41.8;
envelope-from=ag*******@meta-comm.com
Received: by expressmail.office.meta with Internet Mail Service
(5.5.2653.19)
id <RGXKFLXQ>; Mon, 23 Aug 2004 19:03:42 -0500
Message-ID: <83************************************@expressmai l.office.meta>
From: Aleksey Gurtovoy <ag*******@meta-comm.com>
To: 'David Abrahams' <da**@boost-consulting.com>
Subject: RE: What's wrong with this?
Date: Mon, 23 Aug 2004 19:03:42 -0500
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2653.19)
Content-Type: text/plain;
charset="iso-8859-1"
Jul 18 '05 #1
7 2143
da**@boost-consulting.com (David Abrahams) writes:
I started having some weird problems with Python recently; they're so
weird that I can't begin to explain them. All I can do is describe
the symptoms and hope someone else has a clue. So here goes:

FreeBSD 4.2, Python 2.2.2.

I have a nightly cron job that downloads the boost cvs tarball from
SourceForge and bunzip2s it. For about a year everything worked with
no problems. About a month ago the download started getting truncated
with no error reported.
There were some problems with anonymous CVS on sourceforge, which also
affected the nightly CVS tarballs. Can it have to do with this?
I also had problems downloading the CVS tarball for ctypes - but it
seems now repaired.

http://sourceforge.net/docman/displa...d=1#1093021394
Then bunzip2 reports corruption, of course.

Maybe you don't get a bz2 file, but a HTML error message instead?

Only speculating,

Thomas
Jul 18 '05 #2
Thomas Heller <th*****@python.net> wrote in message news:<zn**********@python.net>...
da**@boost-consulting.com (David Abrahams) writes:
I started having some weird problems with Python recently; they're so
weird that I can't begin to explain them. All I can do is describe
the symptoms and hope someone else has a clue. So here goes:

FreeBSD 4.2, Python 2.2.2.

I have a nightly cron job that downloads the boost cvs tarball from
SourceForge and bunzip2s it. For about a year everything worked with
no problems. About a month ago the download started getting truncated
with no error reported.


There were some problems with anonymous CVS on sourceforge, which also
affected the nightly CVS tarballs. Can it have to do with this?
I also had problems downloading the CVS tarball for ctypes - but it
seems now repaired.

http://sourceforge.net/docman/displa...d=1#1093021394


That's not the problem. I can download the file reliably from other machines.
Then bunzip2 reports corruption, of course.

Maybe you don't get a bz2 file, but a HTML error message instead?


No, it's a truncated bz2. I can use bzip2recover and get some of the contents back.

Thanks, though.
--
Dave Abrahams
Boost Consulting
http://www.boost-consulting.com
Jul 18 '05 #3
da**@boost-consulting.com (David Abrahams) writes:
Thomas Heller <th*****@python.net> wrote in message news:<zn**********@python.net>...
da**@boost-consulting.com (David Abrahams) writes:
> I started having some weird problems with Python recently; they're so
> weird that I can't begin to explain them. All I can do is describe
> the symptoms and hope someone else has a clue. So here goes:
>
> FreeBSD 4.2, Python 2.2.2.
>
> I have a nightly cron job that downloads the boost cvs tarball from
> SourceForge and bunzip2s it. For about a year everything worked with
> no problems. About a month ago the download started getting truncated
> with no error reported.


There were some problems with anonymous CVS on sourceforge, which also
affected the nightly CVS tarballs. Can it have to do with this?
I also had problems downloading the CVS tarball for ctypes - but it
seems now repaired.

http://sourceforge.net/docman/displa...d=1#1093021394


That's not the problem. I can download the file reliably from other machines.


Actually it appears that urllib is having some problem on Unix, at
least the one from Python-2.2.x. This fails on Both FreeBSD and
Linux:

urllib.urlretrieve(
'http://cvs.sourceforge.net/cvstarballs/boost-cvsroot.tar.bz2',
'boost-cvsroot.tar.bz2')

Trying again with Python 2.3 on Cygwin.

--
Dave Abrahams
Boost Consulting
http://www.boost-consulting.com
Jul 18 '05 #4
David Abrahams wrote in news:uz***********@boost-consulting.com in
comp.lang.python:
That's not the problem. I can download the file reliably from other
machines.

At the same time, using http ?

Actually it appears that urllib is having some problem on Unix, at
least the one from Python-2.2.x. This fails on Both FreeBSD and
Linux:

urllib.urlretrieve(
'http://cvs.sourceforge.net/cvstarballs/boost-cvsroot.tar.bz2',
'boost-cvsroot.tar.bz2')

Trying again with Python 2.3 on Cygwin.


Is it possible the file is being (re) uploaded (via cvs) during your
cron job's download, thus truncating your download ?

Perhapse you should change to cvs:

os.system( 'cvs ... ' )

FWIW, I tried downlading with IE using the link above I got a
truncated 6 and bit MB's (16:15 BST (UTC +0100)).

Rob.
--
http://www.victim-prime.dsl.pipex.com/
Jul 18 '05 #5
Rob Williscroft <rt*@freenet.co.uk> writes:
David Abrahams wrote in news:uz***********@boost-consulting.com in
comp.lang.python:
That's not the problem. I can download the file reliably from other
machines.
At the same time, using http ?
I can download the file reliably using IE from my WinXP box.

I can download the file reliably using urllib from Cygwin Python 2.3.2

The 2nd element returned by urlretrieve is

'Date: Wed, 25 Aug 2004 14:50:17 GMT\r\nServer: Apache/2.0.40 (Red Hat Linux)\r\nLast-Modified: Wed, 25 Aug 20
2 GMT\r\nETag: "b63d5b-20ec84b-18057e80"\r\nAccept-Ranges: bytes\r\nContent-Length: 34523211\r\nContent-Type:
n/x-bzip2\r\nConnection: close\r\n'
Actually it appears that urllib is having some problem on Unix, at
least the one from Python-2.2.x. This fails on Both FreeBSD and
Linux:

urllib.urlretrieve(
'http://cvs.sourceforge.net/cvstarballs/boost-cvsroot.tar.bz2',
'boost-cvsroot.tar.bz2')

Trying again with Python 2.3 on Cygwin.


As you can see from the above, it works. Is there a known urllib bug
in earlier Pythons?
Is it possible the file is being (re) uploaded (via cvs) during your
cron job's download, thus truncating your download ?
I don't think so.
Perhapse you should change to cvs:

os.system( 'cvs ... ' )
The problem with that is that I want to capture the whole CVS
history, not just today's state.
FWIW, I tried downlading with IE using the link above I got a
truncated 6 and bit MB's (16:15 BST (UTC +0100)).

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Sorry, what does that mean? Did it show that message in a dialog,
or...?

--
Dave Abrahams
Boost Consulting
http://www.boost-consulting.com
Jul 18 '05 #6
David Abrahams wrote in news:uy***********@boost-consulting.com in
comp.lang.python:
Rob Williscroft <rt*@freenet.co.uk> writes:
David Abrahams wrote in news:uz***********@boost-consulting.com in
comp.lang.python:
That's not the problem. I can download the file reliably from
other machines.
At the same time, using http ?


I can download the file reliably using IE from my WinXP box.

I can download the file reliably using urllib from Cygwin Python 2.3.2

The 2nd element returned by urlretrieve is


Which version, the one that works or the one that doesn't ?

'Date: Wed, 25 Aug 2004 14:50:17 GMT\r\nServer: Apache/2.0.40 (Red
Hat Linux)\r\nLast-Modified: Wed, 25 Aug 20 2 GMT\r\nETag:
Something is missing here:

Last-Modified: Wed, 25 Aug 20 2 GMT

Contrast:

Wed, 25 Aug 2004 14:50:17 GMT
"b63d5b-20ec84b-18057e80"\r\nAccept-Ranges: bytes\r\nContent-Length:
34523211\r\nContent-Type: n/x-bzip2\r\nConnection: close\r\n'
34 MB's ( I got 6 MB's )
Trying again with Python 2.3 on Cygwin.
As you can see from the above, it works. Is there a known urllib bug
in earlier Pythons?


Sorry I don't know, but I've seen the same truncation with no python,
and no unix.
Is it possible the file is being (re) uploaded (via cvs) during your
cron job's download, thus truncating your download ?


I don't think so.


Can you test wether or not this is happening ? I.e if you don't
get the full 34523211 bytes re-download and compare the above
Length, ETag and Last-Modified.
Perhapse you should change to cvs:

os.system( 'cvs ... ' )
The problem with that is that I want to capture the whole CVS
history, not just today's state.


I was suggesting you get the tarball via cvs, though presumably
sourceforge don't give you the option. http has the problem that
the server will just truncate the download if the source file
gets replaced.
FWIW, I tried downlading with IE using the link above I got a
truncated 6 and bit MB's (16:15 BST (UTC +0100)).

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Sorry, what does that mean? Did it show that message in a dialog,
or...?


No, I got a download complete, but the file was only 6 MB's, bzip2 -t
told me the file was truncated, the (16:15 ...) is the time I tried
downloading, BST = British Summer Time, though you wouldn't know it
from the weather :).

Further I just ran:

import urllib

filename, headers = \
urllib.urlretrieve(
'http://cvs.sourceforge.net/cvstarballs/boost-cvsroot.tar.bz2',
'boost-cvsroot.tar.bz2')

print filename

print headers

boost-cvsroot.tar.bz2
Date: Wed, 25 Aug 2004 16:53:20 GMT
Server: Apache/2.0.40 (Red Hat Linux)
Last-Modified: Wed, 25 Aug 2004 14:14:02 GMT
ETag: "b63d5b-20ec84b-18057e80"
Accept-Ranges: bytes
Content-Length: 34523211
Content-Type: application/x-bzip2
Connection: close

The script ended at 17::59 BST, Note the difference bettween the two
times in the headers, suggesting the file was modified 1:45 min's
ago ~ the same time my attempted download with IE failed.

Rob.
--
http://www.victim-prime.dsl.pipex.com/
Jul 18 '05 #7
Rob Williscroft <rt*@freenet.co.uk> writes:
David Abrahams wrote in news:uy***********@boost-consulting.com in
comp.lang.python:
Rob Williscroft <rt*@freenet.co.uk> writes:
David Abrahams wrote in news:uz***********@boost-consulting.com in
comp.lang.python:

> That's not the problem. I can download the file reliably from
> other machines.

At the same time, using http ?
I can download the file reliably using IE from my WinXP box.

I can download the file reliably using urllib from Cygwin Python 2.3.2

The 2nd element returned by urlretrieve is


Which version, the one that works or the one that doesn't ?

'Date: Wed, 25 Aug 2004 14:50:17 GMT\r\nServer: Apache/2.0.40 (Red
Hat Linux)\r\nLast-Modified: Wed, 25 Aug 20 2 GMT\r\nETag:
The one that works.
Something is missing here:

Last-Modified: Wed, 25 Aug 20 2 GMT

Contrast:

Wed, 25 Aug 2004 14:50:17 GMT
Where did that come from, what do you think is missing, and why?
"b63d5b-20ec84b-18057e80"\r\nAccept-Ranges: bytes\r\nContent-Length:
34523211\r\nContent-Type: n/x-bzip2\r\nConnection: close\r\n'


34 MB's ( I got 6 MB's )


It's 34MB.
Trying again with Python 2.3 on Cygwin.


As you can see from the above, it works. Is there a known urllib bug
in earlier Pythons?


Sorry I don't know, but I've seen the same truncation with no python,
and no unix.


Argh.
Is it possible the file is being (re) uploaded (via cvs) during your
cron job's download, thus truncating your download ?


I don't think so.


Can you test wether or not this is happening ? I.e if you don't
get the full 34523211 bytes re-download and compare the above
Length, ETag and Last-Modified.


I did some tests, but didn't come up with anything conclusive. I set
my cron job to start 3 hours later. We'll see.
Perhapse you should change to cvs:

os.system( 'cvs ... ' )


The problem with that is that I want to capture the whole CVS
history, not just today's state.


I was suggesting you get the tarball via cvs, though presumably
sourceforge don't give you the option.


No they don't.
http has the problem that
the server will just truncate the download if the source file
gets replaced.
FWIW, I tried downlading with IE using the link above I got a
truncated 6 and bit MB's (16:15 BST (UTC +0100)).

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Sorry, what does that mean? Did it show that message in a dialog,
or...?


No, I got a download complete, but the file was only 6 MB's, bzip2 -t
told me the file was truncated, the (16:15 ...) is the time I tried
downloading, BST = British Summer Time, though you wouldn't know it
from the weather :).

Further I just ran:

import urllib

filename, headers = \
urllib.urlretrieve(
'http://cvs.sourceforge.net/cvstarballs/boost-cvsroot.tar.bz2',
'boost-cvsroot.tar.bz2')

print filename

print headers

boost-cvsroot.tar.bz2
Date: Wed, 25 Aug 2004 16:53:20 GMT
Server: Apache/2.0.40 (Red Hat Linux)
Last-Modified: Wed, 25 Aug 2004 14:14:02 GMT
ETag: "b63d5b-20ec84b-18057e80"
Accept-Ranges: bytes
Content-Length: 34523211
Content-Type: application/x-bzip2
Connection: close

The script ended at 17::59 BST, Note the difference bettween the two
times in the headers, suggesting the file was modified 1:45 min's
ago ~ the same time my attempted download with IE failed.


That's odd! Your (failed) download modified the file being
downloaded??

--
Dave Abrahams
Boost Consulting
http://www.boost-consulting.com
Jul 18 '05 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
by: Mccormick.Johnw | last post by:
Hello, ... I am John McCormick (Systems Programmer ) and I am currently working on a python program which will connect (user) specified inputs and connect them to (user) selected outputs (like...
0
by: Paolino | last post by:
I had always been negative on the boldeness of python on insisting that unbound methods should have been applied only to its im_class instances. Anyway this time I mixed in rightly, so I post this...
13
by: Peter Corbett | last post by:
One of my friends has recently taken up Python, and was griping a bit about the language (it's too "prescriptive" for his tastes). In particular, he didn't like the way that Python expressions were...
7
by: BrJohan | last post by:
Assume having this class hierarchy: (in principle and without details) class A(object): class B1(A): class B2(A): class C1(A1): class C2(A1): class C3(B1): class C4(B2): each of those...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.