473,388 Members | 1,020 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,388 software developers and data experts.

urllib problem (maybe bugs?)

Hi,

I'm trying to fill the form on page
http://www.cbs.dtu.dk/services/TMHMM/ using urllib.

There are two peculiarities. First of all, I am filling in incorrect
key/value pairs in the parameters on purpose because that's the only
way I can get it to work.. For "version" I am suppose to leave it
unchecked, having value of empty string. And for name "outform" I am
suppose to assign it a value of "-short". Instead, I left out
"outform" all together and fill in "-short" for version. I discovered
the method my accident.

After I've done that it works fine for small SEQ values. Then, when I
try to send large amount of data (1.4MB), it fails miserably with
AttributeError exception.

I highly suspect the two problems I have are the result of some bugs
in the urllib module. Any suggestions?

This is my code:

================================================== ==========
fd = open('secretory0_1.txt', 'r')
txt = fd.read()
fd.close()

params = urllib.urlencode({'SEQ': txt, 'configfile':
'/usr/opt/www/pub/CBS/services/TMHMM-2.0/TMHMM2.cf', 'version':
'-short'})
f = urllib.urlopen("http://www.cbs.dtu.dk/cgi-bin/nph-webface", params)
data = f.read()

start = data.find('follow <a href="h') + 16
end = data.find('">This link</a>')
secondurl = data[start:end]

f = urllib.urlopen(secondurl)
print f.read()
================================================== ==========
The value pairs I am suppose to fill in are:

SEQ => some sequence here
configfile => '/usr/opt/www/pub/CBS/services/TMHMM-2.0/TMHMM2.cf'
version => ''
outform => '-short'
The exception I get when sending secretory0_1.txt is:

C:\Documents and Settings\thw\*>python testhttp.py
Traceback (most recent call last):
File "testhttp.py", line 11, in ?
f = urllib.urlopen("http://www.cbs.dtu.dk/cgi-bin/nph-webface", params)
File "C:\Python24\lib\urllib.py", line 79, in urlopen
return opener.open(url, data)
File "C:\Python24\lib\urllib.py", line 182, in open
return getattr(self, name)(url, data)
File "C:\Python24\lib\urllib.py", line 307, in open_http
return self.http_error(url, fp, errcode, errmsg, headers, data)
File "C:\Python24\lib\urllib.py", line 322, in http_error
return self.http_error_default(url, fp, errcode, errmsg, headers)
File "C:\Python24\lib\urllib.py", line 550, in http_error_default
return addinfourl(fp, headers, "http:" + url)
File "C:\Python24\lib\urllib.py", line 836, in __init__
addbase.__init__(self, fp)
File "C:\Python24\lib\urllib.py", line 786, in __init__
self.read = self.fp.read
AttributeError: 'NoneType' object has no attribute 'read'

Timothy
Jul 18 '05 #1
1 2046
Timothy Wu <2h*****@gmail.com> wrote:

I'm trying to fill the form on page
http://www.cbs.dtu.dk/services/TMHMM/ using urllib.

There are two peculiarities. First of all, I am filling in incorrect
key/value pairs in the parameters on purpose because that's the only
way I can get it to work.. For "version" I am suppose to leave it
unchecked, having value of empty string. And for name "outform" I am
suppose to assign it a value of "-short". Instead, I left out
"outform" all together and fill in "-short" for version. I discovered
the method my accident.

After I've done that it works fine for small SEQ values. Then, when I
try to send large amount of data (1.4MB), it fails miserably with
AttributeError exception.

I highly suspect the two problems I have are the result of some bugs
in the urllib module. Any suggestions?


The <form> on that page wants multipart/form-data encoding, where each
parameter is embedded in a separate MIME section. That is common with very
large data, and is _required_ when the form includes a file upload. urllib
doesn't do that. It always sends plain old
application/x-www-form-urlencoded data.

I think you're going to have to roll your own sender.
--
- Tim Roberts, ti**@probo.com
Providenza & Boekelheide, Inc.
Jul 18 '05 #2

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
by: Gary Feldman | last post by:
I think I've found a deficiency in the design of urllib related to https. In order to complete an https connection, it appears that URLOpener and hence FancyURLOpener require the key and cert...
3
by: Michael | last post by:
How do you change the user agent reported by urllib? I need to access a resource that rejects anything but IE.
3
by: Haim Ashkenazi | last post by:
Hi I'm writing a script that uses urllib on win98. until now I used python 2.3.x (x < 4) and it worked ok. I re-installed windows and installed python 2.3.4 and now I get an error when trying to...
7
by: Stuart McGraw | last post by:
I just spent a $*#@!*&^&% hour registering at ^$#@#%^ Sourceforce and trying to submit a Python bug report but it still won't let me. I give up. Maybe someone who cares will see this post, or...
0
by: Ali.Sabil | last post by:
hello all, I just maybe hit a bug in both urllib and urllib2, actually urllib doesn't support proxy authentication, and if you setup the http_proxy env var to...
1
by: evanpmeth | last post by:
I have tried multiple ways of posting information to a website and have failed. I have seen this problem on other forums can someone explain or point me to information on how POST works through...
4
by: John Nagle | last post by:
There's no way to set a timeout if you use "urllib" to open a URL. "HTTP", which "urllib" uses, supports this, but the functionality is lost at the "urllib" level. It's not available via "class...
2
by: shirish | last post by:
Hi all, Before I start, please know that I'm no developer, just a user. This is also a cross-post as I have tried to post the same at python- bugs mailing list, just don't know if it gets in or...
5
by: John Nagle | last post by:
I thought I had all the timeout problems with urllib worked around, but no. socket.setdefaulttimeout is useful, but not always effective. I'm setting that to 15 seconds. If the host end won't...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.