473,324 Members | 2,246 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,324 software developers and data experts.

urlopen() error

Hello. I am getting an error and it has gotten me stuck. I think the
best thing I can do is post my code and the error message and thank
everybody in advanced for any help that you give this issue. Thank you.

#############
Here's the code:
#############

import urllib2
import re
import xlrd
from BeautifulSoup import BeautifulSoup

book = xlrd.open_workbook("ige_virtualMoney.xls")
sh = book.sheet_by_index(0)
rx = 1
for rx in range(sh.nrows):
u = sh.cell_value(rx, 0)
page = urllib2.urlopen(u)
soup = BeautifulSoup(page)
p = soup.findAll('span', "sale")
p = str(p)
p2 = re.findall('\$\d+\.\d\d', p)
for price in p2:
print price

######################
Here are the error messages:
######################

Traceback (most recent call last):
File "E:\Python24\scraper.py", line 16, in -toplevel-
page = urllib2.urlopen(u)
File "E:\Python24\lib\urllib2.py", line 130, in urlopen
return _opener.open(url, data)
File "E:\Python24\lib\urllib2.py", line 350, in open
protocol = req.get_type()
File "E:\Python24\lib\urllib2.py", line 233, in get_type
raise ValueError, "unknown url type: %s" % self.__original
ValueError: unknown url type: List

Sep 8 '06 #1
3 5217
Tempo wrote:
Hello. I am getting an error and it has gotten me stuck. I think the
best thing I can do is post my code and the error message and thank
everybody in advanced for any help that you give this issue. Thank you.

#############
Here's the code:
#############

import urllib2
import re
import xlrd
from BeautifulSoup import BeautifulSoup

book = xlrd.open_workbook("ige_virtualMoney.xls")
sh = book.sheet_by_index(0)
rx = 1
for rx in range(sh.nrows):
u = sh.cell_value(rx, 0)
page = urllib2.urlopen(u)
soup = BeautifulSoup(page)
p = soup.findAll('span', "sale")
p = str(p)
p2 = re.findall('\$\d+\.\d\d', p)
for price in p2:
print price
ValueError: unknown url type: List
^^^^^^^^^^^^^^^^^^^^^^

I don't xlrd, but:
http://docs.python.org/lib/module-urllib2.html
urlopen( url[, data])
Open the URL url, which can be either a string or a Request object.
data should be a string, which specifies additional data to send to the
server. In HTTP requests, which are the only ones that support data, it
should be a buffer in the format of application/x-www-form-urlencoded, for
example one returned from urllib.urlencode().

What is your _u_?
--
Rafał Zawadzki [jid/mail: bl*****@jabberpl.org, skype: blvszcz]
http://glam.pl - używane ciuchy, vintage, secondhand
http://bluszcz.net - moja strona domowa
Sep 8 '06 #2
Tempo wrote:
Hello. I am getting an error and it has gotten me stuck. I think the
best thing I can do is post my code and the error message and thank
everybody in advanced for any help that you give this issue. Thank you.

#############
Here's the code:
#############

import urllib2
import re
import xlrd
from BeautifulSoup import BeautifulSoup

book = xlrd.open_workbook("ige_virtualMoney.xls")
sh = book.sheet_by_index(0)
rx = 1
for rx in range(sh.nrows):
u = sh.cell_value(rx, 0)
page = urllib2.urlopen(u)
soup = BeautifulSoup(page)
p = soup.findAll('span', "sale")
p = str(p)
p2 = re.findall('\$\d+\.\d\d', p)
for price in p2:
print price

######################
Here are the error messages:
######################

Traceback (most recent call last):
File "E:\Python24\scraper.py", line 16, in -toplevel-
page = urllib2.urlopen(u)
File "E:\Python24\lib\urllib2.py", line 130, in urlopen
return _opener.open(url, data)
File "E:\Python24\lib\urllib2.py", line 350, in open
protocol = req.get_type()
File "E:\Python24\lib\urllib2.py", line 233, in get_type
raise ValueError, "unknown url type: %s" % self.__original
ValueError: unknown url type: List
You were expecting u to be a url string like "http://google.com", but it
looks like it is actually a list. I'm not familiar with package xlrd but
cell_value() must be returning a list and not a cell value. Presumably,
the list contains the cell value probably in element 0. Put in a print
statement before your call to urlopen() like:

print u

You'll likely discover your error.

--
Paul McNett
http://paulmcnett.com
http://dabodev.com

Sep 8 '06 #3

Paul McNett wrote:
Tempo wrote:
Hello. I am getting an error and it has gotten me stuck. I think the
best thing I can do is post my code and the error message and thank
everybody in advanced for any help that you give this issue. Thank you.

#############
Here's the code:
#############

import urllib2
import re
import xlrd
from BeautifulSoup import BeautifulSoup

book = xlrd.open_workbook("ige_virtualMoney.xls")
sh = book.sheet_by_index(0)
rx = 1
for rx in range(sh.nrows):
The above 2 lines should probably be:
for rx.range(1, sh.nrows):
otherwise the likelihood is that a column heading will be treated as
data.
Now read on ;-)
u = sh.cell_value(rx, 0)
page = urllib2.urlopen(u)
soup = BeautifulSoup(page)
p = soup.findAll('span', "sale")
p = str(p)
p2 = re.findall('\$\d+\.\d\d', p)
for price in p2:
print price

######################
Here are the error messages:
######################

Traceback (most recent call last):
File "E:\Python24\scraper.py", line 16, in -toplevel-
page = urllib2.urlopen(u)
File "E:\Python24\lib\urllib2.py", line 130, in urlopen
return _opener.open(url, data)
File "E:\Python24\lib\urllib2.py", line 350, in open
protocol = req.get_type()
File "E:\Python24\lib\urllib2.py", line 233, in get_type
raise ValueError, "unknown url type: %s" % self.__original
ValueError: unknown url type: List

You were expecting u to be a url string like "http://google.com", but it
looks like it is actually a list. I'm not familiar with package xlrd but
cell_value() must be returning a list and not a cell value. Presumably,
the list contains the cell value probably in element 0. Put in a print
statement before your call to urlopen() like:

print u
Sage advice. print repr(u) is in general even better advice.
>
You'll likely discover your error.
Just for the record:

1. The xlrd package's Book.Sheet.cell_value() does *not* return lists.
As its docs say, it returns scalars, of the following types: unicode,
int, float, strg

2. The error is nothing to do with Python lists, it's all about
malformed URLs. "unknown url type" means it's not one of http, ftp,
file, data, gopher, ...

|>>x = urllib2.urlopen('List')
Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "C:\Python24\lib\urllib2.py", line 130, in urlopen
return _opener.open(url, data)
File "C:\Python24\lib\urllib2.py", line 350, in open
protocol = req.get_type()
File "C:\Python24\lib\urllib2.py", line 233, in get_type
raise ValueError, "unknown url type: %s" % self.__original
ValueError: unknown url type: List

|>>x = urllib2.urlopen('GOTCHA')
Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "C:\Python24\lib\urllib2.py", line 130, in urlopen
return _opener.open(url, data)
File "C:\Python24\lib\urllib2.py", line 350, in open
protocol = req.get_type()
File "C:\Python24\lib\urllib2.py", line 233, in get_type
raise ValueError, "unknown url type: %s" % self.__original
ValueError: unknown url type: GOTCHA
|>>>

HTH,
John

Sep 15 '06 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

5
by: MAK | last post by:
I'm stumped. I'm trying to use Python 2.3's urllib2.urlopen() to open an HTML file on the local harddrive of my WinXP box. If I were to use, say, Netscape to open this file, I'd specify it as...
1
by: Egor Bolonev | last post by:
how they define proxy in urlopen? i've tried a=urlopen('url', proxies=('127.0.0.1:3128')) a=urlopen('url', proxies='127.0.0.1:3128') but it says there is an error ...
2
by: John F Dutcher | last post by:
Can anyone comment on why the code shown in the Python error is in some way incorrect...or is there a problem with Python on my hoster's site ?? The highlites don't seem to show here...but line...
11
by: Pater Maximus | last post by:
I am trying to implement the recipe listed at http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/211886 However, I can not get to first base. When I try to run import urllib...
0
by: Matt | last post by:
I'm trying to get the HTML data off of a webpage. Let's say for the sake of argument it's the python homepage. I've googled around and found some examples that people said worked. Here's what...
4
by: kgrafals | last post by:
Hi, I'm just trying to read from a webpage with urllib but I'm getting IOErrors. This is my code: import urllib sock = urllib.urlopen("http://www.google.com/") and this is the error:
1
by: Magnus.Moraberg | last post by:
Hi, I have the following code - import urllib2 from BeautifulSoup import BeautifulSoup proxy_support = urllib2.ProxyHandler({"http":"http:// 999.999.999.999:8080"}) opener =...
0
by: Jonas Galvez | last post by:
Try: import re import urllib2 url = 'http://www.google.com/search?num=20&hl=en&q=ipod&btnG=Search' user_agent = 'Mozilla/4.0 (compatible; MSIE 5.5; Windows NT)' headers = {'User-Agent' :...
4
by: Mike Driscoll | last post by:
Hi, I have been using the following code for over a year in one of my programs: f = urllib2.urlopen('https://www.companywebsite.com/somestring') It worked great until the middle of the...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
1
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.