473,888 Members | 1,599 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

raise UnicodeError, "label too long"

Hi I am havin a problem with urllib2.urlopen .

I get this error when I try to pass a unicode to it.

raise UnicodeError, "label too long"

is this problem avoidable? no browser or programs such as wget seem to
have a problem with these strings.

Jan 24 '07 #1
6 7554
In <11************ **********@l53g 2000cwa.googleg roups.com>, Flavio wrote:
Hi I am havin a problem with urllib2.urlopen .

I get this error when I try to pass a unicode to it.

raise UnicodeError, "label too long"

is this problem avoidable? no browser or programs such as wget seem to
have a problem with these strings.
What exactly are you doing? How does a (unicode?) string look like that
triggers this exception?

Ciao,
Marc 'BlackJack' Rintsch

Jan 24 '07 #2
What I am doing is very simple:

I fetch an url (html page) parse it using BeautifulSoup, extract the
links and try to open each of the links, repeating the cycle.

Beautiful soup converts the html to unicode. That's why when I try to
open the links extracted from the page I get this error.

This is bad, since some links do contain strings with non-ascii
characters.

thanks,

Flávio
Marc 'BlackJack' Rintsch escreveu:
In <11************ **********@l53g 2000cwa.googleg roups.com>, Flavio wrote:
Hi I am havin a problem with urllib2.urlopen .

I get this error when I try to pass a unicode to it.

raise UnicodeError, "label too long"

is this problem avoidable? no browser or programs such as wget seem to
have a problem with these strings.

What exactly are you doing? How does a (unicode?) string look like that
triggers this exception?

Ciao,
Marc 'BlackJack' Rintsch
Jan 24 '07 #3
Flavio schrieb:
What I am doing is very simple:

I fetch an url (html page) parse it using BeautifulSoup, extract the
links and try to open each of the links, repeating the cycle.

Beautiful soup converts the html to unicode. That's why when I try to
open the links extracted from the page I get this error.

This is bad, since some links do contain strings with non-ascii
characters.
Please try answering the exact question that Marc asked:
what is an example for unicode string that triggers the
exception?

Regards,
Martin
Jan 24 '07 #4

something like this, for instance:
http://.wikipedia.org/wiki/Copper%28II%29_hydroxide

but even url with any non-ascii characters such as this

http://.wikipedia.org/wiki/Ammonia

also fail when passed to urlopen :
File "/usr/lib/python2.4/encodings/idna.py", line 72, in ToASCII
raise UnicodeError, "label too long"
UnicodeError: label too long

very strange, because I tried other unicode urls from the python
console like this

urllib2.urlopen (u'www.google.c om')

and it works normally:

Martin v. Löwis escreveu:
Flavio schrieb:
What I am doing is very simple:

I fetch an url (html page) parse it using BeautifulSoup, extract the
links and try to open each of the links, repeating the cycle.

Beautiful soup converts the html to unicode. That's why when I try to
open the links extracted from the page I get this error.

This is bad, since some links do contain strings with non-ascii
characters.

Please try answering the exact question that Marc asked:
what is an example for unicode string that triggers the
exception?

Regards,
Martin
Jan 25 '07 #5
Flavio schrieb:
something like this, for instance:
http://.wikipedia.org/wiki/Copper%28II%29_hydroxide

but even url with any non-ascii characters such as this

http://.wikipedia.org/wiki/Ammonia

also fail when passed to urlopen :
File "/usr/lib/python2.4/encodings/idna.py", line 72, in ToASCII
raise UnicodeError, "label too long"
UnicodeError: label too long

very strange, because I tried other unicode urls from the python
console like this
It's the host name that starts with a dot that makes it fails:

pyu".wikipedia. org".encode("id na")
Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "encodings/idna.py", line 163, in encode
File "encodings/idna.py", line 72, in ToASCII
UnicodeError: label too long
pyu"wikipedia.o rg".encode("idn a")
'wikipedia.org'

The exception is certainly misleading; I'll have to find out
whether there is a bug beyond that (i.e. whether host names
with empty labels should be accepted).

Regards,
martin
Jan 25 '07 #6
Guys, I am sorry I wrote these messages very late at night.

Naturally what came before the dot is the language defining two letter
string that is usual of wikipedia urls.

Something in my code is obviously gobbling that up. Thanks for pointing
that out and my apologies again for not seeing this obvious bug.

On Jan 25, 4:39 am, Dennis Lee Bieber <wlfr...@ix.net com.comwrote:
On 24 Jan 2007 16:25:19 -0800, "Flavio" <fccoe...@gmail .comdeclaimed
the following in comp.lang.pytho n:
something like this, for instance:
http://.wikipedia.org/wiki/Copper%28II%29_hydroxide Was there some text between the // and .wikipedia? As written this,
and the next one, both lock up Firefox. Take out the . and they work (or
put www before the . ).

--
Wulfraed Dennis Lee Bieber KD6MOG
wlfr...@ix.netc om.com wulfr...@bestia ria.com
HTTP://wlfraed.home.netcom.com/
(Bestiaria Support Staff: web-a...@bestiaria. com)
HTTP://www.bestiaria.com/
Jan 25 '07 #7

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
3636
by: Num | last post by:
Hi all, I have to convert a J2EE date as a long ("Millis") in a .NET date as a long ("Ticks") In Java, currentTimeMillis, is the difference, measured in milliseconds, between the current time and midnight, January 1, 1970 UTC. In .NET, DateTime.Ticks is the 100-nanosecond intervals that have
16
3801
by: ondekoza | last post by:
Hello, I need to convert the string "FFFFFFFF" to a long. To convert this string I tried the following: >>> 0xffffffff -1 >>> 0xffffffffL 4294967295L OK, this is what I want, so I tried
12
43498
by: Zero | last post by:
Hi everybody, i want to write a small program, which shows me the biggest and smallest number in dependance of the data type. For int the command could be: printf("\n%20s\t%7u\t%13i\t%13i","signed int",sizeof(signed int),INT_MIN,INT_MAX);
4
2047
by: stevewy | last post by:
If I am using srcElement (or "target" for non-IE models) to return various properties of an object I have clicked on, can I access for "label for" value in any way? I'm thinking, for example, of an input text box like this: <label for="fred">Username:</label> <input type="text" name="hello" id="hello1" size="10"> Now, once a function has:
1
4609
by: alien_attack | last post by:
I have the a repeater which is databound to an XmlDatasource. Inside the repeater is a Label as follows: <asp:Label ID="lbl" runat="server" Text='<%# XPath("Asset_Metadata/@Value")%>' /> This gives me a "The server tag is not well formed". It seems to be because of the single quotes in the XPath statement. I've tried changing the single quotes to &apos; which compiles fine,
4
3354
RMWChaos
by: RMWChaos | last post by:
Darnit all, I expect the code I steal from others to work! =D Below is some code that I got to initiate multiple javascripts on page load (rather than using the "onload=" attribute). According the blog where it was posted, it seems that it works for everyone who posted replies...not a single "this is not working for me" post. The errors I receive are as follows: IE7: expected ";" and then "conditional compilation is turned off" FF &...
2
2808
by: Jethro | last post by:
Hi guys, I'm a bit of a newbie, but know VB Classic quite well. The above statement appears in some code I'm hacking, and I have added a column to the DataGrid (in the middle, so those to the right have had their index upped by one). Yet the statement above seems to still return the same value. Is the Index value (9) in the statement : CType(e.Row.Cells(9).FindControl("lblTimeleft"), Label)
0
9961
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
10777
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
0
10438
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
9597
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
0
7148
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5817
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
6014
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
4642
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
3
3251
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.