472,354 Members | 2,171 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 472,354 software developers and data experts.

[2.5.1] "UnicodeDecodeError: 'ascii' codec can't decode byte"?

Hello

I'm getting this error while downloading and parsing web pages:

=====
title = m.group(1)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe9 in position
48: ordinal not in range(128)
=====

From what I understand, it's because some strings are Unicode, and
hence contain characters that are illegal in ASCII.

Does someone know how to solve this error?

Thank you.
Oct 29 '08 #1
3 14525
Gilles Ganault wrote:
I'm getting this error while downloading and parsing web pages:

=====
title = m.group(1)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe9 in position
48: ordinal not in range(128)
=====

From what I understand, it's because some strings are Unicode, and
hence contain characters that are illegal in ASCII.
You just need to use a codec according to the encoding of the webpage. Take
a look at
http://wiki.python.org/moin/Python3UnicodeDecodeError
It is about Python 3, but the principles apply nonetheless. In any case,
throwing the error at a websearch will turn up lots of solutions.

Uli

--
Sator Laser GmbH
Geschäftsführer: Thorsten Föcking, Amtsgericht Hamburg HR B62 932

Oct 29 '08 #2
Ulrich Eckhardt wrote:
Gilles Ganault wrote:
>I'm getting this error while downloading and parsing web pages:

=====
title = m.group(1)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe9 in position
48: ordinal not in range(128)
=====

From what I understand, it's because some strings are Unicode, and
hence contain characters that are illegal in ASCII.

You just need to use a codec according to the encoding of the webpage. Take
a look at
http://wiki.python.org/moin/Python3UnicodeDecodeError
It is about Python 3, but the principles apply nonetheless. In any case,
throwing the error at a websearch will turn up lots of solutions.
I won't believe that statement is producing the error until I see a
traceback. As far as I'm aware the re module can handle Unicode. Getting
a UnicodeDecodeError in an assignment would be unusual to say the least.
Though it's not, I suppose, impossible that calling the .group() method
of a match object might, it seems unlikely.

regards
Steve
--
Steve Holden +1 571 484 6266 +1 800 494 3119
Holden Web LLC http://www.holdenweb.com/

Oct 29 '08 #3
Ulrich Eckhardt wrote:
Gilles Ganault wrote:
>I'm getting this error while downloading and parsing web pages:

=====
title = m.group(1)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe9 in position
48: ordinal not in range(128)
=====

From what I understand, it's because some strings are Unicode, and
hence contain characters that are illegal in ASCII.

You just need to use a codec according to the encoding of the webpage. Take
a look at
http://wiki.python.org/moin/Python3UnicodeDecodeError
It is about Python 3, but the principles apply nonetheless. In any case,
throwing the error at a websearch will turn up lots of solutions.
I won't believe that statement is producing the error until I see a
traceback. As far as I'm aware the re module can handle Unicode. Getting
a UnicodeDecodeError in an assignment would be unusual to say the least.
Though it's not, I suppose, impossible that calling the .group() method
of a match object might, it seems unlikely.

regards
Steve
--
Steve Holden +1 571 484 6266 +1 800 494 3119
Holden Web LLC http://www.holdenweb.com/

Oct 29 '08 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

13
by: - Steve - | last post by:
I've been given a school assignment and while everything else is easy there's one topic I'm completley lost on. I've been given an ASCII file that looks like this. During start-up, the program...
22
by: bq | last post by:
Hello, Two questions related to floating point support: What C compilers for the wintel (MS Windows + x86) platform are C99 compliant as far as <math.h> and <tgmath.h> are concerned? What...
13
by: baumann.Pan | last post by:
when define char *p = " can not modify"; p ='b' ;is not allowed, but if you declare p as char p = "can modify"; p = 'b'; is ok? why?
6
by: Kai Bøhli | last post by:
Hi all ! I've got a lot of feedback from (the always helpful) Jon Skeet on this subject before. Dispite this I'm still not there - due to my own lack of knowledge of course. Anyway, I'm...
5
by: _BNC | last post by:
I've converted " byte" to "byte *" at times, using 'unsafe' and fixed { .... }, but the reverse does not seem to work. In this case, a C++ DLL returns a byte * and a length. What is the best...
2
by: Chris Wood | last post by:
In C#, I am calling a method implemented in Managed C++ that returns an array of booleans. This method in turn calls unto unmanaged C++ code that returns an unsigned byte array, which is...
3
by: mr | last post by:
How can i 'force' c++ to interprete "blabla" strings as unicode string instead of ascii string (i just don't want to add 'L' before the thousands strings that are on my projects...), as all my...
5
by: Achim Domma | last post by:
Hi, I have to convert a string to its "best possible" ascii representation. It's clear to me that this is not possible or sense full for all unicode characters. But for most European characters...
8
by: jeffpierce12 | last post by:
Hello, I am trying to send some characters to a scanner that I have hooked up to the COM 1 port on my PC. I am running Linux operating system, and I have the following sample program: ...
2
by: Kemmylinns12 | last post by:
Blockchain technology has emerged as a transformative force in the business world, offering unprecedented opportunities for innovation and efficiency. While initially associated with cryptocurrencies...
0
jalbright99669
by: jalbright99669 | last post by:
Am having a bit of a time with URL Rewrite. I need to incorporate http to https redirect with a reverse proxy. I have the URL Rewrite rules made but the http to https rule only works for...
0
by: antdb | last post by:
Ⅰ. Advantage of AntDB: hyper-convergence + streaming processing engine In the overall architecture, a new "hyper-convergence" concept was proposed, which integrated multiple engines and...
2
by: Matthew3360 | last post by:
Hi, I have a python app that i want to be able to get variables from a php page on my webserver. My python app is on my computer. How would I make it so the python app could use a http request to get...
0
by: AndyPSV | last post by:
HOW CAN I CREATE AN AI with an .executable file that would suck all files in the folder and on my computerHOW CAN I CREATE AN AI with an .executable file that would suck all files in the folder and...
0
by: Arjunsri | last post by:
I have a Redshift database that I need to use as an import data source. I have configured the DSN connection using the server, port, database, and credentials and received a successful connection...
0
Oralloy
by: Oralloy | last post by:
Hello Folks, I am trying to hook up a CPU which I designed using SystemC to I/O pins on an FPGA. My problem (spelled failure) is with the synthesis of my design into a bitstream, not the C++...
0
by: Carina712 | last post by:
Setting background colors for Excel documents can help to improve the visual appeal of the document and make it easier to read and understand. Background colors can be used to highlight important...
0
BLUEPANDA
by: BLUEPANDA | last post by:
At BluePanda Dev, we're passionate about building high-quality software and sharing our knowledge with the community. That's why we've created a SaaS starter kit that's not only easy to use but also...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.