473,378 Members | 1,393 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,378 software developers and data experts.

playing with pyGoogle - strange codec error

Hello,

I am playing around with pyGoogle and encountered an error that I have
never seen, and I am unsure how to correct for it. Here is a code
snippet:

for r in data.results:
print 'Title: ',r.title
print 'URL: ',r.URL
print 'Summary: ',r.snippet
print

Everything works fine until I get to r.snippet. Here is the error:

UnicodeEncodeError: 'ascii' codec can't encode character '\ua9' in
position 119: ordinal not in range(128)

Any help is appreciated.

Thanks,
Brian
--
Nail a post to the Spalted Board. Free WW'ing software and forums.
Regular freebies! http://www.spaltedboard.com

Jul 18 '05 #1
5 1407

"Brian Blazer" <br***@brianandkate.com> wrote in message
news:2005040410062316807%brian@brianandkatecom...
Everything works fine until I get to r.snippet. Here is the error:

UnicodeEncodeError: 'ascii' codec can't encode character '\ua9' in
position 119: ordinal not in range(128)


You have a character there (the copyright sign) that isn't in the ASCII
set. If you have anything other than plain ASCII, you need to consider
encoding. This reference might help:

http://diveintopython.org/xml_processing/unicode.html
Jul 18 '05 #2
On 2005-04-05 07:32:12 -0500, "Richard Brodie" <R.******@rl.ac.uk> said:

"Brian Blazer" <br***@brianandkate.com> wrote in message
news:2005040410062316807%brian@brianandkatecom...
Everything works fine until I get to r.snippet. Here is the error:

UnicodeEncodeError: 'ascii' codec can't encode character '\ua9' in
position 119: ordinal not in range(128)


You have a character there (the copyright sign) that isn't in the ASCII
set. If you have anything other than plain ASCII, you need to consider
encoding. This reference might help:

http://diveintopython.org/xml_processing/unicode.html


Thank you. That is a great reference.

Brian

--
Nail a post to the Spalted Board. Free WW'ing software and forums.
Regular freebies! http://www.spaltedboard.com

Jul 18 '05 #3
On 2005-04-04 10:06:23 -0500, Brian Blazer <br***@brianandkate.com> said:

<snip>

You know, I am beginning to think that I MAY have stumbled on a bug
here. At first I was thinking that this issue was related to the
offending character being out of range for the Mac. Then I tried it on
A MS machine and a linux box; all with the same error.

This does not happen when I wrote the same script in java. This is
making me wonder if there is an issue with the wrapper for the google
api that was originally done in java.

For the sake of it, here is the full code (minus my google key). It is
going to look wierd, but those print statements are there so that I
dont have to open the file it is writing to every time I want to see
stuff. it has my name hard coded into the search query. The commented
r.snippet.encode(mac_roman) was there to see if by changing the
encoding, I could make it work (no luck). I also tried putting

#-*- coding: utf-8 -*-

right after the shebang (as listed here:
http://www.python.org/peps/pep-0263.html). Again, no help.

Anyway, here is the code ------------------------>

import google

google.LICENSE_KEY = 'insertKeyHere'
#print google.doSpellingSuggestion('helllo')
data = google.doGoogleSearch('Brian Blazer')
print 'Found %d results' % len(data.results)

searchData = open('searchData.txt','w')

for r in data.results:
# r.snippet.encode('mac_roman')
searchData.write ('Title: ' + r.title + '\n' + '\n')
searchData.write ('URL: ' + r.URL + '\n' + '\n')
searchData.write ('Snippet: ' + r.snippet + '\n' + '\n'+'\n')
print r.URL
print r.title
print r.snippet

--
Nail a post to the Spalted Board. Free WW'ing software and forums.
Regular freebies! http://www.spaltedboard.com

Jul 18 '05 #4
Brian Blazer wrote:
You know, I am beginning to think that I MAY have stumbled on a bug
here. At first I was thinking that this issue was related to the
offending character being out of range for the Mac. Then I tried it on
A MS machine and a linux box; all with the same error.
The problem, common to all three, is that you're using a terminal whose
default encoding doesn't specify a valid encoding for the copyright
character (in the first case, the default encoding is 'ascii'; it is
likely the case for the others, as well).

When you print a Unicode string, by default it is encoded to your
default encoding. The problem is this cannot be done faithfully with a
string containing a non-ASCII symbol (like the copyright character which
is actually triggering it for you). So, consequently, the encoding is
failing with an error.

What you probably want here is either to use another encoding, or to
specify what to do in the case that the encoding is not possible.
Either encode to a different encoding (one which you know your terminal
supports even though it is not detected, e.g., 'latin-1'), or specify
what to do with errors in the encoding (e.g., 'ignore', which removes
the offending characters, or 'replace', which replaces them with
question marks):

aUnicodeString.decode('latin-1')
aUnicodeString.decode('ascii', 'replace')
This does not happen when I wrote the same script in java. This is
making me wonder if there is an issue with the wrapper for the google
api that was originally done in java.


Java does not handle Unicode the same way.

--
Erik Max Francis && ma*@alcyone.com && http://www.alcyone.com/max/
San Jose, CA, USA && 37 20 N 121 53 W && AIM erikmaxfrancis
Drifting from woman-who-tries misconstrued / Shifting to woman-wise
-- Lamya
Jul 18 '05 #5
On 2005-04-05 13:55:48 -0500, Erik Max Francis <ma*@alcyone.com> said:
<snip>

Thank you, that worked.

Brian

--
Nail a post to the Spalted Board. Free WW'ing software and forums.
Regular freebies! http://www.spaltedboard.com

Jul 18 '05 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

5
by: Syver Enstad | last post by:
Here's the interactive session Python 2.2.2 (#37, Oct 14 2002, 17:02:34) on win32 Type "help", "copyright", "credits" or "license" for more information. >>> ord('\xe5') 229 >>>...
5
by: Fuzzyman | last post by:
Sorry if my terminology is wrong..... but I'm having intermittent problems dealing with accented characters in python. (Only from the 8 bit latin-1 character set I think..) I've written an...
1
by: rdsteph | last post by:
I am having a lot of fun using the pyGoogle module ( http://pygoogle.sourceforge.net/ ) that uses the Google API. It is about as easy to use as I can imagine, and it is a lot nicer than using my...
3
by: beliavsky | last post by:
Google has started a site Google Code http://code.google.com/ to showcase Open Source software, and the first featured project is PyGoogle, a Python module wrapper for the Google Web APIs. Also...
3
by: thomas Armstrong | last post by:
Hi Using Python 2.3.4 + Feedparser 3.3 (a library to parse XML documents) I'm trying to parse a UTF-8 document with special characters like acute-accent vowels: -------- <?xml version="1.0"...
0
by: WinDev | last post by:
I have a video file that used the DivX codec and it runs fine when I double click on it - Windows Media Player comes up and it plays fine. But when I put a Windows Media Player control on my...
11
by: UJ | last post by:
If I've got a video/audio file, how can I tell what Codec it needs? I want to be able to let the user upload a file to a server but I want to make sure before hand that the codec is already...
0
by: elizabeth.kegel | last post by:
Hello- I have a webform with a link that needs to open an audio file *.wma. *.mp3, etc. What is odd is I am able to click on the file and the Windows Media Player opens and the audio file plays. ...
0
by: Alex Bryan | last post by:
I know this isn't the right mailing list, but it saves me the hassle of signing up for another one if someone on here knows. I have never used pyGoogle before, and I just read that google isn't...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...
0
by: ryjfgjl | last post by:
In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.