Hi
Using Python 2.3.4 + Feedparser 3.3 (a library to parse XML documents)
I'm trying to parse a UTF-8 document with special characters like
acute-accent vowels:
--------
<?xml version="1.0" encoding="UTF-8" standalone="yes "?>
....
-------
But I get this error message:
-------
UnicodeEncodeEr ror: 'ascii' codec can't encode character u'\u2013' in
position 122: ordinal not in range(128)
-------
when trying to execute a MySQL query:
----
query = "UPDATE blogs_news SET text = '" + text_extrated + "'WHERE
id='" + id + "'"
cursor.execute (query) #<--- error line
----
I tried with:
-------
text_extrated = text_extrated.e ncode('iso-8859-1') #<--- error line
query = "UPDATE blogs_news SET text = '" + text_extrated + "'WHERE
id='" + id + "'"
cursor.execute (query)
-------
But I get this error:
------
UnicodeEncodeEr ror: 'latin-1' codec can't encode character u'\u2013'
in position 92: ordinal not in range(256)
-----
I also tried with:
----
text_extrated = re.sub(u'\u2013 ', '-' , text_extrated)
query = "UPDATE blogs_news SET text = '" + text_extrated + "'WHERE
id='" + id + "'"
cursor.execute (query)
-----
It works, but I don't want to substitute each special character,
because there are
always forgotten ones which can crack the program.
Any suggestion to fix it? Thank you very much. 3 20546
thomas Armstrong wrote:
(...) when trying to execute a MySQL query: ---- query = "UPDATE blogs_news SET text = '" + text_extrated + "'WHERE id='" + id + "'" cursor.execute (query) #<--- error line ----
well, to start it's not the best way to do an update,
try this instead:
query = "UPDATE blogs_news SET text = %s WHERE id=%s"
cursor.execute( query, (text_extrated, id))
so mysqldb will take care to quote text_extrated automatically. this
may not not your problem, but it's considered "good style" when dealing
with dbs.
apart for this, IIRC feedparser returns text as unicode strings, and
you correctly tried to encode those as latin-1 str objects before to
pass it to mysql, but not all glyphs in the orginal utf-8 feed can be
translated to latin-1. the charecter set of latin-1 is very thin
compared to the utf-8.
you have to decide:
* switch your mysql db to utf-8 and encode stuff before
insertion to UTF-8
* lose those characters that cannot be mapped into latin-1,
using the:
text_extrated.e ncode('latin-1', errors='replace ')
so unrecognized chars will be replaced by ?
also, mysqldb has some support to manage unicode objects directly, but
things changed a bit during recent releases so i cannot be precise in
this regard.
HTH.
--
deelan, #1 fan of adriana lima!
<http://www.deelan.com/>
Hi.
Thank you both for your answers.
Finally I changed my MySQL table to UTF-8 and changed the structure
of the query (with '%s').
It works. Thank you very much.
2005/9/30, deelan <gg*@zzz.it>: thomas Armstrong wrote: (...) when trying to execute a MySQL query: ---- query = "UPDATE blogs_news SET text = '" + text_extrated + "'WHERE id='" + id + "'" cursor.execute (query) #<--- error line ----
well, to start it's not the best way to do an update, try this instead:
query = "UPDATE blogs_news SET text = %s WHERE id=%s" cursor.execute( query, (text_extrated, id))
so mysqldb will take care to quote text_extrated automatically. this may not not your problem, but it's considered "good style" when dealing with dbs.
apart for this, IIRC feedparser returns text as unicode strings, and you correctly tried to encode those as latin-1 str objects before to pass it to mysql, but not all glyphs in the orginal utf-8 feed can be translated to latin-1. the charecter set of latin-1 is very thin compared to the utf-8.
you have to decide:
* switch your mysql db to utf-8 and encode stuff before insertion to UTF-8
* lose those characters that cannot be mapped into latin-1, using the:
text_extrated.e ncode('latin-1', errors='replace ')
so unrecognized chars will be replaced by ?
also, mysqldb has some support to manage unicode objects directly, but things changed a bit during recent releases so i cannot be precise in this regard.
HTH.
-- deelan, #1 fan of adriana lima! <http://www.deelan.com/>
-- http://mail.python.org/mailman/listinfo/python-list
deelan <gg*@zzz.it> writes:
[...] query = "UPDATE blogs_news SET text = %s WHERE id=%s" cursor.execute( query, (text_extrated, id))
so mysqldb will take care to quote text_extrated automatically. this may not not your problem, but it's considered "good style" when dealing with dbs.
[...]
More than just good style: it prevents SQL injection attacks that
could otherwise allow people to do bad things to your databases.
John This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics |
by: Peter Wilkinson |
last post by:
Hello tlistmembers,
I am using the encoding function to convert unicode to ascii. At one point
this code was working just fine, however, now it has broken.
I am reading a text file that has is in unicode (I am unsure of which
flavour or bit depth). as I read in the file one line at a time
(readlines()) it converts to ascii. Simple enough. At the same time I am
copressing to bz2 with the bz2 module but that works just fine. The code...
|
by: oziko |
last post by:
Hi, I get a piece of code of ogg123.py from the pyogg site, this is the
code:
*******************************
ogg_file = sys.argv
vorbis_file=ogg.vorbis.VorbisFile(ogg_file)
comentarios = vorbis_file.comment()
recognized_comments = ('Artist', 'Album', 'Title', 'Version',
|
by: Francach |
last post by:
Hi,
I don't know what I'm doing wrong here.
I''m using Python 2.4 and py2exe. I get he following error:
Traceback (most recent call last):
File "notegui.pyc", line 34, in OnClose
File "brain.pyc", line 61, in setNote
File "points.pyc", line 151, in setNote
File "point.pyc", line 100, in writeNote
|
by: ChaosKCW |
last post by:
Hi
I am reading from an oracle database using cx_Oracle. I am writing to a
SQLite database using apsw.
The oracle database is returning utf-8 characters for euopean item
names, ie special charcaters from an ASCII perspective.
I get the following error:
> SQLiteCur.execute(sql, row)
|
by: Mike Currie |
last post by:
Can anyone explain why I'm getting an ascii encoding error when I'm trying
to write out using a UTF-8 encoder?
Thanks
Python 2.4.3 (#69, Mar 29 2006, 17:35:34) on
win32
Type "help", "copyright", "credits" or "license" for more information.
>>> filterMap = {}
>>> for i in range(0,255):
| |
by: Thomas W |
last post by:
I'm getting really annoyed with python in regards to
unicode/ascii-encoding problems.
The string below is the encoding of the norwegian word "fødselsdag".
I stored the string as "fødselsdag" but somewhere in my code it got
translated into the mess above and I cannot get the original string
back. It cannot be printed in the console or written a plain text-file.
I've tried to convert it using
|
by: John Nagle |
last post by:
I'm trying to clean up a bad ASCII string, one read from a
web page that is supposedly in the ASCII character set but has some
characters above 127. And I get this:
File "D:\projects\sitetruth\InfoSitePage.py", line 285, in httpfetch
sitetext = sitetext.encode('ascii','replace') # force to clean ASCII
UnicodeDecodeError: 'ascii' codec can't decode byte 0x92 in position 29151:
ordinal not in range(128)
|
by: Oleg Parashchenko |
last post by:
Hello,
I'm working on an unicode-aware application. I like to use "print" to
debug programs, but in this case it was nightmare. The most popular
result of "print" was:
UnicodeDecodeError: 'ascii' codec can't decode byte 0xXX in position
0: ordinal not in range(128)
I spent two hours fixing it, and I hope it's done. The solution is one
|
by: luca72 |
last post by:
hello i have this problem:
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in
position 8: ordinal not in range(128)
Generally i solve the problem inserting :
# -*- coding: ISO-8859-1 -*-
at the top of the file but now he don't work can you help me
thanks
Luca
|
by: marktang |
last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look !
Part I. Meaning of...
|
by: Hystou |
last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it.
First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
| |
by: tracyyun |
last post by:
Dear forum friends,
With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
|
by: isladogs |
last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM).
In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules.
He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms.
Adolph will...
|
by: conductexam |
last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one.
At the time of converting from word file to html my equations which are in the word document file was convert into image.
Globals.ThisAddIn.Application.ActiveDocument.Select();...
|
by: TSSRALBI |
last post by:
Hello
I'm a network technician in training and I need your help.
I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs.
The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols.
I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
|
by: adsilva |
last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
|
by: 6302768590 |
last post by:
Hai team
i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
| |
by: muto222 |
last post by:
How can i add a mobile payment intergratation into php mysql website.
| |