So I've got a problem.
I've got a database of information that is encoded in Windows/CP1252.
What I want to do is dump this to a UTF-8 encoded text file (a RSS
feed).
While the overall problem seems to be related to the conversion, the
only error I'm getting is a
"UnicodeDecodeError: 'ascii' codec can't decode byte 0x92 in position
163: ordinal not in range(128)"
So somewhere I'm missing an implicit conversion to ASCII which is
completely aggrivating my brain.
So, what fundamental issue am I completely overlooking?
Code follows.
def GenerateNoticeRSS():
output = codecs.open(FILEBASE + 'noticeboard.xml','w','utf-8')
conn = psycopg.connect(DSN)
curs = conn.cursor()
sql_query = "select story.subject as subject, story.content as
content, story.summary as summary, story.sid as sid, posts.bid as
board, posts.date_to_publish as date from story$
curs.execute(sql_query)
rows = curs.fetchall()
output.write('<?xml version="1.0" encoding="utf-8"?>\n')
output.write('<rss version="2.0">\n')
output.write('<channel>\n')
output.write('<title>U of L Notice Board</title>\n')
output.write('<link>http://www.uleth.ca/notice</link>\n')
output.write('<description>University of Lethbridge News and
Events</description>\n')
for each in rows:
output.write('<item>\n')
output.write('<title>' + rssTitlePrefix(each[4]) +
unicode(each[0]) + '</title>\n')
output.write('<link>http://www.uleth.ca/notice/display.html?b=' +
str(each[4]) + '&s=' + str(each[3]) + '</link>\n')
output.write('<guid>http://www.uleth.ca/notice/display.html?b=' +
str(each[4]) + '&s=' + str(each[3]) + '</guid>\n')
descript = each[2] + '<BR><BR>' + each[1]
output.write(u'<description>' + unicode(descript) +
u'</description>\n') # this is the line that causes the error.
output.write('</item>\n')
output.write('</channel>\n')
output.write('</rss>\n')
output.close()
return 0