471,336 Members | 1,427 Online
Bytes | Software Development & Data Engineering Community
Post +

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 471,336 software developers and data experts.

encoding misunderstanding

Hi, I'm beginning to understand the encode/decode string methods, but
I'd like confirmation that I'm still thinking in the right direction:

I have a file of latin1 encoded text. Let's say I put one line of that
into a string variable 'tocline', as follows:
tocline = 'Ficha Datos de p\xe9rdida AND acci\xf3n'

import codecs
tocFile =
codecs.open('mytoc.htm','wb',encoding='utf8',error s='replace')
tocline = tocline.decode('latin1','replace')

What I think is that tocFile is wrapped to insure that anything
written to it is in utf8
I decode the latin1 string into python's internal unicode encoding and
that gets written out as utf8.

what exactly is the tocline when it's read in with that \xe9 and \xed
in the string? A latin1 encoded string?
Is my method the right way to write such a line out to a file with

If I read in the latin1 file using
codecs.open(filename,encoding='latin1') and write out the utf8 file
opening with
codecs.open(othername,encoding='utf8'), would I no longer have a
problem -- I could just read in latin1 and write out utf8 with no
more worries about

p.s. sorry if you see this twice--my newsreader is flaky right now.

Jul 27 '07 #1
0 888

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

14 posts views Thread by Alexander May | last post: by
10 posts views Thread by Christopher H. Laco | last post: by
8 posts views Thread by Demon News | last post: by
4 posts views Thread by fitsch | last post: by
reply views Thread by Chris McDonough | last post: by
4 posts views Thread by Christina | last post: by
1 post views Thread by ujjwaltrivedi | last post: by

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.