encoding misunderstanding

Tim Arnold

Hi, I'm beginning to understand the encode/decode string methods, but
I'd like confirmation that I'm still thinking in the right direction:

I have a file of latin1 encoded text. Let's say I put one line of that
file
into a string variable 'tocline', as follows:
tocline = 'Ficha Datos de p\xe9rdida AND acci\xf3n'

import codecs
tocFile =
codecs.open('mytoc.htm','wb',encoding='utf8',error s='replace')
tocline = tocline.decode('latin1','replace')
tocFile.write(tocline)
tocFile.close()

What I think is that tocFile is wrapped to insure that anything
written to it is in utf8
I decode the latin1 string into python's internal unicode encoding and
that gets written out as utf8.

Questions:
what exactly is the tocline when it's read in with that \xe9 and \xed
in the string? A latin1 encoded string?
Is my method the right way to write such a line out to a file with
utf8
encoding?

If I read in the latin1 file using
codecs.open(filename,encoding='latin1') and write out the utf8 file
by
opening with
codecs.open(othername,encoding='utf8'), would I no longer have a
problem -- I could just read in latin1 and write out utf8 with no
more worries about
encoding?

thanks,
--Tim
p.s. sorry if you see this twice--my newsreader is flaky right now.

Jul 27 '07 #1

Subscribe Post Reply

966

Similar topics

Misunderstanding about closures

by: Alexander May | last post by:

When I define a function in the body of a loop, why doesn't the function "close" on the loop vairable? See example below. Thanks, Alex C:\Documents and Settings\Alexander May>python Python...

Python

Encoding/Codepage: Can't Get There From Here

by: Christopher H. Laco | last post by:

Long story longer. I need to get web user input into a backend system that a) only grocks single byte encoding, b) expectes the data transer to be 1 bytes = 1 character, and c) uses the HP Roman-6...

.NET Framework

xml, character encoding, asp question

by: Mark | last post by:

Hi... I've been doing a lot of work both creating and consuming web services, and I notice there seems to be a discontinuity between a number of the different cogs in the wheel centering around...

ASP / Active Server Pages

Saving XmlDocument with Danish Encoding????

by: M O J O | last post by:

Hi, I'm trying to update a vbproj xml file programatically, but when I try to open the updated vbproj file in Vs.Net, all entries ("Files") with danish chars are wrong. How do I force the...

.NET Framework

Encoding problem

by: Demon News | last post by:

I'm trying to do a transform (Using XmlTransform class in c#) and in the Transform I'm specifying the the output xsl below: <xsl:output method="xml" encoding="UTF-8" indent="no"/> the...

.NET Framework

Encoding XML troubles

by: fitsch | last post by:

Hi, I am trying to write a generic RSS/Atom/OPML feed client. The problem is, that those xml feeds may have different encodings: - <?xml version="1.0" encoding="ISO-8859-1" ?>... - <?xml...

C# / C Sharp

encoding during elementtree serialization

by: Chris McDonough | last post by:

ElementTree's XML serialization routine implied by tree._write(file, node, encoding, namespaces looks like this (elided): def _write(self, file, node, encoding, namespaces): # write XML to file...

Python

ISO-8859-1 encoding of an xml string

by: Christina | last post by:

Hey Guys, Currently, I am using the below code: Dim oReqDoc as XmlDocument Dim requiredBytes As Byte() requiredBytes = System.Text.UTF8Encoding.UTF8.GetBytes(oReqDoc.InnerXml). Here, I am...

Visual Basic .NET

How to create a .txt file with unicode encoding

by: ujjwaltrivedi | last post by:

Hey guys, Can anyone tell me how to create a text file with Unicode Encoding. In am using FileStream Finalfile = new FileStream("finalfile.txt", FileMode.Append, FileAccess.Write); ...

C# / C Sharp

Easy Steps to Fix "Canon Printer Won't Connect to WiFi Network"

by: taylorcarr | last post by:

A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...

General

Basic Javascript concepts

by: aa123db | last post by:

Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...

Javascript

Batch import of multiple excel files into the database

by: ryjfgjl | last post by:

If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...

Data Management

Merging data from multiple Excel files

by: ryjfgjl | last post by:

In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...

Data Management

Migrating Website to Cloud - Emmanuel Katto

by: emmanuelkatto | last post by:

Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel

General

Navigating the Data Structures and Algorithms (DSA)

by: BarryA | last post by:

What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...

Algorithms / Advanced Math

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

Windows Server

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

C / C++