Non-unicode strings & Python.

Jonathon Blake

All:

Question

Python is currently Unicode Compliant.

What happens when strings are read in from text files that were
created using GB 2312-1980, or KPS 9566-2003, or other, equally
obscure code ranges?

The idea is to read text in the file format, and replace it with the
appropriate Unicode character,then write it out as a new text file.
[Trivial to program, but incredibly time consuming to actually code]

xan

jonathon
--
Goto http://graphology.meetup.com for information about International
Graphology Meetup Day

Jul 18 '05 #1

Subscribe Reply

2241

Martin v. Löwis

Jonathon Blake wrote:

What happens when strings are read in from text files that were
created using GB 2312-1980, or KPS 9566-2003, or other, equally
obscure code ranges?
Python has two kinds of strings: byte strings, and Unicode strings.
If you read data from a file, you get byte strings - i.e. a sequence
of bytes representing literally the encoded contents of the file.
If you want Unicode strings, you need to use codecs.open.
The idea is to read text in the file format, and replace it with the
appropriate Unicode character,then write it out as a new text file.
[Trivial to program, but incredibly time consuming to actually code]

Not at all:

data = codecs.open(filename, "r", encoding="gb2312")
codecs.open(newfile, "w", encoding="utf-8").write(data)

assuming that by "appropriate Unicode character" you actually mean
"I want to write the file encoded as UTF-8".

Regards,
Martin

Jul 18 '05 #2

by: klaus triendl | last post by:

hi, recently i discovered a memory leak in our code; after some investigation i could reduce it to the following problem: return objects of functions are handled as temporary objects, hence...

C / C++

non-blocking file access possible in c++?

by: Mario | last post by:

Hello, I couldn't find a solution to the following problem (tried google and dejanews), maybe I'm using the wrong keywords? Is there a way to open a file (a linux fifo pipe actually) in...

C / C++

Non-virtual methods - why?

by: Adrian Herscu | last post by:

Hi all, In which circumstances it is appropriate to declare methods as non-virtual? Thanx, Adrian.

C# / C Sharp

Static vs Non-Static Function Performance

by: Steve - DND | last post by:

We're currently doing some tests to determine the performance of static vs non-static functions, and we're coming up with some odd(in our opinion) results. We used a very simple setup. One class...

C# / C Sharp

Mixed-mode and "non-default" app domains

by: Bern McCarty | last post by:

Is it at all possible to leverage mixed-mode assemblies from AppDomains other than the default AppDomain? Is there any means at all of doing this? Mixed-mode is incredibly convenient, but if I...

.NET Framework

memset on structs with non-PODs

by: Patrick Kowalzick | last post by:

Dear all, I have an existing piece of code with a struct with some PODs. struct A { int x; int y; };

C / C++

same overhead in calling virtual and non virtual member function...?

by: ypjofficial | last post by:

Hello All, So far I have been reading that in case of a polymorphic class ( having at least one virtual function in it), the virtual function call get resolved at run time and during that the...

C / C++

Help with Non-Aggregate Type Error

by: Ian825 | last post by:

I need help writing a function for a program that is based upon the various operations of a matrix and I keep getting a "non-aggregate type" error. My guess is that I need to dereference my...

C / C++

Navigation between Secure and Non-secure Pages

by: amitvps | last post by:

Secure Socket Layer is very important and useful for any web application but it brings some problems too with itself. Handling navigation between secure and non-secure pages is one of the cumbersome...

ASP / Active Server Pages

How to make non-blocking call to cin?

by: puzzlecracker | last post by:

is it even possible or/and there is a better alternative to accept input in a nonblocking manner?

C / C++

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...

General

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

C / C++

The easy way to turn off automatic updates for Windows 10/11

by: Hystou | last post by:

Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...

Windows Server

Discussion: How does Zigbee compare with other wireless protocols in smart home applications?

by: tracyyun | last post by:

Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

General

AI Job Threat for Devs

by: agi2029 | last post by:

Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

Career Advice

Access Europe - Using VBA to create a class based on a table - Wed 1 May

by: isladogs | last post by:

The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

Microsoft Access / VBA

Trying to create a lan-to-lan vpn between two differents networks

by: TSSRALBI | last post by:

Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The...

Networking - Hardware / Configuration

Windows Forms - .Net 8.0

by: adsilva | last post by:

A Windows Forms form does not have the event Unload, like VB6. What one acts like?

Visual Basic .NET

php

by: muto222 | last post by:

How can i add a mobile payment intergratation into php mysql website.

PHP

Non-unicode strings & Python.

Similar topics