473,395 Members | 1,941 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,395 software developers and data experts.

Interpreting string containing \u000a

Hi,

I have an ISO-8859-1 file containing things like
"Hello\u000d\u000aWorld", i.e. the character '\', followed by the
character 'u' and then '0', etc.

What is the easiest way to automatically translate these codes into
unicode characters ?

Thank you

Francis Girard
Jun 27 '08 #1
2 3811
"Francis Girard" <fr**************@gmail.comwrote:
I have an ISO-8859-1 file containing things like
"Hello\u000d\u000aWorld", i.e. the character '\', followed by the
character 'u' and then '0', etc.

What is the easiest way to automatically translate these codes into
unicode characters ?
>>s = r"Hello\u000d\u000aWorld"
print s
Hello\u000d\u000aWorld
>>s.decode('iso-8859-1').decode('unicode-escape')
u'Hello\r\nWorld'
>>>
--
Duncan Booth http://kupuguy.blogspot.com
Jun 27 '08 #2
Francis Girard wrote:
I have an ISO-8859-1 file containing things like
"Hello\u000d\u000aWorld", i.e. the character '\', followed by the
character 'u' and then '0', etc.

What is the easiest way to automatically translate these codes into
unicode characters ?
If the file really contains the escape sequences use "unicode-escape" as the
encoding:
>>"Hello\\u000d\\u000aWorld".decode("unicode-escape")
u'Hello\r\nWorld'

If it contains the raw bytes use "iso-8859-1":
>>"Hello\x0d\x0aWorld".decode("iso-8859-1")
u'Hello\r\nWorld'

Open the file with

codecs.open(filename, encoding=encoding_as_determined_above)

instead of the builtin open().

Peter
Jun 27 '08 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
by: Tom Cross | last post by:
Hello- I have a function that returns to me a text representation of Unicode data, which looks like this: ...
9
by: Ken Beesley | last post by:
Newbie question: on unicodedata.name If I do import unicodedata unicodedata.name(u"a") or unicodedata.name(u"\u0061")
7
by: Tariq | last post by:
Hi, I am trying to do this bit using client side script in an asp page var x = "<%ServerVar%>"; The issue is if SerVar has carraige returns the Unterminates String Constant error is thrown...
2
by: psundara | last post by:
Hi, I'm facing a peculiar problem of finding a way to interpret header information in a smart way. I have this header file that is shared by many users, which contains, among things, a few...
2
by: Rob Meade | last post by:
Hi all, I have a webform containing about 9 web user controls, these primarily are just html in a control thats used as the template etc.. A colleague has given me another control for our...
7
by: Greg Buchholz | last post by:
I'm wondering if anyone has advice for figuring out error messages produced by g++. The programs below works fine, until I uncomment out the two "transform" lines. Then it points me to line 24...
2
by: Keith MacDonald | last post by:
Hello, I am considering embedding Python in a C++ application, which works internally in UTF-16. The only API I can find for running scripts is PyRun_SimpleString(const char*). Does that mean...
1
by: Chris Carlen | last post by:
Hi: I'm writing a Python program, a hex line editor, which takes in a line of input from the user such as: -e 01 02 "abc def" 03 04 Trouble is, I don't want to split the quoted part where...
11
by: ramu | last post by:
Hi, Suppose I have a string like this: "I have a string \"and a inner string\\\" I want to remove space in this string but not in the inner string" In the above string I have to remove...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.