473,396 Members | 2,109 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,396 software developers and data experts.

How to decode a string

Lad
To be able to decode a string successfully, I need to know what coding
it is in.
The string can be coded in utf8 or in windows-1250 or in another
coding.
Is there a method how to find out the string coding.
Thank you for help
L.

Aug 21 '06 #1
8 2886
Lad wrote:
To be able to decode a string successfully, I need to know what coding
it is in.
ask whoever provided the string.
The string can be coded in utf8 or in windows-1250 or in another
coding. Is there a method how to find out the string coding.
in general, no. if you have enough text, you may guess, but the right
approach for that depends on the application.

</F>

Aug 21 '06 #2
Lad

Fredrik Lundh wrote:
Lad wrote:
To be able to decode a string successfully, I need to know what coding
it is in.

ask whoever provided the string.
The string can be coded in utf8 or in windows-1250 or in another
coding. Is there a method how to find out the string coding.

in general, no. if you have enough text, you may guess, but the right
approach for that depends on the application.

</F>
Fredrik,
Thank you for your reply
The text is from Mysql table field that uses utf8_czech_ci collation,
but when I try
`RealName`.decode('utf8'),where RealName is that field of MySQL

I will get:
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 3:
ordinal
not in range(128)

Can you please suggest the solution?
Thank you
L.

Aug 21 '06 #3
In <11**********************@m73g2000cwd.googlegroups .com>, Lad wrote:
The text is from Mysql table field that uses utf8_czech_ci collation,
but when I try
`RealName`.decode('utf8'),where RealName is that field of MySQL

I will get:
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 3:
ordinal
not in range(128)

Can you please suggest the solution?
Do you get this from converting the value from the database or from trying
to print the unicode string? Can you give us the output of

print repr(RealName)

Ciao,
Marc 'BlackJack' Rintsch
Aug 21 '06 #4
Lad

Marc 'BlackJack' Rintsch wrote:
In <11**********************@m73g2000cwd.googlegroups .com>, Lad wrote:
The text is from Mysql table field that uses utf8_czech_ci collation,
but when I try
`RealName`.decode('utf8'),where RealName is that field of MySQL

I will get:
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 3:
ordinal
not in range(128)

Can you please suggest the solution?

Do you get this from converting the value from the database or from trying
to print the unicode string? Can you give us the output of

print repr(RealName)

Ciao,
Marc 'BlackJack' Rintsch

for
print repr(RealName) command
I will get

P?ibylov\xe1 Ludmila
where instead of ? should be also a character
Thank you for help
L.

Aug 22 '06 #5
Lad wrote:
for
print repr(RealName) command
I will get

P?ibylov\xe1 Ludmila
where instead of ? should be also a character
that's not very likely; repr() always includes quotes, always escapes
non-ASCII characters, and optionally includes a Unicode prefix.

please try this

print "*", repr(RealName), type(RealName), "*"

and post the entire output; that is, *everything* between the asterisks.

</F>

Aug 22 '06 #6
Lad
Fredrik Lundh wrote:
Lad wrote:
for
print repr(RealName) command
I will get

P?ibylov\xe1 Ludmila
where instead of ? should be also a character

that's not very likely; repr() always includes quotes, always escapes
non-ASCII characters, and optionally includes a Unicode prefix.

please try this

print "*", repr(RealName), type(RealName), "*"

and post the entire output; that is, *everything* between the asterisks.
The result of print "*", repr(RealName), type(RealName), "*" is

* 'Fritschov\xe1 Laura' <type 'str'*
Best regards,
L

Aug 22 '06 #7
"Lad" wrote:
The result of print "*", repr(RealName), type(RealName), "*" is

* 'Fritschov\xe1 Laura' <type 'str'*
looks like the MySQL interface is returning 8-bit strings using ISO-8859-1
encoding (or some variation of that; \xE1 is "LATIN SMALL LETTER A
WITH ACUTE" in 8859-1).

have you tried passing "use_unicode=True" to the connect() call ?

</F>

Aug 22 '06 #8
Lad

Fredrik Lundh wrote:
"Lad" wrote:
The result of print "*", repr(RealName), type(RealName), "*" is

* 'Fritschov\xe1 Laura' <type 'str'*

looks like the MySQL interface is returning 8-bit strings using ISO-8859-1
encoding (or some variation of that; \xE1 is "LATIN SMALL LETTER A
WITH ACUTE" in 8859-1).

have you tried passing "use_unicode=True" to the connect() call ?

</F>
Frederik,
Thank you for your reply.
I found out that if I do not decode the string at all, it looks
correct. But I do not know why it is ok without decoding.
I use Django and I do not use use_unicode=True" to the connect() call.

Aug 22 '06 #9

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
by: Simon | last post by:
Newbie alert! I have a webform listbox with what I am sure is a common requirement. I wish to store a list of values but display a "translation" or decode. An example would be 1, 2, 3, 4 and One,...
3
by: Guoqi Zheng | last post by:
Dear sir, I need to decode base64 encoded email. I used below function but it does not work correctly, especially when I need to decode some Characters like Chinese, Can some one point out...
2
by: aurora | last post by:
I have some unicode string with some characters encode using python notation like '\n' for LF. I need to convert that to the actual LF character. There is a 'unicode_escape' codec that seems to...
2
by: rsd | last post by:
Hi, I'm trying get Samsung YH-920 mp3 player to work with Debian GNU/Linux. To do that I need to run http://www.paul.sladen.org/toys/samsung-yh-925/yh-925-db-0.1.py script, the idea behind the...
4
by: Oleg Parashchenko | last post by:
Hello, I'm working on an unicode-aware application. I like to use "print" to debug programs, but in this case it was nightmare. The most popular result of "print" was: UnicodeDecodeError:...
3
by: Tim Arnold | last post by:
Hi, I'm beginning to understand the encode/decode string methods, but I'd like confirmation that I'm still thinking in the right direction: I have a file of latin1 encoded text. Let's say I put...
15
by: glacier | last post by:
I use chinese charactors as an example here. "'\\xc4\\xe3\\xba\\xc3\\xc2\\xf0'" My first question is : what strategy does 'decode' use to tell the way to seperate the words. I mean since s1 is...
1
by: Eric S. Johansson | last post by:
I'm having a problem (Python 2.4) converting strings with random 8-bit characters into an escape form which is 7-bit clean for storage in a database. Here's an example: body =...
3
by: d-fan | last post by:
void decodebio( unsigned char *encbuf, unsigned char * decbuf, int destbuf ) { /* Read Base64 encoded data from standard input and write the decoded data to standard output: */ BIO...
1
by: anonymous | last post by:
1 Objective to write little programs to help me learn German. See code after numbered comments. //Thanks in advance for any direction or suggestions. tk 2 Want keyboard answer input, for...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.