To be able to decode a string successfully, I need to know what coding
it is in.
The string can be coded in utf8 or in windows-1250 or in another
coding.
Is there a method how to find out the string coding.
Thank you for help
L. 8 2886
Lad wrote:
To be able to decode a string successfully, I need to know what coding
it is in.
ask whoever provided the string.
The string can be coded in utf8 or in windows-1250 or in another
coding. Is there a method how to find out the string coding.
in general, no. if you have enough text, you may guess, but the right
approach for that depends on the application.
</F>
Fredrik Lundh wrote:
Lad wrote:
To be able to decode a string successfully, I need to know what coding
it is in.
ask whoever provided the string.
The string can be coded in utf8 or in windows-1250 or in another
coding. Is there a method how to find out the string coding.
in general, no. if you have enough text, you may guess, but the right
approach for that depends on the application.
</F>
Fredrik,
Thank you for your reply
The text is from Mysql table field that uses utf8_czech_ci collation,
but when I try
`RealName`.decode('utf8'),where RealName is that field of MySQL
I will get:
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 3:
ordinal
not in range(128)
Can you please suggest the solution?
Thank you
L.
In <11**********************@m73g2000cwd.googlegroups .com>, Lad wrote:
The text is from Mysql table field that uses utf8_czech_ci collation,
but when I try
`RealName`.decode('utf8'),where RealName is that field of MySQL
I will get:
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 3:
ordinal
not in range(128)
Can you please suggest the solution?
Do you get this from converting the value from the database or from trying
to print the unicode string? Can you give us the output of
print repr(RealName)
Ciao,
Marc 'BlackJack' Rintsch
Marc 'BlackJack' Rintsch wrote:
In <11**********************@m73g2000cwd.googlegroups .com>, Lad wrote:
The text is from Mysql table field that uses utf8_czech_ci collation,
but when I try
`RealName`.decode('utf8'),where RealName is that field of MySQL
I will get:
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 3:
ordinal
not in range(128)
Can you please suggest the solution?
Do you get this from converting the value from the database or from trying
to print the unicode string? Can you give us the output of
print repr(RealName)
Ciao,
Marc 'BlackJack' Rintsch
for
print repr(RealName) command
I will get
P?ibylov\xe1 Ludmila
where instead of ? should be also a character
Thank you for help
L.
Lad wrote:
for
print repr(RealName) command
I will get
P?ibylov\xe1 Ludmila
where instead of ? should be also a character
that's not very likely; repr() always includes quotes, always escapes
non-ASCII characters, and optionally includes a Unicode prefix.
please try this
print "*", repr(RealName), type(RealName), "*"
and post the entire output; that is, *everything* between the asterisks.
</F>
Fredrik Lundh wrote:
Lad wrote:
for
print repr(RealName) command
I will get
P?ibylov\xe1 Ludmila
where instead of ? should be also a character
that's not very likely; repr() always includes quotes, always escapes
non-ASCII characters, and optionally includes a Unicode prefix.
please try this
print "*", repr(RealName), type(RealName), "*"
and post the entire output; that is, *everything* between the asterisks.
The result of print "*", repr(RealName), type(RealName), "*" is
* 'Fritschov\xe1 Laura' <type 'str'*
Best regards,
L
"Lad" wrote:
The result of print "*", repr(RealName), type(RealName), "*" is
* 'Fritschov\xe1 Laura' <type 'str'*
looks like the MySQL interface is returning 8-bit strings using ISO-8859-1
encoding (or some variation of that; \xE1 is "LATIN SMALL LETTER A
WITH ACUTE" in 8859-1).
have you tried passing "use_unicode=True" to the connect() call ?
</F>
Fredrik Lundh wrote:
"Lad" wrote:
The result of print "*", repr(RealName), type(RealName), "*" is
* 'Fritschov\xe1 Laura' <type 'str'*
looks like the MySQL interface is returning 8-bit strings using ISO-8859-1
encoding (or some variation of that; \xE1 is "LATIN SMALL LETTER A
WITH ACUTE" in 8859-1).
have you tried passing "use_unicode=True" to the connect() call ?
</F>
Frederik,
Thank you for your reply.
I found out that if I do not decode the string at all, it looks
correct. But I do not know why it is ok without decoding.
I use Django and I do not use use_unicode=True" to the connect() call. This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics
by: Simon |
last post by:
Newbie alert!
I have a webform listbox with what I am sure is a common requirement. I wish
to store a list of values but display a "translation" or decode. An example
would be 1, 2, 3, 4 and One,...
|
by: Guoqi Zheng |
last post by:
Dear sir,
I need to decode base64 encoded email.
I used below function but it does not work correctly, especially when I need
to decode some Characters like Chinese,
Can some one point out...
|
by: aurora |
last post by:
I have some unicode string with some characters encode using python
notation like '\n' for LF. I need to convert that to the actual LF
character. There is a 'unicode_escape' codec that seems to...
|
by: rsd |
last post by:
Hi,
I'm trying get Samsung YH-920 mp3 player to work with Debian GNU/Linux.
To do that I need to run
http://www.paul.sladen.org/toys/samsung-yh-925/yh-925-db-0.1.py
script, the idea behind the...
|
by: Oleg Parashchenko |
last post by:
Hello,
I'm working on an unicode-aware application. I like to use "print" to
debug programs, but in this case it was nightmare. The most popular
result of "print" was:
UnicodeDecodeError:...
|
by: Tim Arnold |
last post by:
Hi, I'm beginning to understand the encode/decode string methods, but I'd
like confirmation that I'm still thinking in the right direction:
I have a file of latin1 encoded text. Let's say I put...
|
by: glacier |
last post by:
I use chinese charactors as an example here.
"'\\xc4\\xe3\\xba\\xc3\\xc2\\xf0'"
My first question is : what strategy does 'decode' use to tell the way
to seperate the words. I mean since s1 is...
|
by: Eric S. Johansson |
last post by:
I'm having a problem (Python 2.4) converting strings with random 8-bit
characters into an escape form which is 7-bit clean for storage in a database.
Here's an example:
body =...
|
by: d-fan |
last post by:
void decodebio( unsigned char *encbuf, unsigned char * decbuf, int
destbuf ) {
/* Read Base64 encoded data from standard input and write the
decoded data to standard output: */
BIO...
|
by: anonymous |
last post by:
1 Objective to write little programs to help me learn German. See code
after numbered comments. //Thanks in advance for any direction or
suggestions.
tk
2 Want keyboard answer input, for...
|
by: ryjfgjl |
last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
|
by: emmanuelkatto |
last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud.
Please let me know.
Thanks!
Emmanuel
|
by: BarryA |
last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
|
by: Sonnysonu |
last post by:
This is the data of csv file
1 2 3
1 2 3
1 2 3
1 2 3
2 3
2 3
3
the lengths should be different i have to store the data by column-wise with in the specific length.
suppose the i have to...
|
by: Hystou |
last post by:
There are some requirements for setting up RAID:
1. The motherboard and BIOS support RAID configuration.
2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
|
by: Oralloy |
last post by:
Hello folks,
I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>".
The problem is that using the GNU compilers,...
|
by: jinu1996 |
last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
|
by: Hystou |
last post by:
Overview:
Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
|
by: tracyyun |
last post by:
Dear forum friends,
With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
| |