473,394 Members | 1,481 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,394 software developers and data experts.

Re: Unicode chr(150) en dash

"C:\Python24\Lib\site-packages\MySQLdb\cursors.py", line 149, in
execute query = query.encode(charset) UnicodeEncodeError: 'latin-1'
codec can't encode character u'\u2013' in position 52: ordinal not in
range(256)
Here it complains that it deals with the character U+2013, which
is "EN DASH"; it complains that the encoding called "latin-1" does
not support that character.

That is a fact - Latin-1 does not support EN DASH.
When I type 'print chr(150)' into a python command line window I get
a LATIN SMALL LETTER U WITH CIRCUMFLEX
(http://www.fileformat.info/info/unic...0fb/index.htm),
That's because your console uses the code page 437:

pychr(150).decode("cp437")
u'\xfb'
pyunicodedata.name(_)
'LATIN SMALL LETTER U WITH CIRCUMFLEX'

Code page 437, on your system, is the "OEM code page".
but when I do so into a IDLE window I get a hypen (chr(45).
That's because IDLE uses the "ANSI code page" of your system,
which is windows code page 1252.

pychr(150).decode("windows-1252")
u'\u2013'
pyunicodedata.name(_)
'EN DASH'

You actually *don't* get the character U+002D, HYPHEN-MINUS,
displayed - just a character that has, in your font, a glyph
which looks similar to the glyph for HYPHEN-MINUS.
However, HYPHEN-MINUS and EN DASH are different characters, and
IDLE displays the latter, not the former.
I tried searching "en dash" or even "dash" into the encodings folder
of python Lib, but I couldn't find anything.
You didn't ask a specific question, so I assume you are primarily
after an explanation.

HTH,
Martin
Jun 27 '08 #1
0 2301

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

12
by: Chris Mullins | last post by:
I'm implementing RFC 3491 in .NET, and running into a strange issue. Step 1 of RFC 3491 is performing a set of mappings dicated by tables B.1 and B.2. I'm having trouble with the following...
5
by: Damjan | last post by:
Why doesn't this work: from email.MIMEText import MIMEText msg = MIMEText(u'\u043a\u0438\u0440\u0438\u043b\u0438\u0446\u0430') msg.set_charset('utf-8') msg.as_string() Traceback (most recent...
2
by: gdetre | last post by:
Dear all, My lab has been using a Movable Type blog for internal communication and announcement for a couple of years, but we've now seen the light and I've set up a MoinMoin wiki. Everything's...
15
by: luc.saffre | last post by:
Hello, here is something that surprises me. #coding: iso-8859-1 s1=u"Frau Müller machte große Augen" s2="Frau Müller machte große Augen" if s1 == s2: pass
232
by: robert maas, see http://tinyurl.com/uh3t | last post by:
I'm working on examples of programming in several languages, all (except PHP) running under CGI so that I can show both the source files and the actually running of the examples online. The first...
2
by: John Nagle | last post by:
Regular expressions are compiled in ASCII mode unless Unicode mode is specified to "rc.compile". The difference is that regular expressions in ASCII mode don't recognize things like Unicode...
2
by: John Nagle | last post by:
Here's a strange little bug. "socket.getaddrinfo" blows up if given a bad domain name containing ".." in Unicode. The same string in ASCII produces the correct "gaierror" exception. Actually,...
6
by: hdante | last post by:
On Apr 17, 12:10 pm, marexpo...@googlemail.com wrote: There's a trick here. Blame lax web standards and companies that don't like standards. There's no EN DASH in ISO-8859-1. The first 256...
16
by: Laszlo Nagy | last post by:
I need to create multi lingual invoices from reportlab. I think it is possible to use UTF 8 strings but there is a problem with the font. I could not find any free TTF font that can do latin1,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.