473,396 Members | 1,982 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,396 software developers and data experts.

Dr. Dobb's Python-URL! - weekly Python news and links (Dec 30)

QOTW: "I found the discussion of unicode, in any python book I have,
insufficient." -- Thomas Heller

"If you develop on a Mac, ... Objective-C could come in handy. . . .
PyObjC makes mixing the two languages dead easy and more convenient than
indoor plumbing." -- Robert Kern
Among other activities, the PSF aggregates donors with dollars
destined to do good Python works, and developers expert in
obscure corners of Pythonia.
http://groups-beta.google.com/group/...5bfe05419aa0b3
http://groups-beta.google.com/group/...22f3e14752ce5/

Yippee! The martellibot promises to explain Unicode for Pythoneers.
http://groups-beta.google.com/group/...15a5a05c206712

The glorious SciPy project supports *multiple* worthwhile Wikis.
http://www.scipy.org/wikis

Good style in Python does not generally include "in-place"
operations on lists. Several cleaner idioms are possible.
http://groups-beta.google.com/group/...4559f53d25474e

Assume you're comfortable with tuples' semantics, immutability,
and so on. Do you correctly understand the basics of their
syntax, though? This is another opportunity to think about
Unicode, by the way.
http://groups-beta.google.com/group/...0049d7adb1bcce

Robert Kern, Paul Rubin, Mike Meyer, Alex Martelli, and others
provide disproportionately high-quality advice (and tangents!)
on the subject of languages which complement Python.
http://groups-beta.google.com/group/...c1c6d9d87049b6
================================================== ======================
Everything Python-related you want is probably one or two clicks away in
these pages:

Python.org's Python Language Website is the traditional
center of Pythonia
http://www.python.org
Notice especially the master FAQ
http://www.python.org/doc/FAQ.html

PythonWare complements the digest you're reading with the
marvelous daily python url
http://www.pythonware.com/daily
Mygale is a news-gathering webcrawler that specializes in (new)
World-Wide Web articles related to Python.
http://www.awaretek.com/nowak/mygale.html
While cosmetically similar, Mygale and the Daily Python-URL
are utterly different in their technologies and generally in
their results.

comp.lang.python.announce announces new Python software. Be
sure to scan this newsgroup weekly.
http://groups.google.com/groups?oi=d...ython.announce

Brett Cannon continues the marvelous tradition established by
Andrew Kuchling and Michael Hudson of intelligently summarizing
action on the python-dev mailing list once every other week.
http://www.python.org/dev/summary/

The Python Package Index catalogues packages.
http://www.python.org/pypi/

The somewhat older Vaults of Parnassus ambitiously collects references
to all sorts of Python resources.
http://www.vex.net/~x/parnassus/

Much of Python's real work takes place on Special-Interest Group
mailing lists
http://www.python.org/sigs/

The Python Business Forum "further[s] the interests of companies
that base their business on ... Python."
http://www.python-in-business.org

Python Success Stories--from air-traffic control to on-line
match-making--can inspire you or decision-makers to whom you're
subject with a vision of what the language makes practical.
http://www.pythonology.com/success

The Python Software Foundation (PSF) has replaced the Python
Consortium as an independent nexus of activity. It has official
responsibility for Python's development and maintenance.
http://www.python.org/psf/
Among the ways you can support PSF is with a donation.
http://www.python.org/psf/donate.html

Kurt B. Kaiser publishes a weekly report on faults and patches.
http://www.google.com/groups?as_usub...python%20patch

Cetus collects Python hyperlinks.
http://www.cetus-links.org/oo_python.html

Python FAQTS
http://python.faqts.com/

The Cookbook is a collaborative effort to capture useful and
interesting recipes.
http://aspn.activestate.com/ASPN/Cookbook/Python

Among several Python-oriented RSS/RDF feeds available are
http://www.python.org/channews.rdf
http://bootleg-rss.g-blog.net/pythonware_com_daily.pcgi
http://python.de/backend.php
For more, see
http://www.syndic8.com/feedlist.php?...ShowStatus=all
The old Python "To-Do List" now lives principally in a
SourceForge reincarnation.
http://sourceforge.net/tracker/?atid...70&func=browse
http://python.sourceforge.net/peps/pep-0042.html

The online Python Journal is posted at pythonjournal.cognizor.com.
ed****@pythonjournal.com and ed****@pythonjournal.cognizor.com
welcome submission of material that helps people's understanding
of Python use, and offer Web presentation of your work.

deli.cio.us presents an intriguing approach to reference commentary.
It already aggregates quite a bit of Python intelligence.
http://del.icio.us/tag/python

*Py: the Journal of the Python Language*
http://www.pyzine.com

Archive probing tricks of the trade:
http://groups.google.com/groups?oi=d...python&num=100
http://groups.google.com/groups?meta....lang.python.*

Previous - (U)se the (R)esource, (L)uke! - messages are listed here:
http://www.ddj.com/topics/pythonurl/
http://purl.org/thecliff/python/url.html (dormant)
or
http://groups.google.com/groups?oi=djq&as_q=+Python-URL!&as_ugroup=comp.lang.python
Suggestions/corrections for next week's posting are always welcome.
E-mail to <Py********@phaseit.net> should get through.

To receive a new issue of this posting in e-mail each Monday morning
(approximately), ask <cl****@phaseit.net> to subscribe. Mention
"Python-URL!".
-- The Python-URL! Team--

Dr. Dobb's Journal (http://www.ddj.com) is pleased to participate in and
sponsor the "Python-URL!" project.
Jul 18 '05 #1
16 2069
Cameron Laird <py********@phaseit.net> wrote:
...
Yippee! The martellibot promises to explain Unicode for Pythoneers.
http://groups-beta.google.com/group/...15a5a05c206712


Uh -- _did_ I? Eeep... I guess I did... mostly, I was pointing to
Holger Krekel's very nice recipe (not sure he posted it to the site as
well as submitting it for the printed edition, but, lobby _HIM_ about
that;-).
Alex
Jul 18 '05 #2
On Fri, Dec 31, 2004 at 19:18 +0100, Alex Martelli wrote:
Cameron Laird <py********@phaseit.net> wrote:
...
Yippee! The martellibot promises to explain Unicode for Pythoneers.
http://groups-beta.google.com/group/...15a5a05c206712


Uh -- _did_ I? Eeep... I guess I did... mostly, I was pointing to
Holger Krekel's very nice recipe (not sure he posted it to the site as
well as submitting it for the printed edition, but, lobby _HIM_ about
that;-).


FWIW, i added the recipe back to the online cookbook. It's not perfectly
formatted but still useful, i hope.

http://aspn.activestate.com/ASPN/Coo.../Recipe/361742

cheers,

holger

P.S: happy new year.
Jul 18 '05 #3
Holger:
FWIW, i added the recipe back to the online cookbook. It's not perfectly formatted but still useful, i hope. http://aspn.activestate.com/ASPN/Coo.../Recipe/361742


Uhm... on my system I get:
german_ae = unicode('\xc3\xa4', 'utf8')
print german_ae # dunno if it will appear right on Google groups ä
german_ae.decode('latin1')

Traceback (most recent call last):
File "<stdin>", line 1, in ?
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe4' in
position 0: ordinal not in range(128)
?? What's wrong?

Michele Simionato

Jul 18 '05 #4
On Tue, 04 Jan 2005 05:43:32 -0800, michele.simionato wrote:
Holger:
FWIW, i added the recipe back to the online cookbook. It's not perfectly
formatted but still useful, i hope.

http://aspn.activestate.com/ASPN/Coo.../Recipe/361742


Uhm... on my system I get:
german_ae = unicode('\xc3\xa4', 'utf8')
print german_ae # dunno if it will appear right on Google groups ä
german_ae.decode('latin1')

Traceback (most recent call last):
File "<stdin>", line 1, in ?
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe4' in
position 0: ordinal not in range(128)
?? What's wrong?


I'd rather use german_ae.encode('latin1')
^^^^^^

which returns '\xe4'.
Michele Simionato


Jul 18 '05 #5
Stephan:
I'd rather use german_ae.encode('latin1') ^^^^^^ which returns '\xe4'.


uhm ... then there is a misprint in the discussion of the recipe;
BTW what's the difference between .encode and .decode ?
(yes, I have been living in happy ASCII-land until now ... ;)
I should probably ask for an unicode primer, I have found the
one by Marc André Lemburg
http://www.reportlab.com/i18n/python..._tutorial.html
and I am reading it right now.
Michele Simionato

Jul 18 '05 #6
In article <11*********************@c13g2000cwb.googlegroups. com>,
<mi***************@gmail.com> wrote:

BTW what's the difference between .encode and .decode ?
(yes, I have been living in happy ASCII-land until now ... ;)


Here's the stark simple recipe: when you use Unicode, you *MUST* switch
to a Unicode-centric view of the universe. Therefore you encode *FROM*
Unicode and you decode *TO* Unicode. Period. It's similar to the way
floating point contaminates ints.
--
Aahz (aa**@pythoncraft.com) <*> http://www.pythoncraft.com/

"19. A language that doesn't affect the way you think about programming,
is not worth knowing." --Alan Perlis
Jul 18 '05 #7

michele> BTW what's the difference between .encode and .decode ?

I started to answer, then got confused when I read the docstrings for
unicode.encode and unicode.decode:
help(u"\xe4".decode) Help on built-in function decode:

decode(...)
S.decode([encoding[,errors]]) -> string or unicode

Decodes S using the codec registered for encoding. encoding defaults
to the default encoding. errors may be given to set a different error
handling scheme. Default is 'strict' meaning that encoding errors raise
a UnicodeDecodeError. Other possible values are 'ignore' and 'replace'
as well as any other name registerd with codecs.register_error that is
able to handle UnicodeDecodeErrors.
help(u"\xe4".encode)

Help on built-in function encode:

encode(...)
S.encode([encoding[,errors]]) -> string or unicode

Encodes S using the codec registered for encoding. encoding defaults
to the default encoding. errors may be given to set a different error
handling scheme. Default is 'strict' meaning that encoding errors raise
a UnicodeEncodeError. Other possible values are 'ignore', 'replace' and
'xmlcharrefreplace' as well as any other name registered with
codecs.register_error that can handle UnicodeEncodeErrors.

It probably makes sense to one who knows, but for the feeble-minded like
myself, they seem about the same.

I'd be happy to add a couple examples to the string methods section of the
docs if someone will produce something simple that makes the distinction
clear.

Skip

Jul 18 '05 #8
Yep, I did the same and got confused :-/

Michele

Jul 18 '05 #9
aahz> Here's the stark simple recipe: when you use Unicode, you *MUST*
aahz> switch to a Unicode-centric view of the universe. Therefore you
aahz> encode *FROM* Unicode and you decode *TO* Unicode. Period. It's
aahz> similar to the way floating point contaminates ints.

That's what I do in my code. Why do Unicode objects have a decode method
then?

Skip

Jul 18 '05 #10
Skip Montanaro <sk**@pobox.com> writes:
michele> BTW what's the difference between .encode and .decode ?

I started to answer, then got confused when I read the docstrings for
unicode.encode and unicode.decode:
>>> help(u"\xe4".decode) Help on built-in function decode:

decode(...)
S.decode([encoding[,errors]]) -> string or unicode

Decodes S using the codec registered for encoding. encoding defaults
to the default encoding. errors may be given to set a different error
handling scheme. Default is 'strict' meaning that encoding errors raise
a UnicodeDecodeError. Other possible values are 'ignore' and 'replace'
as well as any other name registerd with codecs.register_error that is
able to handle UnicodeDecodeErrors.
>>> help(u"\xe4".encode) Help on built-in function encode:

encode(...)
S.encode([encoding[,errors]]) -> string or unicode

Encodes S using the codec registered for encoding. encoding defaults
to the default encoding. errors may be given to set a different error
handling scheme. Default is 'strict' meaning that encoding errors raise
a UnicodeEncodeError. Other possible values are 'ignore', 'replace' and
'xmlcharrefreplace' as well as any other name registered with
codecs.register_error that can handle UnicodeEncodeErrors.

It probably makes sense to one who knows, but for the feeble-minded like
myself, they seem about the same.


It seems also the error messages aren't too helpful:
"ä".encode("latin-1") Traceback (most recent call last):
File "<stdin>", line 1, in ?
UnicodeDecodeError: 'ascii' codec can't decode byte 0x84 in position 0: ordinal not in range(128)
Hm, why does the 'encode' call complain about decoding?

Why do string objects have an encode method, and why do unicode objects
have a decode method, and what does this error message want to tell me:
u"ä".decode("latin-1") Traceback (most recent call last):
File "<stdin>", line 1, in ?
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe4' in position 0: ordinal not in range(128)


Thomas
Jul 18 '05 #11
mi***************@gmail.com wrote:
uhm ... then there is a misprint in the discussion of the recipe;
BTW what's the difference between .encode and .decode ?
(yes, I have been living in happy ASCII-land until now ... ;)

# -*- coding: latin-1 -*-
# here i make a unicode string
unicode_file = u'Some danish characters æøå' #.encode('hex')
print type(unicode_file)
print repr(unicode_file)
print ''
# I can convert this unicode string to an ordinary string.
# because æøå are in the latin-1 charmap it can be understood as
# a latin-1 string
# the æøå characters even has the same value in both
latin1_file = unicode_file.encode('latin-1')
print type(latin1_file)
print repr(latin1_file)
print latin1_file
print ''
## I can *not* convert it to ascii
#ascii_file = unicode_file.encode('ascii')
#print ''
# I can also convert it to utf-8
utf8_file = unicode_file.encode('utf-8')
print type(utf8_file)
print repr(utf8_file)
print utf8_file
print ''
#utf8_file is now an ordinary string. again it can help to think of it
as a file
#format.
#
#I can convert this file/string back to unicode again by using the
decode method.
#It tells python to decode this "file format" as utf-8 when it loads it
onto a
#unicode string. And we are back where we started
unicode_file = utf8_file.decode('utf-8')
print type(unicode_file)
print repr(unicode_file)
print ''
# So basically you can encode a unicode string into a special
string/file format
# and you can decode a string from a special string/file format back
into unicode.
###################################
<type 'unicode'>
u'Some danish characters \xe6\xf8\xe5'

<type 'str'>
'Some danish characters \xe6\xf8\xe5'
Some danish characters æøå

<type 'str'>
'Some danish characters \xc3\xa6\xc3\xb8\xc3\xa5'
Some danish characters æøå

<type 'unicode'>
u'Some danish characters \xe6\xf8\xe5'

--

hilsen/regards Max M, Denmark

http://www.mxm.dk/
IT's Mad Science
Jul 18 '05 #12
Thomas Heller wrote:
It seems also the error messages aren't too helpful:
"ä".encode("latin-1")


Traceback (most recent call last):
File "<stdin>", line 1, in ?
UnicodeDecodeError: 'ascii' codec can't decode byte 0x84 in position 0: ordinal not in range(128)

Hm, why does the 'encode' call complain about decoding?


Because it tries to print it out to your console and fail. While writing
to the console it tries to convert to ascii.

Beside, you should write:

u"ä".encode("latin-1") to get a latin-1 encoded string.
--

hilsen/regards Max M, Denmark

http://www.mxm.dk/
IT's Mad Science
Jul 18 '05 #13
Max M <ma**@mxm.dk> writes:
Thomas Heller wrote:
It seems also the error messages aren't too helpful:
>"ä".encode("latin-1") Traceback (most recent call last):
File "<stdin>", line 1, in ?
UnicodeDecodeError: 'ascii' codec can't decode byte 0x84 in position 0: ordinal not in range(128)
Hm, why does the 'encode' call complain about decoding?


Because it tries to print it out to your console and fail. While
writing to the console it tries to convert to ascii.


Wrong, same error without trying to print something:
x = "ä".encode("latin-1") Traceback (most recent call last):
File "<stdin>", line 1, in ?
UnicodeDecodeError: 'ascii' codec can't decode byte 0x84 in position 0: ordinal not in range(128)


Beside, you should write:

u"ä".encode("latin-1") to get a latin-1 encoded string.


I know, but the question was: why does a unicode string has a encode
method, and why does it complain about decoding (which has already been
answered in the meantime).

Thomas
Jul 18 '05 #14
Skip Montanaro wrote:
aahz> Here's the stark simple recipe: when you use Unicode, you *MUST*
aahz> switch to a Unicode-centric view of the universe. Therefore you
aahz> encode *FROM* Unicode and you decode *TO* Unicode. Period. It's
aahz> similar to the way floating point contaminates ints.

That's what I do in my code. Why do Unicode objects have a decode method
then?


Because MAL implemented it! >;->

It first encodes in the default encoding and then decodes the result
with the specified encoding, so if u is a unicode object
u.decode("utf-16")
is an abbreviation of
u.encode().decode("utf-16")

In the same way str has an encode method, so
s.encode("utf-16")
is an abbreviation of
s.decode().encode("utf-16")

Bye,
Walter Dörwald
Jul 18 '05 #15
Skip Montanaro wrote:
I started to answer, then got confused when I read the docstrings for
unicode.encode and unicode.decode:

[snip]
It certainly is confusing. When I first started Unicoding, I pretty
much stuck to Aahz's rule of thumb, without understanding this details,
and still do that. But now I do undertstand it.

Although encodings are bijective (i.e., equivalent one-to-one
mappings), they are not apolar. One side of the encoding is
arbitrarily labeled the encoded form; the other is arbitrarily labeled
the decoded form. (This is not a relativistic system, here.) The
encode method maps from the decoded to the encoded set. The decode
method does the inverse.

That's it. The only real technical difference between encode and
decode is the direction they map in.

By convention, the decoded form is a Python unicode string, and the
encoded form is the byte string.

I believe it's technically possible (but very rude) to write an
"inverse encoding", where the "encoded" form is a unicode string, and
the decoded form is UTF-8 byte string.

Also, note that there are some encodings unrelated to Unicode. For
example, try this:

.. >>> "abcd".encode("base64")
This is an encoding between two byte strings.
--
CARL BANKS

Jul 18 '05 #16
Carl Banks wrote:
Also, note that there are some encodings unrelated to Unicode. For
example, try this:

. >>> "abcd".encode("base64")
This is an encoding between two byte strings.


Yes. This can be especially nice when you need to use restricted charsets.

I needed to use unicode objects as Zope ids. But Zope only accepts a
subset of ascii as ids.

So I used:
hex_id = u'INBOX'.encode('utf-8').encode('hex')
494e424f58
And I can get the unicode representation back with:

unicode_id = id.decode('hex').decode('utf-8')u'INBOX'


Tn that case id.decode('hex') doesn't return a unicode, but a utf-8
encoded string.

--

hilsen/regards Max M, Denmark

http://www.mxm.dk/
IT's Mad Science
Jul 18 '05 #17

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

114
by: Maurice LING | last post by:
This may be a dumb thing to ask, but besides the penalty for dynamic typing, is there any other real reasons that Python is slower than Java? maurice
11
by: frr | last post by:
Hi, After upgrading to 2.4 (from 2.3), I'm getting a weird syntax error: >>> import themes Traceback (most recent call last): File "<interactive input>", line 1, in ? File "themes.py", line...
34
by: Ville Voipio | last post by:
I would need to make some high-reliability software running on Linux in an embedded system. Performance (or lack of it) is not an issue, reliability is. The piece of software is rather simple,...
46
by: Kamilche | last post by:
I switched to Python a couple years ago, and haven't looked back. I've used Python for many applications, including several commercial plugins for Poser. I don't post on here much, because I don't...
852
by: Mark Tarver | last post by:
How do you compare Python to Lisp? What specific advantages do you think that one has over the other? Note I'm not a Python person and I have no axes to grind here. This is just a question for...
6
by: Rafael Almeida | last post by:
Hello, I'm studying compilers now on my university and I can't quite understand one thing about the python interpreter. Why is its input a binary file (pyc)? The LOAD_CONST opcode is 100 (dec)...
14
by: rtk | last post by:
I'm looking for information on building a tiny/small/minimalist/ vanilla python interpreter. One that implements the core language and a few of the key modules but isn't tied to any specific...
92
by: ureuffyrtu955 | last post by:
Python is a good programming language, but "Python" is not a good name. First, python also means snake, Monty Python. If we search "python" in google, emule, many results are not programming...
11
by: MonkeeSage | last post by:
A quick question about how python parses a file into compiled bytecode. Does it parse the whole file into AST first and then compile the AST, or does it build and compile the AST on the fly as it...
71
by: Jack | last post by:
I understand that the standard Python distribution is considered the C-Python. Howerver, the current C-Python is really a combination of C and Python implementation. There are about 2000 Python...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.