473,395 Members | 1,763 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,395 software developers and data experts.

PEP 3131: Supporting Non-ASCII Identifiers

PEP 1 specifies that PEP authors need to collect feedback from the
community. As the author of PEP 3131, I'd like to encourage comments
to the PEP included below, either here (comp.lang.python), or to
py*********@python.org

In summary, this PEP proposes to allow non-ASCII letters as
identifiers in Python. If the PEP is accepted, the following
identifiers would also become valid as class, function, or
variable names: Löffelstiel, changé, ошибка, or 売りå*´
(hoping that the latter one means "counter").

I believe this PEP differs from other Py3k PEPs in that it really
requires feedback from people with different cultural background
to evaluate it fully - most other PEPs are culture-neutral.

So, please provide feedback, e.g. perhaps by answering these
questions:
- should non-ASCII identifiers be supported? why?
- would you use them if it was possible to do so? in what cases?

Regards,
Martin
PEP: 3131
Title: Supporting Non-ASCII Identifiers
Version: $Revision: 55059 $
Last-Modified: $Date: 2007-05-01 22:34:25 +0200 (Di, 01 Mai 2007) $
Author: Martin v. Löwis <ma****@v.loewis.de>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 1-May-2007
Python-Version: 3.0
Post-History:
Abstract
========

This PEP suggests to support non-ASCII letters (such as accented
characters, Cyrillic, Greek, Kanji, etc.) in Python identifiers.

Rationale
=========

Python code is written by many people in the world who are not familiar
with the English language, or even well-acquainted with the Latin
writing system. Such developers often desire to define classes and
functions with names in their native languages, rather than having to
come up with an (often incorrect) English translation of the concept
they want to name.

For some languages, common transliteration systems exist (in particular,
for the Latin-based writing systems). For other languages, users have
larger difficulties to use Latin to write their native words.

Common Objections
=================

Some objections are often raised against proposals similar to this one.

People claim that they will not be able to use a library if to do so
they have to use characters they cannot type on their keyboards.
However, it is the choice of the designer of the library to decide on
various constraints for using the library: people may not be able to use
the library because they cannot get physical access to the source code
(because it is not published), or because licensing prohibits usage, or
because the documentation is in a language they cannot understand. A
developer wishing to make a library widely available needs to make a
number of explicit choices (such as publication, licensing, language
of documentation, and language of identifiers). It should always be the
choice of the author to make these decisions - not the choice of the
language designers.

In particular, projects wishing to have wide usage probably might want
to establish a policy that all identifiers, comments, and documentation
is written in English (see the GNU coding style guide for an example of
such a policy). Restricting the language to ASCII-only identifiers does
not enforce comments and documentation to be English, or the identifiers
actually to be English words, so an additional policy is necessary,
anyway.

Specification of Language Changes
=================================

The syntax of identifiers in Python will be based on the Unicode
standard annex UAX-31 [1]_, with elaboration and changes as defined
below.

Within the ASCII range (U+0001..U+007F), the valid characters for
identifiers are the same as in Python 2.5. This specification only
introduces additional characters from outside the ASCII range. For
other characters, the classification uses the version of the Unicode
Character Database as included in the ``unicodedata`` module.

The identifier syntax is ``<ID_Start<ID_Continue>*``.

``ID_Start`` is defined as all characters having one of the general
categories uppercase letters (Lu), lowercase letters (Ll), titlecase
letters (Lt), modifier letters (Lm), other letters (Lo), letter numbers
(Nl), plus the underscore (XXX what are "stability extensions" listed in
UAX 31).

``ID_Continue`` is defined as all characters in ``ID_Start``, plus
nonspacing marks (Mn), spacing combining marks (Mc), decimal number
(Nd), and connector punctuations (Pc).

All identifiers are converted into the normal form NFC while parsing;
comparison of identifiers is based on NFC.

Policy Specification
====================

As an addition to the Python Coding style, the following policy is
prescribed: All identifiers in the Python standard library MUST use
ASCII-only identifiers, and SHOULD use English words wherever feasible.

As an option, this specification can be applied to Python 2.x. In that
case, ASCII-only identifiers would continue to be represented as byte
string objects in namespace dictionaries; identifiers with non-ASCII
characters would be represented as Unicode strings.

Implementation
==============

The following changes will need to be made to the parser:

1. If a non-ASCII character is found in the UTF-8 representation of the
source code, a forward scan is made to find the first ASCII
non-identifier character (e.g. a space or punctuation character)

2. The entire UTF-8 string is passed to a function to normalize the
string to NFC, and then verify that it follows the identifier syntax.
No such callout is made for pure-ASCII identifiers, which continue to
be parsed the way they are today.

3. If this specification is implemented for 2.x, reflective libraries
(such as pydoc) must be verified to continue to work when Unicode
strings appear in ``__dict__`` slots as keys.

References
==========

... [1] http://www.unicode.org/reports/tr31/
Copyright
=========

This document has been placed in the public domain.
May 13 '07
399 12590
After reading all thread, and based on my experience (I'm italian,
english is not my native language)

Martin v. Löwis wrote:
- should non-ASCII identifiers be supported?
yes
- why?
Years ago I've read C code written by a turkish guy, and all identifiers
were transliteration of arab (persian? don't know) words.
What I've understand of this code? Nothing. 0 (zero ;) ). Not a word.
It would have been different if it was used unicode identifiers? Not at all.
- would you use them if it was possible to do so?
yes

--
()_() | NN KAPISCO XK' CELLHAVETE T'ANNTO CN ME SL | +----
(o.o) | XK' SKRIVO 1 P'HO VELLOCE MA HALL'ORA DITTELO | +---+
'm m' | KE SIETE VOI K CI HAVVETE PROBBLEMI NO PENSATECI | O |
(___) | HE SENZA RANKORI CIAOOOO |
raffaele punto salmaso at gmail punto com
May 16 '07 #251
Gregor Horvath schrieb:
René Fleschenberg schrieb:
>today, to the best of my knowledge. And "in some form or another"
basically means that the PEP would create more possibilities for things
to go wrong. That things can already go wrong today does not mean that
it does not matter if we create more occasions were things can go wrong
even worse.

Following this logic we should not add any new features at all, because
all of them can go wrong and can be used the wrong way.
No, that does not follow from my logic. What I say is: When thinking
about wether to add a new feature, the potential benefits should be
weighed against the potential problems. I see some potential problems
with this PEP and very little potential benefits.
I love Python because it does not dictate how to do things.
I do not need a ASCII-Dictator, I can judge myself when to use this
feature and when to avoid it, like any other feature.
*That* logic can be used to justify the introduction of *any* feature.

--
René
May 16 '07 #252
On Tue, 15 May 2007 17:35:11 +0200, Stefan Behnel
<st******************@web.dewrote:
Eric Brunel wrote:
>On Tue, 15 May 2007 15:57:32 +0200, Stefan Behnel
>>In-house developers are rather for this PEP as they see the advantage
of
expressing concepts in the way the "non-techies" talk about it.

No: I *am* an "in-house" developer. The argument is not
public/open-source against private/industrial. As I said in some of my
earlier posts, any code can pass through many people in its life, people
not having the same language. I dare to say that starting a project
today in any other language than english is almost irresponsible: the
chances that it will get at least read by people not talking the same
language as the original coders are very close to 100%, even if it
always stays "private".

Ok, so I'm an Open-Source guy who happens to work in-house. And I'm a
supporter of PEP 3131. I admit that I was simplifying in my round-up. :)

But I would say that "irresponsible" is a pretty self-centered word in
this
context. Can't you imagine that those who take the "irresponsible"
decisions
of working on (and starting) projects in "another language than English"
are
maybe as responsible as you are when you take the decision of starting a
project in English, but in a different context? It all depends on the
specific
constraints of the project, i.e. environment, developer skills, domain,
...

The more complex an application domain, the more important is clear and
correct domain terminology. And software developers just don't have
that. They
know their own domain (software development with all those concepts,
languages
and keywords), but there is a reason why they develop software for those
who
know the complex professional domain in detail but do not know how to
develop
software. And it's a good idea to name things in a way that is
consistent with
those who know the professional domain.

That's why keywords are taken from the domain of software development and
identifiers are taken (mostly) from the application domain. And that's
why I
support PEP 3131.
You keep eluding the question: even if the decisions made at the project
start seem quite sensible *at that time*, if the project ends up
maintained in Korea, you *will have* to translate all your identifiers to
something displayable, understandable and typable by (almost) anyone,
a.k.a ASCII-English... Since - as I already said - I'm quite convinced
that any application bigger than the average quick-n-dirty throwable
script is highly likely to end up in a different country than its original
coders', you'll end up losing the time you appeared to have gained in the
beginning. That's what I called "irresponsible" (even if I admit that the
word was a bit strong...).

Anyway, concerning the PEP, I've finally "put some water in my wine" as we
say in French, and I'm not so strongly against it now... Not for the
reasons you give (so we can continue our flame war on this ;-) ), but
mainly considering Python's usage in a learning context: this is a valid
reason why non-ASCII identifiers should be supported. I just wish I'll get
a '--ascii-only' switch on my Python interpreter (or any other means to
forbid non-ASCII identifiers and/or strings and/or comments).
--
python -c "print ''.join([chr(154 - ord(c)) for c in
'U(17zX(%,5.zmz5(17l8(%,5.Z*(93-965$l7+-'])"
May 16 '07 #253
René Fleschenberg wrote:
ru***@yahoo.com schrieb:
>I'm not sure how you conclude that no problem exists.
- Meaningful identifiers are critical in creating good code.

I agree.
>- Non-english speakers can not create or understand
english identifiers hence can't create good code nor
easily grok existing code.

I agree that this is a problem, but please understand that is problem is
_not_ solved by allowing non-ASCII identifiers!
Well, as I said before, there are three major differences between the stdlib
and keywords on one hand and identifiers on the other hand. Ignoring arguments
does not make them any less true.

So, the problem is partly tackled by the people who face it by writing
degenerated transliterations and language mix in identifiers, but it would be
*solved* by means of the language if Unicode identifiers were available.

Stefan
May 16 '07 #254
In <46**************@web.de>, Stefan Behnel wrote:
René Fleschenberg wrote:
>We all know what the PEP is about (we can read). The point is: If we do
not *need* non-English/ASCII identifiers, we do not need the PEP. If the
PEP does not solve an actual *problem* and still introduces some
potential for *new* problems, it should be rejected. So far, the
"problem" seems to just not exist. The burden of proof is on those who
support the PEP.

The main problem here seems to be proving the need of something to people who
do not need it themselves. So, if a simple "but I need it because a, b, c" is
not enough, what good is any further prove?
Maybe all the (potential) programmers that can't understand english and
would benefit from the ability to use non-ASCII characters in identifiers
could step up and take part in this debate. In an english speaking
newsgroup… =:o)

There are potential users of Python who don't know much english or no
english at all. This includes kids, old people, people from countries
that have "letters" that are not that easy to transliterate like european
languages, people who just want to learn Python for fun or to customize
their applications like office suites or GIS software with a Python
scripting option.

Some people here seem to think the user base is or should be only from the
computer science domain. Yes, if you are a programming professional it
may be mandatory to be able to write english identifiers, comments and
documentation, but there are not just programming professionals out there.

Ciao,
Marc 'BlackJack' Rintsch
May 16 '07 #255
René Fleschenberg wrote:
Gregor Horvath schrieb:
>If comments are allowed to be none English, then why are identifier not?

I don't need to be able to type in the exact characters of a comment in
order to properly change the code, and if a comment does not display on
my screen correctly, I am not as fscked as badly as when an identifier
does not display (e.g. in a traceback).
Then get tools that match your working environment.

Stefan
May 16 '07 #256
Martin v. Lowis wrote:
Lorenzo Gatti wrote:
>Not providing an explicit listing of allowed characters is inexcusable
sloppiness.
That is a deliberate part of the specification. It is intentional that
it does *not* specify a precise list, but instead defers that list
to the version of the Unicode standard used (in the unicodedata
module).
Ok, maybe you considered listing characters but you earnestly decided
to follow an authority; but this reliance on the Unicode standard is
not a merit: it defers to an external entity (UAX 31 and the Unicode
database) a foundation of Python syntax.
The obvious purpose of Unicode Annex 31 is defining a framework for
parsing the identifiers of arbitrary programming languages, it's only,
in its own words, "specifications for recommended defaults for the use
of Unicode in the definitions of identifiers and in pattern-based
syntax". It suggests an orderly way to add tens of thousands of exotic
characters to programming language grammars, but it doesn't prove it
would be wise to do so.

You seem to like Unicode Annex 31, but keep in mind that:
- it has very limited resources (only the Unicode standard, i.e. lists
and properties of characters, and not sensible programming language
design, software design, etc.)
- it is culturally biased in favour of supporting as much of the
Unicode character set as possible, disregarding the practical
consequences and assuming without discussion that programming language
designers want to do so
- it is also culturally biased towards the typical Unicode patterns of
providing well explained general algorithms, ensuring forward
compatibility, and relying on existing Unicode standards (in this
case, character types) rather than introducing new data (but the
character list of Table 3 is unavoidable); the net result is caring
even less for actual usage.
>The XML standard is an example of how listings of large parts of the
Unicode character set can be provided clearly, exactly and (almost)
concisely.
And, indeed, this is now recognized as one of the bigger mistakes
of the XML recommendation: they provide an explicit list, and fail
to consider characters that are unassigned. In XML 1.1, they try
to address this issue, by now allowing unassigned characters in
XML names even though it's not certain yet what those characters
mean (until they are assigned).
XML 1.1 is, for practical purposes, not used except by mistake. I
challenge you to show me XML languages or documents of some importance
that need XML 1.1 because they use non-ASCII names.
XML 1.1 is supported by many tools and standards because of buzzword
compliance, enthusiastic obedience to the W3C and low cost of
implementation, but this doesn't mean that its features are an
improvement over XML 1.0.
>>``ID_Continue`` is defined as all characters in ``ID_Start``, plus
nonspacing marks (Mn), spacing combining marks (Mc), decimal number
(Nd), and connector punctuations (Pc).

Am I the first to notice how unsuitable these characters are?
Probably. Nobody in the Unicode consortium noticed, but what
do they know about suitability of Unicode characters...
Don't be silly. These characters are suitable for writing text, not
for use in identifiers; the fact that UAX 31 allows them merely proves
how disconnected from actual programming language needs that document
is.

In typical word processing, what characters are used is the editor's
problem and the only thing that matters is the correctness of the
printed result; program code is much more demanding, as it needs to do
more (exact comparisons, easy reading...) with less (straightforward
keyboard inputs and monospaced fonts instead of complex input systems
and WYSIWYG graphical text). The only way to work with program text
successfully is limiting its complexity.
Hard to input characters, hard to see characters, ambiguities and
uncertainty in the sequence of characters, sets of hard to distinguish
glyphs and similar problems are unacceptable.

It seems I'm not the first to notice a lot of Unicode characters that
are unsuitable for identifiers. Appendix I of the XML 1.1 standard
recommends to avoid variation selectors, interlinear annotations (I
missed them...), various decomposable characters, and "names which are
nonsensical, unpronounceable, hard to read, or easily confusable with
other names".
The whole appendix I is a clear admission of self-defeat, probably the
result of committee compromises. Do you think you could do better?

Regards,
Lorenzo Gatti

May 16 '07 #257
sj*******@yahoo.com a écrit :
Steven D'Aprano wrote:
>I would find it useful to be able to use non-ASCII characters for heavily
mathematical programs. There would be a closer correspondence between the
code and the mathematical equations if one could write D(u*p) instead of
delta(mu*pi).

Just as one risk here:
When reading the above on Google groups, it showed up as "if one could
write ?(u*p)..."
When quoting it for response, it showed up as "could write D(u*p)".

I'm sure that the symbol you used was neither a capital letter d nor a
question mark.

Using identifiers that are so prone to corruption when posting in a
rather popular forum seems dangerous to me--and I'd guess that a lot
of source code highlighters, email lists, etc have similar problems.
I'd even be surprised if some programming tools didn't have similar
problems.
So, it was google groups that continuously corrupted the good UTF-8
posts by force converting them to ISO-8859-1?

Of course, there's also the possibility that it is a problem on *your*
side so, to be fair I've launched google groups and looked for this
thread. And of course the result was that Steven's post displayed
perfectly. I didn't try to reply to it of course, no need to clutter
that thread anymore than it is.

--
Δ(µ*π)
May 16 '07 #258
On Tue, 15 May 2007 21:07:30 +0200, Pierre Hanser
<ha****@club-internet.frwrote:
hello

i work for a large phone maker, and for a long time
we thought, very arrogantly, our phones would be ok
for the whole world.

After all, using a phone uses so little words, and
some of them where even replaced with pictograms!
every body should be able to understand appel, bis,
renvoi, mévo, ...

nowdays we make chinese, corean, japanese talking
phones.

because we can do it, because graphics are cheaper
than they were, because it augments our market.
(also because some markets require it)

see the analogy?
Absolutely not: you're talking about internationalization of the
user-interface here, not about the code. There are quite simple ways to
ensure users will see the displays in their own language, even if the
source code is the same for everyone. But your source code will not
automagically translate itself to the language of the guy who'll have to
maintain it or make it evolve. So the analogy actually seems to work
backwards: if you want any coder to be able to read/understand/edit your
code, just don't write it in your own language...
--
python -c "print ''.join([chr(154 - ord(c)) for c in
'U(17zX(%,5.zmz5(17l8(%,5.Z*(93-965$l7+-'])"
May 16 '07 #259
Stefan Behnel schrieb:
Then get tools that match your working environment.
Integration with existing tools *is* something that a PEP should
consider. This one does not do that sufficiently, IMO.

--
René
May 16 '07 #260
sj*******@yahoo.com wrote:
I even sometimes
read code snippets on email lists and websites from my handheld, which
is sadly still memory-limited enough that I'm really unlikely to
install anything approaching a full set of Unicode fonts.
One of the arguments against this PEP was that it seemed to be impossible to
find either transliterated identifiers in code or native identifiers in Java
code using a web search. So it is very unlikely that you will need to upgrade
your handheld as it is very unlikely for you to stumble into such code.

Stefan
May 16 '07 #261
René Fleschenberg schrieb:
Gregor Horvath schrieb:
>René Fleschenberg schrieb:
>>today, to the best of my knowledge. And "in some form or another"
basically means that the PEP would create more possibilities for things
to go wrong. That things can already go wrong today does not mean that
it does not matter if we create more occasions were things can go wrong
even worse.
Following this logic we should not add any new features at all, because
all of them can go wrong and can be used the wrong way.

No, that does not follow from my logic. What I say is: When thinking
about wether to add a new feature, the potential benefits should be
weighed against the potential problems. I see some potential problems
with this PEP and very little potential benefits.
>I love Python because it does not dictate how to do things.
I do not need a ASCII-Dictator, I can judge myself when to use this
feature and when to avoid it, like any other feature.

*That* logic can be used to justify the introduction of *any* feature.
*Your* logic can be used to justify dropping *any* feature.

Stefan
May 16 '07 #262
Stefan Behnel schrieb:
>>- Non-english speakers can not create or understand
english identifiers hence can't create good code nor
easily grok existing code.
I agree that this is a problem, but please understand that is problem is
_not_ solved by allowing non-ASCII identifiers!

Well, as I said before, there are three major differences between the stdlib
and keywords on one hand and identifiers on the other hand. Ignoring arguments
does not make them any less true.
BTW: Please stop replying to my postings by E-Mail (in Thunderbird, use
"Reply" in stead of "Reply to all").

I agree that keywords are a different matter in many respects, but the
only difference between stdlib interfaces and other intefaces is that
the stdlib interfaces are part of the stdlib. That's it. You are still
ignoring the fact that, contrary to what has been suggested in this
thread, it is _not_ possible to write "German" or "Chinese" Python
without cluttering it up with many many English terms. It's not only the
stdlib, but also many many third party libraries. Show me one real
Python program that is feasibly written without throwing in tons of
English terms.

Now, very special environments (what I called "rare and isolated"
earlier) like special learning environments for children are a different
matter. It should be ok if you have to use a specially patched Python
branch there, or have to use an interpreter option that enables the
suggested behaviour. For general programming, it IMO is a bad idea.

--
René
May 16 '07 #263
Marc 'BlackJack' Rintsch schrieb:
There are potential users of Python who don't know much english or no
english at all. This includes kids, old people, people from countries
that have "letters" that are not that easy to transliterate like european
languages, people who just want to learn Python for fun or to customize
their applications like office suites or GIS software with a Python
scripting option.
Make it an interpreter option that can be turned on for those cases.

--
René
May 16 '07 #264
Eric Brunel wrote:
reason why non-ASCII identifiers should be supported. I just wish I'll
get a '--ascii-only' switch on my Python interpreter (or any other means
to forbid non-ASCII identifiers and/or strings and/or comments).
I could certainly live with that as it would be the right way around. Support
Unicode by default, but allow those who require the lowest common denominator
to enforce it.

Stefan
May 16 '07 #265
Stefan Behnel schrieb:
*Your* logic can be used to justify dropping *any* feature.
No. I am considering both the benefits and the problems. You just happen
to not like the outcome of my considerations [again, please don't reply
by E-Mail, I read the NG].

--
René
May 16 '07 #266
On Wed, 16 May 2007 02:14:58 +0200, Steven D'Aprano
<st****@REMOVE.THIS.cybersource.com.auwrote:
On Tue, 15 May 2007 09:09:30 +0200, Eric Brunel wrote:
>Joke aside, this just means that I won't ever be able to program math in
ADA, because I have absolutely no idea on how to do a 'pi' character on
my keyboard.

Maybe you should find out then? Personal ignorance is never an excuse for
rejecting technology.
My "personal ignorance" is fine, thank you; how is yours?: there is no
keyboard *on Earth* allowing to type *all* characters in the whole Unicode
set. So my keyboard may just happen to provide no means at all to type a
greek 'pi', as it doesn't provide any to type Chinese, Japanese, Korean,
Russian, Hebrew, or whatever character set that is not in usage in my
country. And so are all keyboards all over the world.

Have I made my point clear or do you require some more explanations?
--
python -c "print ''.join([chr(154 - ord(c)) for c in
'U(17zX(%,5.zmz5(17l8(%,5.Z*(93-965$l7+-'])"
May 16 '07 #267
René Fleschenberg wrote:
Stefan Behnel schrieb:
>>>- Non-english speakers can not create or understand
english identifiers hence can't create good code nor
easily grok existing code.
I agree that this is a problem, but please understand that is problem is
_not_ solved by allowing non-ASCII identifiers!
Well, as I said before, there are three major differences between the stdlib
and keywords on one hand and identifiers on the other hand. Ignoring arguments
does not make them any less true.

I agree that keywords are a different matter in many respects, but the
only difference between stdlib interfaces and other intefaces is that
the stdlib interfaces are part of the stdlib. That's it. You are still
ignoring the fact that, contrary to what has been suggested in this
thread, it is _not_ possible to write "German" or "Chinese" Python
without cluttering it up with many many English terms. It's not only the
stdlib, but also many many third party libraries. Show me one real
Python program that is feasibly written without throwing in tons of
English terms.

Now, very special environments (what I called "rare and isolated"
earlier) like special learning environments for children are a different
matter. It should be ok if you have to use a specially patched Python
branch there, or have to use an interpreter option that enables the
suggested behaviour. For general programming, it IMO is a bad idea.
Ok, let me put it differently.

You *do not* design Python's keywords. You *do not* design the stdlib. You *do
not* design the concepts behind all that. You *use* them as they are. So you
can simply take the identifiers they define and use them the way the docs say.
You do not have to understand these names, they don't have to be words, they
don't have to mean anything to you. They are just tools. Even if you do not
understand English, they will not get in your way. You just learn them.

But you *do* design your own software. You *do* design its concepts. You *do*
design its APIs. You *do* choose its identifiers. And you want them to be
clear and telling. You want them to match your (or your clients) view of the
application. You do not care about the naming of the tools you use inside. But
you do care about clarity and readability in *your own software*.

See the little difference here?

Stefan
May 16 '07 #268
René Fleschenberg wrote:
Marc 'BlackJack' Rintsch schrieb:
>There are potential users of Python who don't know much english or no
english at all. This includes kids, old people, people from countries
that have "letters" that are not that easy to transliterate like european
languages, people who just want to learn Python for fun or to customize
their applications like office suites or GIS software with a Python
scripting option.

Make it an interpreter option that can be turned on for those cases.
No. Make "ASCII-only" an interpreter option that can be turned on for the
cases where it is really required.

Stefan
May 16 '07 #269
Ben
On May 15, 11:25 pm, Stefan Behnel <stefan.behnel-n05...@web.de>
wrote:
René Fleschenberg wrote:
Javier Bezos schrieb:
>But having, for example, things like open() from the stdlib in your code
and then öffnen() as a name for functions/methods written by yourself is
just plain silly. It makes the code inconsistent and ugly without
significantly improving the readability for someone who speaks German
but not English.
Agreed. I always use English names (more or
less :-)), but this is not the PEP is about.
We all know what the PEP is about (we can read). The point is: If we do
not *need* non-English/ASCII identifiers, we do not need the PEP. If the
PEP does not solve an actual *problem* and still introduces some
potential for *new* problems, it should be rejected. So far, the
"problem" seems to just not exist. The burden of proof is on those who
support the PEP.

The main problem here seems to be proving the need of something to peoplewho
do not need it themselves. So, if a simple "but I need it because a, b, c" is
not enough, what good is any further prove?

Stefan
For what it's worth, I can only speak English (bad English schooling!)
and I'm definitely +1 on the PEP. Anyone using tools from the last 5
years can handle UTF-8

Cheers,
Ben

May 16 '07 #270
René Fleschenberg schrieb:
>I love Python because it does not dictate how to do things.
I do not need a ASCII-Dictator, I can judge myself when to use this
feature and when to avoid it, like any other feature.

*That* logic can be used to justify the introduction of *any* feature.
No. That logic can only be used to justify the introduction of a feature
that brings freedom.

Who are we to dictate the whole python world how to spell an identifier?

Gregor
May 16 '07 #271
Ben wrote:
On May 15, 11:25 pm, Stefan Behnel <stefan.behnel-n05...@web.de>
wrote:
Rene Fleschenberg wrote:
Javier Bezos schrieb:
>>But having, for example, things like open() from the stdlib in your code
>>and then o:ffnen() as a name for functions/methods written by yourself is
>>just plain silly. It makes the code inconsistent and ugly without
>>significantly improving the readability for someone who speaks German
>>but not English.
>Agreed. I always use English names (more or
>less :-)), but this is not the PEP is about.
We all know what the PEP is about (we can read). The point is: If we do
not *need* non-English/ASCII identifiers, we do not need the PEP. If the
PEP does not solve an actual *problem* and still introduces some
potential for *new* problems, it should be rejected. So far, the
"problem" seems to just not exist. The burden of proof is on those who
support the PEP.
The main problem here seems to be proving the need of something to people who
do not need it themselves. So, if a simple "but I need it because a, b, c" is
not enough, what good is any further prove?

Stefan

For what it's worth, I can only speak English (bad English schooling!)
and I'm definitely +1 on the PEP. Anyone using tools from the last 5
years can handle UTF-8
The falsehood of the last sentence is why I'm moderately against this
PEP. Even examples within this thread don't display correctly on
several of the machines I have access too (all of which are less than
5 year old OS/browser environments). It strikes me a similar to the
arguments for quoted-printable in the early 1990s, claiming that
everyone can view it or will be able to soon--and even a decade
_after_ "everyone can deal with latin1 just fine" it was still causing
massive headaches.

May 16 '07 #272
Christophe wrote:
sj*******@yahoo.com a ecrit :
Steven D'Aprano wrote:
I would find it useful to be able to use non-ASCII characters for heavily
mathematical programs. There would be a closer correspondence between the
code and the mathematical equations if one could write D(u*p) instead of
delta(mu*pi).
Just as one risk here:
When reading the above on Google groups, it showed up as "if one could
write ?(u*p)..."
When quoting it for response, it showed up as "could write D(u*p)".

I'm sure that the symbol you used was neither a capital letter d nor a
question mark.

Using identifiers that are so prone to corruption when posting in a
rather popular forum seems dangerous to me--and I'd guess that a lot
of source code highlighters, email lists, etc have similar problems.
I'd even be surprised if some programming tools didn't have similar
problems.

So, it was google groups that continuously corrupted the good UTF-8
posts by force converting them to ISO-8859-1?

Of course, there's also the possibility that it is a problem on *your*
side
Well, that's part of the point isn't it? It seems incredibly naive to
me to think that you could use whatever symbol was intended and have
it show up, and the "well fix your machine!" argument doesn't fly. A
lot of the time programmers have to look at stack traces on end-user's
machines (whatever they may be) to help debug. They have to look at
code on the (GUI-less) production servers over a terminal link. They
have to use all kinds of environments where they can't install the
latest and greatest fonts. Promoting code that becomes very hard to
read and debug in real situations seems like a sound negative to me.

May 16 '07 #273
Stefan Behnel wrote:
sj*******@yahoo.com wrote:
I even sometimes
read code snippets on email lists and websites from my handheld, which
is sadly still memory-limited enough that I'm really unlikely to
install anything approaching a full set of Unicode fonts.

One of the arguments against this PEP was that it seemed to be impossible to
find either transliterated identifiers in code or native identifiers in Java
code using a web search. So it is very unlikely that you will need to upgrade
your handheld as it is very unlikely for you to stumble into such code.
Sure, if the feature isn't going to be used then it won't present
problems. I can't really see much of an argument for a PEP that isn't
going to be used, though, and if it is used then it's worthwhile to
think about the implications of having code that many common systems
simply can't deal with (either displaying it incorrectly or actually
corrupting files that pass through them).

May 16 '07 #274
Steven D'Aprano schrieb:
>Unless you are 150% sure that there will *never* be the need for a
person who does not know your language of choice to be able to read or
modify your code, the language that "fits the environment best" is
English.

Just a touch of hyperbole perhaps?

You know, it may come to a surprise to some people that English is not
the only common language. In fact, it only ranks third, behind Mandarin
and Spanish, and just above Arabic. Although the exact number of speakers
vary according to the source you consult, the rankings are quite stable:
Mandarin, Spanish, then English. Any of those languages could equally
have claim to be the world's lingua franca.
For a language to be a (or the) lingua franca, the sheer number of
people who speak it is actually not as important as you seem to think.
Its use as an international exchange language is the decisive criterion
-- definitely not true for Mandarin, and for Spanish not nearly as much
as for English.

Also, there can be different "linguae francae" for different fields.
English definitely is the lingua franca of programming. But that is
actually off topic. Programming languages are not the same as natural
languages. I was talking about program code, not about works of literature.

--
René
May 16 '07 #275
Lorenzo Gatti:
Ok, maybe you considered listing characters but you earnestly decided
to follow an authority; but this reliance on the Unicode standard is
not a merit: it defers to an external entity (UAX 31 and the Unicode
database) a foundation of Python syntax.
PEP 3131 uses a similar definition to C# except that PEP 3131
disallows formatting characters (category Cf). See section 9.4.2 of
http://www.ecma-international.org/pu...s/Ecma-334.htm

Neil
May 16 '07 #276
"Méta-MCI" <enl...XmcX@Xm..uX.comwrote:

Hi!

- should non-ASCII identifiers be supported? why?
- would you use them if it was possible to do so? in what cases?

Yes.

JScript can use letters with accents in identifiers
XML (1.1) can use letters with accents in tags
C# can use letters with accents in variables
SQL: MySQL/MS-Sql/Oralcle/etc. can use accents in fields or request
etc.
etc.

Python MUST make up for its lost time.
All those lemmings are jumping over a cliff!
I must hurry to keep up!

- Hendrik

May 16 '07 #277
"Eric Brunel" <e..l@pr...ev.comwrote:
>So what? Does it mean that it's acceptable for the standard library and
keywords to be in English only, but the very same restriction on
user-defined identifiers is out of the question? Why? If I can use my own
language in my identifiers, why can't I write:

classe MaClasse:
définir __init__(moi_même, maListe):
moi_même.monDictionnaire = {}
pour i dans maListe:
moi_même.monDictionnaire[i] = Rien

For a French-speaking person, this is far more readable than:

class MaClasse:
def __init__(self, maListe):
self.monDictionnaire = {}
for i in maListe:
self.monDictionnaire[i] = None

Now, *this* is mixing apples and peaches... And this would look even
weirder with a non-indo-european language...
I don't have any French, but I support this point absolutely - having
native identifiers is NFG if you can't also have native reserved words.

You may be stuck with English sentence construction though. - Would
be hard, I would imagine, to let the programmer change the word order,
or to incorporate something as weird as the standard double negative
in Afrikaans...

We say things that translate literally to: "I am not a big man not.", and it
is completely natural, so the if statements should follow the pattern.

- Hendrik

May 16 '07 #278
"Stefan Behnel" <ste...l-******@web.dewrote:

..:) This is not about "technical" English, this is about domain specific
>English. How big is your knowledge about, say, biological terms or banking
terms in English? Would you say you're capable of modelling an application
from the domain of biology, well specified in a large German document, in
perfect English terms?

And: why would you want to do that?
Possibly because it looks better and reads easier than
a dog ugly mix of perfectly good German words
all mixed up with English keywords in an English
style of sentence construction?

- Hendrik

--
"Hier sind wir unter uns" ;-)

May 16 '07 #279
"HYRY" <ru..88@gmail.comwrote:

If non-ASCII identifiers becomes true, I think it will be the best
gift for Children who donot know English.
How do you feel about the mix of English keywords and Chinese?
How does the English - like "sentences " look to a Chinese?

Would you support the extension of this PEP to include Chinese
Keywords?

Would that be a lesser or greater gift?

- Hendrik

May 16 '07 #280
<ru***@yahoo.comwrote:

>
"Hendrik van Rooyen" <m...l@m,,,.co.zawrote in message
news:ma***************************************@pyt hon.org...
<ru***@yahoo.comwrote:

[I fixed the broken attribution in your quote]
Sorry about that - I deliberately fudge email addys...

First "while" is a keyword and will remain "while" so
that has nothing to do with anything.
I think this cuts right down to why I oppose the PEP.
It is not so much for technical reasons as for aesthetic
ones - I find reading a mix of languages horrible, and I am
kind of surprised by the strength of my own reaction.

If I try to analyse my feelings, I think that really the PEP
does not go far enough, in a sense, and from memory
it seems to me that only E Brunel, R Fleschenberg and
to a lesser extent the Martellibot seem to somehow think
in a similar way as I do, but I seem to have an extreme
case of the disease...

And the summaries of reasons for and against have left
out objections based on this feeling of ugliness of mixed
language.

Interestingly, the people who seem to think a bit like that all
seem to be non native English speakers who are fluent in
English.

While the support seems to come from people whose English
is perfectly adequate, but who are unsure to the extent that they
apologise for their "bad" English.

Is this a pattern that you have identified? - I don't know.

I still don't like the thought of the horrible mix of "foreign"
identifiers and English keywords, coupled with the English
sentence construction. And that, in a nutshell, is the main
reason for my rather vehement opposition to this PEP.

The other stuff about sharing and my inability to even type
the OP's name correctly with the umlaut is kind of secondary
to this feeling of revulsion.

"Beautiful is better than ugly"

- Hendrik

May 16 '07 #281
sj*******@yahoo.com a écrit :
Christophe wrote:
>sj*******@yahoo.com a ecrit :
>>Steven D'Aprano wrote:
I would find it useful to be able to use non-ASCII characters for heavily
mathematical programs. There would be a closer correspondence between the
code and the mathematical equations if one could write D(u*p) instead of
delta(mu*pi).
Just as one risk here:
When reading the above on Google groups, it showed up as "if one could
write ?(u*p)..."
When quoting it for response, it showed up as "could write D(u*p)".

I'm sure that the symbol you used was neither a capital letter d nor a
question mark.

Using identifiers that are so prone to corruption when posting in a
rather popular forum seems dangerous to me--and I'd guess that a lot
of source code highlighters, email lists, etc have similar problems.
I'd even be surprised if some programming tools didn't have similar
problems.
So, it was google groups that continuously corrupted the good UTF-8
posts by force converting them to ISO-8859-1?

Of course, there's also the possibility that it is a problem on *your*
side

Well, that's part of the point isn't it? It seems incredibly naive to
me to think that you could use whatever symbol was intended and have
it show up, and the "well fix your machine!" argument doesn't fly. A
lot of the time programmers have to look at stack traces on end-user's
machines (whatever they may be) to help debug. They have to look at
code on the (GUI-less) production servers over a terminal link. They
have to use all kinds of environments where they can't install the
latest and greatest fonts. Promoting code that becomes very hard to
read and debug in real situations seems like a sound negative to me.
Who displays stack frames? Your code. Whose code includes unicode
identifiers? Your code. Whose fault is it to create a stack trace
display procedure that cannot handle unicode? You. Even if you don't
make use of them, you still have to fix the stack trace display
procedure because the exception error message can include unicode text
*today*

You should know that displaying and editing UTF-8 text as if it was
latin-1 works very very well.

Also, Terminals have support for UTF-8 encodings already. Or you could
always use kate+fish to edit your script on the distant server without
such problems (fish is a KDE protocol used to access a computer with ssh
as if it was a hard disk and kate is the standard text/code editor) It's
a matter of tools.
May 16 '07 #282
Hendrik van Rooyen wrote:
"Beautiful is better than ugly"
Good point. Today's transliteration of German words into ASCII identifiers
definitely looks ugly. Time for this PEP to be accepted.

Stefan
May 16 '07 #283
sj*******@yahoo.com schrieb:
code on the (GUI-less) production servers over a terminal link. They
have to use all kinds of environments where they can't install the
latest and greatest fonts. Promoting code that becomes very hard to
read and debug in real situations seems like a sound negative to me.
If someone wants to debug a Chinese program, he has in almost all cases
obviously already installed the correct fonts and his machine can handle
unicode.

Maybe yours and mine not, but I doubt that we are going to debug a
chinese program.

I have debugged German programs (not python) with unicode characters in
it for years and had no problem at all, because all customers and me
have obviously German machines.

Gregor
May 16 '07 #284
Gregor Horvath schrieb:
>*That* logic can be used to justify the introduction of *any* feature.

No. That logic can only be used to justify the introduction of a feature
that brings freedom.
That is any feature that you are not forced to use. So let's get gotos
and the like. Every programming language dictates some things. This is
not a bad thing.

--
René
May 16 '07 #285
sj*******@yahoo.com wrote:
Stefan Behnel wrote:
>sj*******@yahoo.com wrote:
>>I even sometimes
read code snippets on email lists and websites from my handheld, which
is sadly still memory-limited enough that I'm really unlikely to
install anything approaching a full set of Unicode fonts.
One of the arguments against this PEP was that it seemed to be impossible to
find either transliterated identifiers in code or native identifiers in Java
code using a web search. So it is very unlikely that you will need to upgrade
your handheld as it is very unlikely for you to stumble into such code.

Sure, if the feature isn't going to be used then it won't present
problems.
Thing is, this feature *is* going to be used. Just not by projects that you
are likely to stumble into. Most OpenSource projects will continue to stick to
English-only, and posts to English-speaking newsgroups will also stick to
English. But Closed-Source programs and posts to non-English newsgroups *can*
use this feature if their developers want. And you still wouldn't even notice.

Stefan
May 16 '07 #286
Hendrik van Rooyen schrieb:
It is not so much for technical reasons as for aesthetic
ones - I find reading a mix of languages horrible, and I am
kind of surprised by the strength of my own reaction.
This is a matter of taste.
In some programs I use German identifiers (not unicode). I and others
like the mix. My customers can understand the code better. (They are
only reading it)
>
"Beautiful is better than ugly"
Correct.
But why do you think you should enforce your taste to all of us?

With this logic you should all drive Alfa Romeos!

Gregor
May 16 '07 #287
How do you feel about the mix of English keywords and Chinese?
How does the English - like "sentences " look to a Chinese?

Would you support the extension of this PEP to include Chinese
Keywords?

Would that be a lesser or greater gift?
Because the students can remember some English words, Mixing
characters is not a problem. But it's difficult to express their own
thought or logic in English or Pinyin(only mark the pronunciation of
the Chinese character).
As my experience, I found mixing identifiers of Chinese characters and
keywords of English is very easy for reading.
Because the large difference between Chinese characters and ASCII
characters, I can distinguish my identifiers with keywords and
library words quickly.
May 16 '07 #288
Eric Brunel:
... there is no
keyboard *on Earth* allowing to type *all* characters in the whole
Unicode set.
My keyboard in conjunction with the operating system (US English
keyboard on a Windows XP system) allows me to type characters from any
language. I haven't learned how to type these all quickly but I can get
through a session of testing Japanese input by myself. Its a matter of
turning on different keyboard layouts through the "Text Services and
Input Languages" control panel. Then there are small windows called
Input Method Editors that provide a mapping from your input to the
target language. Other platforms provide similar services.

Neil
May 16 '07 #289
"Stefan Behnel" <st******************@web.dewrote:

Hendrik van Rooyen wrote:
"Beautiful is better than ugly"

Good point. Today's transliteration of German words into ASCII identifiers
definitely looks ugly. Time for this PEP to be accepted.
Nice out of context quote. :-)

Now look me in the eye and tell me that you find
the mix of proper German and English keywords
beautiful.

And I will call you a liar.

- Hendrik

May 16 '07 #290
Stefan Behnel schrieb:
>Now, very special environments (what I called "rare and isolated"
earlier) like special learning environments for children are a different
matter. It should be ok if you have to use a specially patched Python
branch there, or have to use an interpreter option that enables the
suggested behaviour. For general programming, it IMO is a bad idea.

Ok, let me put it differently.

You *do not* design Python's keywords. You *do not* design the stdlib. You *do
not* design the concepts behind all that. You *use* them as they are. So you
can simply take the identifiers they define and use them the way the docs say.
You do not have to understand these names, they don't have to be words, they
don't have to mean anything to you. They are just tools. Even if you do not
understand English, they will not get in your way. You just learn them.
I claim that this is *completely unrealistic*. When learning Python, you
*do* learn the actual meanings of English terms like "open",
"exception", "if" and so on if you did not know them before. It would be
extremely foolish not to do so. You do care about these names and you do
want to know their meaning if you want to write anything more in your
life than a 10-line throw-away script.
But you *do* design your own software. You *do* design its concepts. You *do*
design its APIs. You *do* choose its identifiers. And you want them to be
clear and telling. You want them to match your (or your clients) view of the
application. You do not care about the naming of the tools you use inside. But
you do care about clarity and readability in *your own software*.
I do care about the naming of my tools. I care alot. Part of why I like
Python is that it resisted the temptation to clutter the syntax up with
strange symbols like Perl. And I do dislike the decorator syntax, for
example.

Also, your distinction between "inside" and "your own" is nonsense,
because the "inside" does heavily leak into the "own". It is impossible
to write "your own software" with clarity and readability by your
definition (i.e. in your native language). Any real Python program is a
mix of identifiers you designed yourself and identifiers you did not
design yourself. And I think the ones chosen by yourself are even often
in the minority. It is not feasible in practice to just learn what the
"other" identifiers do without understanding their names. Not for
general programming. The standard library is already too big for that,
and most real programs use not only the standard library, but also third
party libraries that have English APIs.

--
René
May 16 '07 #291
Christophe schrieb:
Who displays stack frames? Your code.
Wrong.
Whose code includes unicode
identifiers? Your code.
Wrong.
Whose fault is it to create a stack trace
display procedure that cannot handle unicode? You.
Wrong. If you never have to deal with other people's code,
congratulations to you. Many other people have to. And no, I can usualy
not just tell the person to fix his code. I need to deal with it.
Even if you don't
make use of them, you still have to fix the stack trace display
procedure because the exception error message can include unicode text
*today*
The error message can, but at least the function names and other
identifiers can not.
You should know that displaying and editing UTF-8 text as if it was
latin-1 works very very well.s
No, this only works for those characters that are in the ASCII range.
For all the other characters it does not work well at all.
Also, Terminals have support for UTF-8 encodings already.
Some have, some have not. And you not only need a terminal that can
handle UTF-8 data, you also need a font that has a glyph for all the
characters you need to handle, and you may also need a way to actualy
enter those characters with your keyboard.

--
René
May 16 '07 #292
René Fleschenberg a écrit :
Christophe schrieb:
>You should know that displaying and editing UTF-8 text as if it was
latin-1 works very very well.s

No, this only works for those characters that are in the ASCII range.
For all the other characters it does not work well at all.
This alone shows you don't know enouth about UTF-8 to talk about it.
UTF-8 will NEVER use < 128 chars to describe multibyte chars. When you
parse a UTF-8 file, each space is a space, each \n is an end of line and
each 'Z' is a 'Z'.
>Also, Terminals have support for UTF-8 encodings already.

Some have, some have not. And you not only need a terminal that can
handle UTF-8 data, you also need a font that has a glyph for all the
characters you need to handle, and you may also need a way to actualy
enter those characters with your keyboard.
Ever heard of the famous "cut/paste"? I use it all the time, even when
handling standard ASCII english code. It greatly cuts down my ability to
make some typo while writing code.
May 16 '07 #293
Christophe schrieb:
René Fleschenberg a écrit :
>Christophe schrieb:
>>You should know that displaying and editing UTF-8 text as if it was
latin-1 works very very well.s

No, this only works for those characters that are in the ASCII range.
For all the other characters it does not work well at all.

This alone shows you don't know enouth about UTF-8 to talk about it.
UTF-8 will NEVER use < 128 chars to describe multibyte chars. When you
parse a UTF-8 file, each space is a space, each \n is an end of line and
each 'Z' is a 'Z'.
So? Does that mean that you can just display UTF-8 "as if it was
Latin-1"? No, it does not. It means you can do that for exactly those
characters that are in the ASCII range. For all the others, you can not.

--
René
May 16 '07 #294
René Fleschenberg wrote:
Stefan Behnel schrieb:
>>Now, very special environments (what I called "rare and isolated"
earlier) like special learning environments for children are a different
matter. It should be ok if you have to use a specially patched Python
branch there, or have to use an interpreter option that enables the
suggested behaviour. For general programming, it IMO is a bad idea.
Ok, let me put it differently.

You *do not* design Python's keywords. You *do not* design the stdlib. You *do
not* design the concepts behind all that. You *use* them as they are. So you
can simply take the identifiers they define and use them the way the docs say.
You do not have to understand these names, they don't have to be words, they
don't have to mean anything to you. They are just tools. Even if you do not
understand English, they will not get in your way. You just learn them.

I claim that this is *completely unrealistic*. When learning Python, you
*do* learn the actual meanings of English terms like "open",
Fine, then go ahead and learn their actual meaning in two languages (Python
and English). My point is: you don't have to. You only need to understand
their meaning in Python. Whether or not English can help here or can be useful
in your later life is completely off-topic.

>But you *do* design your own software. You *do* design its concepts. You *do*
design its APIs. You *do* choose its identifiers. And you want them to be
clear and telling. You want them to match your (or your clients) view of the
application. You do not care about the naming of the tools you use inside. But
you do care about clarity and readability in *your own software*.

I do care about the naming of my tools. I care alot. Part of why I like
Python is that it resisted the temptation to clutter the syntax up with
strange symbols like Perl. And I do dislike the decorator syntax, for
example.

Also, your distinction between "inside" and "your own" is nonsense,
because the "inside" does heavily leak into the "own". It is impossible
to write "your own software" with clarity and readability by your
definition (i.e. in your native language). Any real Python program is a
mix of identifiers you designed yourself and identifiers you did not
design yourself. And I think the ones chosen by yourself are even often
in the minority. It is not feasible in practice to just learn what the
"other" identifiers do without understanding their names. Not for
general programming. The standard library is already too big for that,
and most real programs use not only the standard library, but also third
party libraries that have English APIs.
Ok, I think the difference here is that I have practical experience with
developing that way and I am missing native identifiers in my daily work. You
don't have that experience and therefore do not feel that need. And you know
what? That's perfectly fine. I'm not criticising that at all. All I'm
criticising is that people without need for this feature are trying to prevent
those who need it and want to use it *where it is appropriate* from actually
getting this feature into the language.

Stefan
May 16 '07 #295
"Years ago", i wrote RUR-PLE (a python learning environment based on
Karel the Robot).
Someone mentioned using RUR-PLE to teach programming in Chinese to
kids. Here's a little text extracted from the English lessons (and an
even smaller one from the Turkish one). I believe that this is
relevant to this discussion.
==========
While the creators of Reeborg designed him so that he obeys
instructions in English, they realised that not everyone understands
English. So, they gave him the ability to easily learn a second
language. For example, if we want to tell someone to "move forward" in
French, we would say "avance". We can tell Reeborg that "avance" is a
synonym of "move" simply by writing
avance = move.
The order here is important; the known command has to be on the right,
and the new one has to be on the left. Note that we don't have any
parentheses "()" appearing since the parentheses would tell Reeborg
that we want him to obey an instruction; here, we are simply teaching
him a new word. When we want Reeborg to follow the new instruction, we
will use avance().
[snip]

If you want, you can also teach Reeborg a synonym for turn_off. Or,
you may give synonyms in a language other than French if you prefer,
even creating your own language. Then, watch Reeborg as he obeys
instructions written in your language.
[snip]
Note that, if English is not your favourite language, you can always
create a synonym in your language, as long as you define it first,
before using it. However, the synonym you introduce must use the
English alphabet (letters without any accents). For example, in
French, one might define vire_a_gauche = turn_left and use
vire_a_gauche() to instruct the robot to turn left.

----------(this last paragraph, now translated in Turkish)

Eğer İngilizce sizin favori diliniz değilse komutları her zaman kendi
dilinizde de tanımlayabilirsiniz, ancak kendi dilinizde tanımladığınız
komutları oluştururken yalnızca İngiliz alfabesindeki 26 harfi
kullanabilirsiniz. Örneğin Türkçede sola dönüş için sola_don =
turn_left kullanılmalıdır (ö yerine o kullanılmış dikkat ediniz). Bu
tanımlamayı yaptıktan sonra Reeborg'u sola döndürmek için sola_don()
komutunu kullanabilirsiniz.
=================
I don't read Turkish, but I notice the number 26 there (plus a many
accented letters in the text), suspecting it refers to a small English
alphabet. It always bugged me that I could not have proper robot
commands in French.
While I would not use any non-ascii characters in my coding project
(because I like to be able to get bug reports [and patch!] from
others), I would love to be able to rewrite the lessons for RUR-PLE
using commands in proper French, rather than the bastardized purely
ascii based version. And I suspect it would be even more important in
Chinese...

André

May 16 '07 #296
On Wed, 16 May 2007 12:22:01 +0200, Neil Hodgson
<ny*****************@gmail.comwrote:
Eric Brunel:
>... there is no keyboard *on Earth* allowing to type *all* characters
in the whole Unicode set.

My keyboard in conjunction with the operating system (US English
keyboard on a Windows XP system) allows me to type characters from any
language. I haven't learned how to type these all quickly but I can get
through a session of testing Japanese input by myself. Its a matter of
turning on different keyboard layouts through the "Text Services and
Input Languages" control panel. Then there are small windows called
Input Method Editors that provide a mapping from your input to the
target language. Other platforms provide similar services.
Funny you talk about Japanese, a language I'm a bit familiar with and for
which I actually know some input methods. The thing is, these only work if
you know the transcription to the latin alphabet of the word you want to
type, which closely match its pronunciation. So if you don't know that 売り
å*´ is pronounced "uriba" for example, you have absolutely no way of
entering the word. Even if you could choose among a list of characters,
are you aware that there are almost 2000 "basic" Chinese characters used
in the Japanese language? And if I'm not mistaken, there are several tens
of thousands characters in the Chinese language itself. This makes typing
them virtually impossible if you don't know the language and/or have the
correct keyboard.
--
python -c "print ''.join([chr(154 - ord(c)) for c in
'U(17zX(%,5.zmz5(17l8(%,5.Z*(93-965$l7+-'])"
May 16 '07 #297
On Wed, 16 May 2007 09:12:40 +0200, René Fleschenberg wrote
The X people who speak "no English" and program in Python. I
think X actually is very low (close to zero), because programming in
Python virtually does require you to know some English, wether you
can use non-ASCII characters in identifiers or not. It is naive to
believe that you can program in Python without understanding any
English once you can use your native characters in identifiers. That
will not happen. Please understand that: You basically *must* know
some English to program in Python, and the reason for that is not
that you cannot use non-ASCII identifiers.
There is evidence against your assertions that knowing some English is a
prerequisite for programming in Python and that people won't use non-ASCII
identifiers if they could. Go read the posts by "HYRY" on this thread, a
teacher from China, who teaches his students programming in Python, and they
don't know any English. They *do* use non-ASCII identifiers, and then they use
a cleanup script the teacher wrote to replace the identifiers with ASCII
identifiers so that they can actually run their programs. This disproves your
assertion on both counts.

-Carsten

May 16 '07 #298
Ross Ridge schrieb:
non-ASCII identifiers. While it's easy to find code where comments use
non-ASCII characters, I was never able to find a non-made up example
that used them in identifiers.
Gregor Horvath <gh@gregor-horvath.comwrote:
>If comments are allowed to be none English, then why are identifier not?
In the code I was looking at identifiers were allowed to use non-ASCII
characters. For whatever reason, the programmers choose not use non-ASCII
indentifiers even though they had no problem using non-ASCII characters
in commonets.

Ross Ridge

--
l/ // Ross Ridge -- The Great HTMU
[oo][oo] rr****@csclub.uwaterloo.ca
-()-/()/ http://www.csclub.uwaterloo.ca/~rridge/
db //
May 16 '07 #299
You have misread my statements.

Carsten Haese schrieb:
There is evidence against your assertions that knowing some English is a
prerequisite for programming
I think it is a prerequesite for "real" programming. Yes, I can imagine
that if you use Python as a teaching tool for Chinese 12 year-olds, then
it might be nice to be able to spell identifiers with Chinese
characters. However, IMO this is such a special use-case that it is
justified to require the people who need this to explicitly enable it,
by using a patched interpreter or by enabling an interpreter option for
example.
in Python and that people won't use non-ASCII
identifiers if they could.
I did not assert that at all, where did you get the impression that I
do? If I were convinced that noone would use it, I would have not such a
big problem with it. I fear that it *will* be used "in the wild" if the
PEP in its current form is accepted and that I personally *will* have to
deal with such code.

--
René
May 16 '07 #300

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

6
by: Richie | last post by:
I went through the past six months or so of entries in c.l.javascript, and found a couple where people had expressed opinions about the value of supporting much older versions of Netscape and IE. ...
48
by: Nirvana | last post by:
How to make the font size constant in HTML code, so that in a web browser it remains fixed. For e.g in IE if you press CTRL and move mouse wheel front or back the font size changes, cheers
0
by: INGSOC | last post by:
Using remote debugging, I can attach to a windows service and run it in debug mode in VS.Net 2003. The problem is this service uses two supporting dlls. On the remote service, the dlls have...
12
by: Nick Hounsome | last post by:
Can anyone tell me what the rational is for not supporting optional arguments. It is obviously a trivial thing to implement and, since C++ has them, I would not expect them to be omitted without...
4
by: Ravi | last post by:
Hi, I want the list of browser which is not supporting Java Script. So far I am thinking only JavaScript is the standard scripting language supports in most the browser. Is any scripting language...
3
by: babyspring | last post by:
Hi All, I have encountered an annoying problem. I've read through all the post concerning this error. But yet, I still can't seem to solve the problem. When I run the program, it pops out this...
35
by: salad | last post by:
I have an application written in MS-Access. It is a complete application that manages the day-to-day operations of a business. The program is nearly ready to be used in other customer sites. ...
3
by: Phoe6 | last post by:
Hi, Am starting a new thread as I fear the old thread which more than a week old can go unnoticed. Sorry for the multiple mails. I took the approach of Subclassing ConfigParser to support...
3
by: =?Utf-8?B?U29hcHk=?= | last post by:
Hi: I heard from a friend that Microsoft will no longer support XP. There is a sign-up page for those people who still use it and would like MS to continue supporting it. I don't know if this...
0
by: Clive Dixon | last post by:
When working with lots of associated "supporting" classes alongside classes (by this, I mean things such as associated component editor classes specified by , debugger proxy classes specified by ...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.