By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
435,089 Members | 2,226 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 435,089 IT Pros & Developers. It's quick & easy.

Using non-ascii symbols

P: n/a
On the page http://wiki.python.org/moin/Python3%2e0Suggestions
I noticed an interesting suggestion:

"These operators ≤ ≥ * should be added to the language having the
following meaning:

<= >= !=

this should improve readibility (and make language more accessible to
beginners).

This should be an evolution similar to the digraphe and trigraph
(digramme et trigramme) from C and C++ languages."

How do people on this group feel about this suggestion?

The symbols above are not even latin-1, you need utf-8.

(There are not many usefuls symbols in latin-1. Maybe one could use ×
for cartesian products...)

And while they are better readable, they are not better typable (at
least with most current editors).

Is this idea absurd or will one day our children think that restricting
to 7-bit ascii was absurd?

Are there similar attempts in other languages? I can only think of APL,
but that was a long time ago.

Once you open your mind for using non-ascii symbols, I'm sure one can
find a bunch of useful applications. Variable names could be allowed to
be non-ascii, as in XML. Think class names in Arabian... Or you could
use Greek letters if you run out of one-letter variable names, just as
Mathematicians do. Would this be desirable or rather a horror scenario?
Opinions?

-- Christoph
Jan 24 '06
Share this Question
Share on Google+
61 Replies


P: n/a
Having a bit of a play with some of my spam reduction code.

Original:

def isMostlyCyrillic(u):
if type(u) != type(u""):
u = unicode(u, "UTF-8")
cnt = float(sum(0x400 <= ord(c) < 0x500 for c in u))
return (cnt > 1) and ((cnt / len(u)) > 0.5)

Using more mathematical operators:

def isMostlyCyrillic(u):
if type(u) * type(u""):
u ← unicode(u, "UTF-8")
cnt ← float(∑(0x400 ≤ ord(c) < 0x500 ∀ c ∈ u))
return (cnt > 1) ∧ ((cnt ÷ len(u)) > 0.5)

The biggest win for me is "*" with "←" also an improvement. I'm so
used to "/" for division that "÷" now looks strange.

Neil
Jan 27 '06 #51

P: n/a
On Fri, 27 Jan 2006 12:50:03 -0600
Runsun Pan <py********@gmail.com> wrote:
On 1/27/06, Magnus Lycka <ly***@carmen.se> wrote:
Actually, it seems that recent habit of sending text
messages via mobile phones is the prime driver for
reformed spelling these days.

OMG ru kdng?

Make it stop!

Well, let's just say, I think there should be different
standards for "write once / read once" versus "write once /
read many". The mere use of written language once implied
the latter, but I suppose text messaging breaks that rule.
Well, to solve the problem you can either (1) reform the
spelling of a language to meet the limitation of mobile
phones, or (2) advancing the input device on the mobile
phones such that they can input the language of your
choice. For most asian languages, (1) is certainly out of
question.


IIRC, back in the 1990s there was a *lot* of work in Japan
on optical character recognition, and especially "digital
ink" or "stroke" recognition. With all the pen tablets out
these days, it seems like that would be an awfully good way
to handle ideograms.

First of all, they are, much more than Western alphabets,
strict about stroke order and direction (technically the
Roman alphabet is supposed to be drawn a certain way, but
many people "cheat" -- I think that's harder to get away
with with Asian characters, because they tend not to look
right when drawn wrong). And when you have the actual
stroke sequence data as input, recognition is easier and
more reliable (I think that was the point behind the
"graffiti" system for the Palm Pilot).

--
Terry Hancock (ha*****@AnansiSpaceworks.com)
Anansi Spaceworks http://www.AnansiSpaceworks.com

Jan 28 '06 #52

P: n/a
On Fri, 27 Jan 2006 22:29:20 GMT,
Neil Hodgson <ny*****************@gmail.com> wrote:
... I'm so used to "/" for division that "÷" now looks strange.


Strange, indeed, and too close to + for me (at least within my
newsreader).

Regards,
Dan

--
Dan Sommers
<http://www.tombstonezero.net/dan/>
Jan 28 '06 #53

P: n/a
On 1/27/06, Terry Hancock <ha*****@anansispaceworks.com> wrote:
Well, let's just say, I think there should be different
standards for "write once / read once" versus "write once /
read many". The mere use of written language once implied
the latter, but I suppose text messaging breaks that rule.
Since we are on this, let me share with you guys a little 'ice-tip'
for how the younger generations in Taiwan communicate:

A: why did you tell av8d that I am a bmw ?
B: Well, you are just like one of those ogs or obs ...
A: oic, you think you are much q than I ?
B: ...
A: I would 3q if you stop doing so.
B: ok.
A: Orz
B: 88
A: 881

Can you guys figure out the details ?

Here is the decoded version:

A: why did you tell av8d that I am a bmw ?
[8 in our language is pronounced as "ba", so av8d = everybody]

B: Well, you are just like one of those ogs or obs ...
[ogs= oh-ji-sang, obs=oh-ba-sang, Japanese, means old guy, old
woman, respectively]

A: oic, you think you are much q than I ?
[oic=Oh I see; q = cute]

A: I would 3q if you stop doing so.
[ 3q = thank you ]

B: ok.

A: Orz
[ appreciate very much --- it looks like a guy knee down when seeing an Empire ]

B: 88
[ bye-bye ]

A: 881
[ bye-bye with a tone, sometimes 886 = bye-bye-loh ]

The above example is just an extremely simple one. In the real world,
they combined all sort of language sources --- mandarine, japanese,
english, taiwanese ... as well as "shape" like Orz.

This kind of mixture-of-everything is widely used in young
generations, sometimes called "net terms", sometimes called "Martian
words". It faciliates the online activities among youngists, but
creates huge 'generation gaps' --- some dictionaries were published
for high school teachers to study in order for them to talk and
understand their students.

IMO, a language is a living organism, it has its own life and often
evolves with unexpected turns. Maybe in the future some of those
Martian Words will become part of formal Taiwanese, who knows ? :)
First of all, they are, much more than Western alphabets,
strict about stroke order and direction (technically the
Roman alphabet is supposed to be drawn a certain way, but
many people "cheat" -- I think that's harder to get away
with with Asian characters, because they tend not to look
right when drawn wrong). And when you have the actual
stroke sequence data as input, recognition is easier and
more reliable (I think that was the point behind the
"graffiti" system for the Palm Pilot).


But ... to my knowledge, all of the input tablets that using OCR has a
training feature. You can teach the program to recognize your own
order of strokes. The ability to train (be trained) is a very key
element of such an input device.

--
~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~
Runsun Pan, PhD
py********@gmail.com
Nat'l Center for Macromolecular Imaging
http://ncmi.bcm.tmc.edu/ncmi/
~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~
Jan 28 '06 #54

P: n/a
Rocco Moretti wrote:
Could it be APL?


No, it was much newer... someone did it as a hobby language.
Jan 28 '06 #55

P: n/a
Runsun Pan <py********@gmail.com> writes:
Can you guys figure out the details ?

Here is the decoded version:
It looks that with all my 26 years I'm too old to understand something like
that... All I can say is OMG... :-)
IMO, a language is a living organism, it has its own life and often
evolves with unexpected turns. Maybe in the future some of those
Martian Words will become part of formal Taiwanese, who knows ? :)


I am extremely against that for pt_BR (Brazilian Portuguese). There's a TV
channel here that has some movies with "net terms" instead of pt_BR for the
translation...

--
Jorge Godoy <go***@ieee.org>

"Quidquid latine dictum sit, altum sonatur."
- Qualquer coisa dita em latim soa profundo.
- Anything said in Latin sounds smart.
Jan 28 '06 #56

P: n/a
Runsun Pan wrote:
The simplified chinese exists due to the call for modernization of
language decades ago. That involved the 'upside-down' of almost
entire culture

This is in some ways quite the opposite compared to Nynorsk
in Norway, which was an attempt to revive the old and pure
Norwegian, after being dominated (in politics as well as in
grammar) by Denmark from 1387-1814. (I guess it was a
complicating factor that the end of the union with Denmark
led to a union with Sweden. The Norwegians probably had some
difficulties deciding what neighbour they disliked most. When
they broke out of the union with Sweden in 1905, they actually
elected a Danish prince to be their king.) Anyway, only a
fraction of the Norwegians use Nynorsk today, and the majority
still speak the Danish-based bokml. On the other hand, the
spelling of bokml has also been modernized a lot, with a
series of spelling reforms of both languages.
Jan 29 '06 #57

P: n/a
On Fri, 27 Jan 2006 23:57:16 -0600
Runsun Pan <py********@gmail.com> wrote:
But ... to my knowledge, all of the input tablets that
using OCR has a training feature. You can teach the
program to recognize your own order of strokes. The
ability to train (be trained) is a very key element of
such an input device.


Yeah, but I would think that would be a real drawback when
there's something like 2000 to 10,000 characters to train
on! I think you'd need some kind of short cut (maybe you
could share radical information between characters?).

But I guess I assumed this would already be a solved problem
by now. Maybe it was a lot harder than expected.

--
Terry Hancock (ha*****@AnansiSpaceworks.com)
Anansi Spaceworks http://www.AnansiSpaceworks.com

Jan 30 '06 #58

P: n/a
On Fri, 27 Jan 2006 20:27:40 -0500 in comp.lang.python, Dan Sommers
<me@privacy.net> wrote:
On Fri, 27 Jan 2006 22:29:20 GMT,
Neil Hodgson <ny*****************@gmail.com> wrote:
... I'm so used to "/" for division that "" now looks strange.

Indeed, I don't think I've used for division since about 7th grade,
when I first started taking Algebra (over 30 years ago).

Strange, indeed, and too close to + for me (at least within my
newsreader).


FWIW, it looks closer to - than + in mine. And as you say, _too_
close. IMHO.

Regards,
-=Dave

--
Change is inevitable, progress is not.
Jan 30 '06 #59

P: n/a
Dave Hansen schreef:
On Fri, 27 Jan 2006 20:27:40 -0500 in comp.lang.python, Dan Sommers
<me@privacy.net> wrote:
On Fri, 27 Jan 2006 22:29:20 GMT,
Neil Hodgson <ny*****************@gmail.com> wrote:
... I'm so used to "/" for division that "" now looks strange.


Indeed, I don't think I've used for division since about 7th grade,
when I first started taking Algebra (over 30 years ago).


I have even never used it, except that it's printed on calculators. In
school we used ":" and afterwards "/".

--
If I have been able to see further, it was only because I stood
on the shoulders of giants. -- Isaac Newton

Roel Schroeven
Jan 30 '06 #60

P: n/a
Dave Hansen <id**@hotmail.com> wrote:
On Fri, 27 Jan 2006 20:27:40 -0500 in comp.lang.python, Dan Sommers
<me@privacy.net> wrote:
On Fri, 27 Jan 2006 22:29:20 GMT,
Neil Hodgson <ny*****************@gmail.com> wrote:
... I'm so used to "/" for division that "" now looks strange.


Indeed, I don't think I've used for division since about 7th grade,
when I first started taking Algebra (over 30 years ago).


I used it in APL, and the last time was less than 20 years ago;-).
Alex
Jan 31 '06 #61

P: n/a
>From 1387-1814, a ~430 years period, that's quite a long time.
About the total recountable history of Taiwan... :)

In her 400 some history Taiwan has been occupied by several
foreign powers, including Dutch, Tsing Dynasty from China, Japan,
and KMT party from China again. The long time fight against foreign
powers were all futile, resulted in a 'macro-personality' of getting used
to be slaves.

The mentality of being slaves is that when you have the chance to
play master yourself, you still look up to the old master to either get
approval or beg for mercy. This resulted in a bizzard situation in current
Taiwan that even a local-based, democratic government was elected,
the old foreign power is still the underground power that truly control
all aspects of Taiwan. They reject whatever policies the democratic
government plan. Many nation-wide constructions that the old power
planned and supported when they were in power, they turn their positions
into rejecting those them.

The slave mentality of the public is something that help those old power
to paralize the society. With that, a language reform to reduce the cultural
influence of the foreign power is therefore hopeless in Taiwan (at least
currently).

Maybe Norwegians have some sort of that mentality too ? Considering
that they rather to elect people from the old foreign power ...
On 1/29/06, Magnus Lycka <ly***@carmen.se> wrote:
Runsun Pan wrote:
The simplified chinese exists due to the call for modernization of
language decades ago. That involved the 'upside-down' of almost
entire culture

This is in some ways quite the opposite compared to Nynorsk
in Norway, which was an attempt to revive the old and pure
Norwegian, after being dominated (in politics as well as in
grammar) by Denmark from 1387-1814. (I guess it was a
complicating factor that the end of the union with Denmark
led to a union with Sweden. The Norwegians probably had some
difficulties deciding what neighbour they disliked most. When
they broke out of the union with Sweden in 1905, they actually
elected a Danish prince to be their king.) Anyway, only a
fraction of the Norwegians use Nynorsk today, and the majority
still speak the Danish-based bokml. On the other hand, the
spelling of bokml has also been modernized a lot, with a
series of spelling reforms of both languages.
--
http://mail.python.org/mailman/listinfo/python-list

--
~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~
Runsun Pan, PhD
py********@gmail.com
Nat'l Center for Macromolecular Imaging
http://ncmi.bcm.tmc.edu/ncmi/
~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~
Jan 31 '06 #62

61 Replies

This discussion thread is closed

Replies have been disabled for this discussion.