473,503 Members | 12,003 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

unicodedata name for \u000a

Newbie question: on unicodedata.name
If I do

import unicodedata
unicodedata.name(u"a")
or
unicodedata.name(u"\u0061")

I get
'LATIN SMALL LETTER A"

as expected; but when I follow that with

unicodedata.name(u"\u000a")

I get

Traceback (most recent call last):
File "<stdin>", line 1, in ?
ValueError: no such name

There is, of course, a Unicode name for \u000a,
which is 'LINE FEED' or perhaps 'LINE FEED (A)'.

Is there a gap in unicodedata? or in my understanding?

Thanks,

Ken


Jul 18 '05 #1
9 4568
On Sat, 21 Aug 2004 21:24:04 +0200, rumours say that Ken Beesley
<ke*********@xrce.xerox.com> might have written:

[snip]
unicodedata.name(u"\u000a")

I get

Traceback (most recent call last):
File "<stdin>", line 1, in ?
ValueError: no such name

There is, of course, a Unicode name for \u000a,
which is 'LINE FEED' or perhaps 'LINE FEED (A)'.

Is there a gap in unicodedata? or in my understanding?


It seems that all control characters (u"\u0000" to u"\u001f") have no
names in unicodedata. Don't know if this is an omission (ie bug) or
intentional.
--
TZOTZIOY, I speak England very best,
"Tssss!" --Brad Pitt as Achilles in unprecedented Ancient Greek
Jul 18 '05 #2
Ken Beesley schreef:

There is, of course, a Unicode name for \u000a,


No, there isn't. Check
http://www.unicode.org/charts/PDF/U0000.pdf
--
Peter Kleiweg L:NL,af,da,de,en,ia,nds,no,sv,(fr,it) S:NL,de,en,(da,ia)
info: http://www.let.rug.nl/~kleiweg/ls.html

Jul 18 '05 #3
Peter Kleiweg <in*************@nl.invalid> writes:
No, there isn't. Check
http://www.unicode.org/charts/PDF/U0000.pdf


Quoting that document:

Alias names are those for ISO/IEC 6429:1992.
Commonly used alternative aliases are also shown.

000A LF <control>
= LINE FEED (LF)

So the authors of unicodedata.name() could have picked either
'<control>', the ASCII name 'LF' or the alternative 'LINE FEED (LF)'.
Not picking any of them seems strange, and as the OP pointed out,
leads to an error even though the "C0 Controls" part of that page *is*
part of Unicode.
Jul 18 '05 #4
Tor Iver Wilhelmsen wrote:
000A LF <control>
= LINE FEED (LF)

So the authors of unicodedata.name() could have picked either
'<control>', the ASCII name 'LF' or the alternative 'LINE FEED (LF)'.
No. <control> is not a character name. The unicodedata.name function
returns the official character name, so it MUST NOT return an alias
(which rules out your second alternative).
Not picking any of them seems strange, and as the OP pointed out,
leads to an error even though the "C0 Controls" part of that page *is*
part of Unicode.


Yes. However, this strangeness originates from the Unicode
specification. Control characters simply do not have a name.

If you want to know whether a code point is an unassigned character,
check whether unicodedata.type is "Cn".

Regards,
Martin
Jul 18 '05 #5
"Martin v. Löwis" <ma****@v.loewis.de> writes:
No. <control> is not a character name. The unicodedata.name function
returns the official character name, so it MUST NOT return an alias
(which rules out your second alternative).


Then why not return None or the empty string instead of raising an
exception?
Jul 18 '05 #6
Tor Iver Wilhelmsen wrote:
Then why not return None or the empty string instead of raising an
exception?


Why does a dictionary lookup raise a KeyError instead of returning
None or an empty exception? It's easy enough to add a function that
does what you want:

def name(c):
try:
return unicodedata.name
except ValueError:
return None

Python reports failures through exceptions, not through special
return values. It might have been an option initially to return
None. Now, it cannot be changed for backwards compatibility.

Regards,
Martin
Jul 18 '05 #7
Tor Iver Wilhelmsen wrote:
"Martin v. Löwis" <ma****@v.loewis.de> writes:
No. <control> is not a character name. The unicodedata.name function
returns the official character name, so it MUST NOT return an alias
(which rules out your second alternative).


Then why not return None or the empty string instead of raising an
exception?


What's wrong with
import unicodedata
unicodedata.name(u"\u000a", "my default value")

'my default value'

Peter

Jul 18 '05 #8
Tor Iver Wilhelmsen wrote:
Then why not return None or the empty string instead of raising an
exception?


Why does a dictionary lookup raise a KeyError instead of returning
None or an empty exception? It's easy enough to add a function that
does what you want:

def name(c):
try:
return unicodedata.name
except ValueError:
return None

Python reports failures through exceptions, not through special
return values. It might have been an option initially to return
None. Now, it cannot be changed for backwards compatibility.

Regards,
Martin
Jul 18 '05 #9
Tor Iver Wilhelmsen wrote:
"Martin v. Löwis" <ma****@v.loewis.de> writes:
No. <control> is not a character name. The unicodedata.name function
returns the official character name, so it MUST NOT return an alias
(which rules out your second alternative).


Then why not return None or the empty string instead of raising an
exception?


What's wrong with
import unicodedata
unicodedata.name(u"\u000a", "my default value")

'my default value'

Peter

Jul 18 '05 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
1756
by: David Opstad | last post by:
Hi, all! I'm relatively new to Python, but have definitely fallen in love with it. It reminds me of Mesa (old Xerox development language) and LISP a bit. Anyway, on to the question. Now that...
3
4123
by: Christos TZOTZIOY Georgiou | last post by:
I found at least one case where decombining and recombining a unicode character does not result in the same character (see at end). I have no extensive knowledge about Unicode, yet I believe that...
5
1873
by: Xah Lee | last post by:
python has this nice unicodedata module that deals with unicode nicely. #-*- coding: utf-8 -*- # python from unicodedata import * # each unicode char has a unique name. # one can use the...
23
63610
by: stewart.midwinter | last post by:
No doubt I've overlooked something obvious, but here goes: Let's say I assign a value to a var, e.g.: myPlace = 'right here' myTime = 'right now' Now let's say I want to print out the two...
2
2722
by: Ravi | last post by:
My XML looks like: <abc> <def type="apple"> 1 </def> <def type="peach"> 2 </def> <def type="orange"> 3 </def> <def type="banana"> 4 </def> <def type="plum"> 5 </def> </abc>
1
2616
by: discomiller | last post by:
Mario Mueller: Hello *, radiobuttons belong to other radiobuttons by the "name="any_value"" attribut. Thats a fakt. I got the following XML:...
2
2204
by: Szabolcs Nagy | last post by:
the unicodedata manual sais: " name( unichr) Returns the name assigned to the Unicode character unichr as a string. If no name is defined, default is returned, or, if not given, ValueError is...
1
1405
by: James Abley | last post by:
Hi, I'm looking into implementing this module for Jython, and I'm trying to understand the contracts promised by the various methods. Please bear in mind that means I'm probably targeting...
3
2352
by: James Abley | last post by:
Hi, I'm trying to understand how CPython implements unicodedata, with a view to providing an implementation for Jython. This is a background, low priority thing for me, since I last posted to...
0
7193
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
7264
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
7316
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
7449
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
5562
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
1
4992
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...
0
3160
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The...
0
1495
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated ...
1
728
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.