By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
435,483 Members | 3,261 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 435,483 IT Pros & Developers. It's quick & easy.

Combining diacritical marks and HTML+CSS

P: n/a
Greetings.

Is it possible using HTML and CSS to represent a combining diacritical mark
in a different style from the letter it modifies? For example, say I want
to render ő (Latin small letter o with a double acute accent), but with
the o in black and the double acute accent in green. Are either of the
following valid?

1. <span style="color: black;">o</span><span style="color:
green;">&#x030B;</span>

2. <span style="color: black;">o<span style="color:
green;">&#x030B;</span></span>

Neither of the two browsers I tested (SeaMonkey 1.1.6 and Konqueror 3.5.8,
both on GNU/Linux) render the examples as intended. Is there some part of
the HTML, CSS, or Unicode standards which says that combining diacritical
marks can't be styled independently, or are my browsers buggy?

Regards,
Tristan

--
_
_V.-o Tristan Miller [en,(fr,de,ia)] >< Space is limited
/ |`-' -=-=-=-=-=-=-=-=-=-=-=-=-=-=-= < In a haiku, so it's hard
(7_\\ http://www.nothingisreal.com/ >< To finish what you
Nov 12 '07 #1
Share this Question
Share on Google+
5 Replies


P: n/a
On 11/12/2007 11:07 AM, Tristan Miller wrote:
Greetings.

Is it possible using HTML and CSS to represent a combining diacritical mark
in a different style from the letter it modifies? For example, say I want
to render ő (Latin small letter o with a double acute accent), but with
the o in black and the double acute accent in green. Are either of the
following valid?

1. <span style="color: black;">o</span><span style="color:
green;">&#x030B;</span>

2. <span style="color: black;">o<span style="color:
green;">&#x030B;</span></span>

Neither of the two browsers I tested (SeaMonkey 1.1.6 and Konqueror 3.5.8,
both on GNU/Linux) render the examples as intended. Is there some part of
the HTML, CSS, or Unicode standards which says that combining diacritical
marks can't be styled independently, or are my browsers buggy?

Regards,
Tristan
In general, the mark is actually part of the character and not separate.
That is, an "ñ" is not merely an "n" with a tilde. Instead, it's quite
distinct (at least in Spanish) from an "n".

Thus, having separate colors would be inappropriate. What you want
would be the same as having an "i" with the dot a different color than
the stroke or having an "A" withe the cross-bar a different color than
the two diagonals.

--
David E. Ross
<http://www.rossde.com/>

Natural foods can be harmful: Look at all the
people who die of natural causes.
Nov 12 '07 #2

P: n/a
Greetings.

In article <rk********************@reader1.news.saunalahti.fi >, Jukka K.
Korpela wrote:
And if you do process combining diacritic marks by Unicode rules, then o
with double acute is to be treated as compatibility equivalent to the
single character U+0151 (Latin small letter o with double acute). In
general, programs should not be expected to treat compatibility
equivalent characters differently; they may do so, but they are surely
not required to do so.

In particular, a browser may well internally map the combination to
U+0151 at the character level. It's nothing that complicated, though.
They probably just ignore the styles you set for a combining diacritic
mark.
Yes; this much is obvious, since the result is the same with other
combinations of characters and combining diacritical marks for which there
is no equivalent single character -- say, U+0040 U+030B (@̋).
You're not the first one to ask for the feature. It has been discussed at
length in the Unicode mailing list. See e.g. the discussion "Coloured
diacritics",
http://www.unicode.org/mail-arch/uni...-m12/0379.html
That page is password-protected. :(
The bottom line is that no, you can't expect to be able to do such
things.
I suspected as much, but (except in the case of compatibility equivalent
characters) you haven't provided any normative reason for this. Are you
saying that this "bottom line" is simply an implementation choice?

Regards,
Tristan

--
_
_V.-o Tristan Miller [en,(fr,de,ia)] >< Space is limited
/ |`-' -=-=-=-=-=-=-=-=-=-=-=-=-=-=-= < In a haiku, so it's hard
(7_\\ http://www.nothingisreal.com/ >< To finish what you
Nov 13 '07 #3

P: n/a
Scripsit Tristan Miller:
>http://www.unicode.org/mail-arch/uni...-m12/0379.html

That page is password-protected. :(
It's pseudo-protection: the password is announced on the Unicode pages, and
it is "unicode".
>The bottom line is that no, you can't expect to be able to do such
things.

I suspected as much, but (except in the case of compatibility
equivalent characters) you haven't provided any normative reason for
this.
There is no normative reason, even for compatibility equivalent characters.
Programs are allowed to treat a precomposed character differently from its
decomposed form, though they should generally not be expected to do so.

Consider the display issue. A non-supporting (though conforming)
implementation can just ignore combining diacritic marks, showing a generic
glyph of unrepresentable character. A simplistic implementation effectively
just overprints the diacritic, as taken from a font, on the base character.
A better implementation takes into account the shape of the base character
and positions, for example, a diacritic on "O" differently from the same
diacritic on "o". An even better implementation additionally checks whether
the combination exists as a precomposed character (or just as a glyph in a
font) and uses it when possible, since a glyph designed by a font designer
should be expected to be as good as or better than a combination of glyphs
generated by software. This is the intended behavior - but not required.
Are you saying that this "bottom line" is simply an
implementation choice?
Yes, but not necessarily just a matter of lazyness. It might appear to be
simple to let diacritics to be colored separately, but then the effects
would depend on the use of precomposed forms (where no such coloring would
be applied). Moreover, treating colors in an ad hoc manner would not be that
natural, and letting _any_ font formatting apply to diacritics would open a
few cans of worms. For example, font size and weight changes might have
rather odd effects and would generally ruin the work of a sophisticated
algorithm that tries to place a diacritic optimally.

If you want to play with colored diacritics, you could use a spacing
diacritic (either as a separately coded character or as a no-break space
followed by a combining diacritic), which can be colored, and use some piece
of CSS to make it overprint the preceding character. This gets tricky of
course, and nasty - you would effectively imitate the simplistic
implementation of combining diacritics (as described above). There's no sure
way of getting even the horizontal position right, since there is no CSS
unit for the width of a character.

--
Jukka K. Korpela ("Yucca")
http://www.cs.tut.fi/~jkorpela/

Nov 13 '07 #4

P: n/a
On Mon, 12 Nov 2007, Tristan Miller wrote:
Is it possible using HTML and CSS to represent a combining diacritical mark
in a different style from the letter it modifies?
This question is not specific to HTML or CSS. In fact, you could try it
with any word processor. It is rather a question of font technology.
Even if you write, say, as ASCII i followed by a non-spacing,
combining acute, OpenType fonts and similar font formats will
use a single glyph for display. This is usually desired - think of
capital and small letters! There is only one non-spacing, combining
acute for both capital and small letters. And you don't want
the acute to mess with the dot on the i .

Only when no single glyph is available (say, b with acute),
then the two glyphs for b and acute are combined. See
http://www.unics.uni-hannover.de/nht...ombimarks.html
for some examples.

The situation is different for Arabic and Indic scripts where
no precomposed glyphs for letter and vowel sign are available.
Browsers behave differently here. See
http://www.unics.uni-hannover.de/nht...arks-indic.htm
for some examples.

--
In memoriam Alan J. Flavell
http://groups.google.com/groups/sear...Alan.J.Flavell
Nov 13 '07 #5

P: n/a
On Tue, 13 Nov 2007, I wrote:
http://www.unics.uni-hannover.de/nht...ombimarks.html
This address is correct.
http://www.unics.uni-hannover.de/nht...arks-indic.htm
Sorry, this address is wrong - it is correct
http://www.unics.uni-hannover.de/nht...rks-indic.html

--
Bugs in Internet Explorer 7
http://www.unics.uni-hannover.de/nhtcapri/ie7-bugs
Nov 14 '07 #6

This discussion thread is closed

Replies have been disabled for this discussion.