sign in | join about | help | sitemap
Connecting Tech Pros Worldwide
jose.jeria@gmail.com's Avatar

Converting Case - Umlauts?


Question posted by: jose.jeria@gmail.com (Guest) on October 26th, 2005 10:25 AM
I use the following to convert uppercase to lowercase:

translate($queryString, 'ABCDE...', 'abcde...')

But how can i convert the case for umlauts? öåä etc

10 Answers Posted
Martin Honnen's Avatar
Guest - n/a Posts
#2: Re: Converting Case - Umlauts?



Join Bytes! wrote:
[color=blue]
> I use the following to convert uppercase to lowercase:
>
> translate($queryString, 'ABCDE...', 'abcde...')
>
> But how can i convert the case for umlauts? öåä etc[/color]

Pretty much the same, each character in the second argument to translate
is replaced by the character at the same index in the third argument so
you simply need to make sure you have all characters you care about in
upper case as the second argument and the same characters in the same
order as the third argument e.g. global variables

<xsl:variable
name="iso88591UpperCaseLetters"
select="ABCDEFGHIJKLMNOPQRSTUVWXYZÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖ× ØÙÚÛÜÝ" />
<xsl:variable
name="iso88591LowerCaseLetters"
select="abcdefghijklmnopqrstuvwxyzàáâãäåæçèéêëìíîïðñòóôõö× øùúûüý" />

then use e.g.

translate($queryString, $iso88591UpperCaseLetters,
$iso88591LowerCaseLetters)

--

Martin Honnen
http://JavaScript.FAQTs.com/
jose.jeria@gmail.com's Avatar
jose.jeria@gmail.com October 26th, 2005 01:55 PM
Guest - n/a Posts
#3: Re: Converting Case - Umlauts?

This doesnt work, I am using UTF-8.

http://www.jeria.net/XSLT/

type in "ägy" and press submit, you will get a "ablotron error on line
11: XML parser error 4: not well-formed (invalid token)" error.

Xml and xslt files can be found here
http://www.jeria.net/XSLT/xml/

jose.jeria@gmail.com's Avatar
jose.jeria@gmail.com October 26th, 2005 01:55 PM
Guest - n/a Posts
#4: Re: Converting Case - Umlauts?

Oh, sorry, it now works, changed to ISO-8859-1

Thanks

Martin Honnen's Avatar
Guest - n/a Posts
#5: Re: Converting Case - Umlauts?



Martin Honnen wrote:

[color=blue]
> <xsl:variable
> name="iso88591UpperCaseLetters"
> select="ABCDEFGHIJKLMNOPQRSTUVWXYZÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖ× ØÙÚÛÜÝ" />
> <xsl:variable
> name="iso88591LowerCaseLetters"
> select="abcdefghijklmnopqrstuvwxyzàáâãäåæçèéêëìíîïðñòóôõö× øùúûüý" />[/color]

Should be

<xsl:variable
name="iso88591UpperCaseLetters"
select="'ABCDEFGHIJKLMNOPQRSTUVWXYZÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖ ×ØÙÚÛÜÝ'" />
<xsl:variable
name="iso88591LowerCaseLetters"
select="'abcdefghijklmnopqrstuvwxyzàáâãäåæçèéêëìíîïðñòóôõö ×øùúûüý'" />

of course.

--

Martin Honnen
http://JavaScript.FAQTs.com/
Andreas Prilop's Avatar
Guest - n/a Posts
#6: Re: Converting Case - Umlauts?

On Wed, 26 Oct 2005, Martin Honnen wrote:
[color=blue]
> name="iso88591UpperCaseLetters"
> select="'ABCDEFGHIJKLMNOPQRSTUVWXYZÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖ ×ØÙÚÛÜÝ'" />[/color]
^[color=blue]
> name="iso88591LowerCaseLetters"
> select="'abcdefghijklmnopqrstuvwxyzàáâãäåæçèéêëìíîïðñòóôõö ×øùúûüý'" />[/color]
^

The multiplication sign (×) isn't exactly a letter.
However, "sharp s" and "y with diaeresis" are.

--
Netscape 3.04 does everything I need, and it's utterly reliable.
Why should I switch? Peter T. Daniels in <news:sci.lang>

Alan J. Flavell's Avatar
Alan J. Flavell October 26th, 2005 06:15 PM
Guest - n/a Posts
#7: Re: Converting Case - Umlauts?

On Wed, 26 Oct 2005, Andreas Prilop wrote:
[color=blue]
> On Wed, 26 Oct 2005, Martin Honnen wrote:
>[color=green]
> > name="iso88591UpperCaseLetters"
> > select="'ABCDEFGHIJKLMNOPQRSTUVWXYZÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖ ×ØÙÚÛÜÝ'" />[/color]
> ^[color=green]
> > name="iso88591LowerCaseLetters"
> > select="'abcdefghijklmnopqrstuvwxyzàáâãäåæçèéêëìíîïðñòóôõö ×øùúûüý'" />[/color]
>
> The multiplication sign (×) isn't exactly a letter.[/color]

Granted...
[color=blue]
> However, "sharp s" and "y with diaeresis" are.[/color]

What you going to do with them then, in an iso-8859-1 context? ;-)
Andreas Prilop's Avatar
Guest - n/a Posts
#8: Re: Converting Case - Umlauts?

On Wed, 26 Oct 2005, Alan J. Flavell wrote:
[color=blue][color=green]
>> However, "sharp s" and "y with diaeresis" are.[/color]
>
> What you going to do with them then, in an iso-8859-1 context? ;-)[/color]

When converting from lower-case to upper-case, "ß" becomes "SS".
"ÿ" might become "Y" without accents in ISO-8859-1.

But this leads me to a more interesting ... err ... case:

In Greek, there are no accents when a word is written in capitals.
For example (I use romanization here):
"Ellás" has an accent on "alpha", whereas
"ELLAS" has no accent on "Alpha".
Therefore "Alpha" might be considered as an upper-case form
of "alpha with tonos".

Even the proper name "Álan" converts to "ALAN" in caps.
Therefore "Alpha" might be considered as an upper-case form
of "Alpha with tonos". Strange? Yes.

I cannot find anything about this in
http://www.unicode.org/Public/UNIDATA/CaseFolding.txt

--
Netscape 3.04 does everything I need, and it's utterly reliable.
Why should I switch? Peter T. Daniels in <news:sci.lang>

jose.jeria@gmail.com's Avatar
jose.jeria@gmail.com November 1st, 2005 12:35 PM
Guest - n/a Posts
#9: Re: Converting Case - Umlauts?

Would it be possible solving this issue using UTF-8? When using UTF-8
these charachters apperas as question marks.

Martin Honnen's Avatar
Guest - n/a Posts
#10: Re: Converting Case - Umlauts?



Andreas Prilop wrote:
[color=blue]
> On Wed, 26 Oct 2005, Martin Honnen wrote:
>[color=green]
>> name="iso88591LowerCaseLetters"
>> select="'abcdefghijklmnopqrstuvwxyzàáâãäåæçèéêëìíîïðñòóôõö ×øùúûüý'" />[/color]
>
> ^
>
> The multiplication sign (×) isn't exactly a letter.[/color]

Right, I was simply to lazy to copy anything by hand from a list of
defined letters and generated those strings programmatically from
character codes. For the XPath use with the translate function it does
not matter semantically as long as the second and the third argument
have the same length and that sign × is at the same position in both
arguments, then no conversion/translation happens.
[color=blue]
> However, "sharp s" and "y with diaeresis" are.[/color]

But using XPath 1.0 translate it is only possible to translate one
character into another but not one into a sequence of others so for ß to
SS translatation the suggested approach with translate is not going to work.

I guess I just need to be more careful to name my variables and not have
them reference a standard when the variable use is not quite up to the
standard :).

--

Martin Honnen
http://JavaScript.FAQTs.com/
Shmuel (Seymour J.) Metz's Avatar
Shmuel (Seymour J.) Metz November 1st, 2005 04:05 PM
Guest - n/a Posts
#11: Re: Converting Case - Umlauts?

In <1130847841.687354.212360@g49g2000cwa.googlegroups. com>, on
11/01/2005
at 04:24 AM, Join Bytes! said:
[color=blue]
>Would it be possible solving this issue using UTF-8? When using UTF-8
>these charachters apperas as question marks.[/color]

Are you sure that you are using the correct octets for UTF-8? If each
character only takes one octet then you're probably storing the data
as ISO-8859-1 or -15, e.g.,

a" ä E4
e" ë EB
i" ï EF
o" ö F6
u" ü FC
A" Ä C4
E" Ë CB
I" Ï CF
O" Ö D6
U" Ü DC

--
Shmuel (Seymour J.) Metz, SysProg and JOAT
<http://patriot.net/~shmuel>

Unsolicited bulk E-mail subject to legal action. I reserve the right
to publicly post or ridicule any abusive E-mail. Reply to domain
Patriot dot net user shmuel+news to contact me. Do not reply to
Join Bytes!

 
Not the answer you were looking for? Post your question . . .
196,811 members ready to help you find a solution.
Join Bytes.com

What is Bytes?

We are a network of experts and professionals in IT and software development that help one another with answers to tough questions and share insights. Get the best answers to your questions from over 196,811 network members.
Post your question now . . .
It's fast and it's free

Popular Articles

Top Community Contributors