Counting utf-8 characters -special characters

majna

I have character counter for textarea wich counting the characters.
Special character needs same place as two normal characters because of
16-bit encoding.
Counter is counting -2 when special character is added like some
language specific char.

How to count specials like 1 char?
tnx

Sep 19 '07 #1

Subscribe Reply

2766

Thomas 'PointedEars' Lahn

Thomas 'PointedEars' Lahn wrote:

majna wrote:
>I have character counter for textarea wich counting the characters.
Special character needs same place as two normal characters because of
16-bit encoding.

It doesn't.

>Counter is counting -2 when special character is added like some
language specific char.

Should have been -1. But even if most implementations would not be
UTF-16 safe, that would not have sufficed. UTF-16 does not mean that
the representation of a glyph in that encoding requires always only
16 bits:

http://www.unicode.org/faq/utf_bom.html#6

"â‚¬".length === 1

Windows(-1252). Hmpf. Make that "â‚¬" any Unicode glyph (such as "â‚*")
and it is still true.
PointedEars
--
var bugRiddenCrashPronePieceOfJunk = (
navigator.userAgent.indexOf('MSIE 5') != -1
&& navigator.userAgent.indexOf('Mac') != -1
) // Plone, register_function.js:16

Sep 19 '07 #2

Johannes Baagoe

Thomas 'PointedEars' Lahn :

"â‚¬".length === 1

Should be, since 'â‚¬' (U+20AC) is represented as a single UTF-16 code
point, but it is not, e.g., in spidermonkey, which obviously uses UTF-8:

jse = "â‚¬"
â‚¬
jse.length
3
jsfor (i = 0; i < e.length; i++) {print(e.charCodeAt(i).toString(16))}
e2
82
ac

But then, OP mentions UTF-8 in the subject line.

>How to count specials like 1 char?

The same way. ECMAScript 3 implementations use UTF-16 encoded strings.
RTFM.

Hmmm. Is there *any* implementation that actually respects the requirement
of UTF-16?

Besides, even assuming UTF-16, some "language specific" characters (whatever
that means...) take up more than one code point. Some characters may even
use one or more code points according to whether one uses decomposition
or not, e.g., 'Ã©' is either U+00E9 or U+0065 U+0301.

Short of testing each successive octet (if the implementation uses UTF-8)
or code point (if the implementation is correct according to the specs)
to see what kind of character it is, I have so far been unable to answer
the OP's question.

--
Johannes
"Quand on dit c'est un Johannes, cela vaut autant que ce que maintenant
on appelle un pÃ©dant" (H. Estienne, in Ã‰. LittrÃ©, /Dictionnaire de la
langue franÃ§aise/, art. PÃ‰DANT)

Sep 19 '07 #3

Johannes Baagoe

Thomas 'PointedEars' Lahn :

[My version of SpiderMonkey uses UTF-8]

Probably due to your SpiderMonkey build. It works just fine since
Mozilla/4.0.

It does indeed in my version of Firefox. Serves me right for sticking with
obsolete command-line tools :-) It would appear that if I want a good
stand-alone ECMAScript interpreter, I have to compile it myself.

http://www.unicode.org/faq/char_combmark.html#2

Excellent, thanks a lot.

--
Johannes
"Quand on dit c'est un Johannes, cela vaut autant que ce que maintenant
on appelle un pÃ©dant" (H. Estienne, in Ã‰. LittrÃ©, /Dictionnaire de la
langue franÃ§aise/, art. PÃ‰DANT)

Sep 19 '07 #4

Similar topics

Special characters (æøå) and zipfiles

by: Roy W. Andersen | last post by:

I've been searching google about this for days but can't find anything, so I'm hoping someone here can help me out. I'm trying to create zip-files without needing the zip-file extension in PHP,...

PHP

Special characters in java, oracle and html

by: Barry Olly | last post by:

Hi, I'm working on a mini content management system and need help with dealing with special characters. The input are taken from html form which are then stored into a varchar column in...

Java

File path (contains special characters) problem when passing as input to API function.

by: Sakharam Phapale | last post by:

Hi All, I am using an API function, which takes file path as an input. When file path contains special characters (@,#,$,%,&,^, etc), API function gives an error as "Unable to open input file"....

Visual Basic .NET

How to use special characters in strings like in C#?

by: Carl Mercier | last post by:

Hi, Is it possible to use special characters like \n or \t in a VB.NET string, just like in C#? My guess is NO, but maybe there's something I don't know. If it's not possible, does anybody...

Visual Basic .NET

Save exact contents of web form (including special characters) to file?

by: david.lindsay.green | last post by:

Hello all, I am quite new a web scripting and making web pages in general and I have stumbled across a problem I have as yet been unable to solve. I am trying to take the contents of a textarea box...

Javascript

Javascript and special characters

by: Doc | last post by:

Hello! I'm experiencing a little problem counting the number of characters in a textarea on a html page. This is the content type of my HTML document content="text/html; charset=iso-8859-1" ...

Javascript

Special characters in MySQL

by: cheesecaker | last post by:

When I INSERT special characters into my MySQL database, they become distorted. For example, an accented e becomes "Ã©". The database is set to utf8/utf8_general_ci, and so is the table and column...

PHP

Character Classes and Special Characters

by: KevinADC | last post by:

Purpose The purpose of this article is to discuss the difference between characters inside a character class and outside a character class and some special characters inside a character class....

Perl

Data record from Database, displaying as special characters in webpage

by: chike_oji | last post by:

Hello, I am displaying data retrieved from a database as a httpresponse in a webform. I noticed that some characters such as the comma (,), display as special characters in the web browser...

C# / C Sharp

Is that possible of reading the .csv file in column wise and the column have different lengths ?

by: Sonnysonu | last post by:

This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

C / C++

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

C / C++

Maximizing Business Potential: The Nexus of Website Design and Digital Marketing

by: jinu1996 | last post by:

In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

Online Marketing

The easy way to turn off automatic updates for Windows 10/11

by: Hystou | last post by:

Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...

Windows Server

Couldn’t get equations in html when convert word .docx file to html file in C#.

by: conductexam | last post by:

I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...

C# / C Sharp

Trying to create a lan-to-lan vpn between two differents networks

by: TSSRALBI | last post by:

Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The...

Networking - Hardware / Configuration

Windows Forms - .Net 8.0

by: adsilva | last post by:

A Windows Forms form does not have the event Unload, like VB6. What one acts like?

Visual Basic .NET

transfer the data from one system to another through ip address

by: 6302768590 | last post by:

Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated ...

C# / C Sharp