By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
424,495 Members | 1,345 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 424,495 IT Pros & Developers. It's quick & easy.

jscript + charCodeAt, how to set js encoding?

P: n/a
Hi, I would like to use charCodeAt function but it returns wrong dec.
numbers. The question is how to set character of the js. file
executed? I can not use any kind of tags (<script<metaetc) since
the js. file is executed directly through the client server. So no
html etc... is there any solution? I need to display utf-8 text in
RTF1 so I need to transfer the text to unicode by charCodeAt so that
is why I needed it.

Here is the function I would like to use:

function toCharRef(str){
var charRefs = [], codePoint, i;
for(i = 0; i < str.length; ++i){
codePoint = str.charCodeAt(i);
if(0xD800 <= codePoint && codePoint <= 0xDBFF){
i++;
codePoint = 0x2400 + ((codePoint - 0xD800) << 10) +
str.charCodeAt(i);
}
charRefs.push('\\u' + codePoint);
}
return charRefs.join('');
};
Feb 3 '08 #1
Share this Question
Share on Google+
10 Replies


P: n/a
VK
On Feb 3, 7:01 pm, czechboy <oldrich.s...@gmail.comwrote:
Hi, I would like to use charCodeAt function but it returns wrong dec.
numbers. The question is how to set character of the js. file
executed?
Javascript operates only and exclusively with Unicode (note UTF-8) but
Unicode itself. Even if it is say ISO-8859-1 page, from within
Javascript it is seen as Unicode. If user typed in some text on this
page into some form field and you have read it into Javascript
function, it will be not ISO-8859-1 anymore but Unicode. My guess is -
possibly wrong - that you are making double encoding, so whatever
already is an Unicode string being encoded again to make an Unicode
string and here where the wrong results are.
Feb 3 '08 #2

P: n/a
czechboy <ol**********@gmail.comwrites:
Hi, I would like to use charCodeAt function but it returns wrong dec.
numbers.
Really? Please demonstrate that.
The question is how to set character of the js. file
executed? I can not use any kind of tags (<script<metaetc) since
the js. file is executed directly through the client server. So no
html etc... is there any solution?
I don't understand what you're saying.
I need to display utf-8 text in
RTF1 so I need to transfer the text to unicode by charCodeAt so that
is why I needed it.
charCodeAt already returns the unicode codepoint and strings in
javascript are unicode. The conversion to and from other encodings is
presumably handled by the scripting host (i.e. the browser).

function toCharRef(str){
var charRefs = [], codePoint, i;
for(i = 0; i < str.length; ++i){
codePoint = str.charCodeAt(i);
if(0xD800 <= codePoint && codePoint <= 0xDBFF){
i++;
codePoint = 0x2400 + ((codePoint - 0xD800) << 10) +
str.charCodeAt(i);
}
charRefs.push('\\u' + codePoint);
}
return charRefs.join('');
};
codePoint here is not a hexadecimal 4-character string.

Is there any reason you're doing this at all?

Joost.
Feb 3 '08 #3

P: n/a
To explain it in more detail. There is a javascript SDK plug-in in
FARR ( http://www.donationcoder.com/Forums/...?topic=11804.0
). It uses Microsoft scripting host to interpret javascript. I would
like to display unicode result (russian, greek etc) as RTF1 which
means that I have to convert etc. to its decimal interpretation
by charCodeAt. So when I call the charCodeAt for the letter "" the
SDK displays 356 but it should be 269. Do you think there might be an
error in the javascript SDK plug-in?

And concerning the function. It is what I have found on the internet.
I an javascript newbie ;)
Feb 3 '08 #4

P: n/a
VK
Javascript operates only and exclusively with Unicode (note UTF-8) but
Unicode itself.

That is incoherent gibberish, and qualifies as nonsense.
Oh com'on, these are really ground basics. Don't make yourself look
foolish.

Feb 3 '08 #5

P: n/a
VK
On Feb 3, 8:16 pm, Thomas 'PointedEars' Lahn <PointedE...@web.de>
wrote:
VK wrote:
>Javascript operates only and exclusively with Unicode (note UTF-8) but
Unicode itself.
That is incoherent gibberish, and qualifies as nonsense.
Oh com'on, these are really ground basics.

*These* are not, because what you said is nonsense at best.
What exactly you did not understand in my explanations?
Feb 3 '08 #6

P: n/a
VK wrote:
On Feb 3, 8:16 pm, Thomas 'PointedEars' Lahn <PointedE...@web.de>
wrote:
>VK wrote:
>>>>Javascript operates only and exclusively with Unicode (note UTF-8) but
Unicode itself.
That is incoherent gibberish, and qualifies as nonsense.
Oh com'on, these are really ground basics.
*These* are not, because what you said is nonsense at best.

What exactly you did not understand in my explanations?
There is nothing to be understood where there is no meaning.
Someone should force you to read the nonsense that you post.
PointedEars
--
var bugRiddenCrashPronePieceOfJunk = (
navigator.userAgent.indexOf('MSIE 5') != -1
&& navigator.userAgent.indexOf('Mac') != -1
) // Plone, register_function.js:16
Feb 3 '08 #7

P: n/a
czechboy <ol**********@gmail.comwrites:
To explain it in more detail. There is a javascript SDK plug-in in
FARR ( http://www.donationcoder.com/Forums/...?topic=11804.0
). It uses Microsoft scripting host to interpret javascript. I would
like to display unicode result (russian, greek etc) as RTF1 which
means that I have to convert ěščřž etc. to its decimal interpretation
by charCodeAt. So when I call the charCodeAt for the letter "č" the
SDK displays 356 but it should be 269.
I would expect MS jscript to do something as basic as charcodeat
correctly. My implementation (firefox) correctly gives 268 (0x10c) for
"Č". In any case it's more likely that the text is wrongly converted
somewhere before it reaches the script (i.e. converted from an encoding
that it's not in fact in).
Do you think there might be an
error in the javascript SDK plug-in?
Could be. From your URL it appears that the host isn't unicode
aware.
And concerning the function. It is what I have found on the internet.
I an javascript newbie ;)
Don't use it. it's incorrect.

Joost.
Feb 3 '08 #8

P: n/a
On 3 n, 19:15, Joost Diepenmaat <jo...@zeekat.nlwrote:
czechboy <oldrich.s...@gmail.comwrites:
To explain it in more detail. There is a javascript SDK plug-in in
FARR (http://www.donationcoder.com/Forums/...?topic=11804.0
). It uses Microsoft scripting host to interpret javascript. I would
like to display unicode result (russian, greek etc) as RTF1 which
means that I have to convert etc. to its decimal interpretation
by charCodeAt. So when I call the charCodeAt for the letter "" the
SDK displays 356 but it should be 269.

I would expect MS jscript to do something as basic as charcodeat
correctly. My implementation (firefox) correctly gives 268 (0x10c) for
"". In any case it's more likely that the text is wrongly converted
somewhere before it reaches the script (i.e. converted from an encoding
that it's not in fact in).
Do you think there might be an
error in the javascript SDK plug-in?

Could be. From your URL it appears that the host isn't unicode
aware.
And concerning the function. It is what I have found on the internet.
I an javascript newbie ;)

Don't use it. it's incorrect.

Joost.
Thank you for your help. Now it is working. Could you please post me
correct function? I am not that skilled to do one by myself ;) Thank
you
Feb 4 '08 #9

P: n/a
czechboy <ol**********@gmail.comwrites:
Thank you for your help. Now it is working. Could you please post me
correct function? I am not that skilled to do one by myself ;) Thank
you
I think Thomas already posted a correction to the function in this
thread. Look it up.

Joost.
Feb 4 '08 #10

P: n/a
On 4 n, 11:16, Joost Diepenmaat <jo...@zeekat.nlwrote:
czechboy <oldrich.s...@gmail.comwrites:
Thank you for your help. Now it is working. Could you please post me
correct function? I am not that skilled to do one by myself ;) Thank
you

I think Thomas already posted a correction to the function in this
thread. Look it up.

Joost.
Thanks. Meanwhile I have found another function which seams to work
fine. Is it the correct function?

(function(){

var unicode = {

/**
*
*
*/
'dec2hex' : function(ts)
{
return (ts+0).toString(16).toUpperCase();
},
/**
*
*
*/
'dec2hex2' : function(ts)
{
var hexequiv = new Array ("0", "1", "2", "3", "4", "5", "6", "7",
"8", "9", "A", "B", "C", "D", "E", "F");
return hexequiv[(ts >4) & 0xF] + hexequiv[ts & 0xF];
},
/**
*
*
*/
'dec2hex4' : function(ts)
{
var hexequiv = new Array ("0", "1", "2", "3", "4", "5", "6", "7",
"8", "9", "A", "B", "C", "D", "E", "F");
return hexequiv[(ts >12) & 0xF] + hexequiv[(ts >8) & 0xF] +
hexequiv[(ts >4) & 0xF] + hexequiv[ts & 0xF];
},
/**
*
*
*/
'convertCP2Char' : function(ts)
{
var outputString = '';
ts = ts.replace(/^\s+/, '');
if(ts.length == 0)
return "";
ts = ts.replace(/\s+/g, ' ');
var listArray = ts.split(' ');
for(var i = 0; i < listArray.length; i++)
{
var n = parseInt(listArray[i], 16);
if(n <= 0xFFFF)
outputString += String.fromCharCode(n);
else if (n <= 0x10FFFF)
{
n -= 0x10000;
outputString += String.fromCharCode(0xD800 | (n >10)) +
String.fromCharCode(0xDC00 | (n & 0x3FF));
}
else
outputString += '!erreur ' + unicode.dec2hex(n) +'!';
}
return( outputString );
},
/**
*
*
*/
'convertCP2DecNCR' : function(ts)
{
var outputString = "";
ts = ts.replace(/^\s+/, '');
if(ts.length == 0)
return "";
ts = ts.replace(/\s+/g, ' ');
var listArray = ts.split(' ');
for(var i = 0; i < listArray.length; i++)
{
var n = parseInt(listArray[i], 16);
outputString += ('{\\u' + n + '}');
}
return(outputString);
},
/**
*
*
*/
'convertChar2CP' : function(ts)
{
var outputString = "", haut = 0, n = 0;
for(var i = 0; i < ts.length; i++)
{
var b = ts.charCodeAt(i);
if(b < 0 || b 0xFFFF)
outputString += '!erreur ' + unicode.dec2hex(b) + '!';

if(haut != 0)
{
if(0xDC00 <= b && b <= 0xDFFF)
{
outputString += unicode.dec2hex(0x10000 + ((haut - 0xD800) <<
10) + (b - 0xDC00)) + ' ';
haut = 0;
continue;
}
else
{
outputString += '!erreur ' + unicode.dec2hex(haut) + '!';
haut = 0;
}
}

if(0xD800 <= b && b <= 0xDBFF)
haut = b;
else
outputString += unicode.dec2hex(b) + ' ';
}
return( outputString.replace(/ $/, '') );
},
/**
*
*
*/
'convertDecNCR2CP' : function(ts)
{
var outputString = '';
ts = ts.replace(/\s/g, '');
var listArray = ts.split(';');
for (var i = 0; i < listArray.length-1; i++)
{
if(i 0)
outputString += ' ';
var n = parseInt(listArray[i].substring(2, listArray[i].length),
10);
outputString += unicode.dec2hex(n);
}
return( outputString );
}

};
/**
* Convert Character to Decimal.
*
* @example "JavaScript".char2dec();
* @result "JavaScript"
*
* @name char2dec
* @return String
*/
if(!String.prototype.char2dec)
String.prototype.char2dec = function()
{
return unicode.convertCP2DecNCR(unicode.convertChar2CP(th is));
};
/**
* Convert Decimal to Character.
*
* @example
"JavaScript".dec2char();
* @result "JavaScript"
*
* @name dec2char
* @return String
*/
if(!String.prototype.dec2char)
String.prototype.dec2char = function()
{
return unicode.convertCP2Char(unicode.convertDecNCR2CP(th is));
};

})();
Feb 4 '08 #11

This discussion thread is closed

Replies have been disabled for this discussion.