By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
435,463 Members | 2,858 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 435,463 IT Pros & Developers. It's quick & easy.

Generic innerHTML functionality and other minor questions...

P: n/a
Welcome
I have read the in the faq from jibbering about the generic
DynWrite, but i also realized that is uses only innerHTML feature of
HTML objects.
(1) Is there a DOM function which is very similar to innerHTML property
eg. (my guess) setInnerNodeAsText or sth... ?

I want to write function which will be dynamically updateing some of my
select boxes. My question is:
(2.1) Can i use innerHTML property of SELECT (or even incorporate
DynWrite function) and in such form update some selects boxes on some
event ? (i ask because i saw the way which uses Option object, and
assumed is it not as portable as innerHTML solution).

(2.2) Should i use XmlHttpRequest object (and it equivalent in IE from
ActiveX - in the same way as it is used in Beta Google search site) or
i must consider using something like jsrs:
http://www.ashleyit.com/rs/main.htm (but i saw it uses browser sniffing
in a way of testing objects like document.all or document.layers which
in my opinion should not be this way - the best is to use object
detection system).

(3) Does anyone know a project which beautifies harshed JavaScript code
(like as.js from mentioned above beta google site which uses browser
independent autocomplete solution for text boxes).

(4) I have heard about 'this' keyword troubles, so generally my
question is - do i have to use eg
funtion MyConstructor() {
var myRef = this;
this.myMethod = function() {
/* use myRef instead of keyword 'this' */
}
}

MyConstructor.myStaticPublicMethod() { /* method tied up to
MyConstructor function object */
var myRef = this;
/* use myRef instead of keyword 'this' */
}

I think thats all my questions, would be appreciate for answer (i tried
to make this post as legible as i could).
BR
Luke M.

Nov 23 '05 #1
Share this Question
Share on Google+
38 Replies


P: n/a
> (1) Is there a DOM function which is very similar to innerHTML property

No, you have to create each element yourself.
(2.1) Can i use innerHTML property of SELECT (or even incorporate
DynWrite function) and in such form update some selects boxes on some
event ? (i ask because i saw the way which uses Option object, and
assumed is it not as portable as innerHTML solution).
The Option object and innerHTML are both portable, implemented by
almost every browser. The option object predates innerHTML by a year or
two, but they are both used in every JS browser implementation that I
have heard of.
(2.2) Should i use XmlHttpRequest object (and it equivalent in IE from
ActiveX - in the same way as it is used in Beta Google search site) or
i must consider using something like jsrs:
http://www.ashleyit.com/rs/main.htm (but i saw it uses browser sniffing
in a way of testing objects like document.all or document.layers which
in my opinion should not be this way - the best is to use object
detection system).
I would use XMLHTTPObject simply because it is pretty much the same
everywhere (once you have created the object.)
(3) Does anyone know a project which beautifies harshed JavaScript code
(like as.js from mentioned above beta google site which uses browser
independent autocomplete solution for text boxes).
Run it thru indent(1) then run a few :%s/gvar/realVarName/g filters on
it in VIM.
(4) I have heard about 'this' keyword troubles, so generally my
question is - do i have to use eg
funtion MyConstructor() {
var myRef = this;
this.myMethod = function() {
/* use myRef instead of keyword 'this' */
}
}

MyConstructor.myStaticPublicMethod() { /* method tied up to
MyConstructor function object */
var myRef = this;
/* use myRef instead of keyword 'this' */
}
Neither... "this" is the instance of the function that calls it. For
example, if your constructor was called thus: newObject = new
MyConstructor();
then newObject.this == newObject. this is most useful for passing
objects as arguments; for example, using a form validator, you may use
<form onsubmit="validate(this)"> and then your validation function
could read
function validate(f) {
firstFormElement = f.elements[0];
...
}
I think thats all my questions, would be appreciate for answer (i tried
to make this post as legible as i could).
BR
Luke M.


Nov 23 '05 #2

P: n/a
> (1) Is there a DOM function which is very similar to innerHTML property

No, you have to create each element yourself.
(2.1) Can i use innerHTML property of SELECT (or even incorporate
DynWrite function) and in such form update some selects boxes on some
event ? (i ask because i saw the way which uses Option object, and
assumed is it not as portable as innerHTML solution).
The Option object and innerHTML are both portable, implemented by
almost every browser. The option object predates innerHTML by a year or
two, but they are both used in every JS browser implementation that I
have heard of.
(2.2) Should i use XmlHttpRequest object (and it equivalent in IE from
ActiveX - in the same way as it is used in Beta Google search site) or
i must consider using something like jsrs:
http://www.ashleyit.com/rs/main.htm (but i saw it uses browser sniffing
in a way of testing objects like document.all or document.layers which
in my opinion should not be this way - the best is to use object
detection system).
I would use XMLHTTPObject simply because it is pretty much the same
everywhere (once you have created the object.)
(3) Does anyone know a project which beautifies harshed JavaScript code
(like as.js from mentioned above beta google site which uses browser
independent autocomplete solution for text boxes).
Run it thru indent(1) then run a few :%s/gvar/realVarName/g filters on
it in VIM.
(4) I have heard about 'this' keyword troubles, so generally my
question is - do i have to use eg
funtion MyConstructor() {
var myRef = this;
this.myMethod = function() {
/* use myRef instead of keyword 'this' */
}
}

MyConstructor.myStaticPublicMethod() { /* method tied up to
MyConstructor function object */
var myRef = this;
/* use myRef instead of keyword 'this' */
}
Neither... "this" is the instance of the function that calls it. For
example, if your constructor was called thus: newObject = new
MyConstructor();
then newObject.this == newObject. this is most useful for passing
objects as arguments; for example, using a form validator, you may use
<form onsubmit="validate(this)"> and then your validation function
could read
function validate(f) {
firstFormElement = f.elements[0];
...
}
I think thats all my questions, would be appreciate for answer (i tried
to make this post as legible as i could).
BR
Luke M.


Nov 23 '05 #3

P: n/a
On 22/11/2005 08:55, Luke Matuszewski wrote:

[snip]
(1) Is there a DOM function which is very similar to innerHTML
property eg. (my guess) setInnerNodeAsText or sth... ?
There is no way to pass a string containing HTML and have it parsed and
inserted into the document tree. However, text nodes are represented.
For example,

<p id="myP">Some <em>emphasised</em> text.</p>

would create the following structure:

p (element)
+- 'Some ' (#text)
+- em (element)
| +- 'emphasised' (#text)
+- ' text.' (#text)

Text nodes have a data property and this can be modified:

var p = document.getElementById('myP'),
text = p.firstChild;

text.data = 'Some different ';

resulting in the equivalent of:

<p id="myP">Some different <em>emphasised</em> text.</p>

It's also possible to create additional text nodes, using the
document.createTextNode method.

[snip]
(2.1) Can i use innerHTML property of SELECT (or even incorporate
DynWrite function) and in such form update some selects boxes on some
event ?
Undoubtedly, but it's unlikely that I'd use it. Then again, you haven't
explained what you're trying to do.
(i ask because i saw the way which uses Option object, and assumed is
it not as portable as innerHTML solution).
On the contrary. Using the innerHTML property is less portable.

The Option constructor is a de facto function originating from NN4, and
has gathered quite widespread support. Though W3C DOM methods can take
its place, the Option constructor is still preferred by some due to
greater support.
(2.2) Should i use XmlHttpRequest object [...] or i must consider
using something like jsrs:
To do what? You really should stop asking questions without stating the
purpose behind them.

[snip]
(4) I have heard about 'this' keyword troubles, [...]


Did you actually read the follow-ups I posted to your previous threads?
There are /no/ 'troubles'.

[snip]

Mike

--
Michael Winter
Prefix subject with [News] before replying by e-mail.
Nov 23 '05 #4

P: n/a
VK

Michael Winter wrote:
(4) I have heard about 'this' keyword troubles, [...]


Did you actually read the follow-ups I posted to your previous threads?
There are /no/ 'troubles'.


He's mentioning the recently discussed issue with broken
(non-canonical) "this" context behavior in inner object classes. It is
well analysed here:
<http://www.crockford.com/javascript/private.html>
(Look for "workaround for an error in the ECMAScript Language
Specification which causes this to be set incorrectly for inner
functions").

Nov 23 '05 #5

P: n/a
VK
> "this" context behavior in inner object classes

"this" context behavior in inner object *methods*

I'm getting again lexdyslic ...damn... dyslexic :-)

Nov 23 '05 #6

P: n/a
On 22/11/2005 10:02, Joshie Surber wrote:

[snip]
[...] "this" is the instance of the function that calls it.


Presumably that's a typo. If not, it's completely incorrect and I refer
you to a recent post of mine; another follow-up to Luke, as it happens.

Subject: Object oriented stuff and browsers related thing
Date: 2005-11-20 00:05
Message-Id: V4*****************@text.news.blueyonder.co.uk

<http://groups.google.co.uk/group/comp.lang.javascript/browse_frm/thread/27beecb64c504859/1edf3029a9bed649#1edf3029a9bed649>

Specifically the part that begins, "The value of the this operator...".

[snip]

Mike

--
Michael Winter
Prefix subject with [News] before replying by e-mail.
Nov 23 '05 #7

P: n/a

Michael Winter napisal(a):
On 22/11/2005 08:55, Luke Matuszewski wrote:

(2.1) Can i use innerHTML property of SELECT (or even incorporate
DynWrite function) and in such form update some selects boxes on some
event ?


Undoubtedly, but it's unlikely that I'd use it. Then again, you haven't
explained what you're trying to do.

The Option constructor is a de facto function originating from NN4, and
has gathered quite widespread support. Though W3C DOM methods can take
its place, the Option constructor is still preferred by some due to
greater support.
(2.2) Should i use XmlHttpRequest object [...] or i must consider
using something like jsrs:


To do what? You really should stop asking questions without stating the
purpose behind them.


I would like to have a page with few <select> elements in form. When
one select changes its selected value i would like to make a request to
the server CGI (Struts Action Object or PHP script or other CGI system
based program) with an request parameter which would be the value
selected by the user (so it would be probably by onchange event in
select element)
(sample code)
/* x is a reference to XmlHttpRequest */
/* 'this' is a reference to the <select> element <select ...
onchange="MakeRequest(this)" */
function getMyXmlHttpRequestObject() {
var x = null;
try {
x=new ActiveXObject("Msxml2.XMLHTTP")
} catch(e) {
try {
x=new ActiveXObject("Microsoft.XMLHTTP")
} catch(sc) {
x=null
}
}
if( !x && typeof XMLHttpRequest != "undefined") {
x=new XMLHttpRequest()
}
return x;
}

function MakeRequest(selectElem) {
var x = getMyXmlHttpRequestObject();
if(x) {
x.onreadystatechange = changeOptionsInAllSelect(selectElem.form,
x);
/* [1] */
x.open("GET", "/bw/My_Strust_Action/SelectChanged.do?"+
selectElem.name+"="+

selectElem.options[selectElem.selectedIndex].value, true);
x.send(null);
/* [2] */
/* or via POST
x.open("POST", "/bw/My_Strust_Action/SelectChanged.do", true);
x.send(selectElem.name+"="+
selectElem.options[selectElem.selectedIndex].value);
*/
}

function changeOptionsInAllSelect(thisForm, x) {
if( x.readyState == 4 && x.status == 200 ) {

var myXmlData = x.responseXml;
/* make changes... */
}

}

While i was in my University i realized that i really have to ask one
more question:

(1) When i send a request parameter via XmlHttpRequest object which
encoding of request parameter value would be used when sending eg.

1. (if the above script is on the page which was served in Content-Type
as text/html and with charset UTF-8, furthermore the <meta> element
with Content-Type suggest that charset is UTF-8)
The encoding of request parameter send (value of selected element) is
UTF-8.

2. (if the above script is included on the page (via pointing it with
src attrubute in <script> element and the charset attribute is set to
UTF-8) and the page was served in Content-Type as text/html and with
charset UTF-8, furthermore the <meta> is <meta
http-equiv="content-type" content="text/html; charset=UTF-8">)
[2] what encoding would be used if i will send the DOM object and how
it is sended then (in XML?).

Nov 23 '05 #8

P: n/a

Michael Winter napisal(a):
On 22/11/2005 08:55, Luke Matuszewski wrote:

(2.1) Can i use innerHTML property of SELECT (or even incorporate
DynWrite function) and in such form update some selects boxes on some
event ?


Undoubtedly, but it's unlikely that I'd use it. Then again, you haven't
explained what you're trying to do.

The Option constructor is a de facto function originating from NN4, and
has gathered quite widespread support. Though W3C DOM methods can take
its place, the Option constructor is still preferred by some due to
greater support.
(2.2) Should i use XmlHttpRequest object [...] or i must consider
using something like jsrs:


To do what? You really should stop asking questions without stating the
purpose behind them.


I would like to have a page with few <select> elements in form. When
one select changes its selected value i would like to make a request to
the server CGI (Struts Action Object or PHP script or other CGI system
based program) with an request parameter which would be the value
selected by the user (so it would be probably by onchange event in
select element)
(sample code)
/* x is a reference to XmlHttpRequest */
/* 'this' is a reference to the <select> element <select ...
onchange="MakeRequest(this)" */
function getMyXmlHttpRequestObject() {
var x = null;
try {
x=new ActiveXObject("Msxml2.XMLHTTP")
} catch(e) {
try {
x=new ActiveXObject("Microsoft.XMLHTTP")
} catch(sc) {
x=null
}
}
if( !x && typeof XMLHttpRequest != "undefined") {
x=new XMLHttpRequest()
}
return x;
}

function MakeRequest(selectElem) {
var x = getMyXmlHttpRequestObject();
if(x) {
x.onreadystatechange = changeOptionsInAllSelect(selectElem.form,
x);
/* [1] */
x.open("GET", "/bw/My_Strust_Action/SelectChanged.do?"+
selectElem.name+"="+

selectElem.options[selectElem.selectedIndex].value, true);
x.send(null);
/* [2] */
/* or via POST
x.open("POST", "/bw/My_Strust_Action/SelectChanged.do", true);
x.send(selectElem.name+"="+
selectElem.options[selectElem.selectedIndex].value);
*/
}

function changeOptionsInAllSelect(thisForm, x) {
if( x.readyState == 4 && x.status == 200 ) {

var myXmlData = x.responseXml;
/* make changes... */
}

}

While i was in my University i realized that i really have to ask one
more question:

(1) When i send a request parameter via XmlHttpRequest object which
encoding of request parameter value would be used when sending eg.

1. (if the above script is on the page which was served in Content-Type
as text/html and with charset UTF-8, furthermore the <meta> element
with Content-Type suggest that charset is UTF-8)
The encoding of request parameter send (value of selected element) is
UTF-8.

2. (if the above script is included on the page (via pointing it with
src attrubute in <script> element and the charset attribute is set to
UTF-8) and the page was served in Content-Type as text/html and with
charset UTF-8, furthermore the <meta> is <meta
http-equiv="content-type" content="text/html; charset=UTF-8">)
[2] what encoding would be used if i will send the DOM object and how
it is sended then (in XML?).

Nov 23 '05 #9

P: n/a
I ask this questions about encoding because in most kind of browsers
the request parameters are encoded in the same encoding as the encoding
of the page served. But i don't really know what encoding will be used
when serving paramteres in XmlHttpRequest (or i should use aleready
escaped version of url in open (or send if POST) function call on
object x.

Nov 23 '05 #10

P: n/a
VK

Luke Matuszewski wrote:
I ask this questions about encoding because in most kind of browsers
the request parameters are encoded in the same encoding as the encoding
of the page served. But i don't really know what encoding will be used
when serving paramteres in XmlHttpRequest (or i should use aleready
escaped version of url in open (or send if POST) function call on
object x.


The best answer to such questions is *experiment*. Create a testcase
and pass it through the spelled conditions on different browsers. See
if any consistency (which is not guaranteed at all as XMLHttpRequest as
a rather new addon implemented differently by different producers). If
any interesting results - or results at all - do not hesitate to post
them here! ;-)

Nov 23 '05 #11

P: n/a
As a final note i realized that i can encode the parameter names and
its values via encode() function - so if i will pass something like
that
x.open("GET",
"/bw/My_Strust_Action/SelectChanged.do?"+escape(selectElem.name+"="+

selectElem.options[selectElem.selectedIndex].value), true);
/* and further more if i know that the value and name property is taken
from my <select> element used on a page with UTF-8 encoding then it
should be escaped due to that encoding so eg latin characters like
polish l and a line crossing it would be encoded as %C5%82 for UTF-8
and %B3 in Windows-1250 encoding).

Does my assumptions are right ?

Nov 23 '05 #12

P: n/a
VK

Luke Matuszewski wrote:
As a final note i realized that i can encode the parameter names and
its values via encode() function - so if i will pass something like
that
x.open("GET",
"/bw/My_Strust_Action/SelectChanged.do?"+escape(selectElem.name+"="+

selectElem.options[selectElem.selectedIndex].value), true);
/* and further more if i know that the value and name property is taken
from my <select> element used on a page with UTF-8 encoding then it
should be escaped due to that encoding so eg latin characters like
polish l and a line crossing it would be encoded as %C5%82 for UTF-8
and %B3 in Windows-1250 encoding).

Does my assumptions are right ?


Not exactly.

1) As I explained before UTF-8 is a *transport encoding* thus UTF-8
char sequences are existing only during their travel time from server
to browser and from browser to server. At the moment you are able to
operate with strings using JavaScript, UTF-8 mission is already
completed and all char sequences are converted to the relevant Unicode
chars. So you never need to bother with UTF-8 transformations unless
you decided to emulate an HTTP server using JavaScript.

2) escape / unescape methods are working only with ASCII characters.
For Unicode transformations you have to use encodeURIComponent /
decodeURIComponent methods.

Nov 23 '05 #13

P: n/a
When a user inputs some text in the form field it uses it's default
charset set by operating system, but those values inputed by user are
then transformed to encoding specified via Content-Type header in HTTP
response eg. Content-Type: text/html; charset=windows1250 (so it is the
charset used for displaying the page on browser window and for
interpretation by browser - so it now knows what charset to use while
'decoding' the page contents for proper character recognition and then
displaying it (so polish l and a line crossing it would be encoded as
%C5%82 when charset=UTF-8 (takes 2 bytes) and %B3 when
charset=Windows-1250 encoding (takes only one byte).
Inputed values by user in form fields when submit action occurs are
then sent via POST or GET ... When it is POST then it is send in the
Message Body of the HTTP Request using the encoding specified in
mentioned above Content-Type. So HTTP Request looks like that:
[Browser makes HTTP Request via POST]
POST /bw/My_Strust_Action/SelectChanged.do HTTP/1.1
Content-Type: application/x-www-form-urlencoded charset=UTF-8 (or
windows-1250/other)

my_param_name=my_param_value&my_param_name1=my_par am_value2

or

[Browser makes HTTP Request via GET (params and values are sent escaped
in URL part)]
GET
/bw/My_Strust_Action/SelectChanged.do?escaped_param_name=escaped_param_ value&escaped_param_name1=escaped_param_value2
HTTP/1.1
Content-Type: application/x-www-form-urlencoded charset=UTF-8 (or
windows-1250/other)

To be more precise you must read
http://www.w3.org/TR/REC-html40/inte...html#h-17.13.4 (and
specified RFC docs).

Then at the server side - program which parses parameters and its
values can properly decode them because it knows the encoding of
parameters (at first it tries to replace %xx parts to its orginal
form).

Perhaps the biggest difference between the encodeURI() and escape()
functions (and their
decodeURI() and unescape() counterparts) is that the more modern
versions do not encode
a wide range of symbols that are perfectly acceptable as URI characters
according to the syntax
recommended in RFC2396 (http://www.ietf.org/rfc/rfc2396.txt). Thus, the
following
characters are not encoded via the encodeURI() function:
; / ? : @ & = + $ , - _ . ! ~ * ' ( ) #
Use the encodeURI() and decodeURI() functions only on complete URIs.
Applicable URIs can
be relative or absolute, but these two functions are wired especially
so symbols that are part
of the protocol (://), search string (? and =, for instance), and
directory-level delimiters (/)
are not encoded.
But as i assumed the string inside those functions is treated as
Unicode string so polish l and a line crossing it would be encoded as
%C5%82, and therefore the values and its names will be sent in Unicode
escaped version.

Nov 23 '05 #14

P: n/a
It is true that escape() function produces %u0142 for polish l with a
line crossing readed from field by
forms["myForm"].elements["myFormField"].value (not as is should, but
escape was introduced probably only for US-ASCII characters). encodeURI
or encodeURIComponent() works very well and producing %C5%82. It
happens everytime no matter what charset in Content-Type of HTTP
response was used.
(becouse every string in JavaScript is in Unicode format - even
forms["myForm"].elements["myFormField"].value string).

Then it is wise to write/serve pages in UTF-8 when we want to make play
with i18n. Then at server side treat request params and its values as
UTF-8 strings.

Anyway if open() function of XmlHttpRequest does not produces escape
mechanism for given URL (and probably containing parameters) it is best
to use encodeURI on given URL or encodeURIComponent on URL fragment and
at server side treat reqest param values and its names as encoded in
UTF-8.

ECMAScript v1 defines string as is the set of all finite ordered
sequences of zero or more Unicode characters, but
encodeURI/encodeURIComponent and its decode counterpartners are defined
since ECMAScript v3 (IE Windows 5.5 and N6 IE Mac ?).

Nov 23 '05 #15

P: n/a
VK

Luke Matuszewski wrote:
escape was introduced probably only for US-ASCII characters).
escape() and unescape() have been introduced for Common Gateway
Protocol (CGI) to encode / decode ASCII characters which are not
directly allowed for CGI transmission. These are all extended ASCII
chars (>127) and many symbols from the lower part of the ASCII table.

These methods have no idea about double (triple,.. and counting)
character units which constitute Unicode. So they cannot handle
properly Unicode characters but still can be used (though not
recommended) to handle traditional national encoding systems.
encodeURI
or encodeURIComponent() works very well and producing %C5%82. It
happens everytime no matter what charset in Content-Type of HTTP
response was used.
(becouse every string in JavaScript is in Unicode format - even
forms["myForm"].elements["myFormField"].value string).
You've got it! From withing JavaScript everything is Unicode, no matter
what encoding is used. The only complication I can think of is
document.location parts (if say .search field contains some non-Base
ASCII string and you want to read it. I encurage you to experiment).
Then it is wise to write/serve pages in UTF-8 when we want to make play
with i18n. Then at server side treat request params and its values as
UTF-8 strings.


Again: you do not have to worry about it. Your own tack is to save your
document in Unicode format (not UTF-8 ! - otherwise you'll get double
encoding on the trasmission) or in ASCII format with all non-base ASCII
chars transformed into \uFFFF escape sequences. And then you serve it
with UTF-8 Content-Type

[ Content-Type *doesn't* define the actual encoding *on* the page.
Content-Type defines what encoding browser should expect from
the server so it could parse received chars properly. ]

Nov 23 '05 #16

P: n/a
On 22/11/2005 11:43, VK wrote:
Michael Winter wrote:
(4) I have heard about 'this' keyword troubles, [...]


Did you actually read the follow-ups I posted to your previous
threads? There are /no/ 'troubles'.


He's mentioning the recently discussed issue with broken
(non-canonical) "this" context behavior in inner object classes.


No, that's how the language works. Different behaviour might have been
preferred, but that doesn't constitute 'broken'.

[snip]

...an error in the ECMAScript Language Specification which
causes this to be set incorrectly for inner functions.
-- Douglas Crockford,
<http://www.crockford.com/javascript/private.html>

As much as I respect Douglas, I don't agree with that statement. The
caller is supposed to provide a value for the this operator, and that
situation is just fine. The main problem is that because IE only
relatively recently implemented the means to provide that value, it's
forced the current solution.

Mike

--
Michael Winter
Prefix subject with [News] before replying by e-mail.
Nov 23 '05 #17

P: n/a

VK napisal(a):
Again: you do not have to worry about it. Your own tack is to save your
document in Unicode format (not UTF-8 ! - otherwise you'll get double
encoding on the trasmission) or in ASCII format with all non-base ASCII
chars transformed into \uFFFF escape sequences. And then you serve it
with UTF-8 Content-Type


What i mean is to write page and save it using UTF-8 (so Unicode
characters gets transformed in a way the UTF-8 transformation format
wants it to be transformed) so:
- characters from US-ASCII take one byte in UTF-8 encoding;
- characters from out of it takes one, two or more bytes in UTF-8
encoding
(while in UTF-16 one char - even form US-ASCII - takes 2 bytes in file
and there are some chars that takes 4 bytes).
When file is saved if can be transmitted by server - and server should
know what charset to assign in ContentType HTTP header, so then browser
will properly decode the document and identify characters (like html
elements).
http://www.w3.org/TR/REC-html40/charset.html#h-5.1
[ The document character set, however, does not suffice to allow user
agents to correctly interpret HTML documents as they are typically
exchanged -- encoded as a sequence of bytes in a file or during a
network transmission. User agents must also know the specific character
encoding that was used to transform the document character stream into
a byte stream. ]
http://www.w3.org/TR/REC-html40/charset.html#h-5.2
[ The "charset" parameter identifies a character encoding, which is a
method of converting a sequence of bytes into a sequence of characters.
This conversion fits naturally with the scheme of Web activity: servers
send HTML documents to user agents as a stream of bytes; user agents
interpret them as a sequence of characters. The conversion method can
range from simple one-to-one correspondence to complex switching
schemes or algorithms. ]
http://www.w3.org/TR/REC-html40/charset.html#h-5.2.2
[ How does a server determine which character encoding applies for a
document it serves? Some servers examine the first few bytes of the
document, or check against a database of known files and encodings.
Many modern servers give Web masters more control over charset
configuration than old servers do. Web masters should use these
mechanisms to send out a "charset" parameter whenever possible, but
should take care not to identify a document with the wrong "charset"
parameter value.
] and [
How does a user agent know which character encoding has been used? The
server should provide this information. The most straightforward way
for a server to inform the user agent about the character encoding of
the document is to use the "charset" parameter of the "Content-Type"
header field of the HTTP protocol ([RFC2616], sections 3.4 and 14.17)
For example, the following HTTP header announces that the character
encoding is EUC-JP:

Content-Type: text/html; charset=EUC-JP
]
http://www.w3.org/TR/REC-html40/conform.html#h-4.3
[ The optional parameter "charset" refers to the character encoding
used to represent the HTML document as a sequence of bytes. Legal
values for this parameter are defined in the section on character
encodings. Although this parameter is optional, we recommend that it
always be present. ]

The transformation of document which lies on some server and is
actually served depends on configuration of server (as read above) - so
if i use UTF-8 to save files and 'set' server to serve document with
charset=UTF-8 no further transformation of document which is file on
server should take place.

Nov 23 '05 #18

P: n/a
On 22/11/2005 15:00, Luke Matuszewski wrote:

[snip]
(2.2) Should i use XmlHttpRequest object [...] or i must consider
using something like jsrs:

[snip]
I would like to have a page with few <select> elements in form. When
one select changes its selected value i would like to make a request to
the server CGI [...]
Presumably to get data with which you can change the other SELECT
elements? Yes, using XMLHttpRequest is fine for that. However, later in
your code you seem to be using the XML response data. I would have
thought it easier to return data formatted using JSON. An array of
objects containing the text/value pairs for the OPTION elements:

[ { text : 'Option 1', value : '1' },
{ text : 'Option 2', value : '2' },
...
{ text : 'Option n', value : 'n' } ]

You could simply loop through each element.

If I presumed wrong, XML might be the better format.

[snip]
function getMyXmlHttpRequestObject() {
var x = null;
try {
x=new ActiveXObject("Msxml2.XMLHTTP")
} catch(e) {
try {
x=new ActiveXObject("Microsoft.XMLHTTP")
} catch(sc) {
x=null
}
}
if( !x && typeof XMLHttpRequest != "undefined") {
x=new XMLHttpRequest()
}
return x;
}
I'd go with something more along the lines of:

function getRequestObject() {
var request = null;

if( ('function' == typeof XMLHttpRequest)
|| ('object' == typeof XMLHttpRequest))
{
request = new XMLHttpRequest();
} else if('function' == typeof ActiveXObject) {
/*@cc_on @*/
/*@if(@_jscript_version >= 5)
try {
request = new ActiveXObject('Msxml2.XMLHTTP');
} catch(e) {
try {
request = new ActiveXObject('Microsoft.XMLHTTP');
} catch(e) {
request = null;
}
}
@end @*/
}
return request;
}

[snip]
x.onreadystatechange = changeOptionsInAllSelect(
selectElem.form,
x);
You do realise that you're assigning the /result/ of calling that
function to the onreadystatechange property, not assigning the function
itself with some arguments, right?
1. (if the above script is on the page which was served in Content-Type
as text/html and with charset UTF-8, furthermore the <meta> element
with Content-Type suggest that charset is UTF-8)
You shouldn't be using META elements to indicate either MIME type or
encoding. That's what the HTTP Content-Type header is for.
The encoding of request parameter send (value of selected element) is
UTF-8.


As far as I'm aware, some implementations force UTF-8, irrespective of
the character encoding used by the document. Microsoft's (rather patchy)
documentation certainly seems to indicate this.

[snip]

Mike

--
Michael Winter
Prefix subject with [News] before replying by e-mail.
Nov 23 '05 #19

P: n/a
VK
I may keep your document in UTF-8 format and you may ask your server
admin to turn off encoding mechanics so such document would not be
double encoded while sending. But why would you want to do this? And
it's definitely would not improve your text portability because you
would have to make the same agreement with each server admin. HTTP
protocol just doesn't work this way and I have great doubts that you
will manage to change HTTP standards on all servers across the globe
:-)

Effectively your idea is similar to keep all executables on your
computer in base64 text format and turn off your browser's base64
encoder so you could upload your file onto server manually. It is also
doable but in the Name why? ;-)

Coming back to the JavaScript internationalization portability: the
really secure way is to keep your code in Base ASCII format with all
non-Base ASCII characters represented in \uFFFF Unicode escape
sequences. This way there is no way that the encoding will be anyhow
corrupted.
As a development with \u all around is not convenient, you may want to
develope your JavaScript application in your national encoding and use
some converter (escaper ?) on the final phase.

Hope this help.

Nov 23 '05 #20

P: n/a

VK napisal(a):
I may keep your document in UTF-8 format and you may ask your server
admin to turn off encoding mechanics so such document would not be
double encoded while sending. But why would you want to do this? And
it's definitely would not improve your text portability because you
would have to make the same agreement with each server admin. HTTP
protocol just doesn't work this way and I have great doubts that you
will manage to change HTTP standards on all servers across the globe
:-)

Effectively your idea is similar to keep all executables on your
computer in base64 text format and turn off your browser's base64
encoder so you could upload your file onto server manually. It is also
doable but in the Name why? ;-)


You really don't understand me, and i doubt you will. I provied some
meaningful text init from w3c (which says that server of a document is
trying to 'detect' the encoding of the document(or it is set manually
by admin or [see my prevous post]) - so if somehow it can detect that
my document (eg. html page) is in UTF-8 then no transformation is
needed if he sends data using Content-Type: text/html charset=UTF-8.
I far as i know standard http server is trying to detect the encoding
of my document and then all it needs to do is set charset in
Content-Type header to detected encoding - NO TRANSFORMATION (or double
transformation).

Aside from servers like Http Apache - there is a (historically)
conformance that if the server cannot somehow detect encoding of the
served file it will use ISO-8859-1.
Again if you are writing php page or jsp page or other pages (probably
in other language) you CAN EXPLICITE SET CONTENT TYPE HEADER. In jsp
can do that via <@ page directive on top of the page.

Unicode if standard - its main role is to give integer value to
specified national language, while UTF specifies how to transform that
character value (which may be even 4 bytes long) to byte stream - eg
used to produce a raw file.

BR
Luke.

Nov 23 '05 #21

P: n/a
VK
Luke,

You asked for the most secure format to destribute JavaScript files
containing non-Base ASCII characters.

You've got the answer: \uFFFF Unicode escape sequences. If you think
that UTF-8 pre-coded text is more secure in this concern you are
welcome to try it across the servers - it's a free world, man! :-)

If you look my posts around here you'll see that I'm the last man in
the block to prove anything by quoting some documents. As I said
several times: if you have a question - put an experiment. Make a
JavaScript with Polish text in it, convert it into UTF-8, create a
document-holder with UTF-8 Content-Type declaration and destribute it
on free hosting providers across the globe. Get the text into your
browser, check the result. Do the same with \u escaped variants, check
the result. You see - it is rather easy and it doesn't involve any
quotations, cross-references, document versioning comparison and other
academical toys. And most importantly - it gives some *practical*
answers.

If you don't want to experiment, than this particular topic is explored
to the its limits I guess.

Nov 23 '05 #22

P: n/a

VK napisal(a):
Luke,

You asked for the most secure format to destribute JavaScript files
containing non-Base ASCII characters.

No i asked about escaping mechanism - especially how (or does ) it
works with XmlHttpRequest object if using GET. My confusion was in
particular case:
If i have a page which is stated in a browser/User Agent as having
encoding eg. Windows-1250 then when i take value from this page in
JavaScript by eg.

var str = selectElem.options[selectElem.selectedIndex].value;

and use it as request parameter in send method eg.

x.open("GET", "/someUrl?MyRequestParameter="+str, true);

then in what encoding would be that MyRequestParameter paramter value
at server side ?
Now i know that all strings in JavaScript 1.0 are in Unicode and
particularlly my parameter would be in Unicode (UTF-8 as i presume) ...
so if send by GET needs to be escaped (becouse when using GET method
only limited characters are permitted in URL string eg.
even space ' ' is escaped as %20 and @ is escaped as %40 ... (%xx - xx
hex value)).
escape() method properly escapes ' ' and @ and other basic (probably
taken from US-ASCII) characters. But when i want to send a some latin
character like polish l crossed i have to first encode it via
encodeURLComponent added in ECMAScript v3).
When i use POST this is not a problem since URL and POSTed data need no
escaping.

To see escaping mechanism in GET method of HTTP protocol try this ->
open your browser and in Location (URL) field type:

http://www.google.com/?myRequestParam=Some Value

and you will see

http://www.google.com/?myRequestParam=Some%20Value

and a page with text:

Not Found
The requested URL /?myRequestParam=Some%20Value was not found on this
server.

Got it ?

The only restriction here is that all request parameters send via
open() (GET) or send() method (POST) are send using UTF-8 format (so at
server side my script must be aware of it... eg. i must use in servlet
request.setCharacterEncoding("UTF-8"); before i will be taking any of
parameters from HTTP Request.
(personally i use filter which always sets
request.setCharacterEncoding("UTF-8"); so i dont have to do it in every
servlet).

It may be limitation that XmlHttpRequest object is doing GET using
UTF-8 transformation format.

I have done some testing on Apache httpd server - from httpd.conf i
learned that its default charset is set by

#
# Specify a default charset for all pages sent out. This is
# always a good idea and opens the door for future internationalisation
# of your web site, should you ever want it. Specifying it as
# a default does little harm; as the standard dictates that a page
# is in iso-8859-1 (latin1) unless specified otherwise i.e. you
# are merely stating the obvious. There are also some security
# reasons in browsers, related to javascript and URL parsing
# which encourage you to always set a default char set.
#
AddDefaultCharset UTF-8

so (default charset in Content-Type header) is UTF-8

....so i have made 2 files
- one saved in Windows-1250 encoding (Central Europe);
- one saved in UTF-8 encoding;
and opened them in my browser...only file saved as UTF-8 was displayed
correctly.
Both files were served with Content-Type header set to UTF-8 (and
browser View->Encoding was set to that value to -> UTF-8) - so again no
transformation on server side and so no file charset detection was used
(as stated in w3c.org html part).

Nov 23 '05 #23

P: n/a
Luke Matuszewski wrote:
I have read the in the faq from jibbering about the generic
DynWrite, but i also realized that is uses only innerHTML feature of
HTML objects.
(1) Is there a DOM function which is very similar to innerHTML property
eg. (my guess) setInnerNodeAsText or sth... ?


`innerHTML' is a DOM feature, and I think one that can be considered a
feature of "DOM Level 0" (IE3+/NN3+). It is just not a feature of the
_W3C_ DOM, and AFAIK the latter DOM provides no equivalent for it. I think
that is because including such would include describing how the method
should handle invalid markup as it would be nonsensical to provide a method
that allows to destroy the DOM tree, the very thing it operates on. So
far, assignments to `innerHTML' are not checked for well-formedness, hence
the Gecko DOM disallows write access to it for documents served with an XML
document type, such as XHTML served as application/xhtml+xml; in that case,
you must do as Michael described.

You can use W3C DOM Level 3 Core's `textContent' attribute/property for
objects that implement the Node interface (includes all HTML element
objects) but that is restricted to plain text content, and I think that
is a Good Thing after all. `textContent' is supported in more recent
Mozilla/5.0 based UAs. It is on the wish list for Opera 9.

<URL:http://www.w3.org/TR/DOM-Level-3-Core/>
<URL:http://developer.mozilla.org/en/docs/DOM:element.textContent>
<URL:http://www.mozilla.org/docs/dom/reference/levels.html>
HTH

PointedEars
Nov 23 '05 #24

P: n/a
VK wrote:
Luke Matuszewski wrote:
As a final note i realized that i can encode the parameter names and
its values via encode() function
You SHOULD encode all reserved characters unless they serve their intended
purpose. You SHOULD NOT encode unreserved characters. See RFC3986.
- so if i will pass something like
that
x.open("GET",
"/bw/My_Strust_Action/SelectChanged.do?"+escape(selectElem.name+"="+

selectElem.options[selectElem.selectedIndex].value), true);
/* and further more if i know that the value and name property is taken
from my <select> element used on a page with UTF-8 encoding then it
should be escaped due to that encoding so eg latin characters like
polish l and a line crossing it would be encoded as %C5%82 for UTF-8
and %B3 in Windows-1250 encoding).

Does my assumptions are right ?
Not exactly.

1) As I explained before UTF-8 is a *transport encoding* thus UTF-8
char sequences are existing only during their travel time from server
to browser and from browser to server.


This is utter nonsense again. UTF means Unicode _Transformation_ Format,
its use is not restricted to a certain transport channel nor has UTF
anything to do with client-server communication.
At the moment you are able to operate with strings using JavaScript,
UTF-8 mission is already completed and all char sequences are converted
to the relevant Unicode chars.
Nonsense! A transformation format is needed to encode Unicode chars at code
points up to U+10FFFF in 8-bit code, including script code string literals.
It is only that this happens completely transparent to the user of string
literals as in a supporting implementation _all_ characters in the string
literal are encoded using UTF-16. Which is why "\uABCD" is a string of
length 1 there although two bytes are required to store it.
So you never need to bother with UTF-8 transformations unless
you decided to emulate an HTTP server using JavaScript.
Nonsense.
2) escape / unescape methods are working only with ASCII characters.
No, that also works for ISO-8859-xx characters, however it is not specified.
For Unicode transformations you have to use encodeURIComponent /
decodeURIComponent methods.


This was true if one replaced "transformations" with "percent encoding".
PointedEars
Nov 23 '05 #25

P: n/a
VK wrote:
Luke Matuszewski wrote:
encodeURI or encodeURIComponent() works very well and producing %C5%82.
It happens everytime no matter what charset in Content-Type of HTTP
response was used.
(becouse every string in JavaScript is in Unicode format - even
forms["myForm"].elements["myFormField"].value string).


You've got it! From withing JavaScript everything is Unicode, no matter
what encoding is used. [...]


You meant that even though you gave him the worst _opposite_ advice, he
managed to see through it. And now you are confirming that he is right
without admitting that you were utterly wrong. There is a word for that:
hypocrisy.
PointedEars
Nov 23 '05 #26

P: n/a
Luke Matuszewski wrote:
No i asked about escaping mechanism - especially how (or does ) it
works with XmlHttpRequest object if using GET. [...]
Now i know that all strings in JavaScript 1.0 are in Unicode
No, Unicode support was not included before JavaScript 1.3 (NN 4.06). Since
then, strings are encoded using UTF-16 in accordance with ECMAScript. The
current JavaScript version is JavaScript 1.6 (in Mozilla/5.0 rv:1.8b+,
hence Firefox 1.5). Unicode support for Internet Explorer was probably not
included before version 5.5/Win which was the only to support JScript 5.5
which was the first JScript version to support encodeURIComponent().

<URL:http://docs.sun.com/source/816-6408-10/whatsnew.htm>
<URL:http://msdn.microsoft.com/library/default.asp?url=/library/en-us/script56/html/js56jsmthencodeuricomponent.asp>
and particularlly my parameter would be in Unicode (UTF-8 as i
presume) ...
My expectation is instead that ASCII percent-encoding (as described in
RFC2986) will be used for characters below code point 0x80 and UTF-8
percent-encoding will be used for the rest. I tried `?]' (where ? is
fortunately AltGr+l here :)) and submitted it -- %C5%82%5D was used if
UTF-8 was set as Character Encoding in View menu before.[1] I assume
this will be triggered by the default response header you configured.
However, I suggest that you use server-side scripting instead to set

Content-Type: text/html; charset=UTF-8

only if you need it. Serving all, even non-UTF-encoded documents as
UTF-8 encoded is probably harmful.

[1] Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.12) Gecko/20050922
Firefox/1.0.7 (Debian package 1.0.7-1) Mnenhy/0.7.2.0
so if send by GET needs to be escaped (becouse when using GET method
only limited characters are permitted in URL string eg.
even space ' ' is escaped as %20 and @ is escaped as %40 ... (%xx - xx
hex value)).
escape() method properly escapes ' ' and @ and other basic (probably
taken from US-ASCII) characters. But when i want to send a some latin
character like polish l crossed i have to first encode it via
encodeURLComponent added in ECMAScript v3).
encodeURIComponent(), you are right on the rest.
When i use POST this is not a problem since URL and POSTed data need no
escaping.


That is not entirely true. I think if your POST request would include
Unicode characters, it would be necessary to declare them as such, probably
with

Accept-Charset: UTF-8,*
PointedEars
Nov 23 '05 #27

P: n/a
Luke Matuszewski wrote:
No i asked about escaping mechanism - especially how (or does ) it
works with XmlHttpRequest object if using GET. [...]
Now i know that all strings in JavaScript 1.0 are in Unicode
No, Unicode support was not included before JavaScript 1.3 (NN 4.06). Since
then, strings are encoded using UTF-16 in accordance with ECMAScript. The
current JavaScript version is JavaScript 1.6 (in Mozilla/5.0 rv:1.8b+,
hence Firefox 1.5). Unicode support for Internet Explorer was probably not
included before version 5.5/Win which was the only to support JScript 5.5
which was the first JScript version to support encodeURIComponent().

<URL:http://docs.sun.com/source/816-6408-10/whatsnew.htm>
<URL:http://msdn.microsoft.com/library/default.asp?url=/library/en-us/script56/html/js56jsmthencodeuricomponent.asp>
and particularlly my parameter would be in Unicode (UTF-8 as i
presume) ...
My expectation is instead that ASCII percent-encoding (as described in
RFC2986) will be used for characters below code point 0x80 and UTF-8
percent-encoding will be used for the rest. I tried `ł]' (where ł is
fortunately AltGr+l here :)) and submitted it -- %C5%82%5D was used if
UTF-8 was set as Character Encoding in View menu before.[1] I assume
this will be triggered by the default response header you configured.
However, I suggest that you use server-side scripting instead to set

Content-Type: text/html; charset=UTF-8

only if you need it. Serving all, even non-UTF-encoded documents as
UTF-8 encoded is probably harmful.

[1] Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.12) Gecko/20050922
Firefox/1.0.7 (Debian package 1.0.7-1) Mnenhy/0.7.2.0
so if send by GET needs to be escaped (becouse when using GET method
only limited characters are permitted in URL string eg.
even space ' ' is escaped as %20 and @ is escaped as %40 ... (%xx - xx
hex value)).
escape() method properly escapes ' ' and @ and other basic (probably
taken from US-ASCII) characters. But when i want to send a some latin
character like polish l crossed i have to first encode it via
encodeURLComponent added in ECMAScript v3).
encodeURIComponent(), you are right on the rest.
When i use POST this is not a problem since URL and POSTed data need no
escaping.


That is not entirely true. I think if your POST request would include
Unicode characters, it would be necessary to declare them as such, probably
with

Accept-Charset: UTF-8,*
PointedEars
Nov 23 '05 #28

P: n/a
In article <24****************@PointedEars.de>,
Thomas 'PointedEars' Lahn <Po*********@web.de> wrote:

<URL:http://developer.mozilla.org/en/docs/DOM:element.textContent>


hey, that's interesting. I tried it though in Firefox/Gecko. a
window.alert( text) gets me the text I wanted to insert, but the text on
the html page does not change.

How do I get a text change to take effect?

thanks
Nov 23 '05 #29

P: n/a
one man army wrote:
In article <24****************@PointedEars.de>,
Thomas 'PointedEars' Lahn <Po*********@web.de> wrote:
<URL:http://developer.mozilla.org/en/docs/DOM:element.textContent>

hey, that's interesting. I tried it though in Firefox/Gecko. a
window.alert( text) gets me the text I wanted to insert, but the text on
the html page does not change.

How do I get a text change to take effect?


<p onclick="changeTextContent(this);">Click on me...</p>

<script type="text/javascript">

function changeTextContent(el){
if (el.textContent){
el.textContent = (
prompt('Current text is: ' + el.textContent + '\n'
+ 'Enter new text or click \'Cancel\' to '
+ 'keep current text')
|| el.textContent);
}
}
</script>

--
Rob
Nov 23 '05 #30

P: n/a

Thomas 'PointedEars' Lahn napisal(a):
2) escape / unescape methods are working only with ASCII characters.


No, that also works for ISO-8859-xx characters, however it is not specified.
For Unicode transformations you have to use encodeURIComponent /
decodeURIComponent methods.


This was true if one replaced "transformations" with "percent encoding".

But this "percent encoding" is named escaping mechanism in rfc
documents. Also there is one thing to remeber, that when using
encodeURIComponent/decodeURIComponent it uses its argument which is
Unicode string (writen in memory using UTF-16 as spec say) and
transforms it using UTF-8 - this is limitation - because escaping
mechanism in GET forms (forms with action="GET") works dependent on
charset value in Content-Type HTTP header of a document served by
server - so
- if charset is windows-1250 then polish l crossed with line is encoded
as %B3
- if charset is utf-8 then polish l crossed with line is encoded as
%C5%82 (produced by encodeURIComponent);

Far more better would be function encodeURIComponent, which would take
second argument charset - which in turn would specify the encoding to
use when doing escaping mechanism, but i don't really know if there is
a one (provied by IE or SpiderMonkey engine in Mozilla based browsers).

BR.
Luke.

Nov 23 '05 #31

P: n/a
On 23/11/2005 09:16, Luke Matuszewski wrote:

[snip]
[The] escaping mechanism in GET forms (forms with action="GET") works
dependent on charset value in Content-Type HTTP header of a document
served by server [...]


There should be no escaping at all. If the GET transfer method is in
use, data should be limited to 7-bit ASCII. Anything else is undefined.
In practice, user agents do encode data, but the problem is that, unlike
with POST, there is no way to specify a charset parameter.

The most sensible approach for user agents would be to /always/ use
UTF-8, particularly as RFC 3986 (URI Generic Syntax) requires it for
certain URI components, creating consistent behaviour. Unfortunately,
they don't. Alternatively, avoid GET when transmitting multilingual data.

I refer you to <http://ppewww.ph.gla.ac.uk/~flavell/charset/form-i18n.html>.

[snip]

Mike

--
Michael Winter
Prefix subject with [News] before replying by e-mail.
Nov 23 '05 #32

P: n/a
one man army wrote:
Thomas 'PointedEars' Lahn <Po*********@web.de> wrote:
<URL:http://developer.mozilla.org/en/docs/DOM:element.textContent>


hey, that's interesting. I tried it though in Firefox/Gecko. a
window.alert( text) gets me the text I wanted to insert, but the
text on the html page does not change.

How do I get a text change to take effect?


WFM, Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.12) Gecko/20050922
Firefox/1.0.7 (Debian package 1.0.7-1) Mnenhy/0.7.2.0.

Why it does not work for you is impossible to say without you showing
the error message received, or the User-Agent and source code used.

<URL:http://validator.w3.org/>
<URL:http://diveintomark.org/archives/2003/05/05/why_we_wont_help_you>
<URL:http://jibbering.com/faq/#FAQ4_43>
PointedEars
Nov 23 '05 #33

P: n/a

Thomas 'PointedEars' Lahn napisal(a):
That is not entirely true. I think if your POST request would include
Unicode characters, it would be necessary to declare them as such, probably
with

Accept-Charset: UTF-8,*

No, Accept-Charset influence only on message body of the response.
Status-Line and HTTP headers of the response is all constructed from
the ISO-8859-1.
General HTTP response stream consist:
I Status-Line
II Message-Headers (optiona)
III Blank Line
IV Message Body

Nov 23 '05 #34

P: n/a

Michael Winter napisal(a):

There should be no escaping at all.
Yes, but here theory and pratice is evidently not the same - escaping
mechanism is used even on US-ASCI characters like space ' ' (%20) or @
(%40)
http://czyborra.com/charsets/iso646-us.gif

but as a consequence browser implementations and newer server side
scripts has extended escaping mechanism to all (supported by them)
encodings (so eg. polish l is translated in UTF-8 as %C5%82) [dot]

I refer you to <http://ppewww.ph.gla.ac.uk/~flavell/charset/form-i18n.html>.


Nice article ;)

Nov 23 '05 #35

P: n/a
On 2005-11-22, Luke Matuszewski <ma****************@gmail.com> wrote:
I ask this questions about encoding because in most kind of browsers
the request parameters are encoded in the same encoding as the encoding
of the page served.

no, get parameters are urlencode()d it's not like html escaping at all.
But i don't really know what encoding will be used
when serving paramteres in XmlHttpRequest (or i should use aleready
escaped version of url in open (or send if POST) function call on
object x.


It depends what the server/cgi at the other end is expecting.
--

Bye.
Jasen
Nov 24 '05 #36

P: n/a
you cam use a .htaccess file to specify a HTTP encoding header,
you don't need admin priviledges only write (ie. upload) priveledges
too the directory where your web pages are.

--

Bye.
Jasen
Nov 24 '05 #37

P: n/a
On 23/11/2005 18:00, Luke Matuszewski wrote:
Michael Winter napisal(a):
There should be no escaping at all.


Yes, but here theory and pratice is evidently not the same - escaping
mechanism is used even on US-ASCI characters like space ' ' (%20) or @
(%40)


That's entirely different, and not what I was referring to. Certain
US-ASCII characters are reserved within URI components, and the URI
syntax RFCs specify how they are to be treated. Characters from outside
this repertoire, including Unicode characters, are not specified, nor is
there universal agreement in practice. As Flavell concludes, "all other
things being equal, this form submission content-type should be avoided
for serious i18n work."

This part of the thread started due to your concern using
XMLHttpRequest, so idempotence isn't really an issue, and XMLHttpRequest
objects are known (from what I've read) to always transform data using
UTF-8.

Anyway, as far as the HTML side of things are concerned, you should ask
in comp.infosystems.www.authoring.html and consider reviewing archived
material from that group (as they've no doubt discussed it all before).

[snip]

Mike

--
Michael Winter
Prefix subject with [News] before replying by e-mail.
Nov 24 '05 #38

P: n/a

Michael Winter napisal(a):
Anyway, as far as the HTML side of things are concerned, you should ask
in comp.infosystems.www.authoring.html and consider reviewing archived
material from that group (as they've no doubt discussed it all before).


Yep sure.

Best way to use HTML forms and i18n is to serve them with
Content-Type specifing charset=UTF-8. Then when it is form without
charset attribute like:

<form action="url" method="get">

</form>

its contents (values of fields) are encoded in UTF-8 and then escaped
(%xx, as XmlHttpRequest do - as you suggested).
The implementation of XmlHttpRequest may differ on some user agents -
life is rich of that situations - so as the best we should always
encode parameters using encodeURI or encodeURIComponent (which are
fully unicode compilant as not as its old counterpartner escape() )
when using GET HTTP method.

Best Regards.
Luke.

Nov 24 '05 #39

This discussion thread is closed

Replies have been disabled for this discussion.