473,765 Members | 1,963 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Bizarre JS brackets bug - mystery solved!

Afternoon,

In an earlier thread (http://tinyurl.com/5v4aa), I described a
problem I was having which was rather bizarrely solved by
changing the line:
"inputbox.v alue = numq+ag-cw-cc;"
to:
"inputbox.v alue = numq+(ag)-(cw)-(cc);"

This was needed in IE6 but not in any other browser I tried.
I have now solved the mystery of why inserting the brackets
removed the problem.

I used the age-old technique of removing everything else until
only the error remains. If you're interested in the two files
which eventually helped me to see the error, look at:
http://www.ex.ac.uk/cimt/dev/oddity/...ty-working.htm
http://www.ex.ac.uk/cimt/dev/oddity/...ity-faulty.htm

I will, however, explain the solution here.

IE6 is, I believe, the first version of the IE browser to have
"Auto-Select" for text encoding (character set) turned on by
default. When it loads the first of the above pages, it decides
that the encoding is "Western European (Windows)". When it
loads the second of the above pages, it decides that the
encoding is "Unicode (UTF-7)".

This process (and its arbitrary nature) is rather nicely illustrated
by the three examples below, which are all short. For full effect,
make sure you have Auto-Select turned on for text encoding if
you look at any of the web pages.

(1) http://www.ex.ac.uk/cimt/dev/oddity/...s-oddity-1.htm

<HTML>
<HEAD><TITLE>pl us minus oddity 1</TITLE></HEAD>
<BODY>
foo+stuff-bar
</BODY>
</HTML>

This displays:
foo<oriental symbol>bar.
IE has decided that the document is Unicode (UTF-7).

(2) http://www.ex.ac.uk/cimt/dev/oddity/...s-oddity-2.htm

<HTML>
<HEAD><TITLE>pl us minus oddity 2</TITLE></HEAD>
<BODY>
foo+stuff-bar<BR>
foo+ stuff -bar
</BODY>
</HTML>

This displays:
foo+stuff-bar
foo+ stuff -bar
IE has decided that this document is Western European (Windows).
How it has decided this is unclear to me. It contains the same first
line as example (1), but something in the second line makes it change
its mind. Perhaps it is the appearance of "stuff" without the "+"
directly in front?

(3) http://www.ex.ac.uk/cimt/dev/oddity/...s-oddity-3.htm

<HTML>
<HEAD><TITLE>pl us minus oddity</TITLE></HEAD>
<META HTTP-EQUIV="Content-Type"
CONTENT="text/html; CHARSET=iso-8859-1">
<BODY>
foo+stuff-bar
</BODY>
</HTML>

This displays:
foo+stuff-bar
IE has correctly responded to my suggestion that this document is in
Western European (ISO) as specified in the META tag.

I'm sure that some of you will tell me that I should have always set
the character set for every HTML page I have ever written. If I had
done then I might never have discovered this IE6 "feature".

Anyway, I have learnt my lesson.

I can see two potential ongoing problems. Firstly, it seems odd (to
me) that the text-encoding has also been used to process the script
within the page. There will be plenty of occasions where a variable
is enclosed between a "+" and a "-", and each of these could
potentially lead to an error. Do people script in non-latin charsets?

What makes the problem worse is that the way in which IE decides
the encoding depends fairly arbitrarily on things which appear *later*
in the code and/or page. Removing a working section of code might
remove the problem, but not because there was a fault in that section
of code.

Anyway, there is an easy solution.
Make sure the text-encoding is specified on every page.

Al

Jul 23 '05 #1
8 1565
On Thu, 30 Sep 2004 16:00:28 +0100, Al Reynolds <aj******@bat40 0.com>
wrote:

[snip]
Do people script in non-latin charsets?


I don't know if they do, but I presume that the potential is there.
Identifiers can legally contain Unicode characters from certain code
groups, and string literals can contain any Unicode character (and I'm not
referring to escape sequences). For them to be properly processed, I
assume that the character set must be set correctly.

[snip]

Mike

--
Michael Winter
Replace ".invalid" with ".uk" to reply by e-mail.
Jul 23 '05 #2
Al Reynolds wrote:
I can see two potential ongoing problems. Firstly, it seems odd (to
me) that the text-encoding has also been used to process the script
within the page.
The script within the page is just part of the page. If the page is
encoded a specific way, then the text between the <script> and </script>
tags will be encoded the same way.
Anyway, there is an easy solution.
Make sure the text-encoding is specified on every page.


Indeed.
Anyway, this may be of passing interest to you: <url:
http://zsigri.tripod.com/fontboard/cjk/utf7.html />

Using some guess work and the URL above, I've arrived at a partial
solution to your question about why IE sometimes decides to Auto-Select
UTF-7 and sometimes it does not. Here it is:

If all "+" characters on a page are only followed by characters from the
Base64 alphabet up to the next "-" character, the page is assumed to be
UTF-7. If even a single "+" character on the page is followed by a
character not from the Base64 alphabet, the page is assumed to not be
UTF-7. As a result:

abc ++++- def would be UTF-7; but
abc +<b>+++</b>- would not

However, this does not explain everything, otherwise: for (var i = 0; i <
length; ++i-b) { ... } would cause problems (assuming no other occurances
of "+" on the page), but it does not.

--
Grant Wagner <gw*****@agrico reunited.com>
comp.lang.javas cript FAQ - http://jibbering.com/faq

Jul 23 '05 #3
VK
> Anyway, there is an easy solution.
Make sure the text-encoding is specified on every page.


I don't think it always helps. How about situations when you really need a
script-powered page in Unicode? - Online dictionaries and language lessons
just to name the first.

Also I'm out of any ideas how the "+stuff-" literal might be interpreted as
a Korean syllabic symbol (Unicode value B2DB).

I think this is a bug ("+stuff-" = \u45787) and this is so called "unwanted
behavior" for the whole situation.

IMHO this should be definitely reported to Washington (I mean to the state
of, not DC :-)
Jul 23 '05 #4
On Fri, 1 Oct 2004 15:12:34 +0200, "VK" <sc**********@y ahoo.com>
wrote:
Anyway, there is an easy solution.
Make sure the text-encoding is specified on every page.
I don't think it always helps. How about situations when you really need a
script-powered page in Unicode? - Online dictionaries and language lessons
just to name the first.


There is no problem with scripting in IE in UTF-8 or Mozilla, even
script using utf-8 chars as variables work fine - Older Opera and
others have problems, but none in literals.

If the encoding is specifed there's no problem at all, just ensure you
specify an encoding, don't let it be guessed, as IE will guess wrong.
I think this is a bug ("+stuff-" = \u45787) and this is so called "unwanted
behavior" for the whole situation.


No, anything the browser does in response to an invalid document that
it has to fix-up is luck if it works or not - don't risk to luck and
you won't have a problem. For your bug above, a legitimate UTF-7
document would have a complementary bug - you can't deal with both.

Just include a proper charset!

Jim.
Jul 23 '05 #5
On Fri, 1 Oct 2004 15:12:34 +0200, VK <sc**********@y ahoo.com> wrote:
Anyway, there is an easy solution.
Make sure the text-encoding is specified on every page.
I don't think it always helps. How about situations when you really need
a script-powered page in Unicode? - Online dictionaries and language
lessons just to name the first.


[Theory]
Declare the document with its correct character set and place the script
in a separate file. If necessary, specify the charset attribute on the
SCRIPT element.
[/Theory]

Not having written documents in other character sets, I don't know how
effective that will be. However, it seems to be the technically correct
approach.
Also I'm out of any ideas how the "+stuff-" literal might be interpreted
as a Korean syllabic symbol (Unicode value B2DB).
"+stuff-" literal? What are you referring to?
[...] \u45787 [...]


Unicode escape sequences use hexadecimal, not decimal.

[snip]

Mike

--
Michael Winter
Replace ".invalid" with ".uk" to reply by e-mail.
Jul 23 '05 #6
VK
> [Theory]
Declare the document with its correct character set and place the script
in a separate file. If necessary, specify the charset attribute on the
SCRIPT element.
[/Theory]
The theory is good and it's the first what came in my head too. But how to
deal with all this inline little onEvent stuff? (like
"...onChange=up date(this.form, this.form)"
It looks like in Unicode it may be transformed in a unpredictable way.
"+stuff-" literal? What are you referring to?


I'm referring to http://www.ex.ac.uk/cimt/dev/oddity/...ty-working.htm
from the original posting.
The character sequence (let's stick to this term) "foo+stuff-bar" has been
transformed into "foo[Korean symbol]bar".
Why? And what else may happen with your script on a unicode page? Maybe
"x+y=z" can become a Japanese text in some circumstances?

[...] \u45787 [...]


Unicode escape sequences use hexadecimal, not decimal.


It depends. Unicode consortium publish all its tables in hex values.
Nevertheless if you need to use Unicode chars in non-unicode document (for
scripting for example), you have to use \u-sequences (\u+digital code
value).
Again - I'm not saying it's a crucial default, but it is definitely an issue
to be addressed in new IE releases.
Jul 23 '05 #7
On Fri, 1 Oct 2004 16:29:12 +0200, VK <sc**********@y ahoo.com> wrote:
[Theory]
Declare the document with its correct character set and place the
script in a separate file. If necessary, specify the charset attribute
on the SCRIPT element.
[/Theory]
The theory is good and it's the first what came in my head too. But how
to deal with all this inline little onEvent stuff? (like
"...onChange=up date(this.form, this.form)"
It looks like in Unicode it may be transformed in a unpredictable way.


That is a possibility. However, you could add the listeners through the
script itself. The only problem here is that old browsers won't be able to
use such pages as getting a reference to anything other than form controls
depends on getElementById (or similar).
"+stuff-" literal? What are you referring to?


I'm referring to
http://www.ex.ac.uk/cimt/dev/oddity/...ty-working.htm
from the original posting.
The character sequence (let's stick to this term) "foo+stuff-bar" has
been transformed into "foo[Korean symbol]bar".


Oh, I see. I thought you were referring to some strange non-standard
character entity.
Why?
From UTF-7 Definition, RFC 2152 - UTF-7 A Mail-Safe Transformation Format
of Unicode:

The "+" signals that subsequent octets are to be interpreted as
elements of the Modified Base64 alphabet until a character not in
that alphabet is encountered. Such characters include control
characters such as carriage returns and line feeds; thus, a Unicode
shifted sequence always terminates at the of a line [sic]. As a
special case, if the sequence terminates with the character "-"
(US-ASCII decimal 45) then that character is absorbed; other
terminating characters are not absorbed and are processed normally.

So in the sequence, +...-, that entire string is replaced by the value of
.... in the Base64 alphabet. The question is why IE decides the page is
UTF-7.

[snip]
> [...] \u45787 [...]


Unicode escape sequences use hexadecimal, not decimal.


It depends. Unicode consortium publish all its tables in hex values.
Nevertheless if you need to use Unicode chars in non-unicode document
(for scripting for example), you have to use
\u-sequences (\u+digital code value).


A script can be a Unicode document. Though identifiers much come from a
limited alphabet, string literals can contain any Unicode character.

Unicode escape sequences in string literals within scripts *do* require
hexadecimal characters. HTML entity references can use either decimal or
hexadecimal (decimal is probably safer).
Again - I'm not saying it's a crucial default, but it is definitely an
issue to be addressed in new IE releases.


However, Microsoft only seem to be issuing security updates. The next full
release will only be available in Longhorn, or so I've read.

Mike

--
Michael Winter
Replace ".invalid" with ".uk" to reply by e-mail.
Jul 23 '05 #8
On Fri, 1 Oct 2004 16:29:12 +0200, "VK" <sc**********@y ahoo.com>
wrote:
[Theory]
Declare the document with its correct character set and place the script
in a separate file. If necessary, specify the charset attribute on the
SCRIPT element.
[/Theory]
The theory is good and it's the first what came in my head too. But how to
deal with all this inline little onEvent stuff? (like
"...onChange=u pdate(this.form , this.form)"
It looks like in Unicode it may be transformed in a unpredictable way.


It's not, current browsers have excellent unicode support, you've just
got to declare the character set so it knows!
Why? And what else may happen with your script on a unicode page? Maybe
"x+y=z" can become a Japanese text in some circumstances?
no, not if you correctly declare the encoding, it simply cannot
happen.
It depends. Unicode consortium publish all its tables in hex values.
Nevertheless if you need to use Unicode chars in non-unicode document (for
scripting for example), you have to use \u-sequences (\u+digital code
value).
Please read the specifications, Michael was entirely correct:

\uhhhh - Unicode character represented by the four-digit hexadecimal
number hhhh.
Again - I'm not saying it's a crucial default, but it is definitely an issue
to be addressed in new IE releases.


There's no bug, the bug is in your code.

Jim.
Jul 23 '05 #9

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
1774
by: Nicolae Fieraru | last post by:
Hi All, I have a lot of problems with the web site www.ggsurf.com.au I host on www.gnxonline.com and I want to find out if it is my own problem or theirs. I try to use session cookies and it doesn't work fine. Eventually I created some test scripts, which can be found in here: www.ggsurf.com.au/info.asp There is a link to Set the Session("TransactionID") = 25 and the page
8
1553
by: Al Reynolds | last post by:
Afternoon, I have just finished fixing one of my scripts after it started generating odd errors on IE6 on WinXP Service Pack 2. For info, the IE Version is: 6.0.2900.2180.xpsp_sp2_rtm.040803-2158 I have done all the Windows critical updates. I haven't managed to replicate the behaviour in any other browser. I have tested the faulty version in Netscape 4.7x, IE 5
5
1434
by: Keith Wilby | last post by:
If I have an mdw file for a secure database, and in the same folder I have a bat file with the same name, are there any circumstances when this bat file will execute? eg: C:\db\CSS.mdw C:\db\CSS.bat Thanks. Keith.
6
3082
by: GaryDave | last post by:
My school registration database has not been quite right after a recent compact and repair (done while I was away). Though most of the many forms and subforms are working normally, one form in particular will no longer allow new entries into the subform - the recordset of the underlying query is no longer updatable. The subform in question enters the identity and other vital info for children whose parent's ID provides the FK, linking...
36
1976
by: Rolloffle | last post by:
A short time ago my fiancée Kimmy found out that she had gotten pregnant. We had a long, hard talk about what to do, if anything. I was in favour of her getting an abortion, though she was initially reluctant. After a short discussion, we came to a mutual agreement; I arranged a private appointment at our local clinic. I tagged along with Kim for moral support. After registering at reception and sitting around for a few minutes, a nurse...
3
2759
by: Fin | last post by:
Index properties in C++ class libraries (.NET) apper as set_ and get_ methods when used in C# To test this out, I changed the example from section "13.2 Indexed Properties" in MSDN, and placed the Employee and Manager classes in a library. Then I wrote a C# application implement the code from main(). What happens is that everyting works fine except when trying to call the indexed properties Example The following line does not work (as it...
10
1511
by: Thorben Grosser | last post by:
Hello dear Newsgroup, my problem seems somehow silly, but after some googeling, I don't find a solution. The point is: I have an multiple select field to which I add values using some JavaScript. As I am willing to use all the values in a later PHP processor, I have to call these select fields like name and before submitting, I have to mark every item (JavaScript as well) But it seems, which is in some way logical to me, as if the...
3
1886
by: Peter | last post by:
Hi! I am having some very strange behavior with my databound controls. It's taken a long time to isolate exactly what is provoking the problem, but I'm still leagues away from solving it. I have a DataView which filters a DataSet. Bound to this dataview is a ListBox, via its DataSource property. The DisplayMember is the name property of the row. Simple enough so far?
35
2248
by: bukzor | last post by:
I've found some bizzare behavior when using mutable values (lists, dicts, etc) as the default argument of a function. I want to get the community's feedback on this. It's easiest to explain with code. This example is trivial and has design issues, but it demonstrates a problem I've seen in production systems: def main(argv = ): 'print out arguments with BEGIN and END' argv.insert(1, "BEGIN")
0
9568
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
10163
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
10007
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
9957
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
1
7379
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5423
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
3924
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
3532
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
2806
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.