Rasmus Kromann-Larsen wrote:
The With Conundrum
I'm currently writing a master thesis on (preparations for)
static analysis of JavaScript, and after investigating the
with statement, it only even more evident to me that the
with statement is indeed bad.
Bad idea or not the - with - statement exists in javascript and
even if officially deprecated now it will be around for a very
long time to come, for back-compatibility with existing code. It
is also (very) occasional useful to add a specific object to the
scope chain of a function.
My initial thoughts on with were based on:
http://yuiblog.com/blog/2006/04/11/w...dered-harmful/
I built some examples to investigate - and later try to
"eliminate" the with statement from the code I'm analysing.
All examples besides the first one were evaluated in Rhino.
-----------------------------------------------------
Example 1 (pseudo). Readability.
with(x) {
a = b
}
Could be evaluated as:
e.a = e.b
e.a = b
a = e.b
a = b
(And this only gets worse if you add more variables -
exponentially worse actually :-)
With the possibility of deleting a or b doing nothing to make
the situation simpler.
-----------------------------------------------------
Example 2 (rhino): Assignments.
with(x) {
var z = 42; // Or just z = 42 (without var)
}
Where does z end up? If x.z already exists, it will be
overwritten, if not, it will be added to the global scope.
-----------------------------------------------------
Example 3 (rhino): Assignments continued (functions).
--- Example 3.1 ---
var obj = { x:42 };
with(obj) {
function x() { print(x); }
By strict ECMAScript rules that is a syntax error; No statement
may commence with the - function - keyword and only statements
may appear inside a Block Statement (FunctionDeclarations and
Statements being the two syntactic units from which javascript
programs are constructed). JavaScript(tm), and so Rhino, has a
syntax extension that provides a Function Statement, which is
what you have here in Rhino. Other ECMAScript implementations
(including JScript(tm) and that in the Opera browser) error-correct
this syntax error and (more or less) interpret it as a Function
Declaration regardless of its incorrect context.
This introduces the question of what language you are planning
as the subject of your static analysis. If it is JavaScript(tm)
then the above is fine, but the end result will have limited
applicability in the real world (where cross-browser, or at
least multi-browser scripts would be the useful subject). If
the subject was ECMAScript (which is the common sub-set of
implementations without any extensions) then it would be a
good idea to recognise its syntax errors.
On the other hand, if you are taking general source code that
may or may not include - with - statements and then converting
it into a form that does not use them then that form can be
JavaScript(tm) only, and so itself employ as many JavaScript(tm)
extensions as it likes.
}
<snip>
Since I'm interested in doing static analysis on JavaScript code --
and I don't want to deal with all these abnormalities, I decided to
try and normalize my way out of it. That is, take any given code
containing a with statement and automatically output valid JavaScript
with the exact same semantics, but without the with statement.
My first wild attempt was to try and introduce temporary variables, so
that:
-----------------------------------------------------
with(x) {
a = b;
}
-----------------------------------------------------
would become 3 blocks:
-----------------------------------------------------
// With start. (save all variables)
var evaled_x = x;
var temp_a = (evaled_x.a || a);
var temp_b = (evaled_x.b || b);
That won't work. Here you are using the value of the - b -
property of - evaled_x - to make the decision. If - evaled_x -
(and its prototype, and their respective prototypes) do not
have a - b - property then the value of the - evaled_x.b -
expression will be the undefined value, which will type-convert
to boolean false, and the - (evaled_x.b || b) - expression will
work. But if - evaled_x - has a - b - property but the value of
that property is boolean false, an empty string, the null value,
numeric zero or the undefined value (as may be explicitly
assigned to a property) then the right hand side of the logical
OR expression will still be used, and that would be wrong, and
potentially a runtime error.
What you need to do in order to determine which of - evaled_x.b
- or - b - to use is find out if - evaled_x -, or any object on
its prototype chain, actually has a - b - property. For the
object itself that is easy, as ECMAScript defines a -
hasOwnProperty - method, which is inherited by all objects. The
problem is you also need to call it on every object on the -
evaled_x - object's prototype chain, and ECMAScript keeps the
object's prototype chain internal.
However, if your subject really is JavaScript(tm), or you are
converting ECMAScript source into JavaScript(tm) code for analysis,
then you can use its (JavaScript(tm)'s) - __proto__ - extension,
which is a property of objects that refers to the object at the
top of the object's prototype chain (so you can work down the
whole prototype chain).
In that case a better test may be a function like:-
function hasProperty(obj, propertyName){
return (
(obj.hasOwnProperty(propertyName))||
(
Boolean(obj.__proto__)&&
(hasProperty(obj.__proto__, propertyName))
)
);
}
- with:-
var temp_b = hasProperty(evaled_x, 'b')?evaled_x.b:b;
// With block. (do replaced evaluation)
temp_a = temp_b;
// With end. (restore variables to newly calculated)
if(evaled_x.a)
evaled_x.a = temp_a;
else
a = temp_a;
if(evaled_x.b)
evaled_x.b = temp_b;
else
b = temp_b;
-----------------------------------------------------
But as far as I can figure, this won't do. I reached the conclusion
that you'd have to introduce temporary variables for each statement,
since each statement could potentially either directly or by side-
effects change the presence of properties on the evaluated object
(delete or assignments).
Also, more factors might even apply that I havn't even begun to
consider yet.
The - eval - function and the - Function - constructor turning string
data into executable code?
So, I was wondering, does anyone have a better solution (or more
insights) in the glorious quest of trying to eliminate a with
statement?
It is possible to implement ECMAScript in ECMAScript, and do so
without using the - with - statement. The implication of this is
that it must be possible to re-code any ECMAScript that uses the
- with - statement without it. However, the implications of doing
so are massive. The - with - statement manipulates the scope chain,
so the resulting code would have to dispense with the implicit scope
chains
and explicitly implement them, Identifier resolution and and so also
implement its own objects with their prototype inheritance, prototype
chain and property name resolution, and substitute that
alternative mechanism for _all_ of the subject code.
It can be done, but you would then not be analysing the original
source code but instead the equivalent of the executable structure
'complied' from the original source code. In which case it would
be as valid to use an ECMAScript implementation that compiled a
discreet 'bytecode' from its source code and make that 'bytecode'
the subject of analysis.
Richard.