On Sat, 27 Aug 2005, Tim wrote:
On Fri, 26 Aug 2005 12:42:49 +0100, Alan J. Flavell sent:
(Of course, /real/ form submissions will continue to use ampersand as
their separator - the use of semicolon is specifically intended for
precomposed URLs, in order to avoid this problem).
Which shouldn't be a problem, shouldn't it?
I'm not sure what potential problem you're meaning to rule out here,
but no, technically there isn't a problem at all - the only problem is
incomprehension by some HTML authors. :-}
The submitted data isn't HTML,
The submitted data isn't HTML at all, right; but if a *precomposed*
URL is used, then the way to use it in the source HTML is as the value
of an attribute (href="..." for a link, src="..." for an image, etc.).
And in forming the value of that attribute, if there's a "&" in the
URL, then it has to be expressed in &-notation (& or & - or
& if you feel you must) in the HTML attribute value.
When HTML parses that & in the attribute value in the HTML source,
it'll leave just the "&" character itself to be used in the target
URL. But you'd understood that already, right?
Nothing special, in other words - the same rule that applies to /all/
HTML attribute values which happen to have an ampersand in them (an
image whose alt text was coded as alt="Mom&Pop" would trigger just
such a validation error, too, and would need to be corrected to
alt="Mom&Pop" or equivalent).
Whereas, if the alternative semicolon is used for precomposed URLs, as
recommended in the HTML specifications, no such fiddling and bloat is
necessary.
It's perfectly OK to implement the server-side process to treat both
the "&" and the ";" as delimiters, since, if either or both of these
characters are meant to be interpreted as data rather than as
delimiters, the rules call for them to be sent in %xx-encoded format,
not as bare characters. So the same server-side parser can be used,
regardless of which kind of submission is being sent.
As I said, the respected Perl module CGI.pm parses both variants:
there's a switch to determine whether its *generated HTML* will use
semicolon (the default) or ampersand (on request), but that switch
changes only its generated HTML: it doesn't change CGI.pm's parsing of
submitted forms data at all - that code respects both of them as
delimiters, all the time.
h t h