Stan Brown <the_stan_brown@fastmail.fm> wrote:
[color=blue]
> But when I replace the "i" in a mailto: with %6c, neither Mozilla
> 1.4 nor MSIE 4 recognizes it as a mailto link.[/color]
That's understandable, and correct, because it's not a mailto link any
more, or even a syntactically correct URL.
This is a tricky issue, since RFC 2396 is not exactly crystal clear, and
the prose seems to say that you can encode anything. (Clause 2.3 says:
"Unreserved characters can be escaped without changing the semantics of
the URI, but this should not be done unless the URI is being used in a
context that does not allow the unescaped character to appear.")
But if we read the BNF, available at
http://www.cs.tut.fi/~jkorpela/rfc/2396/full.html#A
in a hopefully more readable form as the original plain text, then it
seems that e.g. the scheme part must _not_ be encoded in any way.
It shall consist of alphanumerics and certain other characters, not
including the percent sign, and the BNF describes the syntax of "uri
character" in a manner that covers URI encoding.
[color=blue]
> When I replace the
> "i" with l, both browsers recognize it.[/color]
Naturally, because character references operate at the HTML parsing
level.
(The obfuscation of URLs is a fairly ineffective weapon against spam, by
the way. And we are and will be under heavy attacks from worms that use
addresses in people's address books, so anything you do in order to "spam
protect" addresses on Web pages or Usenet postings is getting even less
relevant than it was.)
--
Yucca,
http://www.cs.tut.fi/~jkorpela/
Pages about Web authoring:
http://www.cs.tut.fi/~jkorpela/www.html