Connecting Tech Pros Worldwide Forums | Help | Site Map

Web browser control, Unicode

Mark Johnson
Guest
 
Posts: n/a
#1: Nov 13 '05
I wonder if anyone has a solution? I wanted to use the web browser
control as a 'zoom' box for a smaller textbox. I can format in the
control, and save whatever formatting as HTML code back to the textbox
when the web browser is closed. The only problem comes in the use of
numeric entities to specify Unicode. The web browser control is fine
when it comes to named entities, like   . And there are a lot of
named entities, just insufficient for the purpose here (which is a
Greek set of composited Unicode taking up an entire page). But give it
a numeric entity, and it immediately converts to the actual character
which cannot then be read back to the textbox (which is limited to
ASCII), except as a question mark, "?". The Unicode character is lost.
All you have is an ASCII question mark as placeholder.

I couldn't see that setting the enable_utf8 flag would help, nor
another for 'reparsing' or something. These are probably default
anyway. But please, someone correct me if that's wrong.

The best I could come up with is an on-the-fly replacement where the
numeric entity, which displays as it should in the browser control,
but yet forever lost to any attempt to read back the innerHTML
property, is copied to a dummy element of the same number but without
the leading "&#" and trailing semi-colon. So where ΐ would
disappear into a "?", when the web browser control is first opened, it
is copied just immediately before that into a "<x390 />" element. Then
when comes time to finally save the result and close the web browser
control, the "<x390 />" is removed/transformed back into &#x390; , in
both cases by VB code, using Regexp, whatever. Obviously, aside from
being kludgey, the method only works when the Unicode is first
enterred into the textbox before opening the web browser control. The
Unicode cannot be directly enterred into the browser control, as might
seem more convenient (cut n paste) from some character map app, for
example.

Hope there's a simpler way. Anyone know?

Mark Johnson
Guest
 
Posts: n/a
#2: Nov 13 '05

re: Web browser control, Unicode


Mark Johnson <102334.12@compuserve.com> wrote:
[color=blue]
>I wonder if anyone has a solution? I wanted to use the web browser
>control as a 'zoom' box for a smaller textbox. I can format in the
>control, and save whatever formatting as HTML code back to the textbox
>when the web browser is closed. The only problem comes in the use of
>numeric entities to specify Unicode. The web browser control is fine
>when it comes to named entities, like &nbsp; . And there are a lot of
>named entities, just insufficient for the purpose here (which is a
>Greek set of composited Unicode taking up an entire page). But give it
>a numeric entity, and it immediately converts to the actual character
>which cannot then be read back to the textbox (which is limited to
>ASCII), except as a question mark, "?". The Unicode character is lost.
>All you have is an ASCII question mark as placeholder.[/color]

So for anyone interested, here's the simple regexp patterns for the
substitutions required.

The textbox control is being 'zoomed' in a popup which uses a web
browser control. As soon as any numeric entity gets put into the
browser control, it's lost. It will display properly. But it can't be
then read back out with document.body.innerHTML (or outerHTML). It's
just gone.

So just before the zoom popup is . . popped up, one modifies the text
in the textbox control, itself.

textbox control = fRegexp(textbox control, "&#([^;]*);",
"<A$1/>&#$1;", True)

where fRegexp is some code you place in a module as cover to the WSH
or whatever regexp function.

Public Function fRegExp(ByVal strFull As String, strPattern As String,
strReplace As String, boolGlobal As Boolean)
Dim regexp1 As New regexp

regexp1.Pattern = strPattern
regexp1.Global = boolGlobal
fRegExp = regexp1.replace(strFull, strReplace)

End Function


And when it comes time to close the popup, and save the final result,
the textbox control again is written directly as:

textbox control = fRegexp(textbox control, "<A([^\?]*) />\?", "&#$1;",
True)


But, once again, the problem is that this keeps any Unicode editing
out of the zoom box. And it would seem to me it might be convenient to
cut n paste from some charmap-type app those Unicode values directly
into the zoom, rather than the underlying textbox after the zoom is
closed. I guess the fix, there, would be to add a button to the zoom
box button bar (assuming one uses such a custom bar), to insertHTML
for each Unicode value, with the dummy "<a . . />" elements and all,
so that these too would be included in the closing call to fRegexp.

But - that is - there must be a better way. Ideally, one could simply
turn

Closed Thread