473,624 Members | 2,510 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

serious encoding problem

I want to save text in a file and after that I want to display this textfile
using the internet explorer.

If I am displaying "html text" everything is fine but if I want to display
plain text all characters from the extended ascii are looking weird - are
not properly encoded! Using the options in View -> Encoding -> ... in the
internet explorer I can switch to another encoding and it is displayed
correct. With the same way, I can make the "html text" look weird.

In my program I am using the AxSHDocVw.AxWeb Browser control to display the
text.
How is that problem solved? Outlook Express for instance is displaying all
messages/ text correct - html and plain text messages. How can I achieve
that behavior? Is there a way to change the encoding from code?

Thanks really a lot in advance,
timtos.

Nov 13 '05 #1
8 6110
timtos <ha*****@uni-koblenz.de> wrote:
I want to save text in a file and after that I want to display this textfile
using the internet explorer.

If I am displaying "html text" everything is fine but if I want to display
plain text all characters from the extended ascii are looking weird - are
not properly encoded!


And what exactly is "the extended ascii"? Ascii is unicode 0-127, and
nothing else. There are various encodings which have the same values
for 0-127, but they differ considerably between each other. You need to
know *exactly* what encoding you really want to use, and then use the
appropriate Encoding instance for output.

See http://www.pobox.com/~skeet/csharp/unicode.html

--
Jon Skeet - <sk***@pobox.co m>
http://www.pobox.com/~skeet/
If replying to the group, please do not mail me too
Nov 13 '05 #2
Thanks for answering! With "the extended ascii" I meant 128 and above!
I came to the phrase "the extended ascii" because at
http://www.asciitable.com/ at the bottom, it is refered to the "Extended
ASCII Codes" and they mean with it what I meant with "the extended ascii"
;-) I´ll have a look at your link now. Thanks again for answering.

Greetings,
timtos.

"Jon Skeet" <sk***@pobox.co m> wrote in message
news:MP******** *************** *@news.microsof t.com...
timtos <ha*****@uni-koblenz.de> wrote:
I want to save text in a file and after that I want to display this textfile using the internet explorer.

If I am displaying "html text" everything is fine but if I want to display plain text all characters from the extended ascii are looking weird - are not properly encoded!


And what exactly is "the extended ascii"? Ascii is unicode 0-127, and
nothing else. There are various encodings which have the same values
for 0-127, but they differ considerably between each other. You need to
know *exactly* what encoding you really want to use, and then use the
appropriate Encoding instance for output.

See http://www.pobox.com/~skeet/csharp/unicode.html

--
Jon Skeet - <sk***@pobox.co m>
http://www.pobox.com/~skeet/
If replying to the group, please do not mail me too

Nov 13 '05 #3
timtos <ha*****@uni-koblenz.de> wrote:
Thanks for answering! With "the extended ascii" I meant 128 and above!
But there *is* no single "extended ascii".
I came to the phrase "the extended ascii" because at
http://www.asciitable.com/ at the bottom, it is refered to the "Extended
ASCII Codes" and they mean with it what I meant with "the extended ascii"
;-) I´ll have a look at your link now. Thanks again for answering.


That page was written by someone who doesn't really understand what
ASCII is, I suspect. As I said, there are various "extensions " to
ASCII, none of which can uniquely be called "extended ascii". When
other people talk about "extended ascii" they mean different things...
and that's why it's a term which should never, IMO, be used.

--
Jon Skeet - <sk***@pobox.co m>
http://www.pobox.com/~skeet/
If replying to the group, please do not mail me too
Nov 13 '05 #4
> When other people talk about "extended ascii" they mean different
things... and that's why it's a term which should never, IMO, be used.

And I agree with you ;-)
Thanks for clearing things up!

Greetings,
timtos.

"Jon Skeet" <sk***@pobox.co m> wrote in message
news:MP******** *************** *@news.microsof t.com...
timtos <ha*****@uni-koblenz.de> wrote:
Thanks for answering! With "the extended ascii" I meant 128 and above!


But there *is* no single "extended ascii".
I came to the phrase "the extended ascii" because at
http://www.asciitable.com/ at the bottom, it is refered to the "Extended
ASCII Codes" and they mean with it what I meant with "the extended ascii" ;-) I´ll have a look at your link now. Thanks again for answering.


That page was written by someone who doesn't really understand what
ASCII is, I suspect. As I said, there are various "extensions " to
ASCII, none of which can uniquely be called "extended ascii". When
other people talk about "extended ascii" they mean different things...
and that's why it's a term which should never, IMO, be used.

--
Jon Skeet - <sk***@pobox.co m>
http://www.pobox.com/~skeet/
If replying to the group, please do not mail me too

Nov 13 '05 #5
timtos <ha*****@uni-koblenz.de> wrote:
But I still got my initial problem :-(
I think I understood the Unicode stuff but perhaps there is something out
there concerning encoding what I _don´t_ understand...


Okay. Do this in several stages. First, work out what the encoding of
the text file is. Then load it into a .NET program, and print out the
unicode value (as an integer) of each character. That way you'll know
you've loaded it properly. Then work out exactly what encoding your
control wants (hopefully it'll be documented) and then you should be
able to encode it appropriately.

--
Jon Skeet - <sk***@pobox.co m>
http://www.pobox.com/~skeet/
If replying to the group, please do not mail me too
Nov 13 '05 #6
First of all, thanks a lot for your help Jon Skeet. You helped a lot!
But a few little questions remain although now the encoding is working:

I created the file with a StreamWriter and used the following call to
initialize it:
sw = fit.CreateText( );

So the default encoding was used - UTF8. I thought that was ok because in
the msdn it is written:
"UTF-8 handles all Unicode characters correctly and gives consistent results
on localized versions of the operating system."

But trying to display that file in the AxWebBrowser control, the characters
were messed up because this control uses West-European(Window s) encoding.
Now I use the following call to initialize the StreamWriter object:
sw = new StreamWriter(pa th, false, System.Text.Enc oding.Default);

Now it is working!
But why is UTF-8 not the right way here?
Any thoughts about this problem? Is the way I am going now ok?

Thanks for sharing your thoughts,
timtos.

"Jon Skeet" wrote:
Okay. Do this in several stages. First, work out what the encoding of
the text file is. Then load it into a .NET program, and print out the
unicode value (as an integer) of each character. That way you'll know
you've loaded it properly. Then work out exactly what encoding your
control wants (hopefully it'll be documented) and then you should be
able to encode it appropriately.

Nov 13 '05 #7
timtos <ha*****@uni-koblenz.de> wrote:
First of all, thanks a lot for your help Jon Skeet. You helped a lot!
But a few little questions remain although now the encoding is working:

I created the file with a StreamWriter and used the following call to
initialize it:
sw = fit.CreateText( );

So the default encoding was used - UTF8. I thought that was ok because in
the msdn it is written:
"UTF-8 handles all Unicode characters correctly and gives consistent results
on localized versions of the operating system."
UTF-8 itself does indeed handle all Unicode characters.
But trying to display that file in the AxWebBrowser control, the characters
were messed up because this control uses West-European(Window s) encoding.
Right.
Now I use the following call to initialize the StreamWriter object:
sw = new StreamWriter(pa th, false, System.Text.Enc oding.Default);

Now it is working!
But why is UTF-8 not the right way here?
Because as you say, the control doesn't use UTF-8.
Any thoughts about this problem? Is the way I am going now ok?


If you could find some way to make the AxWebBrowser control use UTF-8,
that would be the best solution. UTF-8 is a generally nice encoding.

If you can't change what encoding a control will understand, however,
you must "feed" it stuff encoded with what it *does* understand.

--
Jon Skeet - <sk***@pobox.co m>
http://www.pobox.com/~skeet/
If replying to the group, please do not mail me too
Nov 13 '05 #8
"Michael \(michka\) Kaplan [MS]" <mi*****@online .microsoft.com>
<"Michael \(michka\) Kaplan [MS]" <mi*****@online .microsoft.com> >
wrote:
Everything John said is true. But note that the method of using "Default" as
the encoding type will not work well on CJK platforms since some of the byte
combinations that will be produced wwill be illegal in the default system
code page.
I don't understand that. How can using Encoding.Defaul t produce illegal
byte sequences in the default system code page - I thought the whole
point of Encoding.Defaul t was that it *was* the default system code
page.
The way it is working now.... the data is improperly translated to the wrong
code page, but then later you improperly convert it back using the same
encoding. So it is a good example of "two wrongs making a right!"


Not entirely sure about this, either - as far as I can see the OP is
*only* encoding text to a file, not decoding it at all. The browser
control is doing that, and so long as the file is encoded with the same
code page that the browser control is decoding with, how are either of
them "wrong" as such?

--
Jon Skeet - <sk***@pobox.co m>
http://www.pobox.com/~skeet/
If replying to the group, please do not mail me too
Nov 13 '05 #9

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
49318
by: Ann | last post by:
Hi, Is there any way to Change encoding of Java Vm to ISO-8859-1? i am using Java vm along with an application called opencms. I get the following error message.. Error: the encoding of your Java VM is different from the OpenCms encoding! Java VM file encoding: UTF-8
3
3146
by: John Draper | last post by:
I am having a lot of problems trying to get a Python CGI to run. I have included 3 parts... 1) A simple stripped down python module. 2) An output of what I get when I do: python index.py (which not suprisingly should generate HTML output, and it does) 3) And the "error_log" output from the "server error" I get when I run it by typing in: http://localhost/cgi-bin/index.py
8
2807
by: janeaustine50 | last post by:
Python's InteractiveInterpreter uses the built-in compile function. According to the ref. manual, it doesn't seem to concern about the encoding of the source string. When I hand in an unicode object, it is encoded in utf-8 automatically. It can be a problem when I'm building an interactive environment using "compile", with a different encoding from utf-8. IDLE itself has the same problem. ( '<a string with non-ascii-encoding>' is...
7
4954
by: Mark | last post by:
Hi... I've been doing a lot of work both creating and consuming web services, and I notice there seems to be a discontinuity between a number of the different cogs in the wheel centering around windows-1252 and that it is not equivalent to iso-8859-1. Looking in the registry under HKEY_CLASSES_ROOT\MIME\Database\Charset and \Codepage, it seems that all variations on iso-8859-1 (latin1, etc) are mapped to code page 1252, which I'm...
0
3558
by: Christopher Ambler | last post by:
This is long, but it's driving me nuts. I need some adult supervision :-) (and I'm not above bribing for help) I have a stored procedure that I call that returns XML to me. The SP returns 3 recordsets. The first comes as a single XML tag like this: <row recordfound="1" IsExpired="0" /> The second has any number of recordsets that look like this:
10
2401
by: BBFrost | last post by:
We just recently moved one of our major c# apps from VS Net 2002 to VS Net 2003. At first things were looking ok, now problems are starting to appear. So far ... (1) ComboBox.SelectedValue = db_value; If the db_value was not included in the ComboBox value list the ComboBox.SelectedIndex used to return -1, Now the very same code is
4
8830
by: fitsch | last post by:
Hi, I am trying to write a generic RSS/Atom/OPML feed client. The problem is, that those xml feeds may have different encodings: - <?xml version="1.0" encoding="ISO-8859-1" ?>... - <?xml version="1.0" encoding="utf-8" ?>... - ... I am using the WebRequest functionality to get the feeds. So, my code
11
3235
by: David R | last post by:
This is a repost, with additional information. I have a Net 2.0 client (C# Winform) calling an Axis web service. The .NET client can authenticate, create requested objects, serialize the objects and send. These are received without issue on the Axis server. The .NET client can also receive responses from the Axis web service when the response is a single value type (long, string, etc), or an array of value types. However if an object...
1
2265
by: kath | last post by:
Hello, sorry about the lengthy message. I finding difficult to execute this program. The wx.Notebook i created is coming on the splitted frame(self.p2). How do I that. I am started to learn wxPython, and when I run the code, the code doesnot close gracefully, it throughs me an error. "pythonw.exe has encountered a problem and needs to close. We are sorry for the inconvenience"
0
8231
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
8168
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
8614
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
8330
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
7153
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
6107
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5561
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
4075
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
1
1780
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.