473,722 Members | 2,484 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Determining Charset used by system or software

Question:
How can you determine the character set used by a webpage you built?

My understanding of the issue is that the character set used by an HTML
file (or any other file, for that matter) depends on your own system,
and the encoding used by it; you cannot randomly insert a

<META HTTP-EQUIV="Content-Type" content="text/html;
charset=xxx-1234-567">

entry and expect it to work. Ie, the web server doesn't re-encode your
page when serving it according to the charset specified in a meta tag.

But how do you know? Can anyone provide me with pointers as to how I
might determine this on a given machine?

Regards,
Remi.

Jan 19 '06 #1
4 2064


Rémi wrote:
How can you determine the character set used by a webpage you built?
It depends on what you under the "character set" to be, HTML 4 defines
that term here:
<http://www.w3.org/TR/html4/charset.html#h-5.1>
and the "document character set" as "Universal Character Set" (UCS)
("character-by-character equivalent to Unicode") for all HTML documents.
My understanding of the issue is that the character set used by an HTML
file (or any other file, for that matter) depends on your own system,
and the encoding used by it; you cannot randomly insert a

<META HTTP-EQUIV="Content-Type" content="text/html;
charset=xxx-1234-567">


The "character encoding" is usually a choice your text/html editor
offers. If yours does not offer that then consider getting an editor
that does.

--

Martin Honnen
http://JavaScript.FAQTs.com/
Jan 19 '06 #2
Rémi wrote :
Question:
How can you determine the character set used by a webpage you built?

My understanding of the issue is that the character set used by an HTML
file (or any other file, for that matter) depends on your own system,
The character set used by an HTML file does not depend on your own
operating system. Though there must be a *_font_* installed on the
user's operating system capable of rendering the characters of the
character set of the HTML document.
and the encoding used by it; you cannot randomly insert a

<META HTTP-EQUIV="Content-Type" content="text/html;
charset=xxx-1234-567">

entry and expect it to work.
The HTML document must be written and saved according to such
charset=xxx-1234-567 to begin with: that's done by the web author.

Ie, the web server doesn't re-encode your page when serving it according to the charset specified in a meta tag.

The server will serve the document according to its setting.

"How to make the server send out appropriate 'charset' information
depends on the server.
For Apache, this can be done via the AddCharset (Apache 1.3.10 and
later) or AddType directives, for directories or individual resources
(files). With AddDefaultChars et (Apache 1.3.12 and later), it is
possible to set the default 'charset' for a whole server."
http://www.w3.org/International/O-HTTP-charset.html

"If you are serving static files, this information can be associated
with the files by the server. The method of setting up a server to pass
character encoding information in this way will vary from server to
server. You should check with the server administrator.
As an example, Apache servers typically provide a default encoding,
which can usually be overridden by user settings. For example, a user
might add the following line to a .htaccess file to serve all files with
a .html extension as UTF-8 in this and all child directories (...)"
http://www.w3.org/International/tuto...Slide0280.html

But how do you know?
Live HTTP headers is one good tool for this.

http://livehttpheaders.mozdev.org/

http://dotavery.com/blog/archive/2004/07/23/1717.aspx

Can anyone provide me with pointers as to how I might determine this on a given machine?

Regards,
Remi.


The web author should be the one writing and saving the HTML document in
the proper character encoding. Then he should set up his web server (or
ask his web server admin) to make sure that his HTML document will be
served with the correct character encoding.

Gérard
--
remove blah to email me
Jan 20 '06 #3
On Thu, 19 Jan 2006, Rémi wrote:
How can you determine the character set used by a webpage you built?
You need to understand the character model of HTML: it's not evident
from your question that you do, and, until you do, any answer that you
get to your question is likely to be unhelpful.

The relevant section of the HTML4 specification is reasonably clearly
set out, provided one reads it without any preconceived earlier
notions from other fields (e.g word processing).
http://www.w3.org/TR/html401/charset.html

In HTML4 the "document character set" is iso-10646/unicode: that's
firmly defined and not open to negotiation.

The other important issue is what's nowadays accurately known as the
"character encoding scheme", which (for historical reasons) is defined
by that misleading MIME parameter "charset=".
My understanding of the issue is that the character set used by an
HTML file (or any other file, for that matter) depends on your own
system, and the encoding used by it;
That's basically wrong: the "document character set" in HTML4 and
afterwards is Unicode. The character encoding which is served out by
the server is very often the same as is used on the system, but that
isn't necessarily so - it depends.
you cannot randomly insert a

<META HTTP-EQUIV="Content-Type" content="text/html;
charset=xxx-1234-567">

entry and expect it to work.
Correct - you can't.
Ie, the web server doesn't re-encode your page when serving it
according to the charset specified in a meta tag.
In theory, that very much depends on the server. Russian Apache can
transcode "on the fly" to any of the Cyrillic encodings which it
supports, and it will then advertise that encoding in its real HTTP
header (in that unfortunately-named "charset=" attribute), which is
the final arbiter of the matter. Your "meta http-equiv" is only a
nuisance when that happens...

Servers which run on platforms whose native encoding is EBCDIC will
also want to transcode the EBCDIC content into an appropriate
ASCII-based encoding for the web.

But for most of the cases which simple-minded folk come into contact
with, it's true that the content is stored with the same encoding as
is served-out. Just how that encoding is "known", and gets into the
real HTTP header, is a matter for server configuration.
But how do you know?


It's a property which needs to be maintained alongside every text file
(not only HTML), and appropriately advertised when the document is
served-out. Just how that's done is a question of server
configuration etc.
Jan 20 '06 #4
On 19 Jan 2006, Rémi wrote:
X-HTTP-UserAgent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.7.12)

How can you determine the character set used by a webpage you built?
[ ... ]
But how do you know? Can anyone provide me with pointers as to how I
might determine this on a given machine?


What do you mean by "given machine"? Each and every computer on earth?
Or do you want to know for your own "Windows NT 5.0"?

Your editor should tell you. On MS Windows, you can generally use
UTF-8, UTF-16, or the Microsoft-specific code pages from
http://www.unicode.org/Public/MAPPIN...ICSFT/WINDOWS/
In Mozilla Composer, for example, you can choose
File > Save and Change Character Encoding
and save your document in MacBelgian if you like.

--
Netscape 3.04 does everything I need, and it's utterly reliable.
Why should I switch? Peter T. Daniels in <news:sci.lan g>

Jan 20 '06 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

12
3743
by: lawrence | last post by:
How do I get PHP to tell the server that when I echo text to the screen, I need for the text to be sent as UTF-8? How does Apache know the right encoding when all the text is being generated by PHP? If I build a content management system (I have) and I make sure that all input is encoded as UTF-8, how will the server know that the text in the MySql database is UTF-8? I'm taking all user input and using this function on the input: ...
2
5422
by: Luca | last post by:
I have the following problem: I'm developing a system where there are some processes that communicate each other via message queues; the message one process can send to another process is as follows: ****************************************** struct ST_MSG { int iType; char aData; }
0
1119
by: tloren | last post by:
We have all seen on every software product that "minimum system requirements are......" and "recommended..........". How do they know this? Is there anyone here who knows how to find out what the system requirements for your own software are? Both minimum and recommended. Please send me a personal message on this, and I'll post the answer here after I've read it.
2
1281
by: William Payne | last post by:
Hello, I am making a very simple and crude Makefile generator, that currently supports three different options: --project-name=<name_of_project> --source-files=<source_file_names_separated_by_commas> --resource-file=<name_of_resource_file_if_any> The first thing I do is to put all program arguments (excluding argv) in a std::vector of std::strings. Then I call a function to determine which options were passed to the program and their...
7
2129
by: Jean-David Beyer | last post by:
I have six hard drives (4 SCSI and 2 EIDE) on my main machine with parts of a database on each drive. The main index is on one SCSI drive all to itself. The main data are on the other three SCSI drives. Small relations are on one EIDE drive, and the logfiles are on the other EIDE drive. When running the task, below, the rest of the machine is not doing much. I do not remember where I saw it, but somewhere I got the idea that the number...
6
3643
by: Kenneth Courville | last post by:
Hello, I'm looking for assistance with the Access object model. I know this is VB, but I'm building an Office Add-using C# directed at Access 2002. I'm literate in VB, so you can reply in VB... I think my problem mainly lies in knowing the quirks of the Access object model. Basically, I'm looking for a method of determining if an Access database is open yet. AND I'm looking for a method that doesn't not require checking for an...
0
1690
by: CTDev Team | last post by:
Hi, We are using Exchange Server 5.5, and have applications written in VB6 and C# that read and process emails. We are experiencing intermittent errors similar to C# Application System.Runtime.InteropServices.COMException (0x80004005): The client
6
18939
by: Calvin Lai | last post by:
Does anyone know the difference and usage of them? Great thanks!
8
2155
by: =?Utf-8?B?R2VvcmdlQXRraW5z?= | last post by:
Greetings! I wrote a small Exe that simply runs Shell to load PowerPoint and launch a particular file, depending on the day of the week. However, it was set up for office 2003 (I naively hardcoded the path) and I also used Shell. Does anybody have a snipped showing a more efficient method for launching a Powerpoint file, regardless of which version of Office is running? My current, ineffecient code: Sub main() Try
0
8739
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
1
9157
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
9088
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
8052
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
6681
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
4502
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
4762
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
2
2602
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
2147
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.