Scripsit
ge********@gmai l.com:
On some of my course pages, I quote (with attribution)
small sections of Wikipedia and the like. E.g, the top
of
http://en.wiktionary.org/wiki/entropy
has "entropia" in Greek font,
Technically, it has the word in Greek _characters_ (letters). This is
the key issue; fonts are secondary. The page has a style sheet that
makes special suggestions on the font of such words, in a most confusing
and tricky way.
What is the correct --maybe "coding
system" is the term?-- so that I could quote all three of
these on the same HTML page?
The proper _character encoding_ is UTF-8 in such cases. As soon as you
have Japanese, Greek, and umlaut Latin letters on one page, that's
definitely the best option. If there were just a few "special"
characters, you could present them using entity references like ö
or character references like ą, but this gets clumsy (or requires
suitable software for generating them) if you have full sentences that
consist of "special" characters.
It's not possible (in practice on web pages) to switch the character
encoding in the middle of an HTML document.
In the past I've cut&pasted
a snippet from, say, wiki/entropy, into an Emacs buffer,
adjoined a "From Wictionary http://..." and attempted to
save the buffer. Sometimes Emacs asked me for what coding
system to use --and I don't know how to placate it.
UTF-8, if Emacs can really produce it. The version of Emacs I've been
using does not deal with "special" characters, but I recently looked at
the newest version of Emacs for Windows, and it seems to have an
impressive support to "special" characters.
Note that the server should be configured to send an appropriate HTTP
header. You normally do this by adding something to your .htaccess file,
and in practice you need to use the same encoding for all ".html" files
in a directory (folder), though you could use, for example, ISO-8859-1
for ".html" and UTF-8 for ".htm" files.
If I'm using multiple coding systems on the same webpage,
do I have to save the different snippets in different files
stored with different coding systems, and then
<!--#include ... -->
each of them into one webpage?
No, it won't work that way, even if your server supports SSI includes.
They result in a single document, which can have one encoding only. (I
won't mention <iframe>, because it's really a poor hack for things like
this, but it performs sort-of include where the included document is
displayed "autonomous ly" inside the main canvas and may have a different
encoding.)
FWIW, my home OS is MacOSX and I need to upload my webpages
to school. The math dept. server is probably running
Unix; when I manipulate the html files (when at work), I'm
using Emacs running on a Solaris (unix) system.
A nice mess :-) but it should be manageable when using UTF-8. When
uploading with FTP, use binary (not Ascii) mode, since no character
conversion shall be performed - the data is already in a
system-independent encoding.
--
Jukka K. Korpela ("Yucca")
http://www.cs.tut.fi/~jkorpela/