473,322 Members | 1,718 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,322 software developers and data experts.

Character set

I'm new here; got here because suddenly the question came up: is html a
7-bit or an 8-bit language? Officially, I mean.
I seem to consistently suffer from character set issues. Of course, I can
specify a specific character set - but that doesn't guarantee the receiving
computer will have that set on board.
Can anyone tell me more? Where to find guidelines, and real-world info?

Hans
Jul 20 '05 #1
2 1715
"Hans Mabelis" <ha**@mabelis.nl> wrote:
I'm new here;
Checking the FAQ is advisable then. It's a bit dusty, but checking it
is better than starting from scratch in every thread. You might start
from http://www.htmlhelp.com/faq/html/bas...l#special-char
is html a 7-bit or an 8-bit language?
Yes. And no. You can use a 7-bit encoding, or an 8-bit encoding, or any
other encoding for an HTML document.
I seem to consistently suffer from character set issues.
Then please specify them, with URLs, after checking the basic
resources.
Of course, I can specify a specific character set
I'm afraid that could mean rather different things,
- but that doesn't guarantee
the receiving computer will have that set on board.


Indeed. The safest bet in practice is Ascii. The second-safest in
theory (and pretty much in practice too, in worldwide considerations)
is UTF-8, if you know how to produce and announce it. But I'm not sure
whether you mean character encoding, character repertoire, or font.
Three different beasts.

--
Yucca, http://www.cs.tut.fi/~jkorpela/
Pages about Web authoring: http://www.cs.tut.fi/~jkorpela/www.html

Jul 20 '05 #2
On Sun, 7 Mar 2004, Hans Mabelis wrote:
I'm new here; got here because suddenly the question came up: is html a
7-bit or an 8-bit language?
No. Not since RFC2070 and HTML4.*
I seem to consistently suffer from character set issues.
That's a bit vague. Do you want to understand the underlying
principles (which is what I would recommend) or are you experiencing
specific problems (in which case you'd need to say a bit more about
what they are, and preferably put some of the problematic materials
online so that people can see for themselves what's going on).
Of course, I can specify a specific character set


Actually no. The Document Character Set is always iso-10646/unicode.
What you _can_ specify is the character encoding, which in MIME
terminology is confusingly called "charset". Until you understand the
difference, none of this stuff is likely to make much sense, I'm
afraid.

Some people have found the materials in my area
http://ppewww.ph.gla.ac.uk/~flavell/charset/ to be of use.

But RFC2070 itself isn't bad, even if it's somewhat dated. The
description of the character representation model in HTML/4.01 is also
reasonably clear. The hardest part is often un-learning things that
the student is convinced that they already understand.

good luck
Jul 20 '05 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

37
by: chandy | last post by:
Hi, I have an Html document that declares that it uses the utf-8 character set. As this document is editable via a web interface I need to make sure than high-ascii characters that may be...
4
by: mimmo | last post by:
Hi! I should convert the accented letters of a string in the correspondent letters not accented. But when I compile with -Wall it give me: warning: multi-character character constant Do the...
7
by: teachtiro | last post by:
Hi, 'C' says \ is the escape character to be used when characters are to be interpreted in an uncommon sense, e.g. \t usage in printf(), but for printing % through printf(), i have read that %%...
18
by: james | last post by:
Hi, I am loading a CSV file ( Comma Seperated Value) into a Richtext box. I have a routine that splits the data up when it hits the "," and then copies the results into a listbox. The data also...
9
by: simchajoy2000 | last post by:
Hi, I know what the ASCII Character Codes are for the 2nd and 3rd powers in VB.NET but I can't find the 6th power anywhere - does anyone know what it might be or if it even exists? Joy
15
by: wizardyhnr | last post by:
i want to try ANSI C99's unicode fuctions. so i write a test program. the function is simple, but i cannot compile it with dev c++ 4.9.9.2 under windows xp sp2, since the compiler always think that...
17
by: =?Utf-8?B?R2Vvcmdl?= | last post by:
Hello everyone, Wide character and multi-byte character are two popular encoding schemes on Windows. And wide character is using unicode encoding scheme. But each time I feel confused when...
3
KevinADC
by: KevinADC | last post by:
Purpose The purpose of this article is to discuss the difference between characters inside a character class and outside a character class and some special characters inside a character class....
7
by: tempest | last post by:
Hi all. This is a rather long posting but I have some questions concerning the usage of character entities in XML documents and PCI security compliance. The company I work for is using a...
10
by: Paul W | last post by:
Hi all, I have an application that reads data in from a text file and stores it in a database. My problem is that there are some characters in the file that aren't being handled properly. For...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
1
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.