473,803 Members | 3,461 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

preferred charset?

I have been using the charset windows-1252 for a while, but it was
pointed out to someone else in this group recently that it's a
Microsoft creation (I'm sure I'm getting my facts wrong or skewed) and
therefore not good for cross-platform browsing.
Anyway, I am beginning my road to recovery (ie, breaking my addiction
to authoring only for IE) and I would like to know what is the
preferred charset?
I have tried a search and only find immense lists that make me
cross-eyed without ever telling me which to use to utilize a full
range of characters and have them display the way I intend on
English-speaking machines.
I'm not sure of the proper term, but I always use the & character
substitutes for anything that doesn't show up on my keyboard so,
ideally, the charset should display those, right? (For instance, if I
want to display Montréal, I would input Montréal .)
Thanks!
Jul 20 '05
22 13972
In article <MP************ ************@ne ws.odyssey.net> in
comp.infosystem s.www.authoring.html, Stan Brown
<th************ @fastmail.fm> wrote:
In article <in************ *************** *****@4ax.com> in
comp.infosyste ms.www.authoring.html, Jane Withnolastname
Anyway, here's a sorta related question: is it acceptable to write
ASCII codes into html?


Yes, though it's unnecessary excel;t for > and &.


Hmm -- I'm not sure how that got mangled. It should have read
"except for < > and &."

ASCII codes run 0 to 127; of them numbers 32 to 126 are displayable
(though 32 "displays" as a space).

--
Stan Brown, Oak Road Systems, Cortland County, New York, USA
http://OakRoadSystems.com/
HTML 4.01 spec: http://www.w3.org/TR/html401/
validator: http://validator.w3.org/
CSS 2 spec: http://www.w3.org/TR/REC-CSS2/
validator: http://jigsaw.w3.org/css-validator/
Jul 20 '05 #11
Stan Brown <th************ @fastmail.fm> writes:
In article <87************ @dinopsis.dur.a c.uk> in
comp.infosystem s.www.authoring.html, Chris Morris
<c.********@dur ham.ac.uk> wrote:
Stan Brown <th************ @fastmail.fm> writes:
There is no reason to use &quot; ever, that I am aware of.
<img src="quotechar. jpg" alt="&quot;">


<img src="quotechar. jpg" alt='"'> -- even aside from the fact that
the example is extremely unlikely to occur in practice. :-)


<img src="quoteandap os.png" alt="&quot '">

Even more unlikely, yes, but user input could potentially contain
both. More realistically on the image:

<img src="quotation. png" alt="&quot;Quot ation&quot; - John O'Name">
"Single quote marks can be included within the attribute value when
the value is delimited by double quote marks, and vice versa."
http://www.w3.org/TR/html401/intro/sgmltut.html#h-3.2.2


Doing both at once remains a bit more difficult.

--
Chris
Jul 20 '05 #12
On Thu, 28 Aug 2003, Stan Brown wrote:
<img src="quotechar. jpg" alt="&quot;">


<img src="quotechar. jpg" alt='"'> -- even aside from the fact that
the example is extremely unlikely to occur in practice. :-)


http://www.cl.cam.ac.uk/~mgk25/ucs/quotes.html
http://www.cl.cam.ac.uk/~mgk25/ucs/apostrophe.html

Jul 20 '05 #13
On Thu, 28 Aug 2003, Jane Withnolastname wrote:
Anyway, I am beginning my road to recovery (ie, breaking my addiction
to authoring only for IE) and I would like to know what is the
preferred charset?

There is none. It depends on *your* special situation.
http://ppewww.ph.gla.ac.uk/~flavell/...checklist.html
Or would I be better advised to stick with regular quotes and never
mind special ASCII-only characters?
It might be preferable to use only ASCII quotes (" '). See
http://ppewww.ph.gla.ac.uk/~flavell/...cklist.html#s3
Thanks again. I'm feeling quite stupid right now :)
No, no. Perfectly valid questions.
Is there a list somewhere of all the alternate characters?


You probably don't need all. Take
http://www.unics.uni-hannover.de/nht...2.html#symbols
as a starting point.

Jul 20 '05 #14
On Thu, 28 Aug 2003 08:24:25 +0100, Headless <me@privacy.net > wrote:
Jane Withnolastname wrote:
I have been using the charset windows-1252 for a while, but it was
pointed out to someone else in this group recently that it's a
Microsoft creation (I'm sure I'm getting my facts wrong or skewed) and
therefore not good for cross-platform browsing.
Anyway, I am beginning my road to recovery (ie, breaking my addiction
to authoring only for IE) and I would like to know what is the
preferred charset?
I have tried a search and only find immense lists that make me
cross-eyed without ever telling me which to use to utilize a full
range of characters and have them display the way I intend on
English-speaking machines.
I'm not sure of the proper term, but I always use the & character
substitutes for anything that doesn't show up on my keyboard so,
ideally, the charset should display those, right? (For instance, if I
want to display Montréal, I would input Montr&eacute;al .)


I use ISO-8859-1 because it allows me to dispense with character
references like &eacute; the source readability is much better without
those codes.
Headless


So I've got one vote for utf-8 and one vote for iso-8859-1 and
everybody else just wants to argue about quotes, which was so not the
point, to begin with.
Can I get a consensus?
It depends on what I'm using it for? OK, it's a general-use site aimed
at an English-speaking audience that may, at some time or another,
need to use non-English characters, such as é or ç. I need it to
display on all browsers and would be nice (but not necessary) if it
was printable on most printers.

If I understand correctly, this ISO charset will allow me to simply
input é and it will display correctly in all browsers?

Someone questioned my saying that entity rather than number was the
preferred method. Well, it's what I read on this newsgroup only a few
days ago, when someone was asking about the Euro character. The person
had said that it was written with the numerical identifier and was
advised to change it to the entity.

I apologize for apparently having no idea that ASCII stopped at 127. I
learned everything I know about ASCII in high school, something like
15 years ago. Some of it may have been wrong and some may have meshed
with what I *thought* was fact.... Anyway, thanks for straightening me
out on that.

Thanks!
Jul 20 '05 #15
In article <28************ *************@r rzn-user.uni-hannover.de>
in comp.infosystem s.www.authoring.html, Andreas Prilop
<nh******@rrz n-user.uni-hannover.de> wrote:
Stan Brown <th************ @fastmail.fm> wrote:
I didn't write the above -- in fact I disagree with it. PLEASE be
careful with attributions!


You quoted it. Therefore the line has an additional quote mark (>).
Some newsreaders and Google
http://groups.google.com/groups?th=375726c4206f6e49
even display different quoting levels in different colours.
This is an elementary fact of Usenet quoting.


It's an elementary fact of how Usenet quoting is _supposed_ to be.
So many people in fact screw up the quote widgets that the mere
presence or absence of an extra widget is no guide to who said what.

How would you like having elementary errors attributed to you --
especially given that those attributions stay around for all time in
the Google archives?

I must confess I am surprised to see you defending misquoting
someone. By your "logic", the psalmist who wrote "The fool says in
his heart, 'there is no god'" would not object if someone claimed
that he himself said "there is no god".

--
Stan Brown, Oak Road Systems, Cortland County, New York, USA
http://OakRoadSystems.com/
HTML 4.01 spec: http://www.w3.org/TR/html401/
validator: http://validator.w3.org/
CSS 2 spec: http://www.w3.org/TR/REC-CSS2/
validator: http://jigsaw.w3.org/css-validator/
Jul 20 '05 #16
In article <d2************ *************** *****@4ax.com> in
comp.infosystem s.www.authoring.html, Jane Withnolastname
<Ja************ **********@yaho o.com> wrote:
So I've got one vote for utf-8 and one vote for iso-8859-1 and
everybody else just wants to argue about quotes, which was so not the
point, to begin with.
Can I get a consensus?


I understand and sympathize with your wish to ask questions on (what
look like) small unrelated issues and get simple unambiguous
answers.

The problem is that things don't work that way. Your questions are
in fact related, and the right answer to them depends on other facts
which you have not told us. That is why some of us have posted
references by which you can educate yourself on these issues.

--
Stan Brown, Oak Road Systems, Cortland County, New York, USA
http://OakRoadSystems.com/
HTML 4.01 spec: http://www.w3.org/TR/html401/
validator: http://validator.w3.org/
CSS 2 spec: http://www.w3.org/TR/REC-CSS2/
validator: http://jigsaw.w3.org/css-validator/
Jul 20 '05 #17
Jane Withnolastname wrote:
If I understand correctly, this ISO charset will allow me to simply
input é and it will display correctly in all browsers?


There are several variables that go into choosing a charset and
associated choices, read the supplied references if you're interested.

If you're not particularly interested, ISO-8859-1should work fine,
screen and print, all browsers.
Headless

--
Email and usenet filter list: http://www.headless.dna.ie/usenet.htm
Jul 20 '05 #18
Jane Withnolastname wrote:
So I've got one vote for utf-8 and one vote for iso-8859-1 and
everybody else just wants to argue about quotes, which was so not the
point, to begin with.
Can I get a consensus?
If you're mostly going to need characters from Western European
languages, ISO-8859-1 is a reasonable choice; it would let you put
characters directly from the normal Windows character set (as usually
configured in the U.S. and Western Europe) as long as you avoided the
range from #128-#159, which are control characters not permitted in HTML
(even though the proprietary Windows character set has printable
characters in that range). Characters other than those in iso-8859-1
would have to be added via numeric references or entity names (as noted
elsewhere in this thread regarding "curly quotes"; this is also true of
characters from other languages such as Hebrew or Chinese).

UTF-8 would permit the direct inclusion of the full range of Unicode
characters, but would require you to use an editing program that knows
how to generate data in this encoding (which requires multiple bytes for
characters outside the US-ASCII 7-bit range). In a UTF-8 document, you
wouldn't be able to paste in a character such as é or ç directly unless
your editor converted it appropriately; the 8-bit ISO-8859-1 reference
wouldn't be valid. If you just used US-ASCII with all other characters
represented as numeric or entity references, that would be valid,
however, since the US-ASCII range is represented identically in
ISO-8859-1 and UTF-8.
Someone questioned my saying that entity rather than number was the
preferred method. Well, it's what I read on this newsgroup only a few
days ago, when someone was asking about the Euro character. The person
had said that it was written with the numerical identifier and was
advised to change it to the entity.
That's because that person was using an invalid numerical reference for
the Euro character; I think they were using the number of its position
in (some versions of) the proprietary Windows encoding, rather than its
proper Unicode number. The Euro character is especially problematic
because it was only added to character sets relatively recently compared
to other special characters, and hence is not in ISO-8859-1 or even in
early versions of the proprietary Windows character set, but is in one
of the character positions in the current Windows set that is actually a
control character in ISO-8859-1 and Unicode.
I apologize for apparently having no idea that ASCII stopped at 127. I
learned everything I know about ASCII in high school, something like
15 years ago. Some of it may have been wrong and some may have meshed
with what I *thought* was fact.... Anyway, thanks for straightening me
out on that.


More character set info:
http://webtips.dan.info/char.html
http://mailformat.dan.info/body/charsets.html

--
== Dan ==
Dan's Mail Format Site: http://mailformat.dan.info/
Dan's Web Tips: http://webtips.dan.info/
Dan's Domain Site: http://domains.dan.info/

Jul 20 '05 #19
Stan Brown wrote:
In article <28************ *************@r rzn-user.uni-hannover.de>
in comp.infosystem s.www.authoring.html, Andreas Prilop
<nh******@rrz n-user.uni-hannover.de> wrote:
Stan Brown <th************ @fastmail.fm> wrote:

I didn't write the above -- in fact I disagree with it. PLEASE be
careful with attributions!


You quoted it. Therefore the line has an additional quote mark (>).
Some newsreaders and Google
http://groups.google.com/groups?th=375726c4206f6e49
even display different quoting levels in different colours.
This is an elementary fact of Usenet quoting.

It's an elementary fact of how Usenet quoting is _supposed_ to be.
So many people in fact screw up the quote widgets that the mere
presence or absence of an extra widget is no guide to who said what.


But how many layers of attributions should be left there? In long
threads, I sometimes see 4 or more layers of attributions at the top,
then various levels of quoting. It's too much for me to sort through.
I normally look only at the first attribution. And when I reply, I
generally trim the extra ones to keep the reply readable. (I left
there here out of deference to the immediate topic.)

--
Brian
follow the directions in my address to email me

Jul 20 '05 #20

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
6200
by: CJ Oxx | last post by:
I have a problem with browser charset recognition when using PHP 4.1.2 (this is the PHP version which our hosting company provides). For charset recognition, I use the following meta-tag: <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> Here is what I have tried so far: - Regular html page: charset is properly recognised - PHP page which just prints out the html: charset is not properly recognised (ISO charset is...
12
3750
by: lawrence | last post by:
How do I get PHP to tell the server that when I echo text to the screen, I need for the text to be sent as UTF-8? How does Apache know the right encoding when all the text is being generated by PHP? If I build a content management system (I have) and I make sure that all input is encoded as UTF-8, how will the server know that the text in the MySql database is UTF-8? I'm taking all user input and using this function on the input: ...
8
3688
by: Ben Sharvy | last post by:
I don't reallly understand charsets, but I see that you are supposed to declare one. I am putting a paper up in HTML. It uses some special characters, like umlauts and diacriticals. The biggest problem I've had with it is creating characters with lines over them (macrons), as is done to indicate a long vowel sound, for example. There is no way to do this using my word processor. Is there a way to do it in an HTML document? Which charset...
25
3019
by: Andrew Thompson | last post by:
I was recently loading an HTML editor so I could find the charcode of that particularly obscure character using the editor's 'insert special character' dialog. It occured to me there had to be a better way. There are probably dozens, but here is my solution.. http://www.physci.org/codes/charset.jsp
35
4646
by: The Bicycling Guitarist | last post by:
My web site has not been spidered by Googlebot since April 2003. The site in question is at www.TheBicyclingGuitarist.net/ I received much help from this NG and the stylesheets NG when updating the code before then. My host's tech guy just sent me the following. Isn't it okay to specify UTF-8 as the charset in the HTTP headers at the server level? Isn't it okay to have validated XHTML 1.0 strict code? ...
28
3229
by: Xiaotian Sun | last post by:
I added the following line to the header of my html file <meta http-equiv="content-type" content="text/html; charset=utf-8"> hoping browsers will use UTF-8 encoding. But all browsers I tried still use ISO-8859-1. What did I do wrong? Thanks,
0
6611
by: Chris | last post by:
Hi, I am relatively new to using CDO.Message, and I desperately need some help! I have an ASP 3.0 application that uses CDO to send automatic alerts through Exchange 2003. How do I determine which charset to use for the objMessage.BodyPart.CharSet? The users of this application typically enter the HTMLBody message
2
2091
by: godyuyu | last post by:
i need change the system preferred audio device in my vc++ project ,but i can't find the resolution.I just want to know the method.
7
5981
by: gmclee | last post by:
Hi there, I am writing a program to load HTML from file and send it to IE directly. I've met some problem in charset setting. Most of HTML have charset "us-ascii", for some reason, some UNICODE TEXT will be inserted into the HTML before sending to IE. The problem is 1) Can I specify special charset for some component, e.g. <span charset="UTF-8"SOME UNICODE HERE</spand> 2) If "NO" for 1), so any way to change the charset of the...
0
9703
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
9565
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10550
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
9125
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
7604
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5501
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
5633
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
4275
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
3799
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.