473,597 Members | 2,817 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

That long line, – or — or ...

I don't know the English word, but I'm referring to the double-dash
which is used to separate parts of a sentence. I'm using — so far.
Now I saw – which is slightly shorter. Some sites use --.

Is there anything I should know to make a good decision on which to use,
other than what looks best? I think the W3C validator is always handing
out errors, even when I go through the different charsets.

I remember a time when the W3C validator would validate my sites even
though it warned me about the charset... these days, it refuses to do
anything. I don't really control the HTTP charset send in the header.
And I never use meta definitions...
Jul 20 '05 #1
19 33085
Philipp Lenssen <ph************ *@bb-k.com> wrote:
I don't know the English word, but I'm referring to the double-dash
which is used to separate parts of a sentence.
There are two dashes in English typography: em-dash and en-dash.
I'm using — so far.
— is undefined.
Now I saw – which is slightly shorter.
– is an en-dash, — is an em-dash.
Some sites use --.
A surrogate from the typewriter age.
Is there anything I should know to make a good decision on which to use,
other than what looks best?
http://www.cs.tut.fi/~jkorpela/www/windows-chars.html
http://ppewww.ph.gla.ac.uk/~flavell/...cklist.html#s3
I don't really control the HTTP charset send in the header.
And I never use meta definitions...


See http://www.w3.org/International/O-HTTP-charset
http://ppewww.ph.gla.ac.uk/~flavell/...t/ns-burp.html
how to set the encoding ("charset") of your pages.

--
But thats what FP puts in to the page, so i asume thats correct
Harry H. Arends in microsoft.publi c.frontpage.cli ent
Jul 20 '05 #2
Philipp Lenssen wrote:
I don't know the English word, but I'm referring to the double-dash
which is used to separate parts of a sentence. I'm using — so far.
That is a control character, specifically, END OF GUARDED AREA. It's not a
dash.

Now I saw – which is slightly shorter.
That is EN DASH, which seems much more appropriate.

Some sites use --.
Two HYPHEN-MINUS characters. Unicode says it's "used for either hyphen or
minus sign."

It's an ambiguous character, but as it's part of ASCII, more compatible in
some circumstances. It's hard to imagine a situation in which that would
break (aural browsers?), although it's definitely a non-optimal solution.

Unicode suggests the following alternatives:

HYPHEN (U+2010)
NON-BREAKING HYPHEN (U+2011)
FIGURE DASH (U+2012)
EN DASH (U+2013)
MINUS SIGN (U+2212)

I'm not sure, but I think you are describing the EN DASH character, so you
could use – in an HTML document. EM DASH (U+2014) is another
possibility ("May be used in pairs to offset parenthetical text"), which
you could use in an HTML document as — (or any of the other ways of
specifying characters in HTML).

Is there anything I should know to make a good decision on which to use,
other than what looks best?
<URL:http://ppewww.ph.gla.a c.uk/~flavell/charset/> includes lots of good
information on character sets, some of which will be useful.

Obviously, the Unicode charts are of some help as well:

<URL:http://www.unicode.org/charts/>

I think the W3C validator is always handing out errors, even when I go
through the different charsets.
Without knowing those errors, nobody can really comment on that.

I remember a time when the W3C validator would validate my sites even
though it warned me about the charset... these days, it refuses to do
anything. I don't really control the HTTP charset send in the header.


As Alan says, "you don't have the tools necessary to do your job as a web
author".
--
Jim Dabell

Jul 20 '05 #3
In article <MP************ ************@Ne ws.Individual.D E> in
comp.infosystem s.www.authoring.html, Philipp Lenssen
<ph************ *@bb-k.com> wrote:
I don't know the English word, but I'm referring to the double-dash
which is used to separate parts of a sentence. I'm using — so far.
Now I saw – which is slightly shorter. Some sites use --.


It's called an em dash, because it's supposed to be one em wide.
(The en dash is called that because ... well, you can guess. One en
is half of one em, though this is false for widths of dashes in some
fonts.)

The best way to put an em dash in your documents is – -- use
‒ for an en dash. (There are also named character entities,
but browser support is not quite as good.)

— and – for em and en dash are just wrong. Any numeric
character references from 128 through 159 are just wrong. They mean
different things on different machines, sometimes even in different
fonts on the same machine. In many fonts on Microsoft Windows
machines, they mean em and en dashes, and that is why a lot of
people use them. But a large minority of visitors to those Web pages
see either nothing or garbage characters in place of the dashes.

--
Stan Brown, Oak Road Systems, Cortland County, New York, USA
http://OakRoadSystems.com/
HTML 4.01 spec: http://www.w3.org/TR/html401/
validator: http://validator.w3.org/
CSS 2 spec: http://www.w3.org/TR/REC-CSS2/
validator: http://jigsaw.w3.org/css-validator/
Jul 20 '05 #4
In article <MP************ ************@Ne ws.Individual.D E> in
comp.infosystem s.www.authoring.html, Philipp Lenssen
<ph************ *@bb-k.com> wrote:
I remember a time when the W3C validator would validate my sites even
though it warned me about the charset... these days, it refuses to do
anything. I don't really control the HTTP charset send in the header.
And I never use meta definitions...
So what you're saying is that your server doesn't emit a charset,
and you refuse(*) to use the only available backup method; yet you
expect the validator to just guess. Why would you expect it to guess
right? More to the point, why would you expect your visitors'
browsers to guess right?

(*) Maybe this is just a matter of language, and you don't actually
mean you REFUSE to use them. If you mean you have not YET used meta
definitions, it's easy enough to add, between <head> and </head>: <meta http-equiv="Content-Type" content="text/html; charset=utf-8">

If you can't get your server administrator to fix the headers sent
by the server, give thanks that there's a decent workaround in the
shape of a META header.

--
Stan Brown, Oak Road Systems, Cortland County, New York, USA
http://OakRoadSystems.com/
HTML 4.01 spec: http://www.w3.org/TR/html401/
validator: http://validator.w3.org/
CSS 2 spec: http://www.w3.org/TR/REC-CSS2/
validator: http://jigsaw.w3.org/css-validator/
Jul 20 '05 #5
In article <MP************ ************@Ne ws.Individual.D E> in
comp.infosystem s.www.authoring.html, Philipp Lenssen
<ph************ *@bb-k.com> wrote:
I don't know the English word, but I'm referring to the double-dash
which is used to separate parts of a sentence. I'm using — so far.
Now I saw – which is slightly shorter. Some sites use --.


In my previous follow-up (which I've cancelled, for all the good
that does), I stupidly listed – and ‒ for the dashes. It
should be
em dash: —
en dash: –
The reason you see – as "slightly shorter" than — is that
they're different dashes. — is a Microsoftism for the em dash,
which should be — -- the Microsoftism for the – en dash
is –.

As I said in the earlier article, _never_ use € through Ÿ
in your pages. A significant minority of your visitors will not see
those characters as you intended.

--
Stan Brown, Oak Road Systems, Cortland County, New York, USA
http://OakRoadSystems.com/
HTML 4.01 spec: http://www.w3.org/TR/html401/
validator: http://validator.w3.org/
CSS 2 spec: http://www.w3.org/TR/REC-CSS2/
validator: http://jigsaw.w3.org/css-validator/
Jul 20 '05 #6
In article <Xn************ *************** **@193.229.0.31 > in
comp.infosystem s.www.authoring.html, Jukka K. Korpela
<jk******@cs.tu t.fi> wrote:
Most (though not all) authors who say that they don't control HTTP headers
actually have the power to control over the basic headers for their pages.


Is that true with Microsoft IIS servers such as the one that holds
<http://www.acad.sunytc cc.edu/instruct/sbrown/calc26/default.htm>? I
spent a _very_ unproductive couple of hours at Microsoft's site
trying to figure out how to specify the charset.

--
Stan Brown, Oak Road Systems, Cortland County, New York, USA
http://OakRoadSystems.com/
HTML 4.01 spec: http://www.w3.org/TR/html401/
validator: http://validator.w3.org/
CSS 2 spec: http://www.w3.org/TR/REC-CSS2/
validator: http://jigsaw.w3.org/css-validator/
Jul 20 '05 #7
Stan Brown wrote:
In article <MP************ ************@Ne ws.Individual.D E> in
comp.infosystem s.www.authoring.html, Philipp Lenssen
<ph************ *@bb-k.com> wrote:
I remember a time when the W3C validator would validate my sites even
though it warned me about the charset... these days, it refuses to do
anything. I don't really control the HTTP charset send in the header.
And I never use meta definitions...


So what you're saying is that your server doesn't emit a charset,
and you refuse(*) to use the only available backup method; yet you
expect the validator to just guess. Why would you expect it to guess
right? More to the point, why would you expect your visitors'
browsers to guess right?


One thing I forgot to mention:

"In addition, web pages should explicitly set a character set to an
appropriate value in all dynamically generated pages."

-- <URL:http://www.cert.org/advisories/CA-2000-02.html>

--
Jim Dabell

Jul 20 '05 #8
In article <13************ *************@r rzn-user.uni-hannover.de>
in comp.infosystem s.www.authoring.html, Andreas Prilop
<nh******@rrz n-user.uni-hannover.de> wrote:
Stan Brown <th************ @fastmail.fm> wrote:
Most (though not all) authors who say that they don't control HTTP headers
actually have the power to control over the basic headers for their pages.


Is that true with Microsoft IIS servers


Yes, see http://www.w3.org/International/O-HTTP-charset


Thanks, but that's pretty much what I found on Microsoft's site.
Unfortunately it doesn't help me as an author. Only server
administrators can do all that right-clicking stuff. (I should have
made it more clear in my article that I'm an author with no admin
privileges on the IIS server in question.)
--
Stan Brown, Oak Road Systems, Cortland County, New York, USA
http://OakRoadSystems.com/
HTML 4.01 spec: http://www.w3.org/TR/html401/
validator: http://validator.w3.org/
CSS 2 spec: http://www.w3.org/TR/REC-CSS2/
validator: http://jigsaw.w3.org/css-validator/
Jul 20 '05 #9
Philipp Lenssen <ph************ *@bb-k.com> wrote:
Now what's the German version of Em-Dash as described by you --
"used e.g. to make a break in the flow of a sentence"?
I'm not familiar with the exact rules of German punctuation - ich habe
Deutsch nur zwei Jahre in der Schule gelernt - and questions like this
mostly fall outside the scope of HTML authoring. The question "how do I
present such-and-such a character in HTML" is an HTML question; but
"should I use this or that character in my text" is about orthography
of the relevant language.

Well, maybe the questions are somewhat coupled. The official-looking
http://www.neue-rechtschreibung.de/r...rk_zeichen.htm
uses the undefined reference – as "Gedankenstrich "! Quis custodiat
ipsos custodes? They even have, for their page that specifically
describes the norms for German orthography, a title element that
grossly violates that orthography:
<title>deutsc he rechtschreibung </title>
(and doesn't describe the specific content of that page).

What the – is apparently meant to mean is an EN DASH. But we
really cannot be sure. If the person who created that Web page did not
know the meanings of character references, can we know that he knows
the difference between EN DASH and EM DASH and made the correct
decision when describing the official rule? Without knowing the
original decisions, we cannot even know that they make such a
distinction either. I know that for Finnish, the official rules did not
originally make the distinction, and the current rules explicitly say
that you can choose whether you use EN DASH or EM DASH, which
corresponds to the actual usage - it varies, and sometimes it is
impossible to tell when you only see a printed text
whether there's a long variant of an EN DASH or a short variant of an
EM DASH. After all, there are no strict rules on the lengths of those
dashes, though EN DASH tends to have the width of "n" and EM DASH tends
to have the width of "m".
The old site is actually using a single dash for the purpose of
breaking a sentence. It looks really wrong to me.
It's definitely wrong typographically and orthographicall y. Yet it has
been advisable for robustness, in situations where the character
repertoire that can be reliably used is limited to ASCII, ISO Latin 1,
or something else that does not contain the real dashes.
On a side-note, is it really wrong to use Em-Dash with spacing left
and right? 'Cause that's what I do so far.


Depends on the orthography rules. It seems that originally EN DASH and
EM DASH were variants of one character (and official orthography rules
may still treat them that way), used so that EM DASH touches the
surrounding words whereas EN DASH is separated from them with spaces,
to compensate for the difference in dash length. But according to the
Finnish rules, for example, the use of a dash in the function discussed
here _requires_ spaces around the dash, no matter whether you use
EN DASH or EM DASH.

When considering the choice between EN DASH and EM DASH (when you have
a choice), there's the technicality that by Unicode line breaking
rules, a line break is permitted before and after an EM DASH, whereas
EN DASH belongs to category "break opportunity after". But browsers are
not very consistent in applying those rules, and IE seems to break only
after the dash character in either case, even if the dash is preceded
by a space. This is good behavior of course.

--
Yucca, http://www.cs.tut.fi/~jkorpela/
Pages about Web authoring: http://www.cs.tut.fi/~jkorpela/www.html

Jul 20 '05 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
9363
by: Kalessin | last post by:
I'm using PHP's XML parser to validate user-entered XHTML fragments (I wrap them in a top-level element and send them through the parser). It works a treat, except for handling entities: — -- is fine, but &mdash; -- throws an error, e.g: "XML error: undefined entity at line 11, char 415" (Making an entity handler didn't help, at it fails at the parsing stage.)
2
1881
by: J_axx | last post by:
Wow I was told this is a grate place for info I did not believe um but I was wrong I have 4 ASP pages on my site & a DB for each page has a form pointing at its DB, I thought this would be the simplest way. You can add & submit data and you get the confirmation page, but I am seeing no data in my DB's. Each DB has a CHMOD of 777. Here's the link to what I am doing most of the test just read "SOME TEXT"
4
3452
by: Robert Secon | last post by:
Hi there, when using IE5+ my style sheets just work fine, but Mozilla does not agree with it and I can´t figure out which mistake it is. Usually the thing about it is that every row is displayed well but Mozilla displays the first row too big and cuts away the second row of the text. Can somebody tell me what is wrong with it and how to change the code? (HTML-Table wrapped around in php-code)
1
3442
by: Scott Zabolotzky | last post by:
I'm sure somebody has to have done this already but I can't find any good references. If I have an XML file with an associated XSD what is the best way to dynamically generate a web form with proper controls for the various elements in the XML file? I found an article that describes how to do it using stylesheets (http://www.dnzone.com/ShowDetail.asp?NewsId=151) but it's a little cumbersome and was tailored to .NET 1.1.
14
2843
by: arnuld | last post by:
i have slightly modified the programme from section 1.5.1 which takes the input frm keyboard and then prints that to the terminal. it just does not run and i am unable to understand the error message. may you tell me what is wrong and how to make that right ? ------------------------------ INPUT --------------------- #include <stdio.h> int main() {
0
1721
by: Bhavesh | last post by:
Hello genious people, I m trying to insert a LARGE text from Multiline Textbox into my table of sqlserver2000. I m using vs-2005. Please note that I dont want to store blob data From FILE TO TABLE, like storing IMAGE into DB. I hav searched lots of articles on that but didn't get success.
1
1612
by: arnuld | last post by:
PURPOSE: see comments. OUTPUT: negative integer as output even for equal strings /* A string comparison function from K&R2 * example code from section 5.4,page 106 * * returns +ve number , if 1st string is greater than 2nd * return -ve number, if 2nd string is greater than 1st * returns 0, if both are equal
1
4187
by: littlealex | last post by:
IE6 not displaying text correctly - IE 7 & Firefox 3 are fine! Need some help with this as fairly new to CSS! In IE6 the text for the following page doesn't display properly - rather than being aligned to the top, along with the slideshow and link buttons, you have to scroll down to see the text - how can I make IE6 display correctly? http://geekarama.co.uk/new_home.html here is the code for new_home.html and following that the CSS...
0
8284
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
1
8046
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
8262
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
6711
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
5847
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5437
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
3894
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
1
2410
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
1
1500
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.