473,387 Members | 3,820 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,387 software developers and data experts.

Named vs. numerical entities

I recently read the claim somewhere that numerical entities (such as
—) have a speed advantage over the equivalent named entities
(such as —) because the numerical entity requires just a single
byte to be downloaded to the browser, while the named entity requires
one byte for each letter. (So in this case, it would presumably be one
byte vs. seven bytes.) I found this claim a little surprising -- I
would have thought *each* numeral in the numerical entity would require
one byte. Does the Web server really send the entire numerical entity
as a single... character or whatever... I don't even know how to phrase
this question correctly!

Also, which form of the entity enjoys wider browser support? They both
seem to work with modern browsers... but what about older or very buggy
browsers?
Jul 20 '05
81 5066
On Sat, 17 Jul 2004, Leif K-Brooks wrote:
Yes and no. UTF-8 documents are the same size as iso-8859-1 documents,
Er, no. The characters in the upper half of iso-8859-1 need two bytes
per character in utf-8; only one in iso-8859-1.
My advice would be to replace your FTP client if it's that broken,


Cue A.Prilop and the anti-Pirard league (that's an in-joke, don't
worry about it). The FTP software is not "broken", it's got extra
functionality, for mapping between traditional MacRoman encoding and
iso-8859-1. That function needs to be off when the material isn't
encoded in MacRoman.
Jul 20 '05 #51
On Fri, 16 Jul 2004 17:50:35 GMT, "C A Upsdell"
<cupsdell0311XXX@-@-@XXXrogers.com> wrote:
I routinely worked with
various 7- and 8-bit ASCII character sets (in addition to EBCDIC and Gray
codes).


Gray codes are a red-herring here. They've nothing to do with
character encodings.

Jul 20 '05 #52
C A Upsdell wrote:
"Alan J. Flavell" wrote...
C A Upsdell wrote:
Standards written later appear to have disassociated the term
ASCII from the national variants
Uh-uh, it's an international conspiracy to hide the origin of
these codes, is it? You don't seriously believe that the US
American national standards body would go making national
character codes for other countries, do you?


a paragraph like this is unworthy of you. International
conspiracy?


"Don't you know sarcasm when you hear it?!" [1]
ISO an American standards body?
Not *quite* what he was saying. ;-)
Standards being set by one national standards body without
consulting with other nations? You speak as if the US were the
only legitimate country in the world!


Hard to imagine how you could have misread that post more than you did.

--
Brian (remove ".invalid" to email me)
http://www.tsmchughs.com/
Jul 20 '05 #53
On Sat, 17 Jul 2004, Brian wrote [to C A Upsdell ]:
Hard to imagine how you could have misread that post more than you did.


It's comforting to know that someone could perceive the
discrepancy ;-)

I don't think it's worth my while to even start on responding to the
various non-sequiturs. Suffice it to say that I'm well near the front
in the crabby old b*gger stakes, I met my first computer in 1958 and
some of my early programs are for converting between different
character encodings. I've had an interest in character
representation, specifications, standards, usage and terminology in
this field ever since.

Oh, and ASCII is a 7-bit code.

all the best.
Jul 20 '05 #54
Brian wrote:
"Don't you know sarcasm when you hear it?!" [1]


That note marker was meant to be followed by a citation. Here it is: I
lifted that from one Charles Brown.

--
Brian (remove ".invalid" to email me)
http://www.tsmchughs.com/
Jul 20 '05 #55
Alan J. Flavell wrote:
On Sat, 17 Jul 2004, Leif K-Brooks wrote:

Yes and no. UTF-8 documents are the same size as iso-8859-1 documents,

Er, no. The characters in the upper half of iso-8859-1 need two bytes
per character in utf-8; only one in iso-8859-1.


Darn, you're right. Could've sworn I read that somewhere, even though it
doesn't make any sense; I guess this is why one shouldn't make Usenet
posts after midnight.
Jul 20 '05 #56
On Fri, 16 Jul 2004 19:57:52 GMT, Jonas Smithson
<sm************@REMOVETHISboardermail.com> wrote:
But I got the core information I needed: there's no
speed advantage of — over &mdash;.


...I've never understood encodings or entities either....
How portable is —, as a very general thing, relative to say,
  ?

I'd always assumed that both were effectively portable, but just this
week I've been having trouble with a system (Vodafone's PartnerML)
that can't handle apostrophes from M$oft Word, that appear as ’

Jul 20 '05 #57
Well, thanks again to all of you; you've given me a good starting point
for figuring out at least the basics of this stuff, and you've been
very patient with me (although not, I think, with each other!).

I realize now that some of what I've been reading in books has been
misleading or simply wrong; it's odd that a Usenet newsgroup could be
more reliable than some books from reputable publishers, but that seems
to be the case... which makes it hard to know how to "filter"
information as I go forward. In fact, much as I dislike the combative
or sneering tone that many Usenetters adopt (unnecessarily, I think), I
see that the contentiousness does serve one useful purpose -- when I'm
reading a book that contains misinformation, it would be useful if a
critic could be there to step in with a demurral!

Jonas
Jul 20 '05 #58
On Sat, 17 Jul 2004, Andy Dingley wrote:
How portable is —, as a very general thing,
Portable? Utterly: it's a string of seven ASCII characters, after
all; they're unlikely to come to any harm in transit. Compatible with
all browsers and client agents? No, but it's been clear enough since
RFC1866/HTML2.0 that this was where HTML would be heading; RFC2070
actually codified it, and HTML4.0 put it into a W3C version of HTML.
That's quite a little while back now, as you may recall.
relative to say,   ?
That notation is technically meaningless (in HTML) and AFAIK illegal
in XHTML. So by definition it's not compatible with anything. Sure,
it happens to pick out the displayable characters of the Windows-1252
code on a rather popular majority platform; and other browser makers
may have considered that they couldn't afford to not copy that
behaviour, no matter what the specifications said. So it gives the
visual result that the author intended; but to call that "working"
would be stretching things.
I'd always assumed that both were effectively portable,
But what do you really mean by "portable"? They are notations
constructed of strings of ASCII characters. They will certainly
-reach- every client agent in that form. If you really mean "will
client agents render them?" why not ask that question? Most will;
some won't. At least if you use ’ then by definition any client
agent which doesn't render them, doesn't support HTML4. If you use
numbers between 128 and 159 respectively, then you're not really
writing HTML, but some kind of quasi-MSHTML which even MS are weaning
themselves off now.
but just this week I've been having trouble with a system
(Vodafone's PartnerML) that can't handle apostrophes from M$oft
Word, that appear as ’


AFAIK, neither does WebTV. Works great in Lynx, of course.

If you would at least code them as 8-bit characters, instead of
&#number; notations, and send them as charset=windows-1252, then you
would at least be both (a) honest and (b) protocol-conforming. It's
not my top recommendation - far from it, but see the discussion:
http://ppewww.ph.gla.ac.uk/~flavell/...klist.html#s3a

hope that helps
Jul 20 '05 #59
On Sat, 17 Jul 2004, Alan J. Flavell wrote:
On Sat, 17 Jul 2004, Andy Dingley wrote:
How portable is —, as a very general thing, [...] relative to say,   ?


That notation is technically meaningless (in HTML) and AFAIK illegal
in XHTML.


Hah! You caught me out well and truly there!!

There's nothing wrong with 160, it's a no-break space.

The windows-1252 code for your em dash would be 151. And there I was,
posting on autopilot, assuming that's what you had typed. Well, hit
me down with a clue by four...

But the rest of what I wrote was, at least, what I intended. Sorry
about that.
http://www.unicode.org/Public/MAPPIN...OWS/CP1252.TXT
Jul 20 '05 #60
Alan J. Flavell wrote:
relative to say,   ?


That notation is technically meaningless (in HTML) and AFAIK illegal
in XHTML. So by definition it's not compatible with anything. Sure,
it happens to pick out the displayable characters of the Windows-1252
code on a rather popular majority platform; and other browser makers
may have considered that they couldn't afford to not copy that
behaviour, no matter what the specifications said. So it gives the
visual result that the author intended; but to call that "working"
would be stretching things.


I have no doubt I'll be proven wrong, but I don't see anything wrong
with  . HTML defines the character set as UCS (which has she same
characters as Unicode), and Unicode defines character 160 as
non-breaking space: <http://www.unicode.org/charts/PDF/U0080.pdf>.
Jul 20 '05 #61
On Sat, 17 Jul 2004, Leif K-Brooks wrote:
I have no doubt I'll be proven wrong, but I don't see anything wrong
with  .


Quite right. See my correction posted shortly afterwards.

Sorry again.
Jul 20 '05 #62
Jonas Smithson wrote:
Well, thanks again to all of you;
Please follow conventions and quote a few words (but not an entire
message) so that there's some context to your posts.
I realize now that some of what I've been reading in books has been
misleading or simply wrong; it's odd that a Usenet newsgroup could
be more reliable than some books from reputable publishers,
There's lots of nonsense out there. I just finished _Philip and Alex's
Guide to Web Publishing_. Well written, lots of useful general advice,
and his db skills seem pretty solid, but his web advice ranged from
lousy to uncontestably wrong. In one part, he writes, "I put in a
<HEAD> element mostly so that I can legally use the <TITLE> element...."

(sigh) Where to begin? The <TITLE> element is required for a "legal"
document (assuming "legal" means valid). And once there's a <TITLE>
element, the <HEAD> element is assumed with or without tags. In fact,
it is perfectly "legal" for a document to contain neither <HEAD> nor
<BODY> tags. Obviously, noone even remotely qualified edited the book
for technical accuracy. It's sad when a rank amateur such as myself
can pick out technical errors like that.
much as I dislike the combative or sneering tone that many
Usenetters adopt (unnecessarily, I think), I see that the
contentiousness does serve one useful purpose -- when I'm reading a
book that contains misinformation, it would be useful if a critic
could be there to step in with a demurral!


Others seem unable to grasp that concept. Have you read "How to ask
questions the smart way?" There's a section on that very topic.

--
Brian (remove ".invalid" to email me)
http://www.tsmchughs.com/
Jul 20 '05 #63
In article <10*************@corp.supernews.com>,
Brian <us*****@julietremblay.com.invalid> writes:
I realize now that some of what I've been reading in books has been
misleading or simply wrong; it's odd that a Usenet newsgroup could
be more reliable than some books from reputable publishers,
The obvious answer to that one has to be the phrase "Peer Review".
Usenet has its share of idiots, but they tend to get exposed here.
And experts sometimes get it wrong, but wwhen that happens they'll
politely correct each other.

Another answer is the commercial pressures in the world of book
publishing. Your team (author(s) and tech reviewer(s), editor(s))
gives you a far smaller pool of expertise than the global Usenet.
That's another face of the same phenomenon that means Linux will
always be more reliable than Windows.
There's lots of nonsense out there. I just finished [chop]


That's a website I presume? Not at all the same. To get the kind
of peer review you have on Usenet needs two things: first a framework
for it (like mod_annot), second sufficient interest from a community
of experts.
much as I dislike the combative or sneering tone that many
Usenetters adopt (unnecessarily, I think),


It goes with the territory:-)

And it can serve a useful purpose. A well-aimed "don't talk nonsense"
message can help someone focus on not talking nonsense. True, there
are some incurable idiots who won't be corrected, but there's a far
greater number who can be helped into a more useful online existence.

--
Nick Kew
Jul 20 '05 #64
On Sun, 18 Jul 2004, Nick Kew wrote:
Another answer is the commercial pressures in the world of book
publishing. Your team (author(s) and tech reviewer(s), editor(s))
gives you a far smaller pool of expertise than the global Usenet.


It's unfortunate, though, that mistakes and misunderstandings can be
perpetuated in what appears to be an authoritative way by being
repeated in published books. Not quite as bad as the old books on
medicine and surgery, which repeated mumbo-jumbo from the ancient
sages with little attempt to verify its accuracy; but even so...

You can't of course ever take a Usenet posting on its own as an
authoritative answer (and some of the purported answers to - let's say
- HTTP questions on comp.lang.perl.misc have produced some answers
that were truly cringeworthy); but where it's found on an appropriate
group and has gathered a reasonable amount of discussion, the thread
as a whole can often throw light on a topic in a way that, as you say,
a single author and panel of publisher's reviewers could have missed.
Gene Spaffords elephant quote was indeed apt ;-)

I'm still figuratively beating myself with that cluestick which says
"160 is not in the range 128 to 159". Ouch.

all the best
Jul 20 '05 #65
An attribution was snipped by a certain Nick Kew. I'll reinsert it so
that the post makes more sense.

Jonas Smith wrote:
I realize now that some of what I've been reading in books has
been misleading or simply wrong; it's odd that a Usenet
newsgroup could be more reliable than some books from reputable
publishers,

Brian wrote: There's lots of nonsense out there. I just finished [chop]


Nick Kew wrote: That's a website I presume?


You presume wrong. And why did you chop the post there? You cut out
the title of the book, which I tried to indicate with faux
underlining. I suppose I should have included the author/publisher,
too. Here is the title -- again.

_Philip and Alex's Guide to Web Publishing_, by Philip Greenspun
(Morgan Kaufman)

In any case, I was only responding to J. Smith.

--
Brian (remove ".invalid" to email me)
http://www.tsmchughs.com/
Jul 20 '05 #66
"Alan J. Flavell" <fl*****@ph.gla.ac.uk> wrote in
comp.infosystems.www.authoring.html:
If you use
numbers between 128 and 159 respectively, then you're not really
writing HTML, but some kind of quasi-MSHTML which even MS are weaning
themselves off now.


That last part would be good news, but I must have missed it.
Microsoft are actually embracing Net standards?

--
Stan Brown, Oak Road Systems, Tompkins County, New York, USA
http://OakRoadSystems.com/
HTML 4.01 spec: http://www.w3.org/TR/html401/
validator: http://validator.w3.org/
CSS 2 spec: http://www.w3.org/TR/REC-CSS2/
2.1 changes: http://www.w3.org/TR/CSS21/changes.html
validator: http://jigsaw.w3.org/css-validator/
Jul 20 '05 #67
On Sat, 17 Jul 2004 21:13:16 -0400, Brian
<us*****@julietremblay.com.invalid> wrote:
There's lots of nonsense out there. I just finished _Philip and Alex's
Guide to Web Publishing_. Well written, lots of useful general advice,


Alex's contribution was by far the better,
Jul 20 '05 #68
On Sat, 17 Jul 2004, C A Upsdell wrote:
And as I said before, there were 8-bit ASCII sets, sometimes called extended
ASCII:
That's like calling every hamburger a "Big Mac". There are many different
hamburgers around the world - but there's only one Big Mac.
7 bits are not adequate to code characters for most European
languages, or for specialized character sets.


True - but those aren't called properly ASCII. They have other names
such as code page 850 (cp850).

--
Top-posting.
What's the most irritating thing on Usenet?

Jul 20 '05 #69
On Fri, 16 Jul 2004, C A Upsdell wrote:
Also, when I started developing software in the early 1970's -- before the
Internet, before PCs, before microprocessors -- I routinely worked with
various 7- and 8-bit ASCII character sets (in addition to EBCDIC and Gray
codes).
Perhaps _you_ could give us _your_ definition of ASCII? In your opinion,
when is a character set called "an ASCII"?
I find many Internet references denying the existence of 8-bit
ASCII,
Fine!
but I can attest that, in the early 1970s, multiple 7- and 8-bit sets
were alive and well.


No doubt - but they are not properly called ASCII.

--
Top-posting.
What's the most irritating thing on Usenet?

Jul 20 '05 #70
"Tim" <ti*@mail.localhost.invalid> a écrit dans le message de
news:1q*****************************@40tude.net
Not that long ago I tried utf-16 on several different (and *current*
versions of) web browsers. Only some could use it.

I know that's vague, and I'm not inclined to run all the tests before
I post this response. But it was enough to convince *me* that it was
a bad idea.


Good to know. Thanks.

Jul 20 '05 #71
"Andreas Prilop" <nh******@rrzn-user.uni-hannover.de> a écrit dans le
message de news:Pine.GSO.4.44.0407161529340.10170-100000@s5b003
If not, wouldn't the file will get very large ?


Not bigger than a simple image.


Do you mean that html files sizes are not to consider regarding the vast
more bigger sizes of media files (images, js, CSS, ...) ?
If so, it's not perfectly true. Keep in mind that for most websites, in
particular news websites most of the hits are for html files. The other
files are in general sent with some http headers specifing a longer cache
period - so the users browsers just don't reload them as often as for html.
And when you have millions of hits a day, lower a html file size by 1 or 2KB
is not useless at all !

Jul 20 '05 #72
"Alan J. Flavell" <fl*****@ph.gla.ac.uk> a écrit dans le message de
news:Pi******************************@ppepc56.ph.g la.ac.uk
Greek, Cyrillic, Arabic, Hebrew are all represented
by 2 octets in utf-8. Armenian, Syriac and Coptic too, hmmm. The
cutoff (IINM) is U+07FF.


I didn't understand your last phrase (grrr, I'm really shamy of my bad
undestanding of english sometimes).
Did you meant all the Unicode characters that have a code point from 0 to
x07FF will be encoded with 2 octets only in UTF-8 ?

Anyway, thanks a lot for all these complementary informations.

Jul 20 '05 #73
"Matt" <no******@spam.matt.blissett.me.uk> a écrit dans le message de
news:pa****************************@spam.matt.blis sett.me.uk
Set your text editor to UTF-8 encoding, and input the character. You
can copy/paste it from anywhere (e.g. character map, a handy web
page) or use your keyboard -- I edited my keyboard layout to give me
lots of useful symbols. For instance, ndash – and mdash — and AltGr +
hypen and Shift+AltGr+hyphen now.[1]


Under Windows it can be done easilly by using a editor provided by Microsoft
: the Microsoft Keyboard Layout generator
=> http://www.microsoft.com/globaldev/tools/msklc.mspx

For french users you can find a very practical layout here :
http://home-14.tiscali-business.nl/~...35/kbdfrac.htm

Jul 20 '05 #74
On Wed, 21 Jul 2004, Pierre Goiffon wrote:
"Alan J. Flavell" <fl*****@ph.gla.ac.uk> a écrit dans le message de
news:Pi******************************@ppepc56.ph.g la.ac.uk
Greek, Cyrillic, Arabic, Hebrew are all represented
by 2 octets in utf-8. Armenian, Syriac and Coptic too, hmmm. The
cutoff (IINM) is U+07FF.
I didn't understand your last phrase


I meant that two octets are sufficient, up to and including the
Unicode character x07FF.

See the table at e.g http://www.cl.cam.ac.uk/~mgk25/unicode.html#utf-8
or http://www.ietf.org/rfc/rfc3629.txt
Did you meant all the Unicode characters that have a code point from 0 to
x07FF will be encoded with 2 octets only in UTF-8 ?


x0000 to x007f - one octet (ASCII)
x0080 to x07ff - two
x0800 to xffff - three

Characters above 16 bits need 4, 5 or 6 octets.
Jul 20 '05 #75
"Alan J. Flavell" <fl*****@ph.gla.ac.uk> a écrit dans le message de
news:Pi*******************************@ppepc56.ph. gla.ac.uk
I meant that two octets are sufficient, up to and including the
Unicode character x07FF.

See the table at e.g http://www.cl.cam.ac.uk/~mgk25/unicode.html#utf-8
or http://www.ietf.org/rfc/rfc3629.txt


I'm going to read this, thanks

Jul 20 '05 #76
On Wed, 21 Jul 2004, Alan J. Flavell wrote:
Did you meant all the Unicode characters that have a code point from 0 to
x07FF will be encoded with 2 octets only in UTF-8 ?


x0000 to x007f - one octet (ASCII)
x0080 to x07ff - two
x0800 to xffff - three

Characters above 16 bits need 4, 5 or 6 octets.


The maximum is 4 bytes; see version 4.0 of the Unicode Standard.
<http://www.unicode.org/versions/Unicode4.0.0/ch03.pdf>
UTF-8 sequences of 5 or 6 bytes are no longer valid.

--
Top-posting.
What's the most irritating thing on Usenet?

Jul 20 '05 #77
On Wed, 21 Jul 2004, Andreas Prilop wrote:
On Wed, 21 Jul 2004, Alan J. Flavell wrote:
Characters above 16 bits need 4, 5 or 6 octets.

OK, I considered saying "need 4 octets at present, but the mechanism
is defined for 5 or 6 to be used later". But then I didn't, in the
interests of brevity...
The maximum is 4 bytes; see version 4.0 of the Unicode Standard.
<http://www.unicode.org/versions/Unicode4.0.0/ch03.pdf>
You're referring me to section 3.9 around table 3-5, I take it?
UTF-8 sequences of 5 or 6 bytes are no longer valid.


There are no unicode characters defined yet which would require 5 or 6
bytes, agreed, and there is indeed now a rule which forbids
non-shortest encodings. So I agree: sequences of 5 or 6 bytes are not
currently valid, and the Unicode space currently extends to x10FFFF
inclusive.

But are you telling me that Unicode has now set that as a final limit,
such that 5- and 6-byte encodings will never be needed? If so, then
please show me where that is stated, and I will have learned something
new. But section 3.1 states that the Unicode repertoire "is
inherently open", which doesn't seem to me like a statement that it's
limited to x10FFFF different code points.

Jul 20 '05 #78
On Wed, 21 Jul 2004, Alan J. Flavell wrote:
The maximum is 4 bytes; see version 4.0 of the Unicode Standard.
<http://www.unicode.org/versions/Unicode4.0.0/ch03.pdf>
You're referring me to section 3.9 around table 3-5, I take it?


Yes.
So I agree: sequences of 5 or 6 bytes are not
currently valid, and the Unicode space currently extends to x10FFFF
inclusive.

But are you telling me that Unicode has now set that as a final limit,
such that 5- and 6-byte encodings will never be needed? If so, then
please show me where that is stated, and I will have learned something
new.
Section 2.8 "Unicode Allocation"
| The Unicode codespace consists of the numeric values from 0 to 10FFFF
<http://www.unicode.org/versions/Unicode4.0.0/ch02.pdf>
<http://www.google.com/search?q=%2217+planes%22+Unicode>

Section 3.9 "Unicode Encoding Forms"
| Any UTF-32 code unit greater than 0010FFFF is ill-formed.
But section 3.1 states that the Unicode repertoire "is
inherently open", which doesn't seem to me like a statement that it's
limited to x10FFFF different code points.


Open to new characters _inside_ the range 0 to x10FFFF.

--
Top-posting.
What's the most irritating thing on Usenet?

Jul 20 '05 #79
On Wed, 21 Jul 2004, Andreas Prilop wrote:
Section 2.8 "Unicode Allocation"
Thanks.
| The Unicode codespace consists of the numeric values from 0 to 10FFFF
<http://www.unicode.org/versions/Unicode4.0.0/ch02.pdf>
Sure, I see that. But where does it say "and is guaranteed to remain
so in all future versions of Unicode" ?
Open to new characters _inside_ the range 0 to x10FFFF.


Well, they have way too much unassigned space already for this to be a
likely issue in -my- lifetime, so I don't know why I'm being so
pedantic.

thanks
Jul 20 '05 #80
On Wed, 21 Jul 2004, Alan J. Flavell wrote:
| The Unicode codespace consists of the numeric values from 0 to 10FFFF
<http://www.unicode.org/versions/Unicode4.0.0/ch02.pdf>


Sure, I see that. But where does it say "and is guaranteed to remain
so in all future versions of Unicode" ?


Here? <http://www.google.com/search?q=%22limit+future+code%22>

--
Top-posting.
What's the most irritating thing on Usenet?

Jul 20 '05 #81
On Thu, 22 Jul 2004, Andreas Prilop wrote:
On Wed, 21 Jul 2004, Alan J. Flavell wrote:
Sure, I see that. But where does it say "and is guaranteed to remain
so in all future versions of Unicode" ?


Here? <http://www.google.com/search?q=%22limit+future+code%22>


Uh-uh, that's an indirect address reference to
http://www.unicode.org/unicode/faq/utf_bom.html#9

If they're so definite about it, one wonders why they didn't see fit
to put it into the formal specification, rather than only in their
FAQ. But yes, you're right - although I -had- read through the FAQ on
an earlier occasion, I hadn't spotted that the FAQ addressed this
specific question. OK, EOT for me.
Jul 20 '05 #82

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
by: Pieter Claerhout | last post by:
Hi all, what would be the easiest way in Python to decode HTML entities to a unicode string? I would need a function that supports both numerical as well as name based HTML entities. I...
3
by: Kunle Odutola | last post by:
I have a database that tracks players for children's sports clubs. I have included representative DDL for this database at the end of this post. A single instance of this database supports...
3
by: Michel de Becdelièvre | last post by:
I have some *performance* trouble reading MathML files in my application (in ASP.Net). - I have small MathML files (2-3k) as input - as (almost) all MathML files these use entities. I have no...
1
by: pawel.pabich | last post by:
Hajo, is it possible to change all named entities to their numbers? For example: &sect; to § ? Pawel
5
by: Booted Cat | last post by:
I've seen lots of discussions on the proposed inclusion of "function call with named arguments" to C/C++ on these newsgroups. My proposal is slightly different in that: * No ANSI approval is...
2
by: Joergen Bech | last post by:
Is there a function in the .Net 1.1 framework that will take, say, a string containing Scandinavian characters and output the corret HTML entities, such as &aelig; &oslash; &aring; etc.
2
by: Frantic | last post by:
I'm working on a list of japaneese entities that contain the entity, the unicode hexadecimal code and the xml/sgml entity used for that entity. A unicode document is read into the program, then the...
6
by: clintonG | last post by:
Can anybody make sense of this crazy and inconsistent results? // IE7 Feed Reading View disabled displays this raw XML <?xml version="1.0" encoding="utf-8" ?> <!-- AT&T HTML entities & XML...
7
by: tempest | last post by:
Hi all. This is a rather long posting but I have some questions concerning the usage of character entities in XML documents and PCI security compliance. The company I work for is using a...
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.