473,545 Members | 1,890 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Using XHTML entities in XML documents: Legal?


I have a need to include Greek letters in some of my XML documents (the
documents contain astronomical information and many stars are named using
Greek letters). Following some earlier postings on the subject of
entities. I did the following

---- top of file ----
<?xml version="1.0"?>

<!-- I added this to an existing document. -->
<!DOCTYPE observation-set [
<!ENTITY % HTMLsymbol PUBLIC
"-//W3C//ENTITIES Symbols for XHTML//EN"
"xhtml-symbol.ent">
%HTMLsymbol;
]>

<?xml-stylesheet type="text/xsl" href="AOML.xsl" ?>

<!-- This is the existing document root. -->
<observation-set
xmlns="http://www.ecet.vtc.ed u/~pchapin/AOML_0.0"
xmlns:xhtml="ht tp://www.w3.org/1999/xhtml"
xmlns:xsi="http ://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocat ion="http://www.ecet.vtc.ed u/~pchapin/AOML_0.0
AOML.xsd">

<!-- Now I believe I can use &alpha;, &beta;, etc. here. -->

</observation-set>
---- end of file ----

I'm attempting to borrow the entity definitions that were created for
XHTML. I downloaded the file xhtml-symbol.ent from the W3C and have a
copy locally in the same folder as the XML document that references it.
My desire was to now be able to use things like &alpha; and &beta; in my
XML document.

This mostly works. In particular, it works fine with IEv6. My XML
documents also validate (no complaints about undefined entities) with XSV
and XMLSpy (using MSXML, I believe). Also if I use Xalan to style
the document, it generates appropriate HTML. In fact I was able to prove
that Xalan is reading the external file containing the entity
definitions: I temporarily changed the definition of &alpha; to be the
same as &beta;. When Xalan wrote its output it serialized the character I
had written as "&alpha;" in the XML document into "&beta;" in the output
HTML document. Very cool.

However, with Mozilla v1.3 I get "undefined entity" errors. Even if I
include in the internal subset an explicit definition of the entities I'm
using, Mozilla still doesn't seem to notice them. Is this a problem with
Mozilla or am I missing something in my document? It is my desire to
support Mozilla so disregarding this problem is not really an option.

On a possibly related note, the Xerces (v2.3.0) parser seems to notice
the entities but it produces errors of this sort:

[Error] AO-2003-06-16.xml:16:75: Element type "observatio n-set" must be
declared.

The (line, column) of the error points to the end of the opening
observation-set tag. This error does not occur if I remove the <!DOCTYPE
observation-set [...]>. It almost seems as if Xerces sees the DOCTYPE
declaration and commits itself to the idea that a DTD is being used when,
in fact, the document uses an XML Schema. (It complains about all the
other elements as well, not just the document element). However, neither
XSV nor MSXML seemed to have that problem. Is this an issue with Xerces
or is mixing DOCTYPE and XML Schemas a bad thing?

Thanks for any clarification you can provide.

Peter

Jul 20 '05 #1
4 2492


Peter C. Chapin wrote:
I have a need to include Greek letters in some of my XML documents (the
documents contain astronomical information and many stars are named using
Greek letters). Following some earlier postings on the subject of
entities. I did the following

---- top of file ----
<?xml version="1.0"?>


As you don't specify an encoding for your XML you use UTF-8 or UTF-16
both of which are capable to encode Greek letters without the need to
use entities.
So why do you need to use entities?


--

Martin Honnen
http://JavaScript.FAQTs.com/

Jul 20 '05 #2
In article <3F************ **@t-online.de>, Ma***********@t-online.de
says...
As you don't specify an encoding for your XML you use UTF-8 or UTF-16
both of which are capable to encode Greek letters without the need to
use entities.
So why do you need to use entities?


Well, I don't have an editor that allows me to easily enter or view Greek
letters. I have been meaning to look into the matter of editing "Unicode"
files (that is, files that use characters above U+007F to a non-trivial
extent). I haven't walked that road as yet and I guess I figured the
entity solution would address the matter for the half dozen or so greek
characters that I need per document in my current situation.

Since posting my original note I spent some time with the Mozilla bug
database. It turns out that Mozilla doesn't (at least old versions) read
external entities (apparently non-validating parsers are not required to
do so). Furthermore once it encounters a reference to an external entity
it stops processing the internal DTD subset. Apparently this is according
to the XML specification.

However, unlike my earlier assertion Mozilla does read the internal DTD
subset. The reason it didn't notice the Greek entity definitions that I
tried before was because I put them *after* the reference to the external
entity. When I remove the external entity entirely it works fine.

Thus I can get the effect I want if I define all the Greek letter
entities in the internal DTD subset of each document that I produce. That
is not ideal but it is workable, I think.

Peter

Jul 20 '05 #3
Peter C. Chapin <pc***********@ ecet.vtc.edu> wrote:
So why do you need to use entities?


Well, I don't have an editor that allows me to easily enter or view Greek
letters.


Then use &#number; references.
http://www.unics.uni-hannover.de/nht...al2.html#greek

--
Top posting.
What's the most irritating thing on Usenet?
Jul 20 '05 #4

In article <05************ *************@r rzn-user.uni-hannover.de>,
nh******@rrzn-user.uni-hannover.de says...
Well, I don't have an editor that allows me to easily enter or view Greek
letters.


Then use &#number; references.


The document is far more readible and writable using, for example
"&alpha;" than it is using "α". While the numeric references do work
they don't really seem like a very nice solution in this case. I read and
write these documents manually.

Peter

Jul 20 '05 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
1888
by: Joshua Beall | last post by:
Hi All, I have been using the SAX library in PHP to parse XHTML documents, and one thing I have noted is that the <!DOCTYPE> line is ignored. I am wondering is there any way to get the <!DOCTYPE> using the SAX functions in PHP? I am looking over the manual, but nothing is jumping out at me... reference:...
13
3636
by: Tjerk Wolterink | last post by:
Hello i've an xsl stylesheet that must support xhtml entities, my solution: ---- <?xml version="1.0" encoding="ISO-8859-1"?> <!DOCTYPE xsl:stylesheet > <xsl:stylesheet version="1.0" xmlns="http://www.w3.org/1999/xhtml" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
23
4049
by: Mikko Ohtamaa | last post by:
From XML specification: The representation of an empty element is either a start-tag immediately followed by an end-tag, or an empty-element tag. (This means that <foo></foo> is equal to <foo/>) From XHTML specification:
19
3374
by: Ian | last post by:
I'm using the following meta tag with my documents: <meta http-equiv="Content-Type" content= "text/html; charset=us-ascii" /> and yet using character entities like &rsquo; and &mdash; It validates at W3C and WDG, and runs in standards compliance mode in Firefox 0.9. What I'm wondering is, is this a good practice? I assume my pages will...
21
4157
by: kaeli | last post by:
Hey all, Does anyone know if all the newer browsers support XHTML? My main target is IE6/NN6+(firefox/mozilla/etc), but I'd like to know if Safari, Opera, Konqueror, and other browsers also support it. Anyone know of any major compatibility problems with XHTML? How do older browsers, such as Netscape 4, react to the doctype? TIA
32
4487
by: jp29 | last post by:
My take on problems composing, serving and rendering XHTML documents/web pages: 1. Typical conscientious web authors are producing XHTML documents (Web pages) that feature valid Markup and with the content (MIME) type specified as text/html (http://keystonewebsites.com/articles/mime_type.php). These pages are then loaded on to their Server...
2
1482
by: Peter C. Chapin | last post by:
Hello! I'm in the process of creating a schema for marking up my personal astronomical observations (yes, I know this has already been done; I'm not worried about that... this is a pet project). What I would like to do is introduce an <entry> element to enclose a log entry. I would like to include, as possible children of <entry> certain...
22
1896
by: Ted | last post by:
This page http://homepage.ntlworld.com/r.a.mccartney/test/utf-8_test_file_hacked_for_ie_with_local_dtd.xml doesn't work properly in Firefox or IE6. The faults are different. In Firefox the TestText entity is not recognised. In IE6, the <br /> tag doesn't cause a line break. Can anyone tell me what I'm doing wrong?
2
1838
by: Volker Hetzer | last post by:
Hi! I'm trying to parse an xhtml document like this: <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head><title> </title></head> <body> <dl>
0
7478
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main...
0
7410
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language...
0
7923
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that...
0
5984
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then...
0
4960
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert...
0
3466
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in...
0
3448
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
1901
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
1
1025
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.