473,498 Members | 1,942 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

XML to SGML entities

Hello,

I was wondering if anybody could point me in the right direction
regarding this.

I have unicode entities in an XML in hexadecimal format and I need to
be able to convert to ISO entities. Are there facilities available to
do this easily or do I have to parse all text and convert everything
manually? If thats what I have to do, is there any code already
available that would orient me in the right direction?

This is my XML snippet.

XML:

<?xml version = "1.0" encoding = "UTF-8"?>
<root>
<para>&#x212B; &#x00C5; &#x00E5; &#x00C3; &#x03B2; &#x03B5; &#x03F0;
&#x03BB; &#x03BC;</para>
</root>

I basically need to something like this:

SGML:

<root>
<para>&angst; &Aring; &aring; &Atilde; &b.beta; &b.epsi; &b.kappav;
&b.lambda; &b.mu;</para>
</root>

Thanks

Regards
Jeff

Dec 4 '06 #1
2 1960
Jean-François Michaud wrote:
Hello,

I was wondering if anybody could point me in the right direction
regarding this.

I have unicode entities in an XML in hexadecimal format and I need to
be able to convert to ISO entities. Are there facilities available to
do this easily or do I have to parse all text and convert everything
manually? If thats what I have to do, is there any code already
available that would orient me in the right direction?

This is my XML snippet.

XML:

<?xml version = "1.0" encoding = "UTF-8"?>
<root>
<para>&#x212B; &#x00C5; &#x00E5; &#x00C3; &#x03B2; &#x03B5; &#x03F0;
&#x03BB; &#x03BC;</para>
</root>

I basically need to something like this:

SGML:

<root>
<para>&angst; &Aring; &aring; &Atilde; &b.beta; &b.epsi; &b.kappav;
&b.lambda; &b.mu;</para>
</root>

Thanks

Regards
Jeff

one way is to use xslt2 character maps, if I save your file as ent.xml,
saxon8 gives the following output if run with the stylesheet at the end
it's not quite the result you asked for but I think the bold greek
should map to the characters in plane1 so the grk3 entity names are used
rather than grk4. (It would be easy for you to take a local copy and
change that though)

David

$ saxon8 ent.xml ent.xsl
<?xml version="1.0" encoding="UTF-8"?><root>
<para>&angst; &Aring; &aring; &Atilde; &beta; &epsiv; &kappav;
&lambda; &mu;</para>
</root>

<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:import
href="http://www.w3.org/2003/entities/iso9573-2003/iso9573-2003map.xsl"/>
<xsl:output use-character-maps="iso9573-2003"/>
<xsl:template match="/">
<xsl:copy-of select="/"/>
</xsl:template>

</xsl:stylesheet>
Dec 4 '06 #2

David Carlisle wrote:
Jean-François Michaud wrote:
Hello,

I was wondering if anybody could point me in the right direction
regarding this.

I have unicode entities in an XML in hexadecimal format and I need to
be able to convert to ISO entities. Are there facilities available to
do this easily or do I have to parse all text and convert everything
manually? If thats what I have to do, is there any code already
available that would orient me in the right direction?

This is my XML snippet.

XML:

<?xml version = "1.0" encoding = "UTF-8"?>
<root>
<para>&#x212B; &#x00C5; &#x00E5; &#x00C3; &#x03B2; &#x03B5; &#x03F0;
&#x03BB; &#x03BC;</para>
</root>

I basically need to something like this:

SGML:

<root>
<para>&angst; &Aring; &aring; &Atilde; &b.beta; &b.epsi; &b.kappav;
&b.lambda; &b.mu;</para>
</root>

Thanks

Regards
Jeff


one way is to use xslt2 character maps, if I save your file as ent.xml,
saxon8 gives the following output if run with the stylesheet at the end
it's not quite the result you asked for but I think the bold greek
should map to the characters in plane1 so the grk3 entity names are used
rather than grk4. (It would be easy for you to take a local copy and
change that though)

David

$ saxon8 ent.xml ent.xsl
<?xml version="1.0" encoding="UTF-8"?><root>
<para>&angst; &Aring; &aring; &Atilde; &beta; &epsiv; &kappav;
&lambda; &mu;</para>
</root>

<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:import
href="http://www.w3.org/2003/entities/iso9573-2003/iso9573-2003map.xsl"/>
<xsl:output use-character-maps="iso9573-2003"/>
<xsl:template match="/">
<xsl:copy-of select="/"/>
</xsl:template>

</xsl:stylesheet>
Wow! More than I could ever ask for. This is exactly the kind of stuff
I was looking for. Thank you much for your help!! I will look into this
more closely.

Warm regards
Jean-Francois Michaud

Dec 5 '06 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
5331
by: Usman | last post by:
Dear friends, I would like to ask about James Clark sx.exe parser from SGML to XML. I write the batch file like this : "E:\Project\sx\sx.exe" -wall "-DE:\Project\sx\entities"...
0
1857
by: Nick Kew | last post by:
Rationale ========= Many applications today benefit from an SGML and/or XML Entity Catalogue to dereference entities referenced by a Public Identifier. For a validating SGML parser this is an...
1
2142
by: krammer | last post by:
Hello, I have the following questions that I have not been able to find any *good* answers for. Your help would me much appreciated!, fyi, I am a Java XML guy and I have no experience with SGML...
4
4308
by: Clifford W. Racz | last post by:
I am an XML author and I am needing to do some SGML work as well. I am using James Clark's SP (SX) to transform my SGML source into an XML source for use with XSLTs. I am needing to write XSLTs...
6
2759
by: S. | last post by:
if in my website i am using the sgml { notation, is it accurate to say to my users that the site uses unicode or that it requires unicode? is there a mathematical formula to calculate a unicode...
3
1841
by: Michael Hamm | last post by:
How's browser support for author-defined entities? E.g.: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"> <!ENTITY foo CDATA "bar"><title>...</title> <p>I went to the &foo; and had a drink. ...
4
1294
by: Steven Bethard | last post by:
I have some plain text data and some SGML markup for that text that I need to align. (The SGML doesn't maintain the original whitespace, so I have to do some alignment; I can't just calculate the...
3
2366
by: jimmy.williamson | last post by:
Hi, I'm currently working on a project where I am required to investigate how to convert SGML to XML, and then back again. >From what I've seen on the web so far, James Clark's SP software can...
2
2787
by: Frantic | last post by:
I'm working on a list of japaneese entities that contain the entity, the unicode hexadecimal code and the xml/sgml entity used for that entity. A unicode document is read into the program, then the...
0
7125
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
7165
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
7203
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
1
6885
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
7379
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
5462
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
4588
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...
0
3081
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
0
290
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.