By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
429,262 Members | 2,664 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 429,262 IT Pros & Developers. It's quick & easy.

translating LaTeX to XML

P: n/a
I have a LaTeX document describing a long list of items that I want to
translate to XML to treat these as a database. I've written a perl
script to do the basic translation, and a basic DTD file,
but I am stumped at translating
LaTeX character encodings to something XML won't choke on.

I found GNU recode to solve most of this, using

cat milestone.tex | recode -d tex..xml | itemdb -s xml -o milestone.xml

where itemdb is my perl script, and I've gotten rid of the diacritical
characters, but I'm getting errors with &s in URLs:

XML Parsing Error: not well-formed
Location: file:///home/friendly/SCS/Gallery/milestone/Private/milestone.xml
Line Number 2397, Column 101: <commentary
url="http://historical.library.cornell.edu/cgi-bin/cul.math/docviewer?did=00620001&seq=3"
text="Text of d'Ocagne'sbook on parallel coordinates" />
----------------------------------------------------------------------------------------------------^

(This is from the mozilla browser, trying to load the milestone.xml file.)
I'm pretty much a newbie with XML, so I don't know whether it is a
problem with my DTD or what tools are available (debian linux)

-Michael
--
Michael Friendly Email: fr******@yorku.ca
Professor, Psychology Dept.
York University Voice: 416 736-5115 x66249 Fax: 416 736-5814
4700 Keele Street http://www.math.yorku.ca/SCS/friendly.html
Toronto, ONT M3J 1P3 CANADA

Jul 20 '05 #1
Share this Question
Share on Google+
1 Reply


P: n/a
In article <c4**********@sunburst.ccs.yorku.ca>,
Michael Friendly <fr******@yorku.ca> wrote:

% url="http://historical.library.cornell.edu/cgi-bin/cul.math/docviewer?did=00620001&seq=3"

You must encode & as &amp; and < as &gt; anywhere they appear as text in
any XML document, so this should be

url="http://historical.library.cornell.edu/cgi-bin/cul.math/docviewer?did=00620001&amp;seq=3"

(Must might be a bit strong here, you could get away with &#x26; and &#x3c;
if you find that more aesthetically pleasing.)

--

Patrick TJ McPhee
East York Canada
pt**@interlog.com
Jul 20 '05 #2

This discussion thread is closed

Replies have been disabled for this discussion.