473,287 Members | 1,800 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,287 software developers and data experts.

XML for multilingual technical documentation - seven questions


As a technical author and translator, I am highly interested in single
source/multi format publishing. Meaning: I'd like to keep manuals,
technical specifications etc. in multiple languages (English, French)
in a *single* repository (<- files or database) and generate documents
in the various languages and target formats (XHTML, PDF, HTML Help,
Text) on demand.

I am not a programmer, though, and can't develop my own tools, but of
course I am willing to invest money and spend time learning.

I understand that I could use an existing XML Schema such as DocBook or
cook my own and then use XSLT to generate the various output formats.

Since I'm not keen on reinventing the wheel, I'd like to ask you what
would be a good (proven) way to achieve the following. I am looking for
a set of tools and technologies that will work together reliably, and I
assume others have solved these problems before. I'd be grateful if
someone could answer a few of the following questions.
1) Authoring tool?

I guess using a native XML editor from the start would be a better
approach than exporting from some proprietary format such as
FrameMaker. I have considered <oxygen/> and the XML Mind Editor. Are
these good editors for daily work on big, complex documents? What other
products would you recommend for a user fluent with plain text editors,
Frame and Dreamweaver? (A *cough* WYSIWYG environment (using some CSS)
would be appreciated.)
2) Appropriate XML Schema/DTD: DocBook or ..?

DocBook is impressing, but - forgive my blasphemy - seems a bit baroque
while missing pieces I would need for certain clients/technologies. Now
this may seem a bit megalomaniac, but if I wanted to build my own XML
Schema - what tools should I use? The Altova product suite seems
professional, but maybe overkill for a freelancer. What would you
suggest?
3) XSLT

I understand the XSLT processor does most of the magic that turns XML
into target formats. Assuming you'd want XHTML, pretty PDFs and HTML
Help - what would be my weapon of choice as a non-programmer? I'd like
to be able to modify PDF and HTML output, so a "blackbox" app is out of
question.
4) Multilingual documents

To prevent version drift, I would like to keep the text for all
languages in the same file. I.e. the (imaginary) <head1> tag should
hold both the English "Introduction" and the French "Préliminaire".
What's the best approach to achieve this? I can hardly have two <head1
lang="FOO"> tags when my DTD/Schema allows only one. Namespaces?
5) Index/TOC/Document outline

A (multi-level) Index, Table of Contents and maybe a (collapsible)
outline view of a document - does XSLT take care of these? Are there
e.g. sample XSLT stylesheets that can generate a hyperlinked outline of
an XML document in HTML?
6) Conditional Text

What I mean here is text that can be filtered out when generating
target formats. Assuming I want to do something like "Only generate the
digest version of the manual" - does DocBook allow me to tag sections
as "Only for Digest Version"? What would be the generic approach to do
this in XML, and how can I combine them on rendering ("Only for PDF"
AND "Digest")?
7) CAT translation

Integration of Translation Memory Tools: Is there an easy way to feed
XML (e.g. DocBook) documents into CAT tools? Ideally, this would accept
<para lang="EN">Source</para> and generate <para
lang="FR">Target</para> from a TU database.
Thank you for helping.

Jul 20 '05 #1
4 2431
David Winter wrote:

As a technical author and translator, I am highly interested in single
source/multi format publishing. Meaning: I'd like to keep manuals,
technical specifications etc. in multiple languages (English, French)
in a *single* repository (<- files or database) and generate documents
in the various languages and target formats (XHTML, PDF, HTML Help,
Text) on demand.
Yep. Common requirement.
I am not a programmer, though, and can't develop my own tools, but of
course I am willing to invest money and spend time learning.

I understand that I could use an existing XML Schema such as DocBook or
cook my own and then use XSLT to generate the various output formats.
DocBook is excellent for computer documentation. It may be overkill for
technical documentation in other fields (eg maintenance manuals for
washing machines) or may simply not provide what is needed in those
fields. It's a popular misconception that DocBook is for *any* technical
documentation, computing or not. And yes, XSLT can be used to transform
your XML.
Since I'm not keen on reinventing the wheel, I'd like to ask you what
would be a good (proven) way to achieve the following. I am looking for
a set of tools and technologies that will work together reliably, and I
assume others have solved these problems before. I'd be grateful if
someone could answer a few of the following questions.
1) Authoring tool?

I guess using a native XML editor from the start would be a better
Esseential.
approach than exporting from some proprietary format such as
FrameMaker. I have considered <oxygen/> and the XML Mind Editor. Are
these good editors for daily work on big, complex documents? What other
products would you recommend for a user fluent with plain text editors,
Frame and Dreamweaver? (A *cough* WYSIWYG environment (using some CSS)
would be appreciated.)
Don't be fooled by WYSIWYG. Unless it provides *all* your formatting needs,
it may be more of a hindrance than a help. An editor sold on the spurious
basis that it can use fonts and colour does not IMHO qualify as WYSIWYG.

Plaintext: Emacs with psgmls and nsgmls is free and runs on all platforms.

High-end: XML Spy and EPIC are excellent but to do *all* your formatting
you will almost certainly need to start programming them internally.
2) Appropriate XML Schema/DTD: DocBook or ..?

DocBook is impressing, but - forgive my blasphemy - seems a bit baroque
Quod scripsi scripsi (ut supra).
while missing pieces I would need for certain clients/technologies. Now
this may seem a bit megalomaniac, but if I wanted to build my own XML
Schema - what tools should I use? The Altova product suite seems
professional, but maybe overkill for a freelancer. What would you
suggest?
I write DTDs in Emacs with tdtd-mode, and I'll let you into a secret:
most the other DTD and Schema writers I know do the same -- eventually.
Graphical structure-design programs are excellent to get the thing up
and running in outline, though.
3) XSLT

I understand the XSLT processor does most of the magic that turns XML
into target formats. Assuming you'd want XHTML, pretty PDFs and HTML
Help - what would be my weapon of choice as a non-programmer? I'd like
to be able to modify PDF and HTML output, so a "blackbox" app is out of
question.
Don't even think of trying to modify PDF. It's and end-of-line format and
is not designed to be modified, just recreated afresh. In fact, don't try
and modify the HTML either. Always fix the problem in the XSLT (or the XML,
depending on what the problem is) and the recreate the output.

XSL:FO will create PDF direct, but at the expense of having to reinvent all
the formatting wheels -- by hand. I prefer to use XSLT to create LaTeX, and
rely on it because it already knows more about document formatting than
anything else. But it does mean learning some LaTeX (not hard, just
different).
4) Multilingual documents

To prevent version drift, I would like to keep the text for all
languages in the same file. I.e. the (imaginary) <head1> tag should
hold both the English "Introduction" and the French "Préliminaire".
What's the best approach to achieve this? I can hardly have two <head1
lang="FOO"> tags when my DTD/Schema allows only one. Namespaces?
Possibly. Or maybe <head lang="fr">Préliminaire</head> and
<head lang="en">Introduction</head>. These are a form of "effectivities"
(ie they come into effect only when picked up by your XSLT when you
specify "use lang='fr' this time"). Many DTDs do allow precisely this
kind of thing, specifically for this purpose (and more commonly, text
applicable to related but different product lines).

The alternative is to use a translating editor, if you can find one. There
was a superb one put out by CITEC years ago, for SGML, which displayed your
source language in the top window, and in the bottom window it put the
exact same elements, only empty, ready to fill in the target language
(subelements in mixed content were omitted, of course, as they would
likely occur in different sequences in a target language). But this has
long since disappeared, alas, and I've never seen a replacement.
5) Index/TOC/Document outline

A (multi-level) Index, Table of Contents and maybe a (collapsible)
outline view of a document - does XSLT take care of these? Are there
e.g. sample XSLT stylesheets that can generate a hyperlinked outline of
an XML document in HTML?
You can program these in XSLT very easily. There are indeed sample XSLT
stylesheets for (eg) DocBook doing exactly this.
6) Conditional Text

What I mean here is text that can be filtered out when generating
target formats. Assuming I want to do something like "Only generate the
digest version of the manual" - does DocBook allow me to tag sections
as "Only for Digest Version"? What would be the generic approach to do
this in XML, and how can I combine them on rendering ("Only for PDF"
AND "Digest")?
These are effectivities as above. DocBook has attributes to identify
conditionality and many other metadata features. So do many other DTDs.

Combining them would be something you do in the XSLT.
7) CAT translation

Integration of Translation Memory Tools: Is there an easy way to feed
XML (e.g. DocBook) documents into CAT tools? Ideally, this would accept
<para lang="EN">Source</para> and generate <para
lang="FR">Target</para> from a TU database.


I don't know what tools exist in this area. The localisation business was
very slow to take up XML, but it is gathering speed now. The nexus of
knowledge in this area is probably Dublin, which has a huge localisation
industry.

///Peter
--
"The cat in the box is both a wave and a particle"
-- Terry Pratchett, introducing quantum physics in _The Authentic Cat_
Jul 20 '05 #2
Hello Peter,

thank you for your comments - highly appreciated!

Well, it seems I'll bite the bullet and finally learn Emacs. :/

Don't even think of trying to modify PDF.
Sorry; I didn't express myself correctly here. I do not want to fiddle
with the HTML and PDF output, but change the XSLT or - in the case of
PDF - the XSL:FO generating the output. I still have no concept of
XSL:FO, i.e. how to setup various templates for cover and TOC pages,
multi-column pages etc. I had hoped for a handy GUI, but I can live
with some code tweaking. I'll finally take a closer look at LaTeX,
too.

Or maybe <head lang="fr">Préliminaire</head> and
<head lang="en">Introduction</head>.
Many DTDs do allow precisely this kind of thing,
specifically for this purpose (and more commonly, text
applicable to related but different product lines).


What (DTD) would you personally suggest for this (= Writing/maintaining
long technical manuals (various languages, various product versions)?
So far, I keep separate documents for each language, but having to
apply structure changes several times is a PITA.

Thank you again.

Jul 20 '05 #3
David Winter wrote:
Hello Peter,

thank you for your comments - highly appreciated!

Well, it seems I'll bite the bullet and finally learn Emacs. :/
:-) It's a life skill. I can't count the number of times it's saved my neck
when other systems have failed to produce the goodies.
Don't even think of trying to modify PDF.


Sorry; I didn't express myself correctly here. I do not want to fiddle
with the HTML and PDF output, but change the XSLT or - in the case of
PDF - the XSL:FO generating the output. I still have no concept of
XSL:FO, i.e. how to setup various templates for cover and TOC pages,
multi-column pages etc. I had hoped for a handy GUI, but I can live
with some code tweaking. I'll finally take a closer look at LaTeX,
too.


There are several experiments ongoing at creating XSLT GUIs but none of
them do anything useful outside simple 1:1 transformations (eg <para> to
<p>).

Cover pages (unless purely typographic) are often done by a designer as
a separate job. I don't know how your organisation handles these.

The reason behind recommending LaTeX over FO is simply that LaTeX has
all the stuff for automation (eg ToC, multi-columns, etc) already
written. I hate reinventing wheels in a production job.
What (DTD) would you personally suggest for this (= Writing/maintaining
long technical manuals (various languages, various product versions)?
Are they computer manuals or some other technology? For computer doc
I would always recommend DocBook as I've never found anything to beat it,
but if it's some other area, there may be industry-specific DTDs already
available (ask the relevant industrial consortiums and representative
bodies). Otherwise you can always write your own, but it's easier to
steal^H^H^H^H^Hplagia^H^H^H^H^H^Hborrow from another DTD where possible.

Get a copy of Eve Maler and Jeanne el Andaloussi's "SGML DTDs: from Text
to Model to Markup" (ignore the "SGML" in the title: 99% of everything in
the book applies to XML as well). This is THE book on writing DTDs, and
it covers the non-technical side of consulting with users, colleagues, etc,
document modelling, document analysis, and all the organisational aspects.

Doing it yourself is not hard, but needs foresight and hindsight as well
as inside knowledge of the document type.
So far, I keep separate documents for each language, but having to
apply structure changes several times is a PITA.


All multilingual work is a PITA to keep in synch unless you have a large-
scale production publishing workflow system. Actually you probably could
do something like it in Cocoon, but that would be a BIG task.

My gut feeling is to use separate documents, and have a CVS or RCS or other
document check-out/check-in system that will do something sensible with
the "this paragraph changed last time" attributes when a document is
checked out for editing (ie zap them), and then do some kind of diff on
the document when it's checked back in, and see if the diffs have all
been flagged with the relevant "updated" or "deleted" attribute, and
then enforce an interlock on publishing it until corresponding language
versions have been brought up to date. That would be a little tricky to
write, but it would help keep stuff in synch.

///Peter
--
"The cat in the box is both a wave and a particle"
-- Terry Pratchett, introducing quantum physics in _The Authentic Cat_
Jul 20 '05 #4
Peter,

once again thank you for your advice. The ideas on a multi-lingual
workflow sound interesting, but since I am a freelancer, I will have to
come up with some kind of home-cooked, affordable solution or wait for
an Open Source project (right now, everyone and their grandmother seems
to focus on building yet another generic CMS/Blog tool).

BTW, AuthorIT (http://www.authorit.com/) does what I have in mind (and
more), but at least the Localization Manager is out of my price range.
I guess I'll go with DocBook and use the opportunity to learn
something. :)

Jul 20 '05 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
by: Frazer | last post by:
Is there anything like ms support? where i can email technical questions and get answers to them? thnx
2
by: charliewest | last post by:
Building Multilingual Portal I have been assigned a new project to build a multilingual portal using ASP and/or ASP.NET and the expected Microsoft technologies including ADO and SQL Server 2000....
0
by: kpub | last post by:
Hi, I'm new to SQL/PostgreSQL programming so forgive me if my questions are to dumb? I'm designing a multilingual db and I don't know if I'm on the right track. Please advise. The basic...
1
by: maro | last post by:
i am making asp.net project using c#.net and i want to mak technical documentation for the project is there a tool in the .net fram work makes that?
102
by: Xah Lee | last post by:
i had the pleasure to read the PHP's manual today. http://www.php.net/manual/en/ although Pretty Home Page is another criminal hack of the unix lineage, but if we are here to judge the quality...
64
by: Manfred Kooistra | last post by:
I am building a website with identical content in four different languages. On a first visit, the search engine determines the language of the content by the IP address of the visitor. What the...
2
by: Andrew Bullock | last post by:
Hi, Does VS 2005 (C#) have built in support to allow applications to be multilingual, or will i have to program the functionality myself? Thanks Andrew
8
by: CptDondo | last post by:
I have a small, embedded app that uses a webserver to serve up pages showing status, etc. Right now all the pages are hard-coded in English. We need to provide multi-lingual support. All of...
2
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 7 Feb 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:30 (7.30PM). In this month's session, the creator of the excellent VBE...
0
by: MeoLessi9 | last post by:
I have VirtualBox installed on Windows 11 and now I would like to install Kali on a virtual machine. However, on the official website, I see two options: "Installer images" and "Virtual machines"....
0
by: DolphinDB | last post by:
The formulas of 101 quantitative trading alphas used by WorldQuant were presented in the paper 101 Formulaic Alphas. However, some formulas are complex, leading to challenges in calculation. Take...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: marcoviolo | last post by:
Dear all, I would like to implement on my worksheet an vlookup dynamic , that consider a change of pivot excel via win32com, from an external excel (without open it) and save the new file into a...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.