Haines Brown wrote:
Michael Wojcik <mw*****@newsguy.comwrites:
>Andy Dingley just suggested TEI, though he proposed (and I concur)
that you store internally in TEI or DocBook but serve HTML. I'm not
sure whether that's what you were proposing above, or whether you were
thinking of serving XML + schema + style sheet to user agents. The
latter won't be handled properly by many UAs, and will confuse
non-technical users if they try to save content, etc.
Well, I _was_ toying with the idea of serving XML+schema+stylesheet. By
"UA" I presume you mean the average browser (IE).
I mean user agent: whatever is processing the data you send. (That's
standard terminology in the W3C specs, the HTTP RFCs, etc.) Doesn't
particularly matter to me whether it's "average" or exotic, though of
course you may decide not to worry about supporting less-common UAs.
(Do you expect people to read your journal on their iPhones? On other
mobile devices? On browsers embedded in appliances?)
However, I didn't
realize that browsers have problems with XML + public schema +
stylesheet. Would you be more specific about the kinds of problems and
their likelihood of their occurring?
I was over-hasty with that comment. I assumed that there were many UAs
that won't handle XML + schema + style sheet. (IE, for example,
doesn't even handle XHTML properly.) And I believe I've read more
substantial claims to that effect. But I realized when I read your
response that I had not actually verified that suspicion.
Personally, if I were building this application, I'd be reluctant to
serve XML + schema + style sheet, simply because I'd rather not do the
interoperability testing (or limit my content to a handful of common
UAs), when it's not at all difficult to serve HTML 4.01 Strict instead.
And why would a non-technical user
be confused? Wouldn't the user see on his browser the same thing if the
document were instead served as HTML?
Suppose you are a non-technical user. Suppose you are viewing a page
of this journal and decide to save a copy. You know, from prior
experience, that a saved web page is a file with an extension like
".htm" and possibly a folder containing some images and the like.
What's a ".xml" file? What's a ".xsd" file?
And whether the user sees "the same thing" is hard to say. Browsers
have built-in styles for HTML, which they will fall back on in various
circumstances. Some users have user style sheets, which select HTML
elements.
I'm unclear about just what is implied by "store internally". Do you
mean placing TEI or DocBook documents in a database on the server and
then process them for display as HTML/XHTML for the user?
You have to store content, and you have to serve it. Sometimes content
is static - that is, the server simply sends the stored representation
(often just by reading a file from a local filesystem). Often it's
dynamic: server-side includes, ASP and JSP and PHP and other sorts of
scriptable pages, CGI scripts, server extensions that execute
application code, etc.
I don't care (well, for these purposes) how you store content. I'm
suggesting that you store it in a form that works well for your
production toolchain and for the applications that use it - so TEI or
DocBook might well be a good choice. And I'm suggesting that you serve
it in a form that the UA is likely to handle well; I'd suggest HTML
4.01 Strict with external CSS 2.1 style sheets.
To go from the stored representation to the presentation
representation, XSLT looks like the obvious mechanism. The server
could do that on the fly, if it has sufficient resources; or it could
cache the generated HTML; or the HTML could be generated whenever the
XML is updated and served statically.
>You might want to take a look at /Kairos/ [1]. They've been in the
online-humanities-journal biz for a while (about 12 years), so they
have a lot of experience with what works well for their authors and
readers.
I don't understand why you offered this as an example, and probably miss
your point. The document I looked at from the Kairos site is just some
JavaScript that defines a framework and inserts into it an old-fashioned
(using table for format, for example) document.
I was unclear. I didn't mean /Kairos/ as an example of an implementation.
I suggested it because it's an online humanities journal of long
standing, relatively wide readership, and good reputation; because
they've had to deal with all of these issues, and these are the
compromises they arrived at; and because it demonstrates my other
point, which is that people writing for an online journal will want to
be able to use all the possible facilities. That means people will
want to submit articles with multimedia components, so you need to
think about how you'll handle non-text materials in your toolchain.
People will want to submit articles with dynamic content and scripting
- even applications, with any luck - so you'll need to handle that.
If I were to do this I'd
use SSI, XHTML, and CSS, but in any case, at least for the document I
viewed, the internally stored document is only HTML, not TEI or DocBook.
How can you tell how the document is stored internally? What you see
is what the server sent you. You don't know what it did in producing
that content.
--
Michael Wojcik
Micro Focus
Rhetoric & Writing, Michigan State University