468,315 Members | 1,414 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 468,315 developers. It's quick & easy.

New file format design

Hello there,

I am looking for suggestions for designing a simple file format
based on XML. It will only contain text information (no binary data).
1. If I have a choice: Element or Attribute ?
2. Do I need to define my own file version (maybe as the first XML
element) ?
3. Do I need to provide a DTD or XML schema ?

Thanks for inputs,
Mathieu

Jun 15 '06 #1
7 1220
mathieu wrote:
1. If I have a choice: Element or Attribute ?
This is a FAQ. What's the intent of the datum (modifier or content), and
will it ever in the future want to be structured (in which case it has
to be an element).
2. Do I need to define my own file version (maybe as the first XML
element) ?
Up to you. Will you ever need to distinguish versions?

3. Do I need to provide a DTD or XML schema ?


Up to you. Do you want the parser to help confirm the data is reasonably
structured and contains plausible values? Do you need to mark some data
as having particular kinds of meanings (ID is the obvious one that has
to be defined at this level)? Do you want to define named entities
(supported only in DTDs, and *probably* best avoided these days although
folks still debate that)?
--
() ASCII Ribbon Campaign | Joe Kesselman
/\ Stamp out HTML e-mail! | System architexture and kinetic poetry
Jun 15 '06 #2
On Thu, 15 Jun 2006 16:25:23 -0400, Joe Kesselman
<ke************@comcast.net> wrote:
mathieu wrote:
1. If I have a choice: Element or Attribute ?


This is a FAQ.


Isn't this the only Q that's more FA'ed than,
"Why does SAX cut off my text" ? 8-)
Jun 15 '06 #3
Andy Dingley wrote:
1. If I have a choice: Element or Attribute ?

Isn't this the only Q that's more FA'ed than,
"Why does SAX cut off my text" ? 8-)


I wouldn't like to try to guess which one wins. :-P

--
() ASCII Ribbon Campaign | Joe Kesselman
/\ Stamp out HTML e-mail! | System architexture and kinetic poetry
Jun 15 '06 #4

Joe Kesselman wrote:
This is a FAQ. What's the intent of the datum (modifier or content), and
will it ever in the future want to be structured (in which case it has
to be an element).


Thank for the ref, I am sorry I did not do the step of searching for
it.
http://xml.silmaril.ie/developers/attributes/

2. Do I need to define my own file version (maybe as the first XML
element) ?


Up to you. Will you ever need to distinguish versions?


Well I disagree simply because I don't know. I was under the impression
that XML was designed exactly for this 'I don't know'. So adding
Attributes or Elements is still (by design) syntactically correct. What
I am unsure is : is this mechanism enough ?
3. Do I need to provide a DTD or XML schema ?


Up to you. Do you want the parser to help confirm the data is reasonably
structured and contains plausible values? Do you need to mark some data
as having particular kinds of meanings (ID is the obvious one that has
to be defined at this level)? Do you want to define named entities
(supported only in DTDs, and *probably* best avoided these days although
folks still debate that)?


Not really, I know what I am reading. My understanding was that DTD or
XML schema was much more explicit for a third party than if I were to
write down the file specification.

Thanks !
M

Jun 16 '06 #5
mathieu wrote:
Joe Kesselman wrote:
2. Do I need to define my own file version (maybe as the first XML
element) ? Up to you. Will you ever need to distinguish versions?

Well I disagree simply because I don't know.


If you don't know, you can either treat the absence of the version mark
as indicating version 0.0, or you can go ahead and design it in now.
Either solution is defendable.

In general: If in doubt, it's wise to design for a version mark, even if
you make it optional.
My understanding was that DTD or
XML schema was much more explicit for a third party than if I were to
write down the file specification.


Not entirely. The DTD/Schema may be useful for driving some tools. It
may provide some specific kinds of information that aren't expressed
directly in the instance document -- if your parser doesn't support
xml:id, and you don't have a DTD or schema, tools may not be able to
take advantage of some optimization potential. In fact, IBM has
demonstrated that a schema-aware parser can actually be made faster than
a non-validating parser, if you know which schema to expect and you do
some compilation ahead of time. (I think a paper on that topic appears
in the current issue of the IBM Systems Journal; I know the authors have
presented papers on this at conferences.)

If those issues don't concern you, you don't have to create a DTD or
schema immediately -- but the longer you wait, the more likely folks
will do things in their instance documents that you didn't expect. And
formalizing your document design is a good exercise even if you don't
enforce it.

--
() ASCII Ribbon Campaign | Joe Kesselman
/\ Stamp out HTML e-mail! | System architexture and kinetic poetry
Jun 16 '06 #6
Joe Kesselman wrote:
mathieu wrote:
2. Do I need to define my own file version (maybe as the first XML
element) ?
If you don't know, you can either treat the absence of the version mark
as indicating version 0.0, or you can go ahead and design it in now.

King numbering.
(Coinage is labelled 'George II' and 'George IV', but simply 'George'
for the first one)

Jun 16 '06 #7
Andy Dingley <di*****@codesmiths.com> wrote:
King numbering.
(Coinage is labelled 'George II' and 'George IV', but simply 'George'
for the first one)


I like the term; thanks!
--
() ASCII Ribbon Campaign | Joe Kesselman
/\ Stamp out HTML e-mail! | System architexture and kinetic poetry
Jun 16 '06 #8

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

6 posts views Thread by a | last post: by
14 posts views Thread by Xah Lee | last post: by
9 posts views Thread by =?Utf-8?B?QnJpYW4gQ29vaw==?= | last post: by
reply views Thread by NPC403 | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.