473,756 Members | 1,764 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

New file format design

Hello there,

I am looking for suggestions for designing a simple file format
based on XML. It will only contain text information (no binary data).
1. If I have a choice: Element or Attribute ?
2. Do I need to define my own file version (maybe as the first XML
element) ?
3. Do I need to provide a DTD or XML schema ?

Thanks for inputs,
Mathieu

Jun 15 '06 #1
7 1373
mathieu wrote:
1. If I have a choice: Element or Attribute ?
This is a FAQ. What's the intent of the datum (modifier or content), and
will it ever in the future want to be structured (in which case it has
to be an element).
2. Do I need to define my own file version (maybe as the first XML
element) ?
Up to you. Will you ever need to distinguish versions?

3. Do I need to provide a DTD or XML schema ?


Up to you. Do you want the parser to help confirm the data is reasonably
structured and contains plausible values? Do you need to mark some data
as having particular kinds of meanings (ID is the obvious one that has
to be defined at this level)? Do you want to define named entities
(supported only in DTDs, and *probably* best avoided these days although
folks still debate that)?
--
() ASCII Ribbon Campaign | Joe Kesselman
/\ Stamp out HTML e-mail! | System architexture and kinetic poetry
Jun 15 '06 #2
On Thu, 15 Jun 2006 16:25:23 -0400, Joe Kesselman
<ke************ @comcast.net> wrote:
mathieu wrote:
1. If I have a choice: Element or Attribute ?


This is a FAQ.


Isn't this the only Q that's more FA'ed than,
"Why does SAX cut off my text" ? 8-)
Jun 15 '06 #3
Andy Dingley wrote:
1. If I have a choice: Element or Attribute ?

Isn't this the only Q that's more FA'ed than,
"Why does SAX cut off my text" ? 8-)


I wouldn't like to try to guess which one wins. :-P

--
() ASCII Ribbon Campaign | Joe Kesselman
/\ Stamp out HTML e-mail! | System architexture and kinetic poetry
Jun 15 '06 #4

Joe Kesselman wrote:
This is a FAQ. What's the intent of the datum (modifier or content), and
will it ever in the future want to be structured (in which case it has
to be an element).


Thank for the ref, I am sorry I did not do the step of searching for
it.
http://xml.silmaril.ie/developers/attributes/

2. Do I need to define my own file version (maybe as the first XML
element) ?


Up to you. Will you ever need to distinguish versions?


Well I disagree simply because I don't know. I was under the impression
that XML was designed exactly for this 'I don't know'. So adding
Attributes or Elements is still (by design) syntactically correct. What
I am unsure is : is this mechanism enough ?
3. Do I need to provide a DTD or XML schema ?


Up to you. Do you want the parser to help confirm the data is reasonably
structured and contains plausible values? Do you need to mark some data
as having particular kinds of meanings (ID is the obvious one that has
to be defined at this level)? Do you want to define named entities
(supported only in DTDs, and *probably* best avoided these days although
folks still debate that)?


Not really, I know what I am reading. My understanding was that DTD or
XML schema was much more explicit for a third party than if I were to
write down the file specification.

Thanks !
M

Jun 16 '06 #5
mathieu wrote:
Joe Kesselman wrote:
2. Do I need to define my own file version (maybe as the first XML
element) ? Up to you. Will you ever need to distinguish versions?

Well I disagree simply because I don't know.


If you don't know, you can either treat the absence of the version mark
as indicating version 0.0, or you can go ahead and design it in now.
Either solution is defendable.

In general: If in doubt, it's wise to design for a version mark, even if
you make it optional.
My understanding was that DTD or
XML schema was much more explicit for a third party than if I were to
write down the file specification.


Not entirely. The DTD/Schema may be useful for driving some tools. It
may provide some specific kinds of information that aren't expressed
directly in the instance document -- if your parser doesn't support
xml:id, and you don't have a DTD or schema, tools may not be able to
take advantage of some optimization potential. In fact, IBM has
demonstrated that a schema-aware parser can actually be made faster than
a non-validating parser, if you know which schema to expect and you do
some compilation ahead of time. (I think a paper on that topic appears
in the current issue of the IBM Systems Journal; I know the authors have
presented papers on this at conferences.)

If those issues don't concern you, you don't have to create a DTD or
schema immediately -- but the longer you wait, the more likely folks
will do things in their instance documents that you didn't expect. And
formalizing your document design is a good exercise even if you don't
enforce it.

--
() ASCII Ribbon Campaign | Joe Kesselman
/\ Stamp out HTML e-mail! | System architexture and kinetic poetry
Jun 16 '06 #6
Joe Kesselman wrote:
mathieu wrote:
2. Do I need to define my own file version (maybe as the first XML
element) ?
If you don't know, you can either treat the absence of the version mark
as indicating version 0.0, or you can go ahead and design it in now.

King numbering.
(Coinage is labelled 'George II' and 'George IV', but simply 'George'
for the first one)

Jun 16 '06 #7
Andy Dingley <di*****@codesm iths.com> wrote:
King numbering.
(Coinage is labelled 'George II' and 'George IV', but simply 'George'
for the first one)


I like the term; thanks!
--
() ASCII Ribbon Campaign | Joe Kesselman
/\ Stamp out HTML e-mail! | System architexture and kinetic poetry
Jun 16 '06 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
1689
by: Vikas | last post by:
I am working on an application to be written in C++ which has to read a binary file and parse it to get the data out of it. The file format is Integrated Product Message (IPM) which contains message type, elements and sub elements in it. Now the elements in the file and their value depends on message type and one element's value may be dependent on whether some other element is present and what is its value. For example there can be...
6
2920
by: a | last post by:
(I've reached that familiar place where I've got a nagging little problem in a program I'm writing but I've been staring at code for too long and I probably wouldn't be able to recognize the answer even if I was staring right at it.) I'm trying to design a function that reads input from my data file into (a pair of) arrays. Simple enough? However, each line of the file is either going to be 4 integers or a character (then a carriage...
7
728
by: Arnold | last post by:
I need to read a binary file and store it into a buffer in memory (system has large amount of RAM, 2GB+) then pass it to a function. The function accepts input as 32 bit unsigned longs (DWORD). I can pass a max of 512 words to it at a time. So I would pass them in chunks of 512 words until the whole file has been processed. I haven't worked with binary files before so I'm confused with how to store the binary file into memory. What sort of...
14
2067
by: Xah Lee | last post by:
is there a way to condense the following loop into one line? # -*- coding: utf-8 -*- # python import re, os.path imgPaths= # change the image path to the full sized image, if it exists
2
3069
by: Jeevan | last post by:
Hi, I have an array of data (which I am getting from a socket connection). I am working on a program which acts on this data but the program is written to work on data from a file (not from an array). I cannot change anything in the program but can add some features by which I can convert this array of data into a file. The easiest thing would be to write the data into a file (in hard disk) and use it. But I will be working on thousands...
13
2406
by: NickName | last post by:
"For the vision impaired, SVG offers tremendous potential for interactive Internet mapping applications as discussed by Gardner and Bulatov (2001).". Now, here's an SVG file with fair/medium complexity, http://www.carto.net/papers/svg/samples/canvas.svg, even as a sighted person, I find its source code albeit in xml format hard to "absorb" at the first glance, let alone a visually impaired individual. The most important element seems...
1
1515
by: alacrite | last post by:
I have a class that represents a record in a database table. class tableName { int col1; string col2; int col3; ... other fields and relevant operations }
6
8228
by: sara | last post by:
I have what I think is a little strange...I have to get data from our payroll system into a specific format (fixed record length) as a .txt or .prn file only to upload to our 401k custodian. I can get the data into Access (from Payroll) and write a query with all the fields (inserting fields with spaces, as needed), but I can't figure out how to get it to export to a .txt file - without the field name header row. I don't really ...
9
3014
by: =?Utf-8?B?QnJpYW4gQ29vaw==?= | last post by:
I want to open a text file and format it into a specific line and then apply color to a specific location of the text and then display it in a RichTextBox after all of this is done. I can do all of the above after the file is loaded into the RichTextBox, and I am trying to speed the process up by doing it in a temp file.
10
5820
by: ARC | last post by:
Hello all, General question for back-end database that has numerous date fields where the database will be used in regions that put the month first, and regions that do not. Should I save a date format in the table design, such as: mm/dd/yyyy? What I've done for years is to store the date format in date fields, then on the forms, based on their region, I would set the date formats on form_load
0
9869
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
9838
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
9708
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
8709
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
7242
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
6534
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5302
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
3805
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
3354
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.