473,722 Members | 2,338 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

RFC: A Distributed Universal SGML/XML Catalogue Management System

Rationale
=========

Many applications today benefit from an SGML and/or XML Entity Catalogue
to dereference entities referenced by a Public Identifier. For a
validating SGML parser this is an absolute requirement. For any
SGML or XML parser it serves to enable entities such as DTDs and
modules to be resolved locally.

Hitherto, different packages and applications have distributed entity
catalogues. Examples are Docbook, HTML Validators, the OpenSP parser,
and operating system distros. However, there is little coordination
between the distributors of these, and no common package distributors
can rely on. Even in tightly-controlled environments such as the
Debian packages, the W3C Validator includes its own Entity
Catalogue rather than relying on it being available as a dependency.

This situation should be rationalised to allow for an SGML and XML
catalogue to be a single package on which other packages can depend.
In this note, we propose a framework for managing such a package.

Goals
=====
* To maintain a Universal Catalogue
* To provide an automated process for generating local installations of
all or part of the Universal Catalogue.
* To minimise the effort and coordination required to ensure that the
universal catalogues and local installations remain up-to-date.
In particular, end-users should be offered a self-maintaining default
installation that eliminates effort on their part altogether.
* To enable control of different parts of the catalogue to be delegated
to the people/organisations responsible for them.

A loose analogy could be drawn to DNS. But since immediate lookup of
[SG|X]ML entities is dealt with by SYSTEM ids, we only have to deal with
efficient cacheing of local copies of PUBLIC ids. Entities are in
general long-lived, but by no means immutable (for example, the MathML 2
DTD modules have undergone several minor revisions).

Managing a Universal Catalogue
=============== ===============

In principal, all organisations creating public identifiers should be
registered with ISO.
But this is not widely practiced, and the present chaotic situation
indicates that it is not effectively meeting todays needs. We propose
that a distributed architecture for automating catalogue management
is both feasible and preferable.

#### ISO registry: availability???

Our proposal envisages a central registry, cooperating with a set of
recognised repositories each managing its own entity catalogue locally.
For example, the W3C, WapForum and Oasis each manage their own catalogues
independently. Likewise, different groups acting independently within
W3C are responsible for different areas such as HTML, MathML, SVG and
SMIL.
We propose that a universal catalogue will work best if responsibility
for each sub-catalogue is explicitly devolved to the working group
responsible for defining it. The central registry will serve merely
to reference the reponsible groups, in a manner somewhat analagous to DNS.

This is broadly in line with the registry already run by the ISO but
not widely used. What our proposal adds is the availability of the
registry online in machine-readable format, and its integration with
catalogues maintained by each participating organisation. It is
possible that tying the registry in to distribution of Markup libraries
and catalogues may in itself be an incentive for organisations to
register.

#### Implications for naming conventions?

Implementation
==============

Since the Universal Catalogue serves SGML and XML applications, it is
appropriate that it should itself be capable of implementation as an
SGML or XML application. This is straightforward : all we need is a
DTD for declaring catalogues and catalogue entries, and a list of
entities defining catalogues maintained by the groups entrusted with
doing so. This is then implemented by a program to fetch the data
required and write the catalogues. Local installations may be
customised by selecting which entities to include, while package
maintainers can ship a standard configuration.

An implementation demonstrating the above is available at
<URL:http://valet.webthing. com/catalogue/>. It fetches the master
catalogue, DTD and Entities by HTTP. It updates all entries defined,
but uses HTTP If-Modified-Since header to avoid the overhead of re-
fetching anything that is already up-to-date in the local installation.
It can therefore be run regularly (e.g. monthly) with minimal overhead.

CatalogueManage r may be used as-is, but is intended as a proof-of-concept.
Non-technical issues such as how to delegate responsibility for different
sub-catalogues need to be addressed, and the file format used for
the demonstrator is likely to be subject to improvement.

Security
========

A package such as CatalogueManage r that updates system files based on
third-party definitions has potential to introduce malicious files.
It is strongly recommended that standard system security be used to
avoid serious consequences in the event of any of the sub-catalogues
being compromised. CatalogueManage r should run as a user with no
privilege to write to the local filesystem except within a designated
SGML/XML library area, such as /usr/local/share/sgmlib.
Distributors creating a package such as an RPM of CatalogueManage r
should ensure your users' security.

A more inherently secure architecture would generate all local filenames
internally, and is probably preferable. The current implementation serves
for back-compatibility until the proposal can be considered stable.

--
Nick Kew

In urgent need of paying work - see http://www.webthing.com/~nick/cv.html
Jul 20 '05 #1
0 1877

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
2276
by: Constandinos Mavromoustakis | last post by:
CFP: CLADE 2004-Challenges of Large Applications in Distributed Environments ------------------------------------------------- PhD student - Dept.Informatics at Aristotle University of Thessaloniki URL-> http://agent.csd.auth.gr/~cmavrom -------------------------------------------------- -------------------------CLADE 2004--------------------------- Challenges of Large Applications in Distributed Environments June 7th, 2004, Honolulu,...
7
3999
by: J Goldman | last post by:
I'm looking for documentation pointers to learn what I need to put together a distributed database system. I've read through "Oracle 9i Database Administrator's Guide: Distributed Database Concepts" and Oracle Database Concepts: Introduction to the Oracle Server: Distributed Databases Overview", but I don't see how to do what I want. I would like to set up a distributed database system with location transparency and a shared schema...
1
2094
by: krammer | last post by:
Hello, Can any one please give me a short but concise pros and cons list of Unicode support in both SGML and XML? long story short, we are gonna port our leagacy SGML files to XML and the new XML files will have foreign (CJK) and Ascii/English in them. XML would be better to store the text in cuase it has better Unicode support than SGML right???? what are these advantages that XML has
6
2783
by: S. | last post by:
if in my website i am using the sgml { notation, is it accurate to say to my users that the site uses unicode or that it requires unicode? is there a mathematical formula to calculate a unicode value given its utf8 value? Rgds, Sam
19
3220
by: Arthur J. O'Dwyer | last post by:
Request for comments on the following program: http://www.contrib.andrew.cmu.edu/~ajo/workshop/tokens.c It's a token counter for the C programming language, following the outline kind-of-described here: http://www.kochandreas.com/home/language/tests/TOKENS.HTM Basically, it's supposed to give a reasonable approximation of the number of "atomic tokens" present in a C source file. To those reading in c.l.c: Are there any glaring...
40
2395
by: news.microsoft.com | last post by:
To Microsoft and fellow MSDN Universal subscribers... Regarding new MSDN Universal (I mean Premier) price and level changes: 1) Way too expensive for the small and medium developer Universal subscriber (and some large ones as well). $10,000 - $15,000 per user?!? Forget it! 2) Do you (MS) honestly believe that the market you are targeting will just buy this product on good faith? Without our recommendations? Think about
0
908
by: Charles Hall | last post by:
I've successfully built a web service which interacts with my company's document management system through it's supplied API. All works fine, except I now want to make the service more "distributed". Our document management architecture means that we have several document libraries spread across different sites. For performance reasons, I want to host an instance of the web service at each site, so that it can handle any requests to manage...
0
1390
by: melledge | last post by:
XTech 2006, 17-19 May, Amsterdam, The Netherlands www.xtech.org Ajax lightning demos line-up The provisional line-up for the Ajax lightning demos is now available. These will be rapid-fire 5-10 minute demos of some project or aspect of Ajax technology, chaired by Simon Willison of Yahoo!
14
2128
by: Richard Harter | last post by:
Apologies for the length - this post is best viewed with fixed font and a line width >= 72. Below is the source code for a C header file that provides a suite of storage management macros. I am asking for comments on it. In particular: Are there any gotchas that I have overlooked? Are there any suggestions for improvements? Is there a generally available superior packages to do the same thing with the same general licensing? ...
0
8863
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
8739
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
9384
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
9238
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
9157
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
9088
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
1
6681
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
4502
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
3
2147
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.