473,395 Members | 2,437 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,395 software developers and data experts.

DTD elements definition question

es
Hello there,

I'm trying to build what is in basis a screen scraper sofware that
takes an url as input and produces an xml file as output.I wanted to
introduce something like "document definitiion" for the source URL,
i.e.

<document id="some_news_site_without_rss"
url="http://www.example.com/news.html">
<news repeat="true">
<article>
<title begin="somehtml" end="somehtml">
</article>

</news>
</document>

would something like

<document>
<news>
<article>
<title>Some title 1</title>
</article>
<article>
<title>Some title 2</title>
</article>
<article>
<title>Some title 3</title>
</article>
</news>
</document>

I hope you get the idea.

My problem is that I've tried to describe this "definition language"
using DTD, but as far as I can see DTD doesn't support/specifies
something like "I want to have one fix parent element - document, all
the other elements are user-specified (unspecified), but they have to
be closed and have following attributes...".

I'm not so deep into XML/SGML thing so maybe I'm just missing some
basic thing.

Thanks,

Esad Hajdarevic
Jul 20 '05 #1
1 1435


es@d wrote:

I'm trying to build what is in basis a screen scraper sofware that
takes an url as input and produces an xml file as output.I wanted to
introduce something like "document definitiion" for the source URL,
i.e.

<document id="some_news_site_without_rss"
url="http://www.example.com/news.html">
<news repeat="true">
<article>
<title begin="somehtml" end="somehtml">
</article>

</news>
</document>

would something like

<document>
<news>
<article>
<title>Some title 1</title>
</article>
<article>
<title>Some title 2</title>
</article>
<article>
<title>Some title 3</title>
</article>
</news>
</document>

I hope you get the idea.

My problem is that I've tried to describe this "definition language"
using DTD, but as far as I can see DTD doesn't support/specifies
something like "I want to have one fix parent element - document, all
the other elements are user-specified (unspecified), but they have to
be closed and have following attributes...".

I'm not so deep into XML/SGML thing so maybe I'm just missing some
basic thing.


It is correct that if you use a DTD you need to define all elements and
attributes otherwise validation will fail.
With an XML schema you can define some elements while others can be
skipped during validation, see
http://www.w3.org/TR/xmlschema-0/
http://www.w3.org/TR/xmlschema-0/#any
--

Martin Honnen
http://JavaScript.FAQTs.com/

Jul 20 '05 #2

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
by: Wolfgang Lipp | last post by:
my question is: do we need container elements for repeating elements in data-centric xml documents? or is it for some reason very advisable to introduce containers in xml documents even where not...
0
by: Wolfgang Lipp | last post by:
From: Lipp, Wolfgang Sent: Tuesday, 27?January?2004 13:26 <annotation> the first eleven contributions in this thread started as an off-list email discussion; i have posted them here with...
0
by: Stefan Prange | last post by:
Hi everyone, I have a little problem here and wasn't able to find a proper solution by searching the web. So this newsgroups is my last resort ;-) I have an xml-schema definition of an...
23
by: Mikko Ohtamaa | last post by:
From XML specification: The representation of an empty element is either a start-tag immediately followed by an end-tag, or an empty-element tag. (This means that <foo></foo> is equal to...
5
by: Richard Cornford | last post by:
I am interested in hearing opinions on the semantic meaning of FORM (elements) in HTML. I have to start of apologising because this question arose in a context that is not applicable to the...
2
by: Lisa | last post by:
I have to work with an existing xml where there is a root element and all other elements are children of the root (in other words there is no nesting or hierarchy), e.g. <myroot> <A>adata</A>...
90
by: Christoph Zwerschke | last post by:
Ok, the answer is easy: For historical reasons - built-in sets exist only since Python 2.4. Anyway, I was thinking about whether it would be possible and desirable to change the old behavior in...
4
by: newbie120 | last post by:
Hi all maybe its just been a long day, but i have a question about call access modifiers in C#. Consider the following code. namespace Application { private class Class1 { int i;
9
by: Grey Alien | last post by:
If I have a struct declared as : struct A { double x ; char name; struct Other other ; void * ptr ; };
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.