473,765 Members | 2,134 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Can I use XML as an article database ?

Hi all,

I am a newbie with XML.
Hope that any expert can give me a hand to guide me the right
direction on this topic.

I have many articles, all are text file.
They are stored in many directories, according to its topic.

Using this method, I can easily classify the articles by topic.
But, I cannot classify it by Author, or by date.
So, 'directory' is not a good method.

If I put the articles into database,
I can easily add additional columns (e.g. Author, Date of Publish,
etc) to each article.

Then, I can easily sorted by Author or by Date.

But, using a database seems to be quite troublesome.

I wonder whether I can convert all article text file into an XML file
with, for example,
the following tags:
<author>xxx</author>
<date>yyyy-mm-dd</date>
<essay>The original article contents</essay>

Then, put all the XML files under a directory.
Then, use 'something' to search this directory.
Then, I can easily get a list sorted by Author, or by Date, or else.

Now, my questions are:

Q1. Is this method feasible ?

Q2. Is this a correct way of using XML ?
What I mean is XML designed for this use) ?

Q3. Is there anything in the world already done this ?
If yes, please guide me to that.

Q4. Is there anything related to this situation ?
If yes, please give me some keywords
so that I can continue searching the net.
I use the keywords : XML +document +index
but cannot find what I want.

Thanks for your expert advice in advance.
Alvin SIU

May 28 '07 #1
2 2171

Alvin SIU <al*******@gmai l.comwrote in
<11************ **********@j4g2 000prf.googlegr oups.com>:
I have many articles, all are text file.
They are stored in many directories, according to its
topic.

If I put the articles into database,
I can easily add additional columns (e.g. Author, Date of
Publish, etc) to each article.

Then, I can easily sorted by Author or by Date.

But, using a database seems to be quite troublesome.
Troublesome? I'm not sure what you mean. A database seems
like the only sensible way to go, whether it's XML
database, more traditional tuple-based RDBMS or something
else that has 'database' in its name. Because, whether you
realize it or not, what you describe *is* a database.
I wonder whether I can convert all article text file into
an XML file with, for example,
the following tags:
<author>xxx</author>
<date>yyyy-mm-dd</date>
<essay>The original article contents</essay>

Then, put all the XML files under a directory.
Right. Concealing the databaseness of your task behind the
familiar concepts of filesystem won't make The Database go
away. For that matter, any filesystem is a specialised
database.
Then, use 'something' to search this directory.
'Something' is called XQuery. You stuff your XML data into
an XML database, then use XPath/XQuery/XSLT/whatever else
to access it.
Q1. Is this method feasible ?
Not as you described. But if you replace 'directory'
with 'XML database' and 'something' with 'XQuery', it is.
Q2. Is this a correct way of using XML ?
What I mean is XML designed for this use) ?
XML is designed to represented structured data. XML
databases are designed to store and access structured data
represented as XML. XQuery is designed to query structured
data represented as XML.
Q3. Is there anything in the world already done this ?
If yes, please guide me to that.
IBM's DB2 9 Express-C. Alternatively, you might want to
google for XML databases.

--
Pavel Lepin
May 28 '07 #2
On 28 May, 07:54, Alvin SIU <alvin....@gmai l.comwrote:

Q1. Is this method feasible ?
As an example or as working code?

You can certainly do it, but performance for retrieving articles will
be terrible.

Q2. Is this a correct way of using XML ?
What I mean is XML designed for this use) ?
XML is a data format primarily for exchanging documents. Once they're
retrieved, store them in some sort of database.

For your example here, the obvious technology to use is a SQL
database. It's not a perfect choice, but it's very accessible to you.
Anyone can easily get hold of MySQL or Access-like database engines

Q3. Is there anything in the world already done this ?
If yes, please guide me to that.
About a squillion things already!

You should probably read up on:

Dublin Core (especially on this)
Metadata
OAI
RSS 1.0 / Atom syndication formats

You can do this in XML, although XML has restrictions that become a
real nuisance for big systems.
One of your problems isn't the storage and querying of your data, it's
the issue of "vocabulari es". As your system grows bigger and more
interested in inter-working with other systems, then you start to care
about identifying "authors" such that "Douglas Adams" is the guy who
wrote "Health Monitoring of Structural Materials and Components", not
the guy with the towel obsession (follow the link - even the mighty
Amazon have got this one wrong).
<http://www.amazon.co.u k/exec/obidos/ASIN/0470033134/codesmiths>

This itself is a big topic! (with much work going on within it). You
might find yourself using techniques like XML Schema or even OWL to
list these. It also starts to hit the limits of XML, and you might
find RDF more useful to you.

May 29 '07 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
3062
by: Mark | last post by:
I have a website with an increasing amount of articles and news reports and so I am thinking of moving away from storing each article as a seperate page to having a single page and storing articles in a databasewhich are retrieved using a GET parameter. I see the advantage to me in using this approach as being making use of MySQL's fulltext search capability and less work needed when updating the design of the page. I'm not sure of a few...
6
3464
by: Zhang Le | last post by:
Hello, I'm writing a little Tkinter application to retrieve news from various news websites such as http://news.bbc.co.uk/, and display them in a TK listbox. All I want are news title and url information. Since each news site has a different layout, I think I need some template-based techniques to build news extractors for each site, ignoring information such as table, image, advertise, flash that I'm not interested in. So far I have...
1
1337
by: Joe | last post by:
Hello, Joe here, wanted to get the 411 on this article. I posted in the FrontPage forum but there is never an answer. So I have come here, where all my IIS problems have been solved (Thank YOU!!) I want to create a small utility web site with a logon page and a form connected to an Access database. The article below explains exactly how but when I try this I can only see the code no actual logon page - nothing but code. My system is...
1
332
by: John Bailo | last post by:
C# Corner ( http://csharpcorner.com ) is running my article: "Multiuser XML 'Database' Web Service" Thanks to everyone in these newsgroups who gave me help and advice when presenting the concept. It's currently the first article listed on their home page!
8
1987
by: Will Chamberlain | last post by:
I came across a rather interesting article this morning and thought I'd share. We all know that Visual Studio is a great IDE, but I think we can all agree that it is adds a dramatic change to how we write code. I'm not posting to talk trash or start a flamewar, just wanting feedback in regards to the following article. I happen to use Visual Studio on a daily basis and am not a John Rivers alter-ego. ...
0
1238
by: Brian Maguire | last post by:
Linux Magazine Feb 2004 published an article titled "Postgresql 7.4: The Database Administrator's Database". It is a 5 page article describing in detail what and how to use the new postgres features such as Expression indexes, arrays, data type domains, polymorphic functions, and full text indexing. This excellent article was written by Josh Berkus and Joe Conway. Nice Job! It was nice to see Postgres on the front cover of a magazine.
8
1715
by: | last post by:
The New York Times and many other online publications automatically generate "most popular article" lists that cover, say, the last 24 hours. I am looking for guidance and/or code on the best way to do this in ASP.NET 2 in a way that a) won't crush the server and b) won't be surpressed by any of ASP.NET's caching options: you want the "hits" to still be registered even if someone is clicking through to a
13
1398
by: Jonathan Wood | last post by:
I'd like to build a Website that contains many articles. Two basic approaches are to either store the articles in aspx files, possibly indexed by the database, or to store the article text in the database. Some advantages of storing them in files are simplicity, and efficiency. Some advantages of storing them in the database are ease of some operations, and the option of using SQL Server 2005 text index to implement search. Can...
0
9398
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10156
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
1
9951
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
9832
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
1
7375
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
6649
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5275
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
2
3531
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
2805
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.