473,657 Members | 2,953 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Organizing and converting large number of XML files

We are converting the OPEN Process Framework Repository
(www.donald-firesmith.com) of over 1,100 free open source reusable
process components for building development methods for
software-intensive systems from html to xml. The current html files are
organized into a hierarchy of dozens of files based on the natural
metamodel of process components on which the framework is based. I have
the following questions:
1) What is the appropriate way to organize and store the xml files?
Along the same lines as now, placing each XML file in the same folder in
which the current html file is and the future generated xhtml file will
reside? We are a non-profit volunteer organization so we have little
money for databases. Is there a free XML database that we should use
instead?
2) Our website is heavily crosslinked so that each webpage (one per
reusable process component) links to all of the other process component
webpages that are mentioned in it. Currently, our html file hardwires
the location of these links to their current location, making it almost
impossible to change the file structure if the metamodel changes. How
can we make use of the fact that the url for the link should be an
attribute of the process component being linked to and therefore should
be stored in the xml file for the process component being linked to?
How can we make this work when we must incrementally transition to xml
given we are a volunteer organization and have over 1,100 xml files to
generate, not to mention dozens and dozens of xsl files and dtd files?

Any advice on how to practially make the transition and organize/store
the files given the limitations on resources and large numbers of files
would be greatly appreciated.

By the way, browse the website and let us know what you think. If you
have any need for process on your projects, it is a great resource.

Don Firesmith
Chair, OPEN Process Framework Repository Organization

Jul 20 '05 #1
1 1984
Donald Firesmith wrote:
We are converting the OPEN Process Framework Repository
(www.donald-firesmith.com) of over 1,100 free open source reusable
process components for building development methods for
software-intensive systems from html to xml. The current html files are
organized into a hierarchy of dozens of files based on the natural
metamodel of process components on which the framework is based. I have
the following questions:
1) What is the appropriate way to organize and store the xml files?
Along the same lines as now, placing each XML file in the same folder in
which the current html file is and the future generated xhtml file will
reside? We are a non-profit volunteer organization so we have little
money for databases. Is there a free XML database that we should use
instead?
There are a few, but I have found that for *file* storage, the hierarchical
directory structure of the file system is perfectly adequate, and much,
much faster. You do need to take care and be rigorous about naming, though.
2) Our website is heavily crosslinked so that each webpage (one per
reusable process component) links to all of the other process component
webpages that are mentioned in it. Currently, our html file hardwires
the location of these links to their current location, making it almost
impossible to change the file structure if the metamodel changes. How
can we make use of the fact that the url for the link should be an
attribute of the process component being linked to and therefore should
be stored in the xml file for the process component being linked to?
If the data is stored in XML, and the link data is kept as (for example)
attributes of some element (they could also be element content, depending
on your XML design), then they can be accessed by whatever transformation
engine you use when generating the HTML, and the appropriate URI generated.

But you're right, this is a case where a database may be the answer, simply
because it's easier to manage this kind of metadata in bulk (as for example
when your metamodel changes) rather than hand-editing the XML (even though
that would be easier than hand-editing the HTML source).
How can we make this work when we must incrementally transition to xml
given we are a volunteer organization and have over 1,100 xml files to
generate, not to mention dozens and dozens of xsl files and dtd files?


Without studying it in more detail it's hard to say, but my gut feeling
is to make sure your HTML is utterly rigorous and consistent, and then
transform it to XHTML first. This gives you the opportunity to continue
serving it as HTML while you do it, but provides you with files which can
be machine-handled afterwards, when it comes to making your target XML.

///Peter
--
"The cat in the box is both a wave and a particle"
-- Terry Pratchett, introducing quantum physics in _The Authentic Cat_
Jul 20 '05 #2

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

8
1952
by: rouble | last post by:
Without using errno, is there a portable way to detect if a string number is not within the range 0 to 4294967295. Currently I am using strtoul. I can check the string for "-" as the first character before calling strtoul, so I can figure out if the value is less than 0. My issue arises when the number is greater than 4294967295 on platforms where ULONG_MAX is 4294967295; in this case strtoul will return ULONG_MAX and set errno to...
5
1921
by: David | last post by:
Hi all: I am processing a 3D bitmaps(essentially ~1024 2D bitmaps with a size of 1MB each). If I want read large amount of radom data from this series, how could I buffer the file to get optimized performance? With WinXP pro/512MB memories and no other big programmes running at the same time. Cheers
2
1572
by: TreatmentPlant | last post by:
My father is a careers advisor who sends to his clients a monthly snail mail or email as a sort of newsletter. He has been doing this for years, so has a large number of files that he would like to have turned into some sort of online, searchable archive. I know a bit of web-based programming and database construction etc, but have no real idea where to start on this project? Any tips, code snippets, ideas would be appreciated.
0
1266
by: Alexandre Guimond | last post by:
Hi. I've noticed that when i select a large number of files (> 400) using tkFileDialog.Open i get an empty list. Does anyone knows the limits of that interface regarding the maximum number of files that can be selected, or the maximum length of the resulting list? Does anyone have any work around? thx. alex.
1
1302
by: MrTea | last post by:
Hi Folks Hopefully a simple question... Using Visual Studio 2005, what is the easiest way to create a setup for my Windows Forms App that can copy a large number of required PDF files to the installation directory when the setup is executed. I have tried manually adding the PDFs to the Setup Project but it then takes absolutely ages to build.
2
4300
by: =?Utf-8?B?SnVzdCBjbG9zZSB5b3VyIGV5ZXMgYW5kIHNlZQ== | last post by:
Hello All i am trying to read a large txt files -may 1 GB-, which coze my program to hang, and i need to know if there are techniques that make this without hang and do it faster if you know something can help. please tell me. that is my code , and i had run it for a txt file its size is 423 MB, and i left my PC opened and in the next day i found that it records about 10 hours and large number of lines about 500000 line and also i got...
8
6384
by: theCancerus | last post by:
Hi All, I am not sure if this is the right place to ask this question but i am very sure you may have faced this problem, i have already found some post related to this but not the answer i am looking for. My problem is that i have to upload images and store them. I am using filesystem for that. setup is something like this, their will be items/groups/user each can
1
4283
by: =?Utf-8?B?UmFkZW5rb19aZWM=?= | last post by:
I am using standard File.Copy(source,dest,true) method in C# and I have problem with copying large number of files. Here is my code: foreach (FileInfo file in files) { File.Copy(file.FullName,destPath+ "\\" + file.Name, true); } This code copies only 5 or 10 files but in "files" collection there is 60 files.
1
1901
by: crult | last post by:
Hello, I have a large number of xml files in a folder. I want to read and extract the content of each xml file to a new.txt. I'm only interested in the content having the tag <Texte>, and i want to create a .txt file (a texte file for each of my xml's). I use the perl modules xml twig and xml simple. There's the code i have until now: my $xml_dir="C:\xmlperl"; my $output="C:\xmlperl\output.txt"; my $file = $ARGV; ...
0
8384
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
8302
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
1
8499
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
1
6162
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
4150
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
4300
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
2726
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
1937
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
2
1601
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.