473,725 Members | 2,264 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Looking for Ideas: Translating Document into Commands in XML

Hi guys

A bit of curve ball here ... I have a document (Word) that contains a series
of instructions in sections and subsections (and sub-subsections). There are
350 pages of them.

I need to translate these instructions into something that can be processed
automatically, so I have used the Command pattern to set up a set of
commands that correspond to the various instructions in the document.

I have started to enter the instructions into an xml file, which I can
deserialise into my command hierarchy. However, transcribing 350 pages into
an xml document is tedious, time-consuming and error prone. Because I have
sections and subsections, my xml file is quite wide, as well as very long. I
use XMLSpy to edit the file, but I am forever scrolling backwards and
forwards, up and down, cutting and pasting, and losing my place.

Does anyone have any thoughts on how I might improve the situation, make my
file more maintainable, and perhaps automate the process somehow?

My first thought is to write a simple program to maintain the xml file, but
that could take just as long as entering the data.

Any thoughts very welcome.

TIA

Charles
Jul 21 '05 #1
12 1653
Charles,

This is suplied too this newsgroup some weeks ago

\\\Eval by Nigel Amstrong
1. Create a file called: DynamicMath.js
2. Add this code to it:

class DynamicMath
{
static function Eval(MathExpres sion : String) : double
{
return eval(MathExpres sion);

};
}

3. Compile it with the command line jsc compiler: jsc /t:library
DynamicMath.js
4. Add a reference to DynamicMath.dll to your project (and to
Microsoft.JScri pt.dll as well)
5. Use from your favourite .NET language:
Dim d As Double = DynamicMath.Eva l("2 + 3 + 4")
MessageBox.Show (d)
6. That's it..
///
Cor
Jul 21 '05 #2
Hi Cor

Thanks for the reply. I don't think it is going to help unfortunately,
unless I have missed something.

The problem is not so much how to implement the commands nor evaluate them.
Rather, I am trying to find some reliable means of translating a document
into a 'command script'. For example, the document might contain something
like this

1 Start Here
1.1 First Group
1.1.1 Instructions
Do something
Do something else
1.1.2 More Instructions
Do this other thing
Do first thing again
1.2 Second Group
...
1.3 More of the Same

I need to translate this into something that can be deserialised into a
hierarchy of commands, in the style of the Command pattern, so that each
Command can be executed, in sequence. The result of the operation will be

Do something
Do something else
Do this other thing
Do first thing again
...

Charles
"Cor Ligthert" <no************ @planet.nl> wrote in message
news:%2******** ********@TK2MSF TNGP12.phx.gbl. ..
Charles,

This is suplied too this newsgroup some weeks ago

\\\Eval by Nigel Amstrong
1. Create a file called: DynamicMath.js
2. Add this code to it:

class DynamicMath
{
static function Eval(MathExpres sion : String) : double
{
return eval(MathExpres sion);

};
}

3. Compile it with the command line jsc compiler: jsc /t:library
DynamicMath.js
4. Add a reference to DynamicMath.dll to your project (and to
Microsoft.JScri pt.dll as well)
5. Use from your favourite .NET language:
Dim d As Double = DynamicMath.Eva l("2 + 3 + 4")
MessageBox.Show (d)
6. That's it..
///
Cor

Jul 21 '05 #3
Charles,

When it was my problem and the document is well done, than I would look what
Word automation could do for me and than first look at the allinea settings.

However I did not do this a long time, but you asked for idea's

Cor
Jul 21 '05 #4
Hi Cor

You are right. I did ask for ideas, and the Word automation is a distinct
possibility.

Unfortunately, the document is not well done, that is to say it is not
consistent in its terminology, and so it would be difficult to create rules
around the written text; but nevertheless possible.

I think you have succinctly cut to the heart of the matter. I was hoping for
a panacea, when all the time fearing that there is no easy option: I either
have to create a sophisticated programme to process the document, or bite
the bullet and get typing.

[A special prize will be awarded to the person who correctly identifies all
the metaphors, and can accurately report the number used]

Charles
"Cor Ligthert" <no************ @planet.nl> wrote in message
news:%2******** ********@TK2MSF TNGP11.phx.gbl. ..
Charles,

When it was my problem and the document is well done, than I would look
what Word automation could do for me and than first look at the allinea
settings.

However I did not do this a long time, but you asked for idea's

Cor

Jul 21 '05 #5
Charles,
Which version of Word?

Later versions of Word (XP, 2003, not sure about 2000) support saving as an
XML file.

I would then consider passing Word's XML file to a XSLT transform to
"simplify" the document, then read this "simplified " XML in my program...

Looking at the help for Word 2003, you might be able to define an Xml Schema
that you could attach to your Word Document replace parts of the Word
document with Xml tags. I would think with some effort you might be able to
automate replacing parts of the document with tags, which may eliminate the
need for the XSLT transform.

Note: I've used Xml in Word very minimally.

Hope this helps
Jay

"Charles Law" <bl***@nowhere. com> wrote in message
news:%2******** ********@TK2MSF TNGP15.phx.gbl. ..
Hi guys

A bit of curve ball here ... I have a document (Word) that contains a
series of instructions in sections and subsections (and sub-subsections).
There are 350 pages of them.

I need to translate these instructions into something that can be
processed automatically, so I have used the Command pattern to set up a
set of commands that correspond to the various instructions in the
document.

I have started to enter the instructions into an xml file, which I can
deserialise into my command hierarchy. However, transcribing 350 pages
into an xml document is tedious, time-consuming and error prone. Because I
have sections and subsections, my xml file is quite wide, as well as very
long. I use XMLSpy to edit the file, but I am forever scrolling backwards
and forwards, up and down, cutting and pasting, and losing my place.

Does anyone have any thoughts on how I might improve the situation, make
my file more maintainable, and perhaps automate the process somehow?

My first thought is to write a simple program to maintain the xml file,
but that could take just as long as entering the data.

Any thoughts very welcome.

TIA

Charles

Jul 21 '05 #6
The example below ... is it laid out in a table? Do you have any kind
of formatting or anything else that makes the # ( 1.1.2 ) stand out
from the text ( more ins ) and then the stuff below it?

If you can tell the stuff apart, and there's a solid pattern, you
could write a VBA script to loop through the entire document,
line-by-line, and figure out where in the XML each line goes.

1 Start Here
1.1 First Group
1.1.1 Instructions
Do something
Do something else
1.1.2 More Instructions
Do this other thing
Do first thing again
1.2 Second Group
...
1.3 More of the Same

Jul 21 '05 #7
Hi Jay

I noticed the Save As XML so tried it (I have just moved from Word XP to
2003). The resultant file, with no transform, was 9 Mb. I then tried to load
it into XMLSpy and after about 10 minutes of a blank window it GPF'ed on me
:-(

I think you have probably hit on something though, but I don't know XSLT
well enough to know how to start with transforming the file. From what I
could make of the file after loading it in Notepad, it contains a tremendous
amount of bloat. For example, formatting and layout information that I just
don't need. I really only want the structure, after the first pass anyway.
Then I could set about translating the text into something more formal.
Also, this translation process will be a one-off, or at most occasional when
the document changes. It will be the cut down, formal xml file that my
program will read at start-up.

Thanks for the suggestion. I will look into it further.

Cheers.

Charles
"Jay B. Harlow [MVP - Outlook]" <Ja************ @msn.com> wrote in message
news:uz******** ********@TK2MSF TNGP10.phx.gbl. ..
Charles,
Which version of Word?

Later versions of Word (XP, 2003, not sure about 2000) support saving as
an XML file.

I would then consider passing Word's XML file to a XSLT transform to
"simplify" the document, then read this "simplified " XML in my program...

Looking at the help for Word 2003, you might be able to define an Xml
Schema that you could attach to your Word Document replace parts of the
Word document with Xml tags. I would think with some effort you might be
able to automate replacing parts of the document with tags, which may
eliminate the need for the XSLT transform.

Note: I've used Xml in Word very minimally.

Hope this helps
Jay

"Charles Law" <bl***@nowhere. com> wrote in message
news:%2******** ********@TK2MSF TNGP15.phx.gbl. ..
Hi guys

A bit of curve ball here ... I have a document (Word) that contains a
series of instructions in sections and subsections (and sub-subsections).
There are 350 pages of them.

I need to translate these instructions into something that can be
processed automatically, so I have used the Command pattern to set up a
set of commands that correspond to the various instructions in the
document.

I have started to enter the instructions into an xml file, which I can
deserialise into my command hierarchy. However, transcribing 350 pages
into an xml document is tedious, time-consuming and error prone. Because
I have sections and subsections, my xml file is quite wide, as well as
very long. I use XMLSpy to edit the file, but I am forever scrolling
backwards and forwards, up and down, cutting and pasting, and losing my
place.

Does anyone have any thoughts on how I might improve the situation, make
my file more maintainable, and perhaps automate the process somehow?

My first thought is to write a simple program to maintain the xml file,
but that could take just as long as entering the data.

Any thoughts very welcome.

TIA

Charles


Jul 21 '05 #8
Hi (do I call you Thug?)

On the whole, tables are not used, although in places they are [sadly the
document structure is not as consistent as it might be]. And in another
document that I will have to process tables are used extensively. However,
in this one heading styles are used for each section heading, so I could
glean the hierarchy from those. If I have to resort to a program to do the
processing, I guess this will have to be the way to do it.

Thanks for the suggestion.

Charles
"Thug Passion" <Th*********@gm ail.com> wrote in message
news:b0******** *************** ***@posting.goo gle.com...
The example below ... is it laid out in a table? Do you have any kind
of formatting or anything else that makes the # ( 1.1.2 ) stand out
from the text ( more ins ) and then the stuff below it?

If you can tell the stuff apart, and there's a solid pattern, you
could write a VBA script to loop through the entire document,
line-by-line, and figure out where in the XML each line goes.

1 Start Here
1.1 First Group
1.1.1 Instructions
Do something
Do something else
1.1.2 More Instructions
Do this other thing
Do first thing again
1.2 Second Group
...
1.3 More of the Same

Jul 21 '05 #9
Charles,
The resultant file, with no transform, was 9 Mb. That is where doing what Thug & I suggested first using a VBA Script to
automate cleaning up the document first. Getting it closer to a "nicer" XML
format first. Then save it, then possible apply an XSLT, then process it....

Is this document a one time thing or is it going to be ongoing?

If its ongoing I would seriously consider defining a template in Word that
helps enforce the format required.

Hope this helps
Jay

"Charles Law" <bl***@nowhere. com> wrote in message
news:e%******** ********@TK2MSF TNGP09.phx.gbl. .. Hi Jay

I noticed the Save As XML so tried it (I have just moved from Word XP to
2003). The resultant file, with no transform, was 9 Mb. I then tried to
load it into XMLSpy and after about 10 minutes of a blank window it GPF'ed
on me :-(

I think you have probably hit on something though, but I don't know XSLT
well enough to know how to start with transforming the file. From what I
could make of the file after loading it in Notepad, it contains a
tremendous amount of bloat. For example, formatting and layout information
that I just don't need. I really only want the structure, after the first
pass anyway. Then I could set about translating the text into something
more formal. Also, this translation process will be a one-off, or at most
occasional when the document changes. It will be the cut down, formal xml
file that my program will read at start-up.

Thanks for the suggestion. I will look into it further.

Cheers.

Charles
"Jay B. Harlow [MVP - Outlook]" <Ja************ @msn.com> wrote in message
news:uz******** ********@TK2MSF TNGP10.phx.gbl. ..
Charles,
Which version of Word?

Later versions of Word (XP, 2003, not sure about 2000) support saving as
an XML file.

I would then consider passing Word's XML file to a XSLT transform to
"simplify" the document, then read this "simplified " XML in my program...

Looking at the help for Word 2003, you might be able to define an Xml
Schema that you could attach to your Word Document replace parts of the
Word document with Xml tags. I would think with some effort you might be
able to automate replacing parts of the document with tags, which may
eliminate the need for the XSLT transform.

Note: I've used Xml in Word very minimally.

Hope this helps
Jay

"Charles Law" <bl***@nowhere. com> wrote in message
news:%2******** ********@TK2MSF TNGP15.phx.gbl. ..
Hi guys

A bit of curve ball here ... I have a document (Word) that contains a
series of instructions in sections and subsections (and
sub-subsections). There are 350 pages of them.

I need to translate these instructions into something that can be
processed automatically, so I have used the Command pattern to set up a
set of commands that correspond to the various instructions in the
document.

I have started to enter the instructions into an xml file, which I can
deserialise into my command hierarchy. However, transcribing 350 pages
into an xml document is tedious, time-consuming and error prone. Because
I have sections and subsections, my xml file is quite wide, as well as
very long. I use XMLSpy to edit the file, but I am forever scrolling
backwards and forwards, up and down, cutting and pasting, and losing my
place.

Does anyone have any thoughts on how I might improve the situation, make
my file more maintainable, and perhaps automate the process somehow?

My first thought is to write a simple program to maintain the xml file,
but that could take just as long as entering the data.

Any thoughts very welcome.

TIA

Charles



Jul 21 '05 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
9594
by: ÂÑTØÑ | last post by:
Hi, I was looking for a list of commands, but I can't find it. It's about commands you can type in the Internet Explorer adress bar, to get some information about a website. For instance "javascript:alert(document.lastmodified)" to find out when the website was updated. Can someone help me out? Thanx in advance,
12
1012
by: Charles Law | last post by:
Hi guys A bit of curve ball here ... I have a document (Word) that contains a series of instructions in sections and subsections (and sub-subsections). There are 350 pages of them. I need to translate these instructions into something that can be processed automatically, so I have used the Command pattern to set up a set of commands that correspond to the various instructions in the document.
0
8889
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
8752
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
9401
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
9116
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
1
6702
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
6011
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
4519
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
4784
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
3228
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.