473,324 Members | 2,248 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,324 software developers and data experts.

Need Help with C program

I am a programmer (cobol, peoplesoft, sqr, etc.) so I am familiar
with programming logic, etc. but not very familiar with C. I need a C
program in a study I'm doing. The program is fairly simple, but not
familiar with C code it would take me some time to get it to work. A
good C programmer can probably give me the code in a few minutes.

Here's the program specs:

I'm doing a study on the italicized words in the King James Bible. The
italicized words are words not in the Greek and Hebrew text but were
added by the translators for sentence structure, etc.

I have a text file of the Old Testament and the New Testament. The
italicized words are surrounded by brackets []. If a verse contains
any italicized words I want to write that verse to a new flat-file
that I will use in Word to look at, get counts, and do some grammar
statistics, etc.

Here's a sample of the input flat-file with a few verses from Genesis.
Notice verses 2, 4, and 7 contain italicized words or brackets.

1 In the beginning God created the heaven and the earth.
2 And the earth was without form, and void; and darkness [was] upon
the face of the deep. And the Spirit of God moved upon the face of the
waters.
3 And God said, Let there be light: and there was light.
4 And God saw the light, that [it was] good: and God divided the light
from the darkness.
5 And God called the light Day, and the darkness he called Night. And
the evening and the morning were the first day.
6 And God said, Let there be a firmament in the midst of the waters,
and let it divide the waters from the waters.
7 And God made the firmament, and divided the waters which [were]
under the firmament from the waters which [were] above the firmament:
and it was so.
8 And God called the firmament Heaven. And the evening and the morning
were the second day.

So my output file would look like

2 And the earth was without form, and void; and darkness [was] upon
the face of the deep. And the Spirit of God moved upon the face of the
waters.
4 And God saw the light, that [it was] good: and God divided the light
from the darkness.
7 And God made the firmament, and divided the waters which [were]
under the firmament from the waters which [were] above the firmament:
and it was so.

The logic I came up with would be something like:

Read a character from the flat-file:
Check for a number
If number (indicates a new verse)
If a "[" is found (found_flag) from previous verse
write the WORK AREA stored verse to the output file.
Possibly need to write an eol character
Clear out the work area
Clear out the found [ flag

If a [ is NOT found (found-flag) then
clear the stored verse

Move each character to a WORK AREA
Check each character for a "["
If found set the found_flag = y

Got get another character

Another variation I would like to do is create an output file of JUST
the italicized words or bracketed words.

A couple of questions or issues:

How large a file can C read?
The Old Testament file is 3,282,275 characters (size is 3,342,336
bytes) If needed I could cut up the files.

Is there a better way?

I will probably use Borland C+ as the compiler.

If possible, please email any solutions to tw******@charter.net

Thank you very much for your time and expertise.
Nov 13 '05 #1
9 2280
terry wrote:
I am a programmer (cobol...
My condolences :)
.... I need a C
program in a study I'm doing. The program is fairly simple, but not
familiar with C code it would take me some time to get it to work. A
good C programmer can probably give me the code in a few minutes.
See below.

[examples and pseudocode snipped]

Your spec is well written and your logic looks like it would work,
bearing in mind you have to deal with the beginning and end of the
file.
Another variation I would like to do is create an output file of JUST
the italicized words or bracketed words.
Much easier, of course.
How large a file can C read?
The Old Testament file is 3,282,275 characters (size is 3,342,336
bytes) If needed I could cut up the files.
If you slurp the whole thing in one go you'll need an array of that
many chars. Probably not difficult on any modern general-purpose
computer. I'd break the reads into 32k or so at a time, though.

If you read and process one line at a time there's no limit to
overall file size. IIRC there is an implementation-defined limit to
the maximum line length, which might cause problems if your input
only has newlines at the end of paragraphs.
I will probably use Borland C+ as the compiler.
We don't worry about that here. We worry about code that will work
on any platform.
If possible, please email any solutions to tw******@charter.net Thank you very much for your time and expertise.


We don't generally hand people their work on a silver platter here,
either. Some people will quote you their consulting rates.

--
Tom Zych
This email address will expire at some point to thwart spammers.
Permanent address: echo 'g******@cbobk.pbz' | rot13
Nov 13 '05 #2
Do you know how big the biggest verse is ? Assuming that numbers are verse
delimiters and CANNOT appear in the actual verse itself, this should be an
easy problem. Allocate a buffer for the biggest verse. Read the entire
verse into the buffer. Determine if there is a '[' in the verse with
strchr. If there is, write the whole buffer to your output/file where you
want it.

"terry" <te***@av1611.org> wrote in message
news:99**************************@posting.google.c om...
I am a programmer (cobol, peoplesoft, sqr, etc.) so I am familiar
with programming logic, etc. but not very familiar with C. I need a C
program in a study I'm doing. The program is fairly simple, but not
familiar with C code it would take me some time to get it to work. A
good C programmer can probably give me the code in a few minutes.

Here's the program specs:

I'm doing a study on the italicized words in the King James Bible. The
italicized words are words not in the Greek and Hebrew text but were
added by the translators for sentence structure, etc.

I have a text file of the Old Testament and the New Testament. The
italicized words are surrounded by brackets []. If a verse contains
any italicized words I want to write that verse to a new flat-file
that I will use in Word to look at, get counts, and do some grammar
statistics, etc.

Here's a sample of the input flat-file with a few verses from Genesis.
Notice verses 2, 4, and 7 contain italicized words or brackets.

1 In the beginning God created the heaven and the earth.
2 And the earth was without form, and void; and darkness [was] upon
the face of the deep. And the Spirit of God moved upon the face of the
waters.
3 And God said, Let there be light: and there was light.
4 And God saw the light, that [it was] good: and God divided the light
from the darkness.
5 And God called the light Day, and the darkness he called Night. And
the evening and the morning were the first day.
6 And God said, Let there be a firmament in the midst of the waters,
and let it divide the waters from the waters.
7 And God made the firmament, and divided the waters which [were]
under the firmament from the waters which [were] above the firmament:
and it was so.
8 And God called the firmament Heaven. And the evening and the morning
were the second day.

So my output file would look like

2 And the earth was without form, and void; and darkness [was] upon
the face of the deep. And the Spirit of God moved upon the face of the
waters.
4 And God saw the light, that [it was] good: and God divided the light
from the darkness.
7 And God made the firmament, and divided the waters which [were]
under the firmament from the waters which [were] above the firmament:
and it was so.

The logic I came up with would be something like:

Read a character from the flat-file:
Check for a number
If number (indicates a new verse)
If a "[" is found (found_flag) from previous verse
write the WORK AREA stored verse to the output file.
Possibly need to write an eol character
Clear out the work area
Clear out the found [ flag

If a [ is NOT found (found-flag) then
clear the stored verse

Move each character to a WORK AREA
Check each character for a "["
If found set the found_flag = y

Got get another character

Another variation I would like to do is create an output file of JUST
the italicized words or bracketed words.

A couple of questions or issues:

How large a file can C read?
The Old Testament file is 3,282,275 characters (size is 3,342,336
bytes) If needed I could cut up the files.

Is there a better way?

I will probably use Borland C+ as the compiler.

If possible, please email any solutions to tw******@charter.net

Thank you very much for your time and expertise.

Nov 13 '05 #3
On Fri, 12 Sep 2003 11:08:44 -0400, "Kris Wempa"
<calmincents(NO_SPAM)@yahoo.com> wrote:
Do you know how big the biggest verse is ? Assuming that numbers are verse
delimiters and CANNOT appear in the actual verse itself, this should be an
easy problem. Allocate a buffer for the biggest verse. Read the entire
verse into the buffer. Determine if there is a '[' in the verse with
strchr. If there is, write the whole buffer to your output/file where you
want it.


The italic text may spread over more verses, at least Dutch bibles do.
So you have to look for ']' too.

Coos
Nov 13 '05 #4
terry wrote:
I have a text file of the Old Testament and the New Testament. The
italicized words are surrounded by brackets []. If a verse contains
any italicized words I want to write that verse to a new flat-file
that I will use in Word to look at, get counts, and do some grammar
statistics, etc.
Make sure each verse is on a separate line. Then:

grep '\[' bible.txt > italics.txt

Another variation I would like to do is create an output file of JUST
the italicized words or bracketed words.
int ch;
int inword = 0;
while((ch = getc(fpin)) != EOF)
{
if(inword)
{
fputc(ch, fpout);
if(ch == ']')
{
fputc('\n', fpout); /* one word per line */
inword = 0;
}
}
else
{
if(ch == '[')
{
fputc(ch, fpout);
inword = 1;
}
}
}

A couple of questions or issues:

How large a file can C read?


Big as you like, if you're prepared to read it in chunks.

--
Richard Heathfield : bi****@eton.powernet.co.uk
"Usenet is a strange place." - Dennis M Ritchie, 29 July 1999.
C FAQ: http://www.eskimo.com/~scs/C-faq/top.html
K&R answers, C books, etc: http://users.powernet.co.uk/eton
Nov 13 '05 #5

On Fri, 12 Sep 2003, Coos Haak wrote:

On Fri, 12 Sep 2003 11:08:44 -0400, "Kris Wempa" wrote:

Do you know how big the biggest verse is ? Assuming that numbers are verse
delimiters and CANNOT appear in the actual verse itself, this should be an
easy problem. Allocate a buffer for the biggest verse. Read the entire
verse into the buffer. Determine if there is a '[' in the verse with
strchr. If there is, write the whole buffer to your output/file where you
want it.


The italic text may spread over more verses, at least Dutch bibles do.
So you have to look for ']' too.


Ouch. You mean literally, like this:

42 And the people did rejoice and did feast upon the
lambs and toads and tree-sloths and fruit-bats and
orangutans [and breakfast cereals.
43 Now] did the Lord say, "First thou pullest the
Holy Pin. Then thou must count to three. Three shall
be the number of the counting and the number of the
counting shall be three.

Or just that some verses end with italics, and are followed by
verses beginning with italics, like *this*:

42 And the people did rejoice and did feast upon the
lambs and toads and tree-sloths and fruit-bats and
orangutans [and breakfast cereals].
43 [Now] did the Lord say, "First thou pullest the
Holy Pin. Then thou must count to three. Three shall
be the number of the counting and the number of the
counting shall be three.

The second way is the easier way to parse, obviously.
So it all depends on how the text that the OP is using
is arranged.

-Arthur
Nov 13 '05 #6
Tom Zych wrote:

Another variation I would like to do is create an output file of JUST
the italicized words or bracketed words.


Much easier, of course.


Not any more or less easier...
How large a file can C read?
The Old Testament file is 3,282,275 characters (size is 3,342,336
bytes) If needed I could cut up the files.


If you slurp the whole thing in one go you'll need an array of that
many chars. Probably not difficult on any modern general-purpose
computer. I'd break the reads into 32k or so at a time, though.


Fortunately for you, your input is fixed, so you can tweak your program
by pre-processing the data. For example, you *know* how many characters
are in the longest verse. If you are using a particular copy of the
text, you even know how many individual lines are in the longest verse.
So, you just read lines (fgetts) into a circular buffer looking first
for verse numbers, and then for '[' characters. The rest should follow
easily.

/david

--
Andre, a simple peasant, had only one thing on his mind as he crept
along the East wall: 'Andre, creep... Andre, creep... Andre, creep.'
-- unknown
Nov 13 '05 #7

On Mon, 15 Sep 2003, David Rubin wrote:

Tom Zych wrote:
Another variation I would like to do is create an output file of JUST
the italicized words or bracketed words.


Much easier, of course.


Not any more or less easier...


Then you must know a very easy way to do the first part (printing out
whole verses). And I doubt you do.
while ((input_char = getchar()) != EOF)
{
if (!output_flag && input_char == '[')
output_flag = 1;
else if (output_flag && input_char == ']') {
output_flag = 0;
putchar('\n');
}
else
putchar(input_char);
}
See how the second part doesn't require memory allocation;
in fact, it doesn't even require more than three bytes of
state information. The first part requires an arbitrarily
large amount of state information -- which *may* not be
feasible on some implementations with small memories and
big disks.

-Arthur
Nov 13 '05 #8
"Arthur J. O'Dwyer" wrote:

On Mon, 15 Sep 2003, David Rubin wrote:

Tom Zych wrote:

> Another variation I would like to do is create an output file of JUST
> the italicized words or bracketed words.

Much easier, of course.
Not any more or less easier...


Then you must know a very easy way to do the first part (printing out
whole verses). And I doubt you do.


[snip - code] See how the second part doesn't require memory allocation;
in fact, it doesn't even require more than three bytes of
state information. The first part requires an arbitrarily
large amount of state information


Not true. As I pointed out, since the OP is working with a specific text
(and perhaps a specific version of that text), these variables are
bounded.

/david

--
Andre, a simple peasant, had only one thing on his mind as he crept
along the East wall: 'Andre, creep... Andre, creep... Andre, creep.'
-- unknown
Nov 13 '05 #9

On Thu, 18 Sep 2003, David Rubin wrote:

"Arthur J. O'Dwyer" wrote:
On Mon, 15 Sep 2003, David Rubin wrote:
Tom Zych wrote:
>
> > Another variation I would like to do is create an output file
> > of JUST the italicized words or bracketed words.
>
> Much easier, of course.

Not any more or less easier...


Then you must know a very easy way to do the first part (printing out
whole verses). And I doubt you do.


[snip - code]
See how the second part doesn't require memory allocation;
in fact, it doesn't even require more than three bytes of
state information. The first part requires an arbitrarily
large amount of state information


Not true. As I pointed out, since the OP is working with a specific text
(and perhaps a specific version of that text), these variables are
bounded.


Well, then why on earth did you bother to give any such complicated
algorithm at all? A simple "printf" would have solved the OP's
problem *much* faster and simpler.
#include <stdio.h>

int main(void)
{
printf("The requested output is:\n");
printf("2 And the earth was without form, and void; and darkness "
"[was] upon the face of the deep. And the Spirit of God "
"moved upon the face of the waters.\n");
printf("4 And God saw the light, that [it was] good; and God "
"divided the light from the darkness.\n");
printf("7 And God made the firmament, and divided the waters which "
"[were] under the firmament from the waters which [were] "
"above the firmament: and it was so.\n");
...
printf("26 So Joseph died, [being] an hundred and ten years old: "
"and they embalmed him, and he was put in a coffin in "
"Egypt.\n");
return 0;
}
Perhaps the OP wanted to find an algorithm that would be applicable
to *any* text he cared to examine, not merely a single transcription
of KJV.

-Arthur
Nov 13 '05 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
by: Spamtrap | last post by:
I only do occasional Perl programming and most things I write are short processes. I have something I'm working on that is scanning a text file with about 15 million lines and trying to extract...
2
by: aj902 | last post by:
Hello , I am trying to create a program where all detail, http://www.albany.edu/~csi333/projects.htm
13
by: vgame64 | last post by:
Hi, I have been struggling with writing a program for a few hours. The requirements are that: """You will be writing a program which will determine whether a date is valid in terms of days in that...
4
by: robinsand | last post by:
My apologies to those of you who are more advanced Visual C++ .NET programmers, but I am working on a project for an MBA course that is condensed into an eight-week schedule, and I need help...
16
by: pamelafluente | last post by:
I am still working with no success on that client/server problem. I need your help. I will submit simplified versions of my problem so we can see clearly what is going on. My model: A client...
8
by: skumar434 | last post by:
i need to store the data from a data base in to structure .............the problem is like this ....suppose there is a data base which stores the sequence no and item type etc ...but i need only...
4
by: naknak4 | last post by:
Introduction This assignment requires you to develop solutions to the given problem using several different approaches (which actually involves using three different STL containers). You will...
6
by: naknak | last post by:
Introduction This assignment requires you to develop solutions to the given problem using several different approaches (which actually involves using three different STL containers). You will...
1
by: peterggmss | last post by:
This is a slot machine game, 2 forms. One is the actual game (frmMachine) and the other is in the background and randomizes the images shown on frmMachine. I need to make frmMachine wait for...
1
by: raghavshastri | last post by:
You are to write a C++ program to perform a statistical analysis of the blobs in an image. The image will be a grayscale image in PGM format for simplicity. Here is a sample PGM image with 10...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
1
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.