473,394 Members | 1,674 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,394 software developers and data experts.

K&R Ex 1-22

I'm trying to complete excercise 1-23 in the K&R book, which calls for
a program to wrap input lines at a given column, making sure to handle
lines that wrap twice or don't contain whitespace. This is what I have
so far*:
http://pastebin.com/f61c6cfc
*Insert is a function which shifts the contents of the string right by
one character and inserts another character in its place.

I'm sure there's probably a completely better way to do this, but I'm
too far into it to rewrite it a different way.

Anyway, what I really want to know is
why the variable "search_state" always equals zero (at least according
to the printf statement I put in there and the fact that it never
wraps) even though the if statement before "search_state = 1" resolves
to be true. It's almost as though search_state was losing its value at
the end of the for loop. If it doesn't get set to 1, no wrapping will
happen.

I feel really stupid for having this much
trouble with an excercise so early in the book, but this has me
baffled.
-
You can get everything in life you want, if you will help enough other
people get what they want.
Aug 19 '08 #1
17 1639
Andrew C. said:
I'm trying to complete excercise 1-23 in the K&R book, which calls for
a program to wrap input lines at a given column, making sure to handle
lines that wrap twice or don't contain whitespace. This is what I have
so far*:
http://pastebin.com/f61c6cfc
I haven't much time right now, but I will have, later on. For me (or indeed
for anyone else) to look at your problem in sufficient detail to help you,
it would be handy if you could post the complete, compilable source code
that you are using, preferably right here on Usenet rather than on a Web
site. Also, it would be very useful to know what test data you're using.

My first glance (and it *is* only a quick glance) suggests that, when you
pass a string of fewer than 80 characters, your value for 'next' will be
0, which looks like it could be problematic. (That might not be the
problem - it's just a first impression.)

No doubt someone else will be able to help you more, but I really do
recommend that you post the full source here, together with your test
data.

HTH. HAND.

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999
Aug 19 '08 #2
On 2008-08-19, Richard Heathfield <rj*@see.sig.invalidwrote:
Andrew C. said:
>I'm trying to complete excercise 1-23 in the K&R book, which calls for
a program to wrap input lines at a given column, making sure to handle
lines that wrap twice or don't contain whitespace. This is what I have
so far*:
http://pastebin.com/f61c6cfc

I haven't much time right now, but I will have, later on. For me (or indeed
for anyone else) to look at your problem in sufficient detail to help you,
it would be handy if you could post the complete, compilable source code
that you are using, preferably right here on Usenet rather than on a Web
site. Also, it would be very useful to know what test data you're using.

My first glance (and it *is* only a quick glance) suggests that, when you
pass a string of fewer than 80 characters, your value for 'next' will be
0, which looks like it could be problematic. (That might not be the
problem - it's just a first impression.)

No doubt someone else will be able to help you more, but I really do
recommend that you post the full source here, together with your test
data.

HTH. HAND.
You're right, I should have posted the full source code. I used
pastebin because I figured it would have been a bigger breach of
ettiquette to post a sizeable source file directly to usenet, long
lines and all, but here goes:

#include <stdio.h>
#include <string.h>

#define MAXCOLS 80 /*Column to wrap at*/
#define MAXLINE 1000 /*Maximum string length*/
#define NEXT(current) ((current/MAXCOLS+1)*MAXCOLS) /*macro
to calculate next*/

void wrap(char s[], int len);
void insert(char s[], int at, char c);
int getline(char s[], int lim);

main()
{
char line[MAXLINE];
int len;

int i;
for(i = 0; i<=80; ++i)
putchar(' ');
printf("|\n");

while((len = getline(line, MAXLINE)) 0) {
wrap(line,len);
printf("%s",line);
}

return 0;
}

/*wrap: word wrap a string at the desired length */
void wrap(char s[], int len)
{
int search_state = 0; /*0 = looking for next, 1 = looking for
whitespace*/
int i = len-1;
int next = ((i/MAXCOLS))*MAXCOLS; /*next column to wrap at*/

for(i = len-1; i MAXCOLS-1; --i) {
next = ((i/MAXCOLS))*MAXCOLS;
printf(" %d ",search_state); /*debug*/
if(search_state == 1) {
if(s[i] == ' ') { /*all is well*/
s[i] = '\n'; /*place a newline in the
whitespace*/
search_state = 0; /*reset the state*/
}
else if(i == next && s[i] != ' ') { /*there
wasn't any whitespace*/
insert(s,i,'\n'); /*insert a newline
at the wrap point*/
search_state = 0; /*reset the state*/
}
}
else
if(next >= MAXCOLS && i == next) /*if there is
a next, and i equals it, start wrapping*/
search_state = 1;
}
}

/*insert: insert a character into the middle of a character string*/
void insert(char s[], int at, char c)
{
int i;

for(i = strlen(s)-1; i>=at; --i)
s[i+1] = s[i];
s[i] = c;
}

/* getline: read a line into s, return length */
int getline(char s[], int lim)
{
int c, i;
for (i = 0; i<lim-1 && (c=getchar()) != EOF && c!='\n'; ++i)
s[i] = c;
if (c == '\n') {
s[i] = c;
++i;
}
if (i >= lim-1) {
while((c = getchar()) != EOF && c!='\n')
++i; /*increment length anyway in case of overflow*/
s[lim-1] = '\0';
}
else
s[i] = '\0';
return i;
}
By the way, with any input line less than 80 characters, absolutely
nothing should happen to it, which is indeed the correct behavior
because no word wrapping is needed.

As for test data, any input (I either mashed the keyboard or typed in
song lyrics) does the exact same thing: spit the input out unchanged.

Thanks for your help, and sorry for being a noob.
--
You can get everything in life you want, if you will help enough other
people get what they want.
Aug 19 '08 #3
Andrew C. said:
On 2008-08-19, Richard Heathfield <rj*@see.sig.invalidwrote:
>>
No doubt someone else will be able to help you more, but I really do
recommend that you post the full source here, together with your test
data.
You're right, I should have posted the full source code. I used
pastebin because I figured it would have been a bigger breach of
ettiquette to post a sizeable source file directly to usenet, long
lines and all,
Your concern for etiquette makes a welcome change! Nevertheless, a hundred
lines or so is no big deal in Usenet nowadays, at least not for most
people, and we can hope that the first couple of articles in this thread
will warn modem-users off. :-)

Okay, my first step was to compile it:

gcc -W -Wall -ansi -pedantic -Wformat-nonliteral -Wcast-align
-Wpointer-arith -Wbad-function-cast -Wmissing-prototypes
-Wstrict-prototypes -Wmissing-declarations -Winline -Wundef
-Wnested-externs -Wcast-qual -Wshadow -Wconversion -Wwrite-strings
-ffloat-store -fno-builtin -O2 -g -pg -c -o kr123.o kr123.c
kr123.c:14: warning: return-type defaults to `int'
kr123.c:14: warning: function declaration isn't a prototype
kr123.c: In function `wrap':
kr123.c:50: warning: passing arg 3 of `insert' with different width due to
prototype
kr123.c: In function `getline':
kr123.c:75: warning: `c' might be used uninitialized in this function

So let's go fix those things.

On line 14, I changed:

main()

to:

int main(void)

On line 50, '\n' is actually of type int, would you believe? But we know
its value must fit in a char (for several reasons, one of which is that it
is a mandatory member of the source character set, so it must have a value
in the range 1 to CHAR_MAX), so it's okay to pass it to a function
expecting a char. No fix required (although changing insert() to take int
rather than char would at least silence the warning).

By far the most serious warning is the one about 'c' possibly being used
uninitialised. But as far as I can see, this is actually impossible, given
that there is only one call to getline, and it specifies a large-ish value
for lim, so i<lim-1 must be true first time round, so the getchar is
definitely called.

So much for the compiler. Let's look at the behaviour.

To make matters a little less long-winded, I changed MAXCOLS to 12 and
recompiled. I gave your program an input of abcdefghijklmnop, and it
printed a bunch of spaces and a vertical bar, as expected. (I don't know
why you do that, but hey, it's your program!) It then spat out what look
like some debugging numbers, followed by abcdefghijklmnop again, with no
wrap.

Then I gave it lots and lots of junk, and (remembering that we changed
MAXCOLS to 12) look what happened!

fkjslvnfdlskjvnfsdkjlvnfdskjvnfsdkjvnfksldjnvjksld fnvjdskfvnfdjsvnjkdfls
jfks vjksfdvjkfsdhvjksdfvhjksdfv
0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0
0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0
0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0
0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 fkjslvnfdls
jjvnfsdkjlvnfdskjvnfsdkj
nnfksldjnvjksldfnvjdskfv
ffdjsvnjkdfls jfks vjksf
vvjkfsdhvjksdfvhjksdfv
#1 Wed Mar 27 13:57:05 UTC 2002

WOW! I have no idea where that timestamp came from... :-)

So it looks like you *are* getting wrap, but not how and when you want it.
It also looks like you have an unterminated "string". (That's the usual
cause of utterly off-the-wall output.)

Time to read the code.

The first few lines in main just print out a bunch of spaces and a |, for
no apparent reason - although no doubt you know why. Nothing illegal
there, so let's move on.

You have a nice simple loop, consisting of getline, wrap, and printf. The
printf prints 'line', so we have a good idea of which "string" isn't
getting terminated. I checked to see whether it is constructed properly,
by looking at getline, and getline seems fine.

So the problem of failure to terminate is either in wrap() or in insert(),
and my money is on insert(), so let's take a look:

void insert(char s[], int at, char c)
{
int i;

for(i = strlen(s)-1; i>=at; --i)
s[i+1] = s[i];
s[i] = c;
}

Consider the following string: "ABCDEF\n\0" (I've made the \0 explicitly
visible, but it isn't part of the count given by strlen.)

i = strlen(s) - 1; Since strlen(s) is 7 (A-F is 6, + \n), i takes the value
6.

s[i+1] = s[i];

so we're doing s[7] = s[6]. Now, s[7] /was/ the null terminator, but we
just overwrote that. Not good. Let's fix it.

Whilst strlen yields the length of a string (not including the null), we
can think of it in another way, i.e. as returning the offset of the null
terminator.

void insert(char s[], int at, char c)
{
size_t nullpos = strlen(s);
size_t pos = nullpos;
while(pos 0 && pos at)
{
s[pos] = s[pos - 1];
--pos;
}
s[pos] = c;
s[nullpos + 1] = '\0';
}

That's fixed our null termination problem. Now for the logic bug, which is
clearly in the wrap() function. Experimentation quickly revealed that your
code *is* wrapping, albeit rather bizarrely - for example, the first wrap
seems to occur at 2*MAXCOLS, rather than at MAXCOLS.

It appears that what you're trying to do is replace a space with a newline
if possible, but if necessary you'll shift a long string's tail over by
one character to insert a newline. A perfectly reasonable approach,
although of course you have to be careful with lines close to your
buffer's size limit.

It strikes me that it would be much easier to do this as follows:

void wrap(char s[], size_t len)
{
char *t = s;
while(len >= MAXCOLS)
{
t += MAXCOLS;
while(t s && *t != ' ')
{
--t;
}
if(*t == ' ')
{
*t = '\n';
}
else
{
t += MAXCOLS;
insert(t, 0, '\n');
}
len -= (t - s);
s = t;
}
}

Dropping that - and the insert() update - into your program (and making
appropriate type-changes from int to size_t, which is a more appropriate
type for describing buffer sizes), for input like this:

That's fixed our null termination problem. Now for the logic bug, which is
clearly in the wrap() function. Experimentation quickly revealed that your
code *is* wrapping, albeit rather bizarrely - for example, the first wrap
seems to occur at 2*MAXCOLS, rather than at MAXCOLS.

I get output like this:

That's fixed
our null
termination
problem.
Now for the
logic bug,
which is
clearly in
the wrap()
function.
Experimenta
tion
quickly
revealed
that your
code *is*
wrapping,
albeit
rather
bizarrely -
for
example,
the first
wrap seems
to occur at
2*MAXCOLS,
rather than
at
MAXCOLS.

and that seems to be working nicely.

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999
Aug 19 '08 #4
Thanks much for your help. First, let me go ahead and explain myself
on that one segment of code involving spaces and a pipe - that's just
a ruler I was printing to see if I had typed enough input to warrant
wrapping. I noticed it was hardcoded to use 80 columns so I fixed it
to use MAXCOLS instead. I guess I should have commented that.

Thank you for pointing out the problem with insert(), i probably
wouldn't have noticed it otherwise.

Your solution for wrap is amazingly more elegant than mine:

On 2008-08-19, Richard Heathfield <rj*@see.sig.invalidwrote:
It strikes me that it would be much easier to do this as follows:

void wrap(char s[], size_t len)
{
char *t = s;
while(len >= MAXCOLS)
{
t += MAXCOLS;
while(t s && *t != ' ')
{
--t;
}
if(*t == ' ')
{
*t = '\n';
}
else
{
t += MAXCOLS;
insert(t, 0, '\n');
}
len -= (t - s);
s = t;
}
}
However, I'm having trouble understanding it because I can see it uses
pointers, and having not reached that chapter in the book I don't really
understand what pointers do. I'm sure this section can be written
without pointers though, so I'll just set out to rewrite the function
using the logic I think I'm seeing above.

--
You can get everything in life you want, if you will help enough other
people get what they want.
Aug 19 '08 #5
On Tue, 19 Aug 2008 07:57:04 +0000, Richard Heathfield posted:
>http://pastebin.com/f61c6cfc

I haven't much time right now, but I will have, later on. For me (or indeed
for anyone else) to look at your problem in sufficient detail to help you,
it would be handy if you could post the complete, compilable source code
that you are using, preferably right here on Usenet rather than on a Web
site. Also, it would be very useful to know what test data you're using.
Richard errs that everyone else prefers source posting in clc as opposed to
a link. Are you solving 22 or 23?
--
When a new source of taxation is found it never means, in practice, that
the old source is abandoned. It merely means that the politicians have two
ways of milking the taxpayer where they had one before. 8
H. L. Mencken
Aug 20 '08 #6
On 2008-08-20, Ron Ford <ro*@example.invalidwrote:
On Tue, 19 Aug 2008 07:57:04 +0000, Richard Heathfield posted:
>>http://pastebin.com/f61c6cfc

I haven't much time right now, but I will have, later on. For me (or indeed
for anyone else) to look at your problem in sufficient detail to help you,
it would be handy if you could post the complete, compilable source code
that you are using, preferably right here on Usenet rather than on a Web
site. Also, it would be very useful to know what test data you're using.

Richard errs that everyone else prefers source posting in clc as opposed to
a link. Are you solving 22 or 23?
I'm solving 22. I already finished 23, it was actually quite easy.

--
Let us never negotiate out of fear, but let us never fear to negotiate.
-- John F. Kennedy
Aug 20 '08 #7
Andrew C. said:

<snip>
However, I'm having trouble understanding it because I can see it uses
pointers,
Oops! Sorry about that.

void wrap(char s[], size_t len)
{
size_t basepos = 0;
size_t curpos = 0;

while(len >= MAXCOLS)
{
curpos += MAXCOLS;
while(curpos basepos && s[curpos] != ' ')
{
--curpos;
}
if(s[curpos] == ' ')
{
s[curpos] = '\n';
}
else
{
curpos += MAXCOLS;
insert(s, curpos, '\n');
}
len -= (curpos - basepos);
basepos = curpos;
}
}

I haven't actually tested this, but I'm reasonably sure I got the
translation right.

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999
Aug 20 '08 #8
On 2008-08-20, Richard Heathfield <rj*@see.sig.invalidwrote:
Andrew C. said:

<snip>
>However, I'm having trouble understanding it because I can see it uses
pointers,

Oops! Sorry about that.

void wrap(char s[], size_t len)
{
size_t basepos = 0;
size_t curpos = 0;

while(len >= MAXCOLS)
{
curpos += MAXCOLS;
while(curpos basepos && s[curpos] != ' ')
{
--curpos;
}
if(s[curpos] == ' ')
{
s[curpos] = '\n';
}
else
{
curpos += MAXCOLS;
insert(s, curpos, '\n');
}
len -= (curpos - basepos);
basepos = curpos;
}
}

I haven't actually tested this, but I'm reasonably sure I got the
translation right.
Thanks for the clarification, it works like a charm. And I actually
understand this one. I only wish I could have thought of that myself.
--
Let us never negotiate out of fear, but let us never fear to negotiate.
-- John F. Kennedy
Aug 20 '08 #9
Andrew C. said:
On 2008-08-20, Richard Heathfield <rj*@see.sig.invalidwrote:
<snip>
>>
I haven't actually tested this, but I'm reasonably sure I got the
translation right.
Thanks for the clarification, it works like a charm. And I actually
understand this one. I only wish I could have thought of that myself.
In your original article in the thread, you said:

"I'm sure there's probably a completely better way to do this, but I'm
too far into it to rewrite it a different way."

Well, sometimes we get so far into things, and into such a tangle, that a
rewrite is the only sensible option. I spent about three minutes looking
at your function, and I still didn't understand how you intended it to
work. At that point, I thought "well, how hard can it be?" and spent less
time writing the replacement than I had spent trying to understand the
original!

When you find yourself in a tangle of loops and ifs, it's sometimes better
to stop, step back, and think about something else for a while. Then have
another go at writing the code from scratch - preferably in a different
source file, leaving the original well alone. You may well find that your
second attempt is rather more elegant than your first.

In fact, once you've been through all the Chapter 1 exercises and perhaps a
few from Chapters 2-4, you might want to start at Ex 1-8 again, WITHOUT
revisiting your earlier solutions. Work through to 1-24. Then go back and
compare your new solutions with your old ones. I think you'll notice a
dramatic improvement.

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999
Aug 20 '08 #10
On Tue, 19 Aug 2008 14:30:25 +0000, Richard Heathfield
<rj*@see.sig.invalidwrote:
>Andrew C. said:
>On 2008-08-19, Richard Heathfield <rj*@see.sig.invalidwrote:
>>>
I hesitate to add this, but here goes:

I worked this exercise before looking at this thread and came up with a
quite different solution, which I include below. Would some of you
please comment on what I have done, especially in contrast to Andrew's
approach. I also am new to C, having previous experience with perl.
Here is my soloution:

#include <stdio.h>
#define MAXLINE 1000 /* input line max */
#define LINEMAX 40 /* fold point max */

char line[MAXLINE];
char newline[MAXLINE];

int getline(void);

int main(void)
{
int k, len, m, n;
int last_spa; /* index of
rightmost space before fold point max */

while((len = getline()) 0) { /* len is length of
input line or segment still unfolded */
while(len){
if(len LINEMAX) { /* folding needed */
last_spa = 0;
/* find a place to
fold */
for(k=0; k < LINEMAX; ++k) if(line[k] == ' ') last_spa =
k;
if (last_spa == 0) last_spa = k; /* can't find a
space - just split anyway */

for(m=0; m <= last_spa; ++m) { /* extract fold
segment */
newline[m] = line[0];
for(n=0; n < len; ++n) line[n] = line[n+1]; /*
shorten original line */
}
newline[m] = '\0'; /* terminate the new
line */
puts(newline); /* and output it */

len -= m; /* adjust remaining
line length */
}else{
puts(line); /* short (last)
segment out */
len = 0; /* force more input
*/
}
}
}
return 0;
}

int getline(void)
{
int c, i;
for(i = 0; i<MAXLINE-1 && (c=getchar())!= EOF && c != '\n'; ++i)
line[i] = c;
if(c == '\n') {
line[i] = '\0';
++i;
}
line[i] = '\0';
return i;
}
Sorry about the long lines ;-(

Aug 22 '08 #11
Pilcrow <pi*****@pp.infowrites:
On Tue, 19 Aug 2008 14:30:25 +0000, Richard Heathfield
<rj*@see.sig.invalidwrote:
>>Andrew C. said:
>>On 2008-08-19, Richard Heathfield <rj*@see.sig.invalidwrote:

I hesitate to add this, but here goes:
I have only time to comment on a fragment anyway.
I worked this exercise before looking at this thread and came up with a
quite different solution, which I include below. Would some of you
please comment on what I have done, especially in contrast to Andrew's
approach. I also am new to C, having previous experience with perl.
Here is my soloution:

#include <stdio.h>
#define MAXLINE 1000 /* input line max */
#define LINEMAX 40 /* fold point max */

char line[MAXLINE];
char newline[MAXLINE];
I don't like solutions that have a second limit like this input line
length. What happens when the whole 2Mb file has had the newlines
filtered out by a rogue FTP program?
int getline(void);

int main(void)
{
int k, len, m, n;
int last_spa; /* index of
rightmost space before fold point max */

while((len = getline()) 0) { /* len is length of
input line or segment still unfolded */
while(len){
if(len LINEMAX) { /* folding needed */
last_spa = 0;
/* find a place to
fold */
for(k=0; k < LINEMAX; ++k) if(line[k] == ' ') last_spa =
k;
if (last_spa == 0) last_spa = k; /* can't find a
space - just split anyway */

for(m=0; m <= last_spa; ++m) { /* extract fold
segment */
newline[m] = line[0];
for(n=0; n < len; ++n) line[n] = line[n+1]; /*
shorten original line */
}
newline[m] = '\0'; /* terminate the new
line */
puts(newline); /* and output it */

len -= m; /* adjust remaining
line length */
}else{
puts(line); /* short (last)
segment out */
len = 0; /* force more input
*/
}
}
}
return 0;
}
Forgive me, but I did not even read this. The line wrap makes it very
"noisy" and I'd have to put in an editor and remove, by hand, the added
newlines. Please enjoy the irony that this is a line wrapping algorithm!
int getline(void)
If you had put: int getline(char *line, int maxline) this would be
usable in other programs (you'd need to change MAXLINE to maxline).
{
int c, i;
for(i = 0; i<MAXLINE-1 && (c=getchar())!= EOF && c != '\n'; ++i)
line[i] = c;
if(c == '\n') {
line[i] = '\0';
++i;
}
What does this if do? I think you just want to set line[i] = 0
regardless of what c has in it.
line[i] = '\0';
return i;
}
Sorry about the long lines ;-(
--
Ben.
Aug 22 '08 #12
Pilcrow said:

<snip>
I worked this exercise before looking at this thread and came up with a
quite different solution, which I include below. Would some of you
please comment on what I have done, especially in contrast to Andrew's
approach.
Sure.

Firstly, I should say that I've read Ben's reply. I'm not as concerned as
he is about the input line length issue - we are, after all, still in
Chapter 1.

I can understand why the code layout put Ben off from reading the code, as
it nearly put me off too. But I decided to run it through indent and see
if that made it any easier. It did (mostly because it unwrapped the long
lines). Well, a bit, anyway.

I'm not overly excited by your name choices. MAXLINE and LINEMAX are so
similar that I would not want to have to remember which was which in a
hurry. Your comments do better justice to the concepts than the names do.
In fact, why not use those descriptions instead? INPUT_LINE_MAX and
FOLD_POINT_MAX ? (Or something like that, anyway. Something that clearly
distinguishes between the two.)

Although line is okay, I wouldn't have used newline - since C programmers
tend to read newline as meaning '\n' rather than as an array of 1000
characters.

Next: k, m, and n aren't exactly flowing over with descriptive power, are
they?

As for last_spa, it makes me think of Cheltenham. Would it be too much to
ask you to make it last_space?

I've now spent four paragraphs slagging off your name choices, and it
probably sounds like I'm determined to find fault. Sorry about that. But
name choices *matter*. Not because they make *this* program hard to
understand, but because choosing good names is a skill that pays dividends
in programs large and small, a skill that should be cultivated and
practised at every opportunity.

On to your algorithm.

Given that you have the line length, why do you start looking for the last
space *from the front of the line*? Why not start at the back?

The nesting of m and n is a bit of a mystery. Why not just loop *once* to
copy all relevant characters from line to newline, and /then/ loop *once*
to close up the gap in line? This would improve your line processing from
O(n*n) to O(n).

That's about all I can think of to say about it, I'm afraid - except to
wonder out loud whether you realise that you are double-terminating the
vast majority of your input strings? (See getline.) No big deal - but I
just thought I'd mention it. Mostly to prove that I'd bothered to read it,
actually. :-)

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999
Aug 23 '08 #13
On Fri, 22 Aug 2008 23:50:00 +0100, Ben Bacarisse <be********@bsb.me.uk>
wrote:
>Pilcrow <pi*****@pp.infowrites:
>On Tue, 19 Aug 2008 14:30:25 +0000, Richard Heathfield
<rj*@see.sig.invalidwrote:
>>>Andrew C. said:

On 2008-08-19, Richard Heathfield <rj*@see.sig.invalidwrote:
>

I hesitate to add this, but here goes:

I have only time to comment on a fragment anyway.
>#include <stdio.h>
#define MAXLINE 1000 /* input line max */
#define LINEMAX 40 /* fold point max */

char line[MAXLINE];
char newline[MAXLINE];

I don't like solutions that have a second limit like this input line
length. What happens when the whole 2Mb file has had the newlines
filtered out by a rogue FTP program?
Won't matter, getline will never deliver more than 999 chars
>
Forgive me, but I did not even read this. The line wrap makes it very
"noisy" and I'd have to put in an editor and remove, by hand, the added
newlines. Please enjoy the irony that this is a line wrapping algorithm!
I definitely agree about the messy line wraps. Here it is with shorter
lines:

#include <stdio.h>
#define MAXLINE 1000 /* input line max */
#define LINEMAX 40 /* fold point max */

char line[MAXLINE];
char newline[MAXLINE];

int getline(void);

int main(void)
{
int k, m, n;
int len; /* len is len of input line or segment still unfolded */
int last_spa; /* index of rightmost space before fold point */

while((len = getline()) 0) {fprintf(stderr,"%d\n",len);
while(len){
if(len LINEMAX) { /* folding needed */
last_spa = 0;
/* find a place to fold */
for(k=0; k < LINEMAX; ++k)
if(line[k] == ' ') last_spa = k;
/* can't find a space - just split anyway */
if (last_spa == 0) last_spa = k-1;

for(m=0; m <= last_spa; ++m) {/* extract segment */
newline[m] = line[0]; /* shorten line */
for(n=0; n < len; ++n) line[n] = line[n+1];
}
newline[m] = '\0'; /* terminate the new line */
puts(newline); /* and output it */

len -= m; /* adjust remaining line length */
}else{
puts(line); /* short (last) segment out */
len = 0; /* force more input */
}
}
}
return 0;
}

int getline(void)
{
int c, i;
for(i = 0; i<MAXLINE-1 && (c=getchar())!= EOF && c != '\n'; ++i)
line[i] = c;
line[i] = '\0'; /* add a null */
return i;
}

>int getline(void)

If you had put: int getline(char *line, int maxline) this would be
usable in other programs (you'd need to change MAXLINE to maxline).
good point
>
>{
int c, i;
for(i = 0; i<MAXLINE-1 && (c=getchar())!= EOF && c != '\n'; ++i)
line[i] = c;
if(c == '\n') {
line[i] = '\0';
++i;
}

What does this if do? I think you just want to set line[i] = 0
regardless of what c has in it.
right you are. fixed. and thanks for the comments
Aug 23 '08 #14
Pilcrow <pi*****@pp.infowrites:
On Fri, 22 Aug 2008 23:50:00 +0100, Ben Bacarisse <be********@bsb.me.uk>
wrote:
<snip>
>>I don't like solutions that have a second limit like this input line
length. What happens when the whole 2Mb file has had the newlines
filtered out by a rogue FTP program?

Won't matter, getline will never deliver more than 999 chars
Yes, I knew is was safe but it is likely (I admit I did not check)
that such a solution can't be used to fix the problem described
because words may end up across line buffer boundaries. I have now
checked and that is indeed a problem.
>>Forgive me, but I did not even read this. The line wrap makes it very
"noisy" and I'd have to put in an editor and remove, by hand, the added
newlines. Please enjoy the irony that this is a line wrapping algorithm!

I definitely agree about the messy line wraps. Here it is with shorter
lines:

#include <stdio.h>
#define MAXLINE 1000 /* input line max */
#define LINEMAX 40 /* fold point max */

char line[MAXLINE];
char newline[MAXLINE];

int getline(void);

int main(void)
{
int k, m, n;
int len; /* len is len of input line or segment still unfolded */
int last_spa; /* index of rightmost space before fold point */

while((len = getline()) 0) {fprintf(stderr,"%d\n",len);
while(len){
if(len LINEMAX) { /* folding needed */
last_spa = 0;
/* find a place to fold */
for(k=0; k < LINEMAX; ++k)
if(line[k] == ' ') last_spa = k;
/* can't find a space - just split anyway */
if (last_spa == 0) last_spa = k-1;
Is this really what you mean? The usual pattern would be to set
last_spa to LINEMAX-1 (the value k-1 if the loop ends without setting
it) rather the zero (to which it can be set by the loop). I am not
saying this is wrong, just messy.
>
for(m=0; m <= last_spa; ++m) {/* extract segment */
newline[m] = line[0]; /* shorten line */
for(n=0; n < len; ++n) line[n] = line[n+1];
}
newline[m] = '\0'; /* terminate the new line */
puts(newline); /* and output it */

len -= m; /* adjust remaining line length */
}else{
puts(line); /* short (last) segment out */
len = 0; /* force more input */
}
}
}
return 0;
}
--
Ben.
Aug 23 '08 #15
On Tue, 19 Aug 2008 14:30:25 +0000, Richard Heathfield
<rj*@see.sig.invalidwrote:
On line 50, '\n' is actually of type int, would you believe? But we know
its value must fit in a char (for several reasons, one of which is that it
is a mandatory member of the source character set, so it must have a value
in the range 1 to CHAR_MAX), so it's okay to pass it to ... char ....
Nit: of the basic execution charset. Which is what matters here,
6.2.5p3. 5.1.1.2 does talk about newlines, but 5.2.1p3 more
specifically (and thus I consider controllingly) says an
(impl-dependent) indicator 'treat[ed] as' a newline.

(This is really angels-on-a-pinhead: something that acts like a
newline, as a source char which cannot be accessed by the program, and
if it is converted to an execution char only as an extension becomes a
newline, but isn't formally a newline.)

- formerly david.thompson1 || achar(64) || worldnet.att.net
Sep 1 '08 #16
On Mon, 01 Sep 2008 06:53:21 GMT, David Thompson posted:
On Tue, 19 Aug 2008 14:30:25 +0000, Richard Heathfield
<rj*@see.sig.invalidwrote:
>On line 50, '\n' is actually of type int, would you believe? But we know
its value must fit in a char (for several reasons, one of which is that it
is a mandatory member of the source character set, so it must have a value
in the range 1 to CHAR_MAX), so it's okay to pass it to ... char ....

Nit: of the basic execution charset. Which is what matters here,
6.2.5p3. 5.1.1.2 does talk about newlines, but 5.2.1p3 more
specifically (and thus I consider controllingly) says an
(impl-dependent) indicator 'treat[ed] as' a newline.

(This is really angels-on-a-pinhead: something that acts like a
newline, as a source char which cannot be accessed by the program, and
if it is converted to an execution char only as an extension becomes a
newline, but isn't formally a newline.)
I'd like to see an illustrative snippet.

--
We must respect the other fellow's religion, but only in the sense and to
the extent that we respect his theory that his wife is beautiful and his
children smart. 5
H. L. Mencken
Sep 3 '08 #17
Ron Ford wrote:
On Mon, 01 Sep 2008 06:53:21 GMT, David Thompson posted:
>On Tue, 19 Aug 2008 14:30:25 +0000, Richard Heathfield
<rj*@see.sig.invalidwrote:
>>On line 50, '\n' is actually of type int, would you believe? But we know
its value must fit in a char (for several reasons, one of which is that it
is a mandatory member of the source character set, so it must have a value
in the range 1 to CHAR_MAX), so it's okay to pass it to ... char ....
Nit: of the basic execution charset. Which is what matters here,
6.2.5p3. 5.1.1.2 does talk about newlines, but 5.2.1p3 more
specifically (and thus I consider controllingly) says an
(impl-dependent) indicator 'treat[ed] as' a newline.
In n1256.pdf, the wording is:

"In source files, there shall be some way of indicating the end of each
line of text; this International Standard treats such an end-of-line
indicator as if it were a single new-line character."
>(This is really angels-on-a-pinhead: something that acts like a
newline, as a source char which cannot be accessed by the program, and
if it is converted to an execution char only as an extension becomes a
newline, but isn't formally a newline.)

I'd like to see an illustrative snippet.
The basic idea is that the C standard allows the same type of freedom of
representation for C source code files that it allows for files written
or read by C programs in text mode. What is treated as a single '\n'
character for purposes of parsing the C code might be represented by any
of the following in the source code file (among many other possibilities):

* '\n' followed by '\r'
* '\r' followed by '\n'
* '\r'
* text is stored in fixed-length blocks. When a line ends inside a
block, instead of writing a '\n' character into the block, the block is
padded to the end with '\0' characters.
* Same as previous, but the padding is with blanks.
* Same as previous, but each block contains a count of the number of
characters actually stored in the block - the contents of the rest of
the block are irrelevant. If the number of characters stored is less
less than the capacity of the block, that indicates the end of a line.

Most of the possibilities listed above are in actual current use.
Sep 3 '08 #18

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

9
by: Collin VanDyck | last post by:
I have a basic understanding of this, so forgive me if I am overly simplistic in my explanation of my problem.. I am trying to get a Java/Xalan transform to pass through a numeric character...
1
by: DrTebi | last post by:
Hello, I have the following problem: I used to "encode" my email address within links, in order to avoid (most) email spiders. So I had a link like this: <a...
0
by: Thomas Scheffler | last post by:
Hi, I runned in trouble using XALAN for XSL-Transformation. The following snipplet show what I mean: <a href="http://blah.com/?test=test&amp;test2=test2">Test1&amp;</a> <a...
4
by: Luklrc | last post by:
Hi, I'm having to create a querysting with javascript. My problem is that javscript turns the "&" characher into "&amp;" when it gets used as a querystring in the url EG: ...
4
by: johkar | last post by:
When the output method is set to xml, even though I have CDATA around my JavaScript, the operaters of && and < are converted to XML character entities which causes errors in my JavaScript. I know...
6
by: Rich Wallace | last post by:
All, I receive an xml doc with an element which contains a &amp; in the data. I am attempting to insert the data into a SQL Server database and I want to convert the &amp; to a "&" character before I...
8
by: Nathan Sokalski | last post by:
I add a JavaScript event handler to some of my Webcontrols using the Attributes.Add() method as follows: Dim jscode as String = "return (event.keyCode>=65&&event.keyCode<=90);"...
11
by: Jeremy | last post by:
How can one stop a browser from converting &amp; to & ? We have a textarea in our system wehre a user can type in some html code and have it saved to the database. When the data is retireved...
14
by: Arne | last post by:
A lot of Firefox users I know, says they have problems with validation where the ampersand sign has to be written as &amp; to be valid. I don't have Firefox my self and don't wont to install it only...
12
by: InvalidLastName | last post by:
We have been used XslTransform. .NET 1.1, for transform XML document, Dataset with xsl to HTML. Some of these html contents contain javascript and links. For example: // javascript if (a &gt; b)...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.