
August 19th, 2008, 08:45 AM
| | | K&R Ex 1-22
I'm trying to complete excercise 1-23 in the K&R book, which calls for
a program to wrap input lines at a given column, making sure to handle
lines that wrap twice or don't contain whitespace. This is what I have
so far*: http://pastebin.com/f61c6cfc
*Insert is a function which shifts the contents of the string right by
one character and inserts another character in its place.
I'm sure there's probably a completely better way to do this, but I'm
too far into it to rewrite it a different way.
Anyway, what I really want to know is
why the variable "search_state" always equals zero (at least according
to the printf statement I put in there and the fact that it never
wraps) even though the if statement before "search_state = 1" resolves
to be true. It's almost as though search_state was losing its value at
the end of the for loop. If it doesn't get set to 1, no wrapping will
happen.
I feel really stupid for having this much
trouble with an excercise so early in the book, but this has me
baffled.
-
You can get everything in life you want, if you will help enough other
people get what they want. | 
August 19th, 2008, 08:55 AM
| | | Re: K&R Ex 1-22
Andrew C. said: Quote:
I'm trying to complete excercise 1-23 in the K&R book, which calls for
a program to wrap input lines at a given column, making sure to handle
lines that wrap twice or don't contain whitespace. This is what I have
so far*: http://pastebin.com/f61c6cfc | I haven't much time right now, but I will have, later on. For me (or indeed
for anyone else) to look at your problem in sufficient detail to help you,
it would be handy if you could post the complete, compilable source code
that you are using, preferably right here on Usenet rather than on a Web
site. Also, it would be very useful to know what test data you're using.
My first glance (and it *is* only a quick glance) suggests that, when you
pass a string of fewer than 80 characters, your value for 'next' will be
0, which looks like it could be problematic. (That might not be the
problem - it's just a first impression.)
No doubt someone else will be able to help you more, but I really do
recommend that you post the full source here, together with your test
data.
HTH. HAND.
--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999 | 
August 19th, 2008, 10:35 AM
| | | Re: K&R Ex 1-22
On 2008-08-19, Richard Heathfield <rjh@see.sig.invalidwrote: Quote:
Andrew C. said:
> Quote:
>I'm trying to complete excercise 1-23 in the K&R book, which calls for
>a program to wrap input lines at a given column, making sure to handle
>lines that wrap twice or don't contain whitespace. This is what I have
>so far*:
>http://pastebin.com/f61c6cfc | >
I haven't much time right now, but I will have, later on. For me (or indeed
for anyone else) to look at your problem in sufficient detail to help you,
it would be handy if you could post the complete, compilable source code
that you are using, preferably right here on Usenet rather than on a Web
site. Also, it would be very useful to know what test data you're using.
>
My first glance (and it *is* only a quick glance) suggests that, when you
pass a string of fewer than 80 characters, your value for 'next' will be
0, which looks like it could be problematic. (That might not be the
problem - it's just a first impression.)
>
No doubt someone else will be able to help you more, but I really do
recommend that you post the full source here, together with your test
data.
>
HTH. HAND.
>
| You're right, I should have posted the full source code. I used
pastebin because I figured it would have been a bigger breach of
ettiquette to post a sizeable source file directly to usenet, long
lines and all, but here goes:
#include <stdio.h>
#include <string.h>
#define MAXCOLS 80 /*Column to wrap at*/
#define MAXLINE 1000 /*Maximum string length*/
#define NEXT(current) ((current/MAXCOLS+1)*MAXCOLS) /*macro
to calculate next*/
void wrap(char s[], int len);
void insert(char s[], int at, char c);
int getline(char s[], int lim);
main()
{
char line[MAXLINE];
int len;
int i;
for(i = 0; i<=80; ++i)
putchar(' ');
printf("|\n");
while((len = getline(line, MAXLINE)) 0) {
wrap(line,len);
printf("%s",line);
}
return 0;
}
/*wrap: word wrap a string at the desired length */
void wrap(char s[], int len)
{
int search_state = 0; /*0 = looking for next, 1 = looking for
whitespace*/
int i = len-1;
int next = ((i/MAXCOLS))*MAXCOLS; /*next column to wrap at*/
for(i = len-1; i MAXCOLS-1; --i) {
next = ((i/MAXCOLS))*MAXCOLS;
printf(" %d ",search_state); /*debug*/
if(search_state == 1) {
if(s[i] == ' ') { /*all is well*/
s[i] = '\n'; /*place a newline in the
whitespace*/
search_state = 0; /*reset the state*/
}
else if(i == next && s[i] != ' ') { /*there
wasn't any whitespace*/
insert(s,i,'\n'); /*insert a newline
at the wrap point*/
search_state = 0; /*reset the state*/
}
}
else
if(next >= MAXCOLS && i == next) /*if there is
a next, and i equals it, start wrapping*/
search_state = 1;
}
}
/*insert: insert a character into the middle of a character string*/
void insert(char s[], int at, char c)
{
int i;
for(i = strlen(s)-1; i>=at; --i)
s[i+1] = s[i];
s[i] = c;
}
/* getline: read a line into s, return length */
int getline(char s[], int lim)
{
int c, i;
for (i = 0; i<lim-1 && (c=getchar()) != EOF && c!='\n'; ++i)
s[i] = c;
if (c == '\n') {
s[i] = c;
++i;
}
if (i >= lim-1) {
while((c = getchar()) != EOF && c!='\n')
++i; /*increment length anyway in case of overflow*/
s[lim-1] = '\0';
}
else
s[i] = '\0';
return i;
}
By the way, with any input line less than 80 characters, absolutely
nothing should happen to it, which is indeed the correct behavior
because no word wrapping is needed.
As for test data, any input (I either mashed the keyboard or typed in
song lyrics) does the exact same thing: spit the input out unchanged.
Thanks for your help, and sorry for being a noob.
--
You can get everything in life you want, if you will help enough other
people get what they want. | 
August 19th, 2008, 03:35 PM
| | | Re: K&R Ex 1-22
Andrew C. said: Quote:
On 2008-08-19, Richard Heathfield <rjh@see.sig.invalidwrote: Quote:
>>
>No doubt someone else will be able to help you more, but I really do
>recommend that you post the full source here, together with your test
>data.
>>
| You're right, I should have posted the full source code. I used
pastebin because I figured it would have been a bigger breach of
ettiquette to post a sizeable source file directly to usenet, long
lines and all,
| Your concern for etiquette makes a welcome change! Nevertheless, a hundred
lines or so is no big deal in Usenet nowadays, at least not for most
people, and we can hope that the first couple of articles in this thread
will warn modem-users off. :-)
Okay, my first step was to compile it:
gcc -W -Wall -ansi -pedantic -Wformat-nonliteral -Wcast-align
-Wpointer-arith -Wbad-function-cast -Wmissing-prototypes
-Wstrict-prototypes -Wmissing-declarations -Winline -Wundef
-Wnested-externs -Wcast-qual -Wshadow -Wconversion -Wwrite-strings
-ffloat-store -fno-builtin -O2 -g -pg -c -o kr123.o kr123.c
kr123.c:14: warning: return-type defaults to `int'
kr123.c:14: warning: function declaration isn't a prototype
kr123.c: In function `wrap':
kr123.c:50: warning: passing arg 3 of `insert' with different width due to
prototype
kr123.c: In function `getline':
kr123.c:75: warning: `c' might be used uninitialized in this function
So let's go fix those things.
On line 14, I changed:
main()
to:
int main(void)
On line 50, '\n' is actually of type int, would you believe? But we know
its value must fit in a char (for several reasons, one of which is that it
is a mandatory member of the source character set, so it must have a value
in the range 1 to CHAR_MAX), so it's okay to pass it to a function
expecting a char. No fix required (although changing insert() to take int
rather than char would at least silence the warning).
By far the most serious warning is the one about 'c' possibly being used
uninitialised. But as far as I can see, this is actually impossible, given
that there is only one call to getline, and it specifies a large-ish value
for lim, so i<lim-1 must be true first time round, so the getchar is
definitely called.
So much for the compiler. Let's look at the behaviour.
To make matters a little less long-winded, I changed MAXCOLS to 12 and
recompiled. I gave your program an input of abcdefghijklmnop, and it
printed a bunch of spaces and a vertical bar, as expected. (I don't know
why you do that, but hey, it's your program!) It then spat out what look
like some debugging numbers, followed by abcdefghijklmnop again, with no
wrap.
Then I gave it lots and lots of junk, and (remembering that we changed
MAXCOLS to 12) look what happened!
fkjslvnfdlskjvnfsdkjlvnfdskjvnfsdkjvnfksldjnvjksld fnvjdskfvnfdjsvnjkdfls
jfks vjksfdvjkfsdhvjksdfvhjksdfv
0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0
0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0
0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0
0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 fkjslvnfdls
jjvnfsdkjlvnfdskjvnfsdkj
nnfksldjnvjksldfnvjdskfv
ffdjsvnjkdfls jfks vjksf
vvjkfsdhvjksdfvhjksdfv
#1 Wed Mar 27 13:57:05 UTC 2002
WOW! I have no idea where that timestamp came from... :-)
So it looks like you *are* getting wrap, but not how and when you want it.
It also looks like you have an unterminated "string". (That's the usual
cause of utterly off-the-wall output.)
Time to read the code.
The first few lines in main just print out a bunch of spaces and a |, for
no apparent reason - although no doubt you know why. Nothing illegal
there, so let's move on.
You have a nice simple loop, consisting of getline, wrap, and printf. The
printf prints 'line', so we have a good idea of which "string" isn't
getting terminated. I checked to see whether it is constructed properly,
by looking at getline, and getline seems fine.
So the problem of failure to terminate is either in wrap() or in insert(),
and my money is on insert(), so let's take a look:
void insert(char s[], int at, char c)
{
int i;
for(i = strlen(s)-1; i>=at; --i)
s[i+1] = s[i];
s[i] = c;
}
Consider the following string: "ABCDEF\n\0" (I've made the \0 explicitly
visible, but it isn't part of the count given by strlen.)
i = strlen(s) - 1; Since strlen(s) is 7 (A-F is 6, + \n), i takes the value
6.
s[i+1] = s[i];
so we're doing s[7] = s[6]. Now, s[7] /was/ the null terminator, but we
just overwrote that. Not good. Let's fix it.
Whilst strlen yields the length of a string (not including the null), we
can think of it in another way, i.e. as returning the offset of the null
terminator.
void insert(char s[], int at, char c)
{
size_t nullpos = strlen(s);
size_t pos = nullpos;
while(pos 0 && pos at)
{
s[pos] = s[pos - 1];
--pos;
}
s[pos] = c;
s[nullpos + 1] = '\0';
}
That's fixed our null termination problem. Now for the logic bug, which is
clearly in the wrap() function. Experimentation quickly revealed that your
code *is* wrapping, albeit rather bizarrely - for example, the first wrap
seems to occur at 2*MAXCOLS, rather than at MAXCOLS.
It appears that what you're trying to do is replace a space with a newline
if possible, but if necessary you'll shift a long string's tail over by
one character to insert a newline. A perfectly reasonable approach,
although of course you have to be careful with lines close to your
buffer's size limit.
It strikes me that it would be much easier to do this as follows:
void wrap(char s[], size_t len)
{
char *t = s;
while(len >= MAXCOLS)
{
t += MAXCOLS;
while(t s && *t != ' ')
{
--t;
}
if(*t == ' ')
{
*t = '\n';
}
else
{
t += MAXCOLS;
insert(t, 0, '\n');
}
len -= (t - s);
s = t;
}
}
Dropping that - and the insert() update - into your program (and making
appropriate type-changes from int to size_t, which is a more appropriate
type for describing buffer sizes), for input like this:
That's fixed our null termination problem. Now for the logic bug, which is
clearly in the wrap() function. Experimentation quickly revealed that your
code *is* wrapping, albeit rather bizarrely - for example, the first wrap
seems to occur at 2*MAXCOLS, rather than at MAXCOLS.
I get output like this:
That's fixed
our null
termination
problem.
Now for the
logic bug,
which is
clearly in
the wrap()
function.
Experimenta
tion
quickly
revealed
that your
code *is*
wrapping,
albeit
rather
bizarrely -
for
example,
the first
wrap seems
to occur at
2*MAXCOLS,
rather than
at
MAXCOLS.
and that seems to be working nicely.
--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999 | 
August 19th, 2008, 09:23 PM
| | | Re: K&R Ex 1-22
Thanks much for your help. First, let me go ahead and explain myself
on that one segment of code involving spaces and a pipe - that's just
a ruler I was printing to see if I had typed enough input to warrant
wrapping. I noticed it was hardcoded to use 80 columns so I fixed it
to use MAXCOLS instead. I guess I should have commented that.
Thank you for pointing out the problem with insert(), i probably
wouldn't have noticed it otherwise.
Your solution for wrap is amazingly more elegant than mine:
On 2008-08-19, Richard Heathfield <rjh@see.sig.invalidwrote: Quote:
It strikes me that it would be much easier to do this as follows:
>
void wrap(char s[], size_t len)
{
char *t = s;
while(len >= MAXCOLS)
{
t += MAXCOLS;
while(t s && *t != ' ')
{
--t;
}
if(*t == ' ')
{
*t = '\n';
}
else
{
t += MAXCOLS;
insert(t, 0, '\n');
}
len -= (t - s);
s = t;
}
}
| However, I'm having trouble understanding it because I can see it uses
pointers, and having not reached that chapter in the book I don't really
understand what pointers do. I'm sure this section can be written
without pointers though, so I'll just set out to rewrite the function
using the logic I think I'm seeing above.
--
You can get everything in life you want, if you will help enough other
people get what they want. | 
August 20th, 2008, 02:35 AM
| | | Re: K&R Ex 1-22
On Tue, 19 Aug 2008 07:57:04 +0000, Richard Heathfield posted: Quote: >
I haven't much time right now, but I will have, later on. For me (or indeed
for anyone else) to look at your problem in sufficient detail to help you,
it would be handy if you could post the complete, compilable source code
that you are using, preferably right here on Usenet rather than on a Web
site. Also, it would be very useful to know what test data you're using.
| Richard errs that everyone else prefers source posting in clc as opposed to
a link. Are you solving 22 or 23?
--
When a new source of taxation is found it never means, in practice, that
the old source is abandoned. It merely means that the politicians have two
ways of milking the taxpayer where they had one before. 8
H. L. Mencken | 
August 20th, 2008, 02:45 AM
| | | Re: K&R Ex 1-22
On 2008-08-20, Ron Ford <ron@example.invalidwrote: Quote:
On Tue, 19 Aug 2008 07:57:04 +0000, Richard Heathfield posted:
> Quote: >>
>I haven't much time right now, but I will have, later on. For me (or indeed
>for anyone else) to look at your problem in sufficient detail to help you,
>it would be handy if you could post the complete, compilable source code
>that you are using, preferably right here on Usenet rather than on a Web
>site. Also, it would be very useful to know what test data you're using.
| >
Richard errs that everyone else prefers source posting in clc as opposed to
a link. Are you solving 22 or 23?
| I'm solving 22. I already finished 23, it was actually quite easy.
--
Let us never negotiate out of fear, but let us never fear to negotiate.
-- John F. Kennedy | 
August 20th, 2008, 05:25 AM
| | | Re: K&R Ex 1-22
Andrew C. said:
<snip> Quote:
However, I'm having trouble understanding it because I can see it uses
pointers,
| Oops! Sorry about that.
void wrap(char s[], size_t len)
{
size_t basepos = 0;
size_t curpos = 0;
while(len >= MAXCOLS)
{
curpos += MAXCOLS;
while(curpos basepos && s[curpos] != ' ')
{
--curpos;
}
if(s[curpos] == ' ')
{
s[curpos] = '\n';
}
else
{
curpos += MAXCOLS;
insert(s, curpos, '\n');
}
len -= (curpos - basepos);
basepos = curpos;
}
}
I haven't actually tested this, but I'm reasonably sure I got the
translation right.
--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999 | 
August 20th, 2008, 06:05 AM
| | | Re: K&R Ex 1-22
On 2008-08-20, Richard Heathfield <rjh@see.sig.invalidwrote: Quote:
Andrew C. said:
>
><snip>
> Quote:
>However, I'm having trouble understanding it because I can see it uses
>pointers,
| >
Oops! Sorry about that.
>
void wrap(char s[], size_t len)
{
size_t basepos = 0;
size_t curpos = 0;
>
while(len >= MAXCOLS)
{
curpos += MAXCOLS;
while(curpos basepos && s[curpos] != ' ')
{
--curpos;
}
if(s[curpos] == ' ')
{
s[curpos] = '\n';
}
else
{
curpos += MAXCOLS;
insert(s, curpos, '\n');
}
len -= (curpos - basepos);
basepos = curpos;
}
}
>
I haven't actually tested this, but I'm reasonably sure I got the
translation right.
| Thanks for the clarification, it works like a charm. And I actually
understand this one. I only wish I could have thought of that myself.
--
Let us never negotiate out of fear, but let us never fear to negotiate.
-- John F. Kennedy | 
August 20th, 2008, 06:15 AM
| | | Re: K&R Ex 1-22
Andrew C. said: Quote: |
On 2008-08-20, Richard Heathfield <rjh@see.sig.invalidwrote:
| <snip> Quote: Quote:
>>
>I haven't actually tested this, but I'm reasonably sure I got the
>translation right.
| Thanks for the clarification, it works like a charm. And I actually
understand this one. I only wish I could have thought of that myself.
| In your original article in the thread, you said:
"I'm sure there's probably a completely better way to do this, but I'm
too far into it to rewrite it a different way."
Well, sometimes we get so far into things, and into such a tangle, that a
rewrite is the only sensible option. I spent about three minutes looking
at your function, and I still didn't understand how you intended it to
work. At that point, I thought "well, how hard can it be?" and spent less
time writing the replacement than I had spent trying to understand the
original!
When you find yourself in a tangle of loops and ifs, it's sometimes better
to stop, step back, and think about something else for a while. Then have
another go at writing the code from scratch - preferably in a different
source file, leaving the original well alone. You may well find that your
second attempt is rather more elegant than your first.
In fact, once you've been through all the Chapter 1 exercises and perhaps a
few from Chapters 2-4, you might want to start at Ex 1-8 again, WITHOUT
revisiting your earlier solutions. Work through to 1-24. Then go back and
compare your new solutions with your old ones. I think you'll notice a
dramatic improvement.
--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999 | 
August 22nd, 2008, 10:35 PM
| | | Re: K&R Ex 1-22
On Tue, 19 Aug 2008 14:30:25 +0000, Richard Heathfield
<rjh@see.sig.invalidwrote: Quote:
>Andrew C. said:
> Quote: |
>On 2008-08-19, Richard Heathfield <rjh@see.sig.invalidwrote: | | I hesitate to add this, but here goes:
I worked this exercise before looking at this thread and came up with a
quite different solution, which I include below. Would some of you
please comment on what I have done, especially in contrast to Andrew's
approach. I also am new to C, having previous experience with perl.
Here is my soloution:
#include <stdio.h>
#define MAXLINE 1000 /* input line max */
#define LINEMAX 40 /* fold point max */
char line[MAXLINE];
char newline[MAXLINE];
int getline(void);
int main(void)
{
int k, len, m, n;
int last_spa; /* index of
rightmost space before fold point max */
while((len = getline()) 0) { /* len is length of
input line or segment still unfolded */
while(len){
if(len LINEMAX) { /* folding needed */
last_spa = 0;
/* find a place to
fold */
for(k=0; k < LINEMAX; ++k) if(line[k] == ' ') last_spa =
k;
if (last_spa == 0) last_spa = k; /* can't find a
space - just split anyway */
for(m=0; m <= last_spa; ++m) { /* extract fold
segment */
newline[m] = line[0];
for(n=0; n < len; ++n) line[n] = line[n+1]; /*
shorten original line */
}
newline[m] = '\0'; /* terminate the new
line */
puts(newline); /* and output it */
len -= m; /* adjust remaining
line length */
}else{
puts(line); /* short (last)
segment out */
len = 0; /* force more input
*/
}
}
}
return 0;
}
int getline(void)
{
int c, i;
for(i = 0; i<MAXLINE-1 && (c=getchar())!= EOF && c != '\n'; ++i)
line[i] = c;
if(c == '\n') {
line[i] = '\0';
++i;
}
line[i] = '\0';
return i;
}
Sorry about the long lines ;-( | 
August 22nd, 2008, 11:55 PM
| | | Re: K&R Ex 1-22
Pilcrow <pilcrow@pp.infowrites: Quote:
On Tue, 19 Aug 2008 14:30:25 +0000, Richard Heathfield
<rjh@see.sig.invalidwrote:
> Quote:
>>Andrew C. said:
>> Quote:
>>On 2008-08-19, Richard Heathfield <rjh@see.sig.invalidwrote:
>>>>
| | >
I hesitate to add this, but here goes:
| I have only time to comment on a fragment anyway. Quote:
I worked this exercise before looking at this thread and came up with a
quite different solution, which I include below. Would some of you
please comment on what I have done, especially in contrast to Andrew's
approach. I also am new to C, having previous experience with perl.
Here is my soloution:
>
#include <stdio.h>
#define MAXLINE 1000 /* input line max */
#define LINEMAX 40 /* fold point max */
>
char line[MAXLINE];
char newline[MAXLINE];
| I don't like solutions that have a second limit like this input line
length. What happens when the whole 2Mb file has had the newlines
filtered out by a rogue FTP program? Quote:
int getline(void);
>
int main(void)
{
int k, len, m, n;
int last_spa; /* index of
rightmost space before fold point max */
>
while((len = getline()) 0) { /* len is length of
input line or segment still unfolded */
while(len){
if(len LINEMAX) { /* folding needed */
last_spa = 0;
/* find a place to
fold */
for(k=0; k < LINEMAX; ++k) if(line[k] == ' ') last_spa =
k;
if (last_spa == 0) last_spa = k; /* can't find a
space - just split anyway */
>
for(m=0; m <= last_spa; ++m) { /* extract fold
segment */
newline[m] = line[0];
for(n=0; n < len; ++n) line[n] = line[n+1]; /*
shorten original line */
}
newline[m] = '\0'; /* terminate the new
line */
puts(newline); /* and output it */
>
len -= m; /* adjust remaining
line length */
}else{
puts(line); /* short (last)
segment out */
len = 0; /* force more input
*/
}
}
}
return 0;
}
| Forgive me, but I did not even read this. The line wrap makes it very
"noisy" and I'd have to put in an editor and remove, by hand, the added
newlines. Please enjoy the irony that this is a line wrapping algorithm! If you had put: int getline(char *line, int maxline) this would be
usable in other programs (you'd need to change MAXLINE to maxline). Quote:
{
int c, i;
for(i = 0; i<MAXLINE-1 && (c=getchar())!= EOF && c != '\n'; ++i)
line[i] = c;
if(c == '\n') {
line[i] = '\0';
++i;
}
| What does this if do? I think you just want to set line[i] = 0
regardless of what c has in it. Quote:
line[i] = '\0';
return i;
}
>
>
Sorry about the long lines ;-(
>
| --
Ben. | 
August 23rd, 2008, 06:55 AM
| | | Re: K&R Ex 1-22
Pilcrow said:
<snip> Quote:
I worked this exercise before looking at this thread and came up with a
quite different solution, which I include below. Would some of you
please comment on what I have done, especially in contrast to Andrew's
approach.
| Sure.
Firstly, I should say that I've read Ben's reply. I'm not as concerned as
he is about the input line length issue - we are, after all, still in
Chapter 1.
I can understand why the code layout put Ben off from reading the code, as
it nearly put me off too. But I decided to run it through indent and see
if that made it any easier. It did (mostly because it unwrapped the long
lines). Well, a bit, anyway.
I'm not overly excited by your name choices. MAXLINE and LINEMAX are so
similar that I would not want to have to remember which was which in a
hurry. Your comments do better justice to the concepts than the names do.
In fact, why not use those descriptions instead? INPUT_LINE_MAX and
FOLD_POINT_MAX ? (Or something like that, anyway. Something that clearly
distinguishes between the two.)
Although line is okay, I wouldn't have used newline - since C programmers
tend to read newline as meaning '\n' rather than as an array of 1000
characters.
Next: k, m, and n aren't exactly flowing over with descriptive power, are
they?
As for last_spa, it makes me think of Cheltenham. Would it be too much to
ask you to make it last_space?
I've now spent four paragraphs slagging off your name choices, and it
probably sounds like I'm determined to find fault. Sorry about that. But
name choices *matter*. Not because they make *this* program hard to
understand, but because choosing good names is a skill that pays dividends
in programs large and small, a skill that should be cultivated and
practised at every opportunity.
On to your algorithm.
Given that you have the line length, why do you start looking for the last
space *from the front of the line*? Why not start at the back?
The nesting of m and n is a bit of a mystery. Why not just loop *once* to
copy all relevant characters from line to newline, and /then/ loop *once*
to close up the gap in line? This would improve your line processing from
O(n*n) to O(n).
That's about all I can think of to say about it, I'm afraid - except to
wonder out loud whether you realise that you are double-terminating the
vast majority of your input strings? (See getline.) No big deal - but I
just thought I'd mention it. Mostly to prove that I'd bothered to read it,
actually. :-)
--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999 | 
August 23rd, 2008, 08:05 AM
| | | Re: K&R Ex 1-22
On Fri, 22 Aug 2008 23:50:00 +0100, Ben Bacarisse <ben.usenet@bsb.me.uk>
wrote: Quote:
>Pilcrow <pilcrow@pp.infowrites:
> Quote:
>On Tue, 19 Aug 2008 14:30:25 +0000, Richard Heathfield
><rjh@see.sig.invalidwrote:
>> Quote:
>>>Andrew C. said:
>>>
>>>On 2008-08-19, Richard Heathfield <rjh@see.sig.invalidwrote:
>>>>>
| >>
>I hesitate to add this, but here goes:
| >
>I have only time to comment on a fragment anyway.
>
| Quote: Quote:
>#include <stdio.h>
>#define MAXLINE 1000 /* input line max */
>#define LINEMAX 40 /* fold point max */
>>
>char line[MAXLINE];
>char newline[MAXLINE];
| >
>I don't like solutions that have a second limit like this input line
>length. What happens when the whole 2Mb file has had the newlines
>filtered out by a rogue FTP program?
| Won't matter, getline will never deliver more than 999 chars Quote:
>
>Forgive me, but I did not even read this. The line wrap makes it very
>"noisy" and I'd have to put in an editor and remove, by hand, the added
>newlines. Please enjoy the irony that this is a line wrapping algorithm!
>
| I definitely agree about the messy line wraps. Here it is with shorter
lines:
#include <stdio.h>
#define MAXLINE 1000 /* input line max */
#define LINEMAX 40 /* fold point max */
char line[MAXLINE];
char newline[MAXLINE];
int getline(void);
int main(void)
{
int k, m, n;
int len; /* len is len of input line or segment still unfolded */
int last_spa; /* index of rightmost space before fold point */
while((len = getline()) 0) {fprintf(stderr,"%d\n",len);
while(len){
if(len LINEMAX) { /* folding needed */
last_spa = 0;
/* find a place to fold */
for(k=0; k < LINEMAX; ++k)
if(line[k] == ' ') last_spa = k;
/* can't find a space - just split anyway */
if (last_spa == 0) last_spa = k-1;
for(m=0; m <= last_spa; ++m) {/* extract segment */
newline[m] = line[0]; /* shorten line */
for(n=0; n < len; ++n) line[n] = line[n+1];
}
newline[m] = '\0'; /* terminate the new line */
puts(newline); /* and output it */
len -= m; /* adjust remaining line length */
}else{
puts(line); /* short (last) segment out */
len = 0; /* force more input */
}
}
}
return 0;
}
int getline(void)
{
int c, i;
for(i = 0; i<MAXLINE-1 && (c=getchar())!= EOF && c != '\n'; ++i)
line[i] = c;
line[i] = '\0'; /* add a null */
return i;
} Quote: >
>If you had put: int getline(char *line, int maxline) this would be
>usable in other programs (you'd need to change MAXLINE to maxline).
| good point Quote:
> Quote:
>{
> int c, i;
> for(i = 0; i<MAXLINE-1 && (c=getchar())!= EOF && c != '\n'; ++i)
> line[i] = c;
> if(c == '\n') {
> line[i] = '\0';
> ++i;
> }
| >
>What does this if do? I think you just want to set line[i] = 0
>regardless of what c has in it.
| right you are. fixed. and thanks for the comments | 
August 23rd, 2008, 03:45 PM
| | | Re: K&R Ex 1-22
Pilcrow <pilcrow@pp.infowrites: Quote:
On Fri, 22 Aug 2008 23:50:00 +0100, Ben Bacarisse <ben.usenet@bsb.me.uk>
wrote:
| <snip> Quote: Quote:
>>I don't like solutions that have a second limit like this input line
>>length. What happens when the whole 2Mb file has had the newlines
>>filtered out by a rogue FTP program?
| >
Won't matter, getline will never deliver more than 999 chars
| Yes, I knew is was safe but it is likely (I admit I did not check)
that such a solution can't be used to fix the problem described
because words may end up across line buffer boundaries. I have now
checked and that is indeed a problem. Quote: Quote:
>>Forgive me, but I did not even read this. The line wrap makes it very
>>"noisy" and I'd have to put in an editor and remove, by hand, the added
>>newlines. Please enjoy the irony that this is a line wrapping algorithm!
>>
| >
I definitely agree about the messy line wraps. Here it is with shorter
lines:
>
#include <stdio.h>
#define MAXLINE 1000 /* input line max */
#define LINEMAX 40 /* fold point max */
>
char line[MAXLINE];
char newline[MAXLINE];
>
int getline(void);
>
int main(void)
{
int k, m, n;
int len; /* len is len of input line or segment still unfolded */
int last_spa; /* index of rightmost space before fold point */
>
while((len = getline()) 0) {fprintf(stderr,"%d\n",len);
while(len){
if(len LINEMAX) { /* folding needed */
last_spa = 0;
/* find a place to fold */
for(k=0; k < LINEMAX; ++k)
if(line[k] == ' ') last_spa = k;
/* can't find a space - just split anyway */
if (last_spa == 0) last_spa = k-1;
| Is this really what you mean? The usual pattern would be to set
last_spa to LINEMAX-1 (the value k-1 if the loop ends without setting
it) rather the zero (to which it can be set by the loop). I am not
saying this is wrong, just messy. Quote:
>
for(m=0; m <= last_spa; ++m) {/* extract segment */
newline[m] = line[0]; /* shorten line */
for(n=0; n < len; ++n) line[n] = line[n+1];
}
newline[m] = '\0'; /* terminate the new line */
puts(newline); /* and output it */
>
len -= m; /* adjust remaining line length */
}else{
puts(line); /* short (last) segment out */
len = 0; /* force more input */
}
}
}
return 0;
}
| --
Ben. | 
September 1st, 2008, 07:55 AM
| | | Re: K&R Ex 1-22
On Tue, 19 Aug 2008 14:30:25 +0000, Richard Heathfield
<rjh@see.sig.invalidwrote: Quote:
On line 50, '\n' is actually of type int, would you believe? But we know
its value must fit in a char (for several reasons, one of which is that it
is a mandatory member of the source character set, so it must have a value
in the range 1 to CHAR_MAX), so it's okay to pass it to ... char ....
| Nit: of the basic execution charset. Which is what matters here,
6.2.5p3. 5.1.1.2 does talk about newlines, but 5.2.1p3 more
specifically (and thus I consider controllingly) says an
(impl-dependent) indicator 'treat[ed] as' a newline.
(This is really angels-on-a-pinhead: something that acts like a
newline, as a source char which cannot be accessed by the program, and
if it is converted to an execution char only as an extension becomes a
newline, but isn't formally a newline.)
- formerly david.thompson1 || achar(64) || worldnet.att.net | 
September 3rd, 2008, 08:05 AM
| | | Re: K&R Ex 1-22
On Mon, 01 Sep 2008 06:53:21 GMT, David Thompson posted: Quote:
On Tue, 19 Aug 2008 14:30:25 +0000, Richard Heathfield
<rjh@see.sig.invalidwrote:
> Quote:
>On line 50, '\n' is actually of type int, would you believe? But we know
>its value must fit in a char (for several reasons, one of which is that it
>is a mandatory member of the source character set, so it must have a value
>in the range 1 to CHAR_MAX), so it's okay to pass it to ... char ....
| >
Nit: of the basic execution charset. Which is what matters here,
6.2.5p3. 5.1.1.2 does talk about newlines, but 5.2.1p3 more
specifically (and thus I consider controllingly) says an
(impl-dependent) indicator 'treat[ed] as' a newline.
>
(This is really angels-on-a-pinhead: something that acts like a
newline, as a source char which cannot be accessed by the program, and
if it is converted to an execution char only as an extension becomes a
newline, but isn't formally a newline.)
| I'd like to see an illustrative snippet.
--
We must respect the other fellow's religion, but only in the sense and to
the extent that we respect his theory that his wife is beautiful and his
children smart. 5
H. L. Mencken | 
September 3rd, 2008, 12:45 PM
| | | Re: K&R Ex 1-22
Ron Ford wrote: Quote:
On Mon, 01 Sep 2008 06:53:21 GMT, David Thompson posted:
> Quote:
>On Tue, 19 Aug 2008 14:30:25 +0000, Richard Heathfield
><rjh@see.sig.invalidwrote:
>> Quote:
>>On line 50, '\n' is actually of type int, would you believe? But we know
>>its value must fit in a char (for several reasons, one of which is that it
>>is a mandatory member of the source character set, so it must have a value
>>in the range 1 to CHAR_MAX), so it's okay to pass it to ... char ....
| >Nit: of the basic execution charset. Which is what matters here,
>6.2.5p3. 5.1.1.2 does talk about newlines, but 5.2.1p3 more
>specifically (and thus I consider controllingly) says an
>(impl-dependent) indicator 'treat[ed] as' a newline.
| | In n1256.pdf, the wording is:
"In source files, there shall be some way of indicating the end of each
line of text; this International Standard treats such an end-of-line
indicator as if it were a single new-line character." Quote: Quote:
>(This is really angels-on-a-pinhead: something that acts like a
>newline, as a source char which cannot be accessed by the program, and
>if it is converted to an execution char only as an extension becomes a
>newline, but isn't formally a newline.)
| >
I'd like to see an illustrative snippet.
| The basic idea is that the C standard allows the same type of freedom of
representation for C source code files that it allows for files written
or read by C programs in text mode. What is treated as a single '\n'
character for purposes of parsing the C code might be represented by any
of the following in the source code file (among many other possibilities):
* '\n' followed by '\r'
* '\r' followed by '\n'
* '\r'
* text is stored in fixed-length blocks. When a line ends inside a
block, instead of writing a '\n' character into the block, the block is
padded to the end with '\0' characters.
* Same as previous, but the padding is with blanks.
* Same as previous, but each block contains a count of the number of
characters actually stored in the block - the contents of the rest of
the block are irrelevant. If the number of characters stored is less
less than the capacity of the block, that indicates the end of a line.
Most of the possibilities listed above are in actual current use. | | Thread Tools | Search this Thread | | | |
Posting Rules
| You may not post new threads You may not post replies You may not post attachments You may not edit your posts HTML code is Off | | | | | | What is Bytes?
We are a network of experts and professionals in IT and software development that help one another with answers to tough questions and share insights.
Get the best answers to your questions from over 205,338 network members.
|