Any way to take a word as input from stdin ?

arnuld <su*****@invalid.addresswrote:

C++ has std::string for taking a word as input from stdin. C takes input
in 2 ways:

1) as a character, etchar()
2) as a whole line, fgets()

as C programmer, are we supposed to create a get_word function everytime
when we need a words as input from stdin ( e.g. terminal)

There is no generic solution (mainly because there is no consensus on
what a "word" is), so yes.

Richard

Sep 10 '08 #3

Malcolm McLean

"arnuld" <su*****@invalid.addresswrote in message

as C programmer, are we supposed to create a get_word function everytime
when we need a words as input from stdin ( e.g. terminal)

Generally there will be a regular expression parser available. It's not part
of the standard library, unfortunately, so the details may vary.
You can specify exactly what you mean by a 'word', and extract with that.

--
Free games and programming goodies.
http://www.personal.leeds.ac.uk/~bgy1mm

Sep 10 '08 #4

Pilcrow

On Wed, 10 Sep 2008 15:08:40 +0500, arnuld <su*****@invalid.address>
wrote:

>I searched the c.l.c archives provided by Google as Google Groups with
"word input" as the key words and did not come up with anything good.
C++ has std::string for taking a word as input from stdin. C takes input
in 2 ways:

1) as a character, etchar()
2) as a whole line, fgets()
as C programmer, are we supposed to create a get_word function everytime
when we need a words as input from stdin ( e.g. terminal)

Try using fgets(), and strtok(). strtok() will allow you to define word
separators to your taste.

Here is sample code:

------------------------------------------------------------------------
#include <stdio.h>
#include <string.h>
#define MAXLINE 500

char *tok;
char line[MAXLINE];

int main(void)
{
while(fgets(line, MAXLINE, stdin) != NULL) {
if((tok = strtok(line," \n")) != NULL) puts(tok); /* first
token on each line */
while((tok = strtok(NULL," \n")) != NULL) puts(tok); /*
subsequent tokens */
}
return 0;
}

Sep 10 '08 #5

Pilcrow said:

On Wed, 10 Sep 2008 15:08:40 +0500, arnuld <su*****@invalid.address>
wrote:

<snip>

>>as C programmer, are we supposed to create a get_word function everytime
when we need a words as input from stdin ( e.g. terminal)

Try using fgets(), and strtok(). strtok() will allow you to define word
separators to your taste.

This is poor advice for a beginner. Whilst strtok does have its uses, it
also has issues - traps for the unwary programmer. These derive from its
maintenance of significant state between calls, which makes it unsuitable
for use in library functions.

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999

Sep 10 '08 #6

Gordon Burditt

>I searched the c.l.c archives provided by Google as Google Groups with

>"word input" as the key words and did not come up with anything good.
C++ has std::string for taking a word as input from stdin. C takes input
in 2 ways:

1) as a character, etchar()
2) as a whole line, fgets()
as C programmer, are we supposed to create a get_word function everytime
when we need a words as input from stdin ( e.g. terminal)

The first step is to define what a "word" is.

How many words are these:

1. don't
2. antidisestablish-
mentarianism
3. Joe,Bob,Sally, and Henry.
4. Joe, Bob, Sally, and Henry.
5. $1,416,383,583.20
6. ()@#$#^&*#^%%#^@*^$&*$
7. George W. Bush
8. slam-dunk
9. 15th-century vase
10. M.O.N.S.T.E.R., the computer chess-playing machine
11. lo**************************************@yahoo.com

Justify your answers.

Sep 10 '08 #7

Pilcrow

On Wed, 10 Sep 2008 22:05:46 +0000, Richard Heathfield
<rj*@see.sig.invalidwrote:

>Pilcrow said:

>On Wed, 10 Sep 2008 15:08:40 +0500, arnuld <su*****@invalid.address>
wrote:

<snip>

>>>as C programmer, are we supposed to create a get_word function everytime
when we need a words as input from stdin ( e.g. terminal)

Try using fgets(), and strtok(). strtok() will allow you to define word
separators to your taste.

This is poor advice for a beginner. Whilst strtok does have its uses, it
also has issues - traps for the unwary programmer. These derive from its
maintenance of significant state between calls, which makes it unsuitable

I understood that, and I am a 'beginner'. It is very adequately covered
in textbooks (see 'C in a Nutshell', ISBN 0-596-00697-7, page 440),
somewhat less so in K&R2. And I gave the questioner an example to help
him. My dissatisfaction with strtok() is that repeated separation
characters are treated as one, making it difficult to present the user
with an intuitively understandable interface. It is not usually a good
idea to equate ignorance and stupidity.

>for use in library functions.

Sep 10 '08 #8

arnuld wrote:

>
I searched the c.l.c archives provided by Google as Google Groups
with "word input" as the key words and did not come up with
anything good.

C++ has std::string for taking a word as input from stdin. C takes
input in 2 ways:

1) as a character, etchar()
2) as a whole line, fgets()

as C programmer, are we supposed to create a get_word function
everytime when we need a words as input from stdin ( e.g. terminal)

Well, first you have to define a word. Does it terminate on
blanks, on blanks and non-print chars, on blanks and tabs, etc. I
think you will find that the C++ mechanism terminates on blanks and
'\n' (but I could well be wrong). Having defined it, you just
write the code to extract such a beast from a stream (or from a
string). At that point both you and your code reader know exactly
what the function extracts.

Don't forget to preserve the exit char. Something else may need
it.

Note that, having written the function, you are allowed to keep its
source (and its object code) and reuse it as often as you wish,
with minimum effort. If you have taken the elementary precaution
of writing it in standard C, you can use it anywhere.

--
[mail]: Chuck F (cbfalconer at maineline dot net)
[page]: <http://cbfalconer.home.att.net>
Try the download section.

Sep 11 '08 #9

William Pursell

On 10 Sep, 11:08, arnuld <sunr...@invalid.addresswrote:

>
as C programmer, are we supposed to create a get_word function everytime
when we need a words as input from stdin ( e.g. terminal)

No. You should either find a function that does what you want
or write it yourself, and once you have done that...don't
ever do it again. Put it in a library and use it.

If you are re-writing the same function repeatedly, then
you aren't a C-programmer. You aren't any kind of programmer.
Re-writing the same functionality can be a useful
exercise for the novice, but it is a silly waste of
time otherwise.

Sep 11 '08 #10

arnuld

On Wed, 10 Sep 2008 17:01:53 -0500, Gordon Burditt wrote:

The first step is to define what a "word" is.

Fore *my* program, a word is a collection of letters, numbers or anything
separated by space, tab or newline.

How many words are these:

1. don't

1 word

2. antidisestablish-mentarianism

1 word

3. Joe,Bob,Sally, and Henry.

3 words. Joe,Bob,Sally, makes one word, and makes second, Henry. makes
3rd ( notice that full stop with Henry.)

4. Joe, Bob, Sally, and Henry.

5 words

5. $1,416,383,583.20

all 1 word. There is no space in between them.

6. ()@#$#^&*#^%%#^@*^$&*$

1 word

7. George W. Bush

3 words

8. slam-dunk

1 word

9. 15th-century vase

2 words

10. M.O.N.S.T.E.R., the computer chess-playing machine

5 words

11. lo**************************************@yahoo.com

1 word, of course

Justify your answers.

Any collection of letters,symbols or numbers separated by single or
multiple spaces or tab or newline. Therefore

comp.lang.c++ -- 1 word
Std. Lib -- 2 words
Lov@389&om -- 1 word
I think it is pretty much clear now what a word is.

--
www.lispmachine.wordpress.com
my email is @ the above blog.
Google Groups is Blocked. Reason: Excessive Spamming

Sep 11 '08 #11

Bartc

"arnuld" <su*****@invalid.addresswrote in message
news:pa****************************@invalid.addres s...

>On Wed, 10 Sep 2008 17:01:53 -0500, Gordon Burditt wrote:

>The first step is to define what a "word" is.

Fore *my* program, a word is a collection of letters, numbers or anything
separated by space, tab or newline.

>How many words are these:

1. don't

1 word

>2. antidisestablish-mentarianism

1 word

>3. Joe,Bob,Sally, and Henry.

3 words. Joe,Bob,Sally, makes one word, and makes second, Henry. makes
3rd ( notice that full stop with Henry.)

You have commas in the middle of words?

Ever heard of comma-delimited files? Comma is way up there with space and
tab.

--
Bartc

Sep 11 '08 #12

arnuld

On Thu, 11 Sep 2008 09:36:49 +0000, Bartc wrote:

You have commas in the middle of words?

Ever heard of comma-delimited files? Comma is way up there with space and
tab.

yes, I know and @%$@programmimnng34 is not a word either. If I start to
differentiate these things then it will become very complex to define what
a word is and there could be lots of controversy over what should be (or
could be ?) a word. So I take a simple approach, the white space
(whether a newline or a tab or a single space) separates the words. simple ...
--
www.lispmachine.wordpress.com
my email is @ the above blog.
Google Groups is Blocked. Reason: Excessive Spamming

Sep 11 '08 #13

arnuld

On Wed, 10 Sep 2008 20:01:33 -0400, CBFalconer wrote:

Well, first you have to define a word. Does it terminate on
blanks, on blanks and non-print chars, on blanks and tabs, etc. I
think you will find that the C++ mechanism terminates on blanks and
'\n' (but I could well be wrong).

I have told this already in my last reply ( to BartC )

Having defined it, you just
write the code to extract such a beast from a stream (or from a
string). At that point both you and your code reader know exactly
what the function extracts.

Now there is a big problem in this. In C++ i don't have to care whether
users enter one word or 100s. Memory was being managed by std. lib.
vector. Now here I am thinking of using fgets() to store the input,
which has 2 problems:

1) extract words from each line.
2) fgets() uses array top store data and I don't know how large is
the input, so I can't decide on the size of the array.

Don't forget to preserve the exit char.
Something else may need it.

you mean null character ?

Note that, having written the function, you are allowed to keep its
source (and its object code) and reuse it as often as you wish, with
minimum effort. If you have taken the elementary precaution of writing
it in standard C, you can use it anywhere.

Thats what I want to do, write in ANSI C :)
--
www.lispmachine.wordpress.com
my email is @ the above blog.
Google Groups is Blocked. Reason: Excessive Spamming

Sep 11 '08 #14

arnuld said:

<snip>

Now here I am thinking of using fgets() to store the input,
which has 2 problems:

1) extract words from each line.
2) fgets() uses array top store data and I don't know how large is
the input, so I can't decide on the size of the array.

This is a common problem - so common, in fact, that I wrote it up on the
Web. Take a look at http://www.cpax.org.uk/prg/writings/fgetdata.php which
looks at scanf, gets, and fgets, points out the difficulties with each,
and then discusses a possible solution to the problem of arbitrarily long
lines.

On that page, I present code for reading a word at a time, and for reading
a line at a time. In fact, since you supply your own delimiters, reading a
line is really just a special case of reading a word!

I do not pretend that my code is perfect. For example, the return values
could have been better chosen (I must fix that one day).

It is not intended to be a plug-in solution to the problem (although some
people do actually use it that way and, as far as I'm aware, no harm has
come to them as a result). Rather, it is intended to demonstrate one
possible approach to the problem, in the hope that the reader will have an
"aha!" moment and perhaps come up with a solution that fits his own needs
much better than a generic solution is likely to be able to do.

Several other approaches apart from the one I chose to demonstrate are also
discussed (but not demonstrated), the intent being to give a wider view of
various ways to tackle this problem, depending on your priorities.

Finally, the page provides links to a few other people's demonstrations of
how to solve this problem, again with the intent of providing a wider
perspective on different approaches.

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999

Sep 11 '08 #15

pete

arnuld wrote:

>On Thu, 11 Sep 2008 09:36:49 +0000, Bartc wrote:

>You have commas in the middle of words?

Ever heard of comma-delimited files? Comma is way up there with space and
tab.

yes, I know and @%$@programmimnng34 is not a word either. If I start to
differentiate these things then it will become very complex to define what
a word is and there could be lots of controversy over what should be (or
could be ?) a word. So I take a simple approach, the white space
(whether a newline or a tab or a single space) separates the words. simple ...

Your original question was:
"as C programmer,
are we supposed to create a get_word function everytime
when we need a words as input from stdin"

The answer is
"Yes; every time that you define what you want 'word' to mean."

--
pete

Sep 11 '08 #16

arnuld

On Thu, 11 Sep 2008 12:00:01 +0000, Richard Heathfield wrote:

This is a common problem - so common, in fact, that I wrote it up on the
Web. Take a look at http://www.cpax.org.uk/prg/writings/fgetdata.php which
looks at scanf, gets, and fgets, points out the difficulties with each,
and then discusses a possible solution to the problem of arbitrarily long
lines.

...SNIP....

I have not checked it but will be doing it later. The only one question
that keeps on popping up into my mind is "Why C was not designed to have
this feature ? ". That reminds of an article "Back to Basics" by Joel
Spolsky where he said that we have null terminated strings in C whihc
are much slower than PASCAL strings not by choice but by force, as C was
developed on PDP-7, which had ASCIZ table, which required strings to be Z
terminated ( Z means ZERO). Do we have same kid of thing here in my
problem ?

I am just curious and feel a little strange on having this "word problem"
in C.

--
www.lispmachine.wordpress.com
my email is @ the above blog.
Google Groups is Blocked. Reason: Excessive Spamming

Sep 11 '08 #17

arnuld

On Thu, 11 Sep 2008 08:03:45 -0400, pete wrote:

Your original question was:
"as C programmer,
are we supposed to create a get_word function everytime
when we need a words as input from stdin"

The answer is
"Yes; every time that you define what you want 'word' to mean."

yes, I think CBFalconer also answered that and now things are getting much
more fundamental as I am starting to writing code

--
www.lispmachine.wordpress.com
my email is @ the above blog.
Google Groups is Blocked. Reason: Excessive Spamming

Sep 11 '08 #18

arnuld said:

>On Thu, 11 Sep 2008 12:00:01 +0000, Richard Heathfield wrote:

>This is a common problem - so common, in fact, that I wrote it up on the
Web. Take a look at http://www.cpax.org.uk/prg/writings/fgetdata.php
which looks at scanf, gets, and fgets, points out the difficulties with
each, and then discusses a possible solution to the problem of
arbitrarily long lines.

...SNIP....

I have not checked it but will be doing it later. The only one question
that keeps on popping up into my mind is "Why C was not designed to have
this feature ? ".

I answered that question already (see the above link).

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999

Sep 11 '08 #19

arnuld wrote:

>Bartc wrote:

>You have commas in the middle of words? Ever heard of
comma-delimited files? Comma is way up there with space and tab.

yes, I know and @%$@programmimnng34 is not a word either. If I
start to differentiate these things then it will become very
complex to define what a word is and there could be lots of
controversy over what should be (or could be ?) a word. So I
take a simple approach, the white space (whether a newline or a
tab or a single space) separates the words. simple ...

But that is the point. Chars and lines are easily defined. Words
depend on the usage to be applied. Therefore the code to separate
words depends on the usage. You have to write the parsing code to
suit the job. It just isn't black and white.

--
[mail]: Chuck F (cbfalconer at maineline dot net)
[page]: <http://cbfalconer.home.att.net>
Try the download section.

Sep 11 '08 #20

arnuld wrote:

>CBFalconer wrote:

.... snip ...

>
Now there is a big problem in this. In C++ i don't have to care
whether users enter one word or 100s. Memory was being managed by
std. lib. vector. Now here I am thinking of using fgets() to
store the input, which has 2 problems:

1) extract words from each line.
2) fgets() uses array top store data and I don't know how large
is the input, so I can't decide on the size of the array.

My suggestion is to use ggets, available in std. C source code at:

<http://cbfalconer.home.att.net/download/ggets.zip>

>
>Don't forget to preserve the exit char.
Something else may need it.

you mean null character ?

No. I mean the char that doesn't belong to the word and signifies
the completion. It you are getting the word from a string put the
char back by backing up the pointer (or index). If coming from a
stream you have ungetc available.

--
[mail]: Chuck F (cbfalconer at maineline dot net)
[page]: <http://cbfalconer.home.att.net>
Try the download section.

Sep 11 '08 #21

arnuld wrote:

>

.... snip ...

>
I have not checked it but will be doing it later. The only one
question that keeps on popping up into my mind is "Why C was not
designed to have this feature ? ". That reminds of an article
"Back to Basics" by Joel Spolsky where he said that we have null
terminated strings in C whihc are much slower than PASCAL
strings not by choice but by force, as C was developed on PDP-7,
which had ASCIZ table, which required strings to be Z terminated
( Z means ZERO). Do we have same kid of thing here in my problem?

Speed depends on use. Most string processing just processes until
you hit the end of the string, and there is then no slowdown from
nul termination. In addition most strings are short, and again
there is little effort in finding length. With a little care you
can often avoid finding string lengths in advance.

--
[mail]: Chuck F (cbfalconer at maineline dot net)
[page]: <http://cbfalconer.home.att.net>
Try the download section.

Sep 11 '08 #22

Chris Dollin

arnuld wrote:

I have not checked it but will be doing it later. The only one question
that keeps on popping up into my mind is "Why C was not designed to have
this feature ? ".

Because C was designed for /implementing/ this feature; as a bare-bones
systems programming language.

That reminds of an article "Back to Basics" by Joel
Spolsky where he said that we have null terminated strings in C whihc
are much slower than PASCAL strings

I'd be interested in real evidence for this claim. Real, as in, it
happened in these programs and couldn't be eliminated by straightforward
fixes, rather than contrived examples or beginners gotchas.

not by choice but by force, as C was
developed on PDP-7, which had ASCIZ table, which required strings to be Z
terminated ( Z means ZERO).

That seems ... unlikely ... to me. Just because one's assembler has
an ASCIZ directive doesn't mean one has to use it; even if one does,
one can perfectly well also associate a length with a string as well
as a null terminator.

Do we have same kid of thing here in my
problem ?

I am just curious and feel a little strange on having this "word problem"
in C.

You've picked a language deliberately sparse in built-in features;
don't be surprised if it doesn't have many.

--
'It changed the future .. and it changed us.' /Babylon 5/

Hewlett-Packard Limited Cain Road, Bracknell, registered no:
registered office: Berks RG12 1HN 690597 England

Sep 11 '08 #23

Pilcrow

On Thu, 11 Sep 2008 12:00:01 +0000, Richard Heathfield
<rj*@see.sig.invalidwrote:

>arnuld said:

<snip>

>Now here I am thinking of using fgets() to store the input,
which has 2 problems:

1) extract words from each line.
2) fgets() uses array top store data and I don't know how large is
the input, so I can't decide on the size of the array.

This is a common problem - so common, in fact, that I wrote it up on the
Web. Take a look at http://www.cpax.org.uk/prg/writings/fgetdata.php which
looks at scanf, gets, and fgets, points out the difficulties with each,
and then discusses a possible solution to the problem of arbitrarily long
lines.

On that page, I present code for reading a word at a time, and for reading
a line at a time. In fact, since you supply your own delimiters, reading a
line is really just a special case of reading a word!

I do not pretend that my code is perfect. For example, the return values
could have been better chosen (I must fix that one day).

It is not intended to be a plug-in solution to the problem (although some
people do actually use it that way and, as far as I'm aware, no harm has
come to them as a result). Rather, it is intended to demonstrate one
possible approach to the problem, in the hope that the reader will have an
"aha!" moment and perhaps come up with a solution that fits his own needs
much better than a generic solution is likely to be able to do.

Several other approaches apart from the one I chose to demonstrate are also
discussed (but not demonstrated), the intent being to give a wider view of
various ways to tackle this problem, depending on your priorities.

Finally, the page provides links to a few other people's demonstrations of
how to solve this problem, again with the intent of providing a wider
perspective on different approaches.

Thank you so much! This is much more the sort of thing I was hoping to
find when I started reading this group.

I much appreciate the excellent documentation in the function itself.

Is there at least an index to other similar solutions to general
problems? In comp.lang.perl.misc one often sees people scolded for not
using tested, robust solutions, rather than reinventing the wheel. CPAN
largely fills most peoples' needs. At the risk of making myself a
complete bore, I ask again: why doesn't the C community follow this
example?

Now, if you just followed the same indenting and bracketting style that
is used in K&R2, I would be *totally* happy. I have a lot of trouble
reading yours. Nevermind, I'll just have to write a perl script to
convert from your style to theirs. Shouldn't be too hard.

Thank you again!!

Sep 11 '08 #24

Pilcrow said:

<snip>

Is there at least an index to other similar solutions to general
problems?

http://www.google.com :-)

<snip>

Now, if you just followed the same indenting and bracketting style that
is used in K&R2, I would be *totally* happy.

Yes, but I wouldn't.

I have a lot of trouble reading yours.

You may well be the first person ever to say that. People have made all
kinds of complaints about my code, but readability is not usually high on
the hit-list.

Nevermind, I'll just have to write a perl script to
convert from your style to theirs. Shouldn't be too hard.

man indent

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999

Sep 11 '08 #25

Keith Thompson

Pilcrow <Pi******@gmail.comwrites:

On Wed, 10 Sep 2008 22:05:46 +0000, Richard Heathfield
<rj*@see.sig.invalidwrote:
>>Pilcrow said:
>>On Wed, 10 Sep 2008 15:08:40 +0500, arnuld <su*****@invalid.address>
wrote:
<snip>

>>>>as C programmer, are we supposed to create a get_word function everytime
when we need a words as input from stdin ( e.g. terminal)

Try using fgets(), and strtok(). strtok() will allow you to define word
separators to your taste.

This is poor advice for a beginner. Whilst strtok does have its uses, it
also has issues - traps for the unwary programmer. These derive from its
maintenance of significant state between calls, which makes it unsuitable

I understood that, and I am a 'beginner'. It is very adequately covered
in textbooks (see 'C in a Nutshell', ISBN 0-596-00697-7, page 440),
somewhat less so in K&R2. And I gave the questioner an example to help
him. My dissatisfaction with strtok() is that repeated separation
characters are treated as one, making it difficult to present the user
with an intuitively understandable interface. It is not usually a good
idea to equate ignorance and stupidity.

Yes, it certianly is. Did someone do that?

>>for use in library functions.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Sep 11 '08 #26

Keith Thompson

arnuld <su*****@invalid.addresswrites:

>On Wed, 10 Sep 2008 17:01:53 -0500, Gordon Burditt wrote:

>The first step is to define what a "word" is.

Fore *my* program, a word is a collection of letters, numbers or anything
separated by space, tab or newline.

As you know, that definition is fine for your program; others might
have different requirements.

Incidentally, the phrase "letters, numbers, or anything" seems
redundant. I think that a more precise rendering of what you meant
would be:

A "word" is a non-empty contiguous sequence of characters other
than space, tab, or newline, preceded or followed either by a
space, tab, or newline or by the start or end of the input.

It would also be good to specify whether the input is a string, a line
of text, or an entire text file.

If I take your definition literally, then in the following
"word"
the word "word" is not a word, because it's not separated by space,
tab, or newline.

It might be more convenient to treat anything for which isspace()
returns true (or for which isspace() returns true in the "C" locale)
as a separator; that includes several whitespace characters that you
didn't mention. But of course if your requirements call for only
space, tab, and newline to be treated as separators, then that trumps
convenience.

>How many words are these:

1. don't

1 word

>2. antidisestablish-mentarianism

1 word

In the previous article, "antidisestablish-" and "mentarianism" were
on two lines, so they'd be two words by your definition. (Gordon's
point was that it's reasonable to treat them as a single word, since
that's what the hyphen means in English text, but if they're two words
by your definition then they're two words by your definition.)

[snip]

Any collection of letters,symbols or numbers separated by single or
multiple spaces or tab or newline. Therefore

comp.lang.c++ -- 1 word
Std. Lib -- 2 words
Lov@389&om -- 1 word
I think it is pretty much clear now what a word is.

It's pretty much clear what your definition of a word is. It's still
not at all clear what a word is in general (and it can't be, since the
term is used inconsistently).

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Sep 11 '08 #27

Pilcrow

On Thu, 11 Sep 2008 13:24:52 -0700, Keith Thompson <ks***@mib.org>
wrote:

>Pilcrow <Pi******@gmail.comwrites:
>On Wed, 10 Sep 2008 22:05:46 +0000, Richard Heathfield
<rj*@see.sig.invalidwrote:
>>>Pilcrow said:
On Wed, 10 Sep 2008 15:08:40 +0500, arnuld <su*****@invalid.address>
wrote:
<snip>

>as C programmer, are we supposed to create a get_word function everytime
>when we need a words as input from stdin ( e.g. terminal)

Try using fgets(), and strtok(). strtok() will allow you to define word
separators to your taste.

This is poor advice for a beginner. Whilst strtok does have its uses, it
also has issues - traps for the unwary programmer. These derive from its
maintenance of significant state between calls, which makes it unsuitable

I understood that, and I am a 'beginner'. It is very adequately covered
in textbooks (see 'C in a Nutshell', ISBN 0-596-00697-7, page 440),
somewhat less so in K&R2. And I gave the questioner an example to help
him. My dissatisfaction with strtok() is that repeated separation
characters are treated as one, making it difficult to present the user
with an intuitively understandable interface. It is not usually a good
idea to equate ignorance and stupidity.

Yes, it certianly is. Did someone do that?

How many times does someone here say, in effect, "this is too deep for a
beginner"?

Sep 12 '08 #28

Pilcrow

On Thu, 11 Sep 2008 16:42:51 +0000, Richard Heathfield
<rj*@see.sig.invalidwrote:

>Pilcrow said:

<snip>

>Is there at least an index to other similar solutions to general
problems?

http://www.google.com :-)

<snip>

>Now, if you just followed the same indenting and bracketting style that
is used in K&R2, I would be *totally* happy.

Yes, but I wouldn't.

>I have a lot of trouble reading yours.

You may well be the first person ever to say that. People have made all
kinds of complaints about my code, but readability is not usually high on
the hit-list.

I apologize. It was not really a complaint, more an expression of my
frustration.

I am still digesting that code. I was especially taken with the memory
management. It should be provided for all the other situations where
one sees the a caution that one should make sure that there is adequate
room for the result. After I have gotten more experience with C, I
think I'll try my hand at it.

Sep 12 '08 #29

Keith Thompson

Pilcrow <Pi******@gmail.comwrites:

On Thu, 11 Sep 2008 13:24:52 -0700, Keith Thompson <ks***@mib.org>
wrote:
>>Pilcrow <Pi******@gmail.comwrites:

[...]

>>It is not usually a good idea to equate ignorance and stupidity.

Yes, it certianly is. Did someone do that?

s/certianly/certainly/

How many times does someone here say, in effect, "this is too deep for a
beginner"?

That's not equating ignorance and stupidity; it's equating ignorance
and ignorance. And ignorance isn't necessarily an insult; it's
usually curable, after all.

Sorry, but some things really are too deep for a beginner.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Sep 12 '08 #30

Pilcrow said:

On Thu, 11 Sep 2008 16:42:51 +0000, Richard Heathfield
<rj*@see.sig.invalidwrote:

>>Pilcrow said:

<snip>

>>I have a lot of trouble reading yours.

You may well be the first person ever to say that. People have made all
kinds of complaints about my code, but readability is not usually high on
the hit-list.

I apologize.

I wish you wouldn't. You have every right to say what you said. I wasn't
being "precious" about it - merely surprised! In fact, I'd be quite
curious to know more about *why* you have a lot of trouble reading my
code. Maybe there's something I can change to make it easier for you to
read without making it more difficult for myself and others.

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999

Sep 12 '08 #31

Pilcrow <Pi******@gmail.comwrote:

Is there at least an index to other similar solutions to general
problems? In comp.lang.perl.misc one often sees people scolded for not
using tested, robust solutions, rather than reinventing the wheel. CPAN
largely fills most peoples' needs. At the risk of making myself a
complete bore, I ask again: why doesn't the C community follow this
example?

Mainly because for many things C is used for, someone else's "almost
good enough" solution is _not_ good enough. In PERL, well... you're
having to deal with PERL already. That your string library (or rather,
someone else's string library) is slightly tentacular doesn't matter
much when you're already up to your knees in Cthulhuspawn.

Richard

Sep 12 '08 #32

arnuld

On Thu, 11 Sep 2008 12:42:13 +0000, Richard Heathfield wrote:

I answered that question already (see the above link).

If thats C's way of doing things. I have to admit, it is very messy :( . I
really can't find why it is better than:

std::vector<std::stringsvec;

--
www.lispmachine.wordpress.com
my email is @ the above blog.
Google Groups is Blocked. Reason: Excessive Spamming

Sep 12 '08 #33

vippstar

On Sep 12, 10:12 am, arnuld <sunr...@invalid.addresswrote:

On Thu, 11 Sep 2008 12:42:13 +0000, Richard Heathfield wrote:
I answered that question already (see the above link).

If thats C's way of doing things. I have to admit, it is very messy :( . I
really can't find why it is better than:

std::vector<std::stringsvec;

Well, for starters, because it does compile.

Sep 12 '08 #34

arnuld said:

>On Thu, 11 Sep 2008 12:42:13 +0000, Richard Heathfield wrote:

>I answered that question already (see the above link).

If thats C's way of doing things. I have to admit, it is very messy :( .

It seems you have misunderstood.

The question was: why doesn't C have this feature (the ability to read
arbitrarily long lines) already? My answer to that perfectly reasonable
question is quite simply that there are many ways to do this, and no one
of them stands out as being the universally "right" decision.

I really can't find why it is better than:

std::vector<std::stringsvec;

Here's one obvious problem with that: it won't compile. Here's another:
assuming it did, it doesn't appear to be a function, so it's hard to see
how it could read anything at all.

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999

Sep 12 '08 #35

Richard

Richard Heathfield <rj*@see.sig.invalidwrites:

Pilcrow said:

>On Thu, 11 Sep 2008 16:42:51 +0000, Richard Heathfield
<rj*@see.sig.invalidwrote:

>>>Pilcrow said:

<snip>

>>>I have a lot of trouble reading yours.

You may well be the first person ever to say that. People have made all
kinds of complaints about my code, but readability is not usually high on
the hit-list.

I apologize.

I wish you wouldn't. You have every right to say what you said. I wasn't
being "precious" about it - merely surprised! In fact, I'd be quite
curious to know more about *why* you have a lot of trouble reading my
code. Maybe there's something I can change to make it easier for you to
read without making it more difficult for myself and others.

Speaking for myself and knowingly only picking on style things:

I hate this move to putting braces on their own lines. Its a horrible
waste of vertical space and "3 levels" for one unit is not natural.

e.g

while(t--){
doSomething(t);
doSomething2(t);
}

is much preferable to

while(t--)
{
doSomething(t);
doSomething2(t);
}

The closing brace matches to the opening "while". Clean. Economical.

Also you adopt the non "standard" option of putting your values to
compare against on the left. While "clever" it does read as traditional
"English" and is not adopted widely elsewhere.

e.g

while( 0 == getValue(t))
doSomething(t);

is horrible. 0 is not the thing was are interested in manipulating or
looking at. the return from getValue(t) is. It is that we compare
against a benchmark figure - therefore

while(getValue(t)==0)
doSomething(t);

reads much better and is more traditional.

Yes, we could argue until the cows come home about it and it is purely
style. But I have tried to justify my traditional K&R preferences. At
least you do not seem to have adopted Falconer's horrific habit of
having conditional targets on the same line e.g

if(error(r))printf("I'm a pedantic nutter");

Sep 12 '08 #36

Richard

vi******@gmail.com writes:

On Sep 12, 10:12 am, arnuld <sunr...@invalid.addresswrote:

On Thu, 11 Sep 2008 12:42:13 +0000, Richard Heathfield wrote:
I answered that question already (see the above link).

If thats C's way of doing things. I have to admit, it is very messy :( . I
really can't find why it is better than:

std::vector<std::stringsvec;

Well, for starters, because it does compile.

Minus 3 for being too late on your attempt to get promoted into the
c.l.c "reg" upper echelon. But that atttempt combined with your
"indeeds", your "Mr heathfields" and various nauseating attempts at
belittling nOObs should ensure at least a cushion at RHs feet in the
near future.

Hint : it was perfectly clear what arnuld meant. Pretending other
languages do not exist (especially one as rooted in C history as C++) in
this NG is simply pathetic.

Sep 12 '08 #37

arnuld

On Fri, 12 Sep 2008 08:52:03 +0000, Richard Heathfield wrote:

It seems you have misunderstood.

The question was: why doesn't C have this feature (the ability to read
arbitrarily long lines) already? My answer to that perfectly reasonable
question is quite simply that there are many ways to do this, and no one
of them stands out as being the universally "right" decision.

NO, I simply got it. You have discussed several ways of accomplishing task
but none of them fits properly, you have shows the pros and cons of each
very *clearly* and hence then on that you prove reasonably why we don't
have such function in Std. Lib. I think you are pretty much technical,
unbiased and right about it.

Here's one obvious problem with that: it won't compile. Here's another:
assuming it did, it doesn't appear to be a function, so it's hard to see
how it could read anything at all.

I know you are playing here ;), It won't compile because its from C++.

<OT>
If i have to use C then I have to use one of the options you have suggested
or do it in other language, but thats personal. I have posted the code on
comap.lang.c++ with title "sorting the input":

http://groups.google.com/group/comp....8e58b666d0eb1#
The only problem i was having with C is, my mind was drifted away very
badly from the *thinking-in-problem* to *thinking-about-language-issues*
and hence my focus was lost. C++ saved that focus and IMVVHO, may be I
am not sure, C++ version will run as fast as C version. But I don't think
C++ is better than C, because there are cases where C++ will not fit,
like resource and memory constraint systems, where there is no library
available, then even when if you use C++ compiler, you will have to
learn and use the *C Way*. You can't do anything else.

But since thats personal, Why will I even use C++ at all. I will prefer
Common Lisp on my side and will avoid working on resource and memory
constraint systems. I think they disrupt and kill my ability to
thinking-in-problem. </OT>

--
www.lispmachine.wordpress.com
my email is @ the above blog.
Google Groups is Blocked. Reason: Excessive Spamming

Sep 12 '08 #38

vippstar

On Sep 12, 1:18 pm, Richard<rgr...@gmail.comwrote:

[replying to me]

Minus 3 for being too late on your attempt to get promoted into the
c.l.c "reg" upper echelon. But that atttempt combined with your
"indeeds", your "Mr heathfields" and various nauseating attempts at
belittling nOObs should ensure at least a cushion at RHs feet in the
near future.

I'm tired of this. I don't give a crap what Heathfield thinks of me.
I'm here to learn C and help others do the same, not to socialize.

Sep 12 '08 #39

arnuld

On Fri, 12 Sep 2008 12:18:27 +0200, Richard wrote:

>vi******@gmail.com writes:

>On Sep 12, 10:12 am, arnuld <sunr...@invalid.addresswrote:
>On Thu, 11 Sep 2008 12:42:13 +0000, Richard Heathfield wrote:
I answered that question already (see the above link).

If thats C's way of doing things. I have to admit, it is very messy :( . I
really can't find why it is better than:

std::vector<std::stringsvec;

Well, for starters, because it does compile.

you must be using Google Groups, thats why I don't see your post. Anyway,
Richard is right. It won't compile (this is comp.lang.c )

--
www.lispmachine.wordpress.com
my email is @ the above blog.
Google Groups is Blocked. Reason: Excessive Spamming

Sep 12 '08 #40

vippstar

On Sep 12, 1:32 pm, arnuld <sunr...@invalid.addresswrote:

On Fri, 12 Sep 2008 12:18:27 +0200, Richard wrote:
vipps...@gmail.com writes:
On Sep 12, 10:12 am, arnuld <sunr...@invalid.addresswrote:
On Thu, 11 Sep 2008 12:42:13 +0000, Richard Heathfield wrote:
I answered that question already (see the above link).

>If thats C's way of doing things. I have to admit, it is very messy :( . I
really can't find why it is better than:

> std::vector<std::stringsvec;

Well, for starters, because it does compile.

you must be using Google Groups, thats why I don't see your post. Anyway,
Richard is right. It won't compile (this is comp.lang.c )

Yes I do. There's better spam filters than just blocking a service...
I'm also right; I meant that the other way is better than the latter
because the former *does* compile. (unlike the latter that does not)

Sep 12 '08 #41

Richard

vi******@gmail.com writes:

On Sep 12, 1:18 pm, Richard<rgr...@gmail.comwrote:

[replying to me]

>Minus 3 for being too late on your attempt to get promoted into the
c.l.c "reg" upper echelon. But that atttempt combined with your
"indeeds", your "Mr heathfields" and various nauseating attempts at
belittling nOObs should ensure at least a cushion at RHs feet in the
near future.

I'm tired of this. I don't give a crap what Heathfield thinks of me.
I'm here to learn C and help others do the same, not to socialize.

Hmmmm. Indeed. It's "Mr Heathfield" to you.

Sep 12 '08 #42

Pilcrow <Pi******@gmail.comwrote:

On Wed, 10 Sep 2008 22:05:46 +0000, Richard Heathfield
Pilcrow said:

Try using fgets(), and strtok(). strtok() will allow you to define word
separators to your taste.
This is poor advice for a beginner. Whilst strtok does have its uses, it
also has issues - traps for the unwary programmer. These derive from its
maintenance of significant state between calls, which makes it unsuitable

I understood that, and I am a 'beginner'. It is very adequately covered
in textbooks (see 'C in a Nutshell', ISBN 0-596-00697-7, page 440),
somewhat less so in K&R2. And I gave the questioner an example to help
him. My dissatisfaction with strtok() is that repeated separation
characters are treated as one, making it difficult to present the user
with an intuitively understandable interface. It is not usually a good
idea to equate ignorance and stupidity.

There is also the catch that strtok() scribbles over its parameter,
meaning that you cannot use it to tokenise either a string literal, or
data you want to keep. This is something that catches out a lot of less
well-informed newbies.

Richard

Sep 12 '08 #43

arnuld <su*****@invalid.addresswrote:

If thats C's way of doing things. I have to admit, it is very messy :( . I
really can't find why it is better than:

std::vector<std::stringsvec;

Confucius, he says: "if you want C++, you know where to find it".

Richard

Sep 12 '08 #44

Richard

rl*@hoekstra-uitgeverij.nl (Richard Bos) writes:

arnuld <su*****@invalid.addresswrote:

>If thats C's way of doing things. I have to admit, it is very messy :( . I
really can't find why it is better than:

std::vector<std::stringsvec;

Confucius, he says: "if you want C++, you know where to find it".

Richard

If I were looking at designing solutions in C for such things then it
would be remiss of me NOT to look to see how C++ has done it in the
meantime. It could lead to a lot of time saving. Sure you can not use
the C++ syntax but things are never there for "no reason". And in that
context mentioning the way C++ does it here is clearly topical and
possibly useful to C library designers.

Sep 12 '08 #45

James Kuyper

arnuld wrote:

I searched the c.l.c archives provided by Google as Google Groups with
"word input" as the key words and did not come up with anything good.
C++ has std::string for taking a word as input from stdin.

Could you identify the std::string feature that implements this? I
couldn't find any use of the word "word" anywhere in section 21 of the
C++ standard, which describes std::string.

Sep 12 '08 #46

James Kuyper

Richard wrote:

Richard Heathfield <rj*@see.sig.invalidwrites:

....

>I wish you wouldn't. You have every right to say what you said. I wasn't
being "precious" about it - merely surprised! In fact, I'd be quite
curious to know more about *why* you have a lot of trouble reading my
code. Maybe there's something I can change to make it easier for you to
read without making it more difficult for myself and others.

Speaking for myself and knowingly only picking on style things:

I hate this move to putting braces on their own lines. Its a horrible
waste of vertical space and "3 levels" for one unit is not natural.

Vertical space is not in short supply. Personally, I handle that issue
the same way Richard Heathfield does. My reason is that it makes it
easier to identify and move block statements around when there's a set
of lines which is used for the block, and only that block, including the
delimiting curly brackets.

Also you adopt the non "standard" option of putting your values to
compare against on the left. While "clever" it does read as traditional
"English" and is not adopted widely elsewhere.

e.g

while( 0 == getValue(t))
doSomething(t);

I don't personally use this style, for reasons similar to yours.
However, are you aware of the reason why some people do this? When a
literal is the left operand of a comparison, rather than the right,
there is no danger of your code being silently compiled if you
accidentally type "=" instead of "==". I'll grant you that this doesn't
make any difference when the right operand is also something which could
not be the left operand of an assignment, such as a function call.
However, this kind of rule is much more effective when used
consistently, rather than always asking yourself "is it needed here?". I
tried this style, but found it very hard to break old habits; but I
would not criticize people for adopting it.

Sep 12 '08 #47

Richard

James Kuyper <ja*********@verizon.netwrites:

Richard wrote:
>Richard Heathfield <rj*@see.sig.invalidwrites:
...

>>I wish you wouldn't. You have every right to say what you said. I
wasn't being "precious" about it - merely surprised! In fact, I'd
be quite curious to know more about *why* you have a lot of trouble
reading my code. Maybe there's something I can change to make it
easier for you to read without making it more difficult for myself
and others.

Speaking for myself and knowingly only picking on style things:

I hate this move to putting braces on their own lines. Its a horrible
waste of vertical space and "3 levels" for one unit is not natural.

Vertical space is not in short supply. Personally, I handle that issue
the same way Richard Heathfield does. My reason is that it makes it
easier to identify and move block statements around when there's a set
of lines which is used for the block, and only that block, including
the delimiting curly brackets.

Sounds very rare to me. This moving blocks around. And even so its one
key stroke away to realign etc. hardly worth adopting an entire new
layout to support. I have emacs set up that a single sequence collects
the entire scope into the clipboard anyway.

>
>Also you adopt the non "standard" option of putting your values to
compare against on the left. While "clever" it does read as traditional
"English" and is not adopted widely elsewhere.

e.g

while( 0 == getValue(t))
doSomething(t);

I don't personally use this style, for reasons similar to
yours. However, are you aware of the reason why some people do this?

Yes. Well, one reason. And its not the same as the other one (which I
knew too :-;).

When a literal is the left operand of a comparison, rather than the
right, there is no danger of your code being silently compiled if you
accidentally type "=" instead of "==". I'll grant you that this

Fabricated and blown out of all proportion IMO. What about protecting
against someone typing "a==0" instead of "a=0". or one of a million
other errors. Its a coding error. And 2 seconds testing or debugging
puts that right.

doesn't make any difference when the right operand is also something
which could not be the left operand of an assignment, such as a
function call. However, this kind of rule is much more effective when
used consistently, rather than always asking yourself "is it needed
here?". I tried this style, but found it very hard to break old
habits; but I would not criticize people for adopting it.

I would. The perceived benefits are more than offset by the non standard
"reading" of the code. In my opinion of course. I now expect the usual
sycophants to appear telling us how their productivity increased 30000%
when they adopted this notation ....

It's amusing that the people I know you use this "back to front" trend
are also some of the worst "team players" I have ever encountered and
tend to be jobs worth language lawyers than good, practical programmers
interested in contributing to a consistent and maintainable code base.

Sep 12 '08 #48