473,396 Members | 1,810 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,396 software developers and data experts.

doubly-linked list & sorting

This program follows from the section 6.5 of K&R2 where authors created a
doubly-linked list using a binary-tree based approach. The only thing I
have rewritten myself is the getword function. I am using it because there
are many concepts involved in this program that I want to understand like:
1.) degree of efficiency of this program as compared to sort using hash-table
based approach.

2.) Keeping the program well structured so that anyone can have a look and
in some amount of time he can draw a map of the design and the
key-concepts involved in a program.

3.) Any other advice you think is important to be given to a beginning C
programmer :)


/* K&R2, exercise 6.4
*
* write a program that prints the distinct words in its input sorted
* alphabetically. Precede each word by its count.
*
* ( exercise follows the section on linked lists, so I think it is the general
* idea that I should use linked list to solve the problem )
*
*/


#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>

enum MAXSIZE { ARRSIZE = 1000, WORDSIZE = 30 };
int getword( char * , int );
struct tnode *add_to_tree( struct tnode *, char * );
void print_tree( struct tnode * );
struct tnode *tree_alloc( void );
char *strdup( char * );
int main( void )
{
struct tnode *root_node;
char word[ARRSIZE];

root_node = NULL;

while( getword( word, WORDSIZE ) )
{
root_node = add_to_tree( root_node, word );
}

print_tree( root_node );
return 0;
}
struct tnode {
char *word;
int count;
struct tnode *left;
struct tnode *right;
};

/* A program that takes a single word as input. If the input word
* contains more than 30 letters then only the extra letters will be
* discarded . For general purpose usage of English it does not make any
* sense to use a word larger than this size. Nearly every general purpose
* word can be expressed in a word with less than or equal to 30 letters.
*
* I am only considering pure words like "usenet". Words containing anything
* other than the 26 alphabets of English will simply be discarded e.g. like
* "usen@t" or "usenet3".
*
*/
int getword( char *word, int max_length )
{
int c;
char *w = word;
while( isspace( c = getchar() ) )
{
;
}

while( --max_length )
{
if( isalpha( c ) )
{
*w++ = c;
}
else if( isspace( c ) || c == EOF )
{
*w = '\0';
break;
}
else
{
return 0;
}

c = getchar();
}

/* I can simply ignore the if condition and directly write the '\0'
onto the last element because in worst case it will only rewrite
the '\n' that is put in there by else if clause.

or in else if clause, I could replace break with return word[0].

I thought these 2 ideas will be either inefficient or
a bad programming practice, so I did not do it.
*/
if( *w != '\0' )
{
*w = '\0';
}

return word[0];
}

/* adds the word onto the tree */
struct tnode *add_to_tree( struct tnode *tree, char *pc )
{
int match;

if( tree == NULL )
{
tree = tree_alloc();
tree->word = strdup( pc );
tree->count = 1;
tree->left = tree->right = NULL;
}
else if( 0 == (match = strcmp( pc, tree->word )) )
{
++tree->count;
}
else if( match 0 )
{
tree->right = add_to_tree( tree->right, pc );
}
else if( match < 0 )
{
tree->left = add_to_tree( tree->left, pc );
}

return tree;

}

/* prints the tree */
void print_tree( struct tnode *tree )
{
if ( tree )
{
print_tree(tree->left);
printf("%30s -- %4d\n", tree->word, tree->count);
print_tree(tree->right);
}

}
/* allocate memory for a node */
struct tnode *tree_alloc( void )
{
return malloc( sizeof( struct tnode ) );
}

char *strdup( char *pc )
{
char *pc2;

pc2 = malloc( strlen(pc) + 1 );

if( pc )
{
strcpy( pc2, pc );
}

return pc2;
}

============= OUTPUT ===============
[arnuld@raj C]$ gcc -ansi -pedantic -Wall -Wextra test.c
[arnuld@raj C]$ ./a.out
like this usenet and that of usenet is this
and -- 1
is -- 1
like -- 1
of -- 1
that -- 1
this -- 2
usenet -- 2
[arnuld@raj C]$

--
http://lispmachine.wordpress.com/
my email ID is @ the above address

Jun 27 '08 #1
4 2902
On Thu, 01 May 2008 16:19:47 +0500, arnuld <No****@NoPain.comwrote:
>This program follows from the section 6.5 of K&R2 where authors created a
doubly-linked list using a binary-tree based approach. The only thing I
have rewritten myself is the getword function. I am using it because there
are many concepts involved in this program that I want to understand like:
1.) degree of efficiency of this program as compared to sort using hash-table
based approach.

2.) Keeping the program well structured so that anyone can have a look and
in some amount of time he can draw a map of the design and the
key-concepts involved in a program.

3.) Any other advice you think is important to be given to a beginning C
programmer :)


/* K&R2, exercise 6.4
*
* write a program that prints the distinct words in its input sorted
* alphabetically. Precede each word by its count.
*
* ( exercise follows the section on linked lists, so I think it is the general
* idea that I should use linked list to solve the problem )
*
*/


#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>

enum MAXSIZE { ARRSIZE = 1000, WORDSIZE = 30 };
int getword( char * , int );
struct tnode *add_to_tree( struct tnode *, char * );
void print_tree( struct tnode * );
struct tnode *tree_alloc( void );
char *strdup( char * );
int main( void )
{
struct tnode *root_node;
char word[ARRSIZE];
Here you define an array of 1000 char.
>
root_node = NULL;

while( getword( word, WORDSIZE ) )
Here you tell getword to only use the first 30 char in the 1000
element array. At this point your concerns about efficiency are
misplaced. You have been messing with this enum for at least a
half-dozen threads and you still don't get it right.
{
root_node = add_to_tree( root_node, word );
}

print_tree( root_node );
return 0;
}
struct tnode {
char *word;
int count;
struct tnode *left;
struct tnode *right;
};
How does your code compile if this is not before all the references to
the struct? I don't see an incomplete or tentative declaration for
this struct up near the top.
>
/* A program that takes a single word as input. If the input word
* contains more than 30 letters then only the extra letters will be
* discarded . For general purpose usage of English it does not make any
* sense to use a word larger than this size. Nearly every general purpose
* word can be expressed in a word with less than or equal to 30 letters.
*
* I am only considering pure words like "usenet". Words containing anything
* other than the 26 alphabets of English will simply be discarded e.g. like
* "usen@t" or "usenet3".
*
*/
int getword( char *word, int max_length )
{
int c;
char *w = word;
while( isspace( c = getchar() ) )
{
;
}

while( --max_length )
{
if( isalpha( c ) )
{
*w++ = c;
}
else if( isspace( c ) || c == EOF )
{
*w = '\0';
break;
}
else
{
return 0;
}

c = getchar();
}

/* I can simply ignore the if condition and directly write the '\0'
onto the last element because in worst case it will only rewrite
the '\n' that is put in there by else if clause.

or in else if clause, I could replace break with return word[0].

I thought these 2 ideas will be either inefficient or
a bad programming practice, so I did not do it.
*/
if( *w != '\0' )
{
*w = '\0';
}

return word[0];
What is the significance of returning the first character in the word?
>}

/* adds the word onto the tree */
struct tnode *add_to_tree( struct tnode *tree, char *pc )
{
int match;

if( tree == NULL )
{
tree = tree_alloc();
tree->word = strdup( pc );
tree->count = 1;
tree->left = tree->right = NULL;
}
else if( 0 == (match = strcmp( pc, tree->word )) )
{
++tree->count;
}
else if( match 0 )
{
tree->right = add_to_tree( tree->right, pc );
}
else if( match < 0 )
{
tree->left = add_to_tree( tree->left, pc );
}

return tree;

}

/* prints the tree */
void print_tree( struct tnode *tree )
{
if ( tree )
{
print_tree(tree->left);
printf("%30s -- %4d\n", tree->word, tree->count);
print_tree(tree->right);
}

}
/* allocate memory for a node */
struct tnode *tree_alloc( void )
{
return malloc( sizeof( struct tnode ) );
}

char *strdup( char *pc )
{
char *pc2;

pc2 = malloc( strlen(pc) + 1 );

if( pc )
I'm pretty sure you meant pc2 here.
{
strcpy( pc2, pc );
}

return pc2;
}

============= OUTPUT ===============
[arnuld@raj C]$ gcc -ansi -pedantic -Wall -Wextra test.c
[arnuld@raj C]$ ./a.out
like this usenet and that of usenet is this
and -- 1
is -- 1
like -- 1
of -- 1
that -- 1
this -- 2
usenet -- 2
[arnuld@raj C]$

Remove del for email
Jun 27 '08 #2
On May 1, 4:19 am, arnuld <NoS...@NoPain.comwrote:
This program follows from the section 6.5 of K&R2 where authors created a
doubly-linked list using a binary-tree based approach. The only thing I
have rewritten myself is the getword function. I am using it because there
are many concepts involved in this program that I want to understand like:

1.) degree of efficiency of this program as compared to sort using
hash-table
based approach.
Infinitely better than sorting using a hash table (assuming the sort works)
because you do not sort data with a hash table. The data elements are
randomly dispersed. The hash table structure is useful for
insert/update/delete/find in O(1) but you cannot pull out the data in an
ordered manner unless it is a very special hash (e.g. hash(int) = int).
2.) Keeping the program well structured so that anyone can have a look and
in some amount of time he can draw a map of the design and the
key-concepts involved in a program.
A good design is essential for good software. 80% of the cost of software
is maintenance, so a well designed product is a good product and a poorly
designed product is not as good.
3.) Any other advice you think is important to be given to a beginning C
programmer :)

/* K&R2, exercise 6.4
*
* write a program that prints the distinct words in its input sorted
* alphabetically. Precede each word by its count.
*
* ( exercise follows the section on linked lists, so I think it is the
general
* idea that I should use linked list to solve the problem )
*
*/

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>

enum MAXSIZE { ARRSIZE = 1000, WORDSIZE = 30 };

int getword( char * , int );

struct tnode *add_to_tree( struct tnode *, char * );
void print_tree( struct tnode * );
struct tnode *tree_alloc( void );
char *strdup( char * );

int main( void )
{
struct tnode *root_node;
char word[ARRSIZE];

root_node = NULL;

while( getword( word, WORDSIZE ) )
{
root_node = add_to_tree( root_node, word );
}

print_tree( root_node );

return 0;

}

struct tnode {
char *word;
int count;
struct tnode *left;
struct tnode *right;

};

/* A program that takes a single word as input. If the input word
* contains more than 30 letters then only the extra letters will be
* discarded . For general purpose usage of English it does not make any
* sense to use a word larger than this size. Nearly every general purpose
* word can be expressed in a word with less than or equal to 30 letters.
*
* I am only considering pure words like "usenet". Words containing
anything
* other than the 26 alphabets of English will simply be discarded e.g.
like
* "usen@t" or "usenet3".
*
*/
int getword( char *word, int max_length )
{
int c;
char *w = word;

while( isspace( c = getchar() ) )
{
;
}

while( --max_length )
{
if( isalpha( c ) )
{
*w++ = c;
}
else if( isspace( c ) || c == EOF )
{
*w = '\0';
break;
}
else
{
return 0;
}

c = getchar();
}

/* I can simply ignore the if condition and directly write the '\0'
onto the last element because in worst case it will only rewrite
the '\n' that is put in there by else if clause.

or in else if clause, I could replace break with return word[0].

I thought these 2 ideas will be either inefficient or
a bad programming practice, so I did not do it.
*/
if( *w != '\0' )
{
*w = '\0';
}

return word[0];

}

/* adds the word onto the tree */
struct tnode *add_to_tree( struct tnode *tree, char *pc )
{
int match;

if( tree == NULL )
{
tree = tree_alloc();
Assuming that tree_alloc() performs memory allocation, did the allocation
succeed?
tree->word = strdup( pc );
Unfortunately, strdup() is not standard, so this code is not maximally
portable. Secondly, did the allocation succeed?
tree->count = 1;
tree->left = tree->right = NULL;
}
else if( 0 == (match = strcmp( pc, tree->word )) )
{
++tree->count;
}
else if( match 0 )
{
tree->right = add_to_tree( tree->right, pc );
}
else if( match < 0 )
{
tree->left = add_to_tree( tree->left, pc );
}

return tree;

}

/* prints the tree */
void print_tree( struct tnode *tree )
{
if ( tree )
{
print_tree(tree->left);
printf("%30s -- %4d\n", tree->word, tree->count);
print_tree(tree->right);
}

}

/* allocate memory for a node */
struct tnode *tree_alloc( void )
{
return malloc( sizeof( struct tnode ) );

}

char *strdup( char *pc )
{
char *pc2;

pc2 = malloc( strlen(pc) + 1 );

if( pc )
{
strcpy( pc2, pc );
}

return pc2;

}

============= OUTPUT ===============
[arnuld@raj C]$ gcc -ansi -pedantic -Wall -Wextra test.c
[arnuld@raj C]$ ./a.out
like this usenet and that of usenet is this
and -- 1
is -- 1
like -- 1
of -- 1
that -- 1
this -- 2
usenet -- 2
[arnuld@raj C]$

** Posted from http://www.teranews.com **
Jun 27 '08 #3
On May 2, 8:45*pm, user923005 <dcor...@connx.comwrote:
Sample for Arnuld to compare:
Here are the exercises from Chapter 6 (except the hash table stuff
which seems out of place to me) as purloined from Richard Heathfield's
K&R2 solution site, and then made to work with arbitrary files, and
then the missing exercise was added and then I made some more icky
hacks and called it good. *That didn't make it good, but that's what I
called it.
[snip]
It also needs this header called ch6.h :

#ifndef KR2_CH6_PROTOTYPES_DEFINED
#define KR2_CH6_PROTOTYPES_DEFINED

#include <assert.h>
#include <ctype.h>
#include <limits.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
typedef struct WORD {
char *Word;
size_t Count;
struct WORD *Left;
struct WORD *Right;
} WORD;

struct linelist {
struct linelist *next;
int line;
};
struct wordtree {
char *word;
struct linelist *firstline;
struct wordtree *left;
struct wordtree *right;
};
enum token {
TOK_ID = UCHAR_MAX + 1, /* Identifier. */
TOK_STRING, /* String constant. */
TOK_CHAR, /* Character constant. */
TOK_EOF /* End of file. */
};

/* Classification of an input character. */
enum class {
CLS_WS = 0, /* Whitespace. */
CLS_BEG_CMT, /* Slash-star beginning a comment. */
CLS_END_CMT, /* Star-slash ending a comment. */
CLS_OTHER, /* None of the above. */
CLS_IN_CMT = 4 /* Combined with one of the above,
indicates
* we're inside a comment. */
};

#define SUCCESS 0
#define CANNOT_MALLOC_WORDARRAY 1
#define NO_WORDS_ON_INPUT 2
#define NO_MEMORY_FOR_WORDNODE 3
#define NO_MEMORY_FOR_WORD 4
extern char *GetLine(char *, int, FILE *);
extern char *char_in_string(char *, int);
extern char *dupstr(char *);
extern char *tokenise(char **, char *);

extern int AddToTree(struct WORD **, size_t *, char *);
extern int CompareCounts(const void *, const void *);
extern int CompareWords(const void *, const void *);
extern int CompareWordStems(const void *, const void *);
extern int NoiseWord(char *);
extern int OutputWords(FILE *, size_t, struct WORD **);
extern int OutputStemmedWords(FILE *, size_t, struct WORD **);
extern int ReadInputToTree(struct WORD **, size_t *, FILE *);
extern int WalkTree(struct WORD **, struct WORD *);
extern int i_strcmp(const char *, const char *);
extern int i_strncmp(const char *s, const char *t, int n);

extern enum token getword(char *, int, FILE *);
extern struct linelist *addlink(int);
extern struct wordtree *addword(struct wordtree **, char *, int);

extern void FreeTree(struct WORD *);
extern void deletelist(struct linelist *);
extern void deleteword(struct wordtree **);
extern void printlist(struct linelist *, FILE *);
extern void printtree(struct wordtree *, FILE *);

#endif /* KR2_CH6_PROTOTYPES_DEFINED */

Jun 27 '08 #4
user923005 said:
Sample for Arnuld to compare:
Here are the exercises from Chapter 6 (except the hash table stuff
which seems out of place to me) as purloined from Richard Heathfield's
K&R2 solution site,
*******PLEASE******* don't go there. Go here instead:

http://clc-wiki.net/wiki/K%26R2_solutions

Could you please tell your cloth-eared master to update his links? Shout if
necessary. Thanks.

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999
Jun 27 '08 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

11
by: Marc Ferry | last post by:
I already posted this mail in comp.sys.hp and comp.sys.hp.hpux but had no response. As this problem might be present on other OSes than HP-UX 10.20, I crosspost it here, in the hope of getting an...
12
by: DrBonzo | last post by:
Given public class A { virtual void foo() { ... } } public class B : A { public override void foo() { ... } public Bthingy() { ... base.foo() ...} }
4
by: paeez | last post by:
how can i add a node after ith node in a doubly linkedlist?(for example after the third node)
2
by: zubia | last post by:
hi how 2 deal with the rptr n lptr in doubly linklist
6
by: Sebastian Schack | last post by:
Hey everyone. Hope I'm in the right spot here... if not, please correct me. My .css starts with the definition of the body-part which looks as follows: body { background: #FFFFFF...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.