By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
435,573 Members | 3,008 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 435,573 IT Pros & Developers. It's quick & easy.

Binary Tree

P: n/a
Hi all,

I'm trying to do a binary search and collect some stats from a text
file in order to compare the processing times of this program (binary
searching) versus an old program using linked lists. I'm totally new
to binary searches by the way. Can anyone help me with the commented
sections below? Much of the code such as functions and printfs has
already been completed. Any help is greatly appreciated.

Thanks,
James

Expand|Select|Wrap|Line Numbers
  1. #include <stdio.h>
  2. #include <malloc.h>
  3. struct tnode {            // specify the "shape" of a tnode structure ...
  4. struct tnode *left;    // the left and right branch pointers
  5. struct tnode *right;
  6. int count;        // the word count as before
  7. char *word;        // a pointer to the word
  8. } *root;            // declare the root pointer variable
  9.  
  10. struct tnode **tree_search(struct tnode **, char *);
  11. void tree_stats(struct tnode *);
  12. int get_word(char *);
  13.  
  14. int total_nodes, total_words, high;
  15. struct tnode *most_frequent;
  16.  
  17. int main(int argc, char *argv[]) {
  18. struct tnode **tpp;
  19. char word_buff[100];    // the reusable word buffer
  20. int i;
  21. while(get_word(word_buff)) {
  22. tpp = tree_search(&root, word_buff);
  23. /****CODE TO ADD NEW NODES AND COUNT REPEATS *****/
  24.  
  25.  
  26.  
  27.  
  28.  
  29.  
  30. }
  31. tree_stats(root);
  32. printf("total_nodes %d\n", total_nodes);
  33. printf("total_words %d\n", total_words);
  34. if(most_frequent)
  35. printf("most frequent word <%s> count is %d\n",
  36. most_frequent->word, most_frequent->count);
  37. for(i = 1; i < argc; i++) {
  38. tpp = tree_search(&root, argv[i]);
  39. if((*tpp) == NULL)
  40. printf("%s is NOT in the tree\n", argv[i]);
  41. else
  42. printf("<%s> count is %d\n", argv[i], (*tpp)->count);
  43. }
  44. return(0);
  45. }
  46.  
  47. // binary tree search returning a pointer to the pointer leading to a
  48. hit
  49. struct tnode **tree_search(struct tnode **tpp, char *w) {
  50. /***** CODE TO DO THE BINARY SRCH *****/
  51. return(tpp);
  52. }
  53.  
  54. // gather stats to check tree correctness
  55. void tree_stats(struct tnode *tp) {
  56. /***** RECURSIVE CODE TO COLLECT ALL OF THE STATISTICS *****/
  57.  
  58.  
  59.  
  60.  
  61.  
  62. }
  63.  
  64. #include <ctype.h>
  65. /* Leave this routine EXACTLY as it stands */
  66. int get_word(char *s) {
  67. int c;
  68. do {
  69. c = getchar();
  70. if(c == EOF)
  71. return(0);
  72. } while(!isalpha(c) && !isdigit(c));
  73. do {
  74. if(isupper(c))
  75. c = tolower(c);
  76. *s++ = c;
  77. c = getchar();
  78. } while(isalpha(c) || isdigit(c));
  79. *s = 0;
  80. return(1);
  81. }
  82.  
  83.  
Nov 15 '05 #1
Share this Question
Share on Google+
15 Replies


P: n/a

Foodbank schreef:
Hi all,

I'm trying to do a binary search and collect some stats from a text
file in order to compare the processing times of this program (binary
searching) versus an old program using linked lists.
A "binary search" does not involve any trees, it's a way of searching
through *sorted* sequential lists.
I'm totally new
to binary searches by the way. Can anyone help me with the commented
sections below? Much of the code such as functions and printfs has
already been completed.
Yes. All but the actual implementation of the tree-building and
searching algo's, which are left to the student. I bet that's what the
teacher gave you as an assignment.
Any help is greatly appreciated. /****CODE TO ADD NEW NODES AND COUNT REPEATS *****/ <snip>/***** CODE TO DO THE BINARY SRCH *****/ <snip>/***** RECURSIVE CODE TO COLLECT ALL OF THE STATISTICS *****/


Do your own homework, or at least let us see you trying.

Nov 15 '05 #2

P: n/a

Foodbank wrote:
Hi all,

I'm trying to do a binary search and collect some stats from a text
file in order to compare the processing times of this program (binary
searching) versus an old program using linked lists. I'm totally new
to binary searches by the way. Can anyone help me with the commented
sections below?
Here's a couple links to how binary trees work. It's about numbers
in a Collatz sequence instead of words but the same principles apply.
In fact, I learned how to do this from the O'Reilly book "Practical
C Programming" where the example processes a list of words. The
solution
I talk about uses a three pointer structure that allows me to create
a binary tree that is simultaneously a linked list. Ignore that third
pointer feature if it's not applicable to your problem.

This is a quickie tutorial on binary trees

<http://groups.google.com/group/Dymanic_Systems/browse_frm/thread/40b6e1f3d696be54/cb42098abc4cf84e?hl=en#cb42098abc4cf84e>

And this is how a binary tree is applicable to the problem of
finding a duplicate number in a Collatz sequence.

<http://groups.google.com/group/Dymanic_Systems/browse_frm/thread/43c11da057cfb30c/6f35e1f30e0a3b43?hl=en#6f35e1f30e0a3b43>

If none of this makes any sense, I can dig up my code examples,
but right now they are coded to use GMP unlimited precision integers
and won't directly plug into your program, but the principles should
apply.
Much of the code such as functions and printfs has
already been completed. Any help is greatly appreciated.

Thanks,
James

Expand|Select|Wrap|Line Numbers
  1.  #include <stdio.h>
  2.  #include <malloc.h>
  3.  struct tnode {            // specify the "shape" of a tnode structure ...
  4.      struct tnode *left;    // the left and right branch pointers
  5.      struct tnode *right;
  6.      int count;        // the word count as before
  7.      char *word;        // a pointer to the word
  8.  } *root;            // declare the root pointer variable
  9.  struct tnode **tree_search(struct tnode **, char *);
  10.  void tree_stats(struct tnode *);
  11.  int get_word(char *);
  12.  int total_nodes, total_words, high;
  13.  struct tnode *most_frequent;
  14.  int main(int argc, char *argv[]) {
  15.      struct tnode **tpp;
  16.      char word_buff[100];    // the reusable word buffer
  17.      int i;
  18.      while(get_word(word_buff)) {
  19.          tpp = tree_search(&root, word_buff);
  20.          /****CODE TO ADD NEW NODES AND COUNT REPEATS *****/
  21.      }
  22.      tree_stats(root);
  23.      printf("total_nodes %d\n", total_nodes);
  24.      printf("total_words %d\n", total_words);
  25.      if(most_frequent)
  26.          printf("most frequent word <%s> count is %d\n",
  27.              most_frequent->word, most_frequent->count);
  28.      for(i = 1; i < argc; i++) {
  29.          tpp = tree_search(&root, argv[i]);
  30.          if((*tpp) == NULL)
  31.              printf("%s is NOT in the tree\n", argv[i]);
  32.          else
  33.              printf("<%s> count is %d\n", argv[i], (*tpp)->count);
  34.      }
  35.      return(0);
  36.  }
  37.  // binary tree search returning a pointer to the pointer leading to a
  38.  hit
  39.  struct tnode **tree_search(struct tnode **tpp, char *w) {
  40.      /***** CODE TO DO THE BINARY SRCH *****/
  41.      return(tpp);
  42.  }
  43.  // gather stats to check tree correctness
  44.  void tree_stats(struct tnode *tp) {
  45.      /***** RECURSIVE CODE TO COLLECT ALL OF THE STATISTICS *****/
  46.  }
  47.  #include <ctype.h>
  48.  /* Leave this routine EXACTLY as it stands */
  49.  int get_word(char *s) {
  50.      int c;
  51.      do {
  52.          c = getchar();
  53.          if(c == EOF)
  54.              return(0);
  55.      } while(!isalpha(c) && !isdigit(c));
  56.      do {
  57.          if(isupper(c))
  58.              c = tolower(c);
  59.          *s++ = c;
  60.          c = getchar();
  61.      } while(isalpha(c) || isdigit(c));
  62.      *s = 0;
  63.      return(1);
  64.  }
  65.  


Nov 15 '05 #3

P: n/a

me********@aol.com wrote:
Foodbank wrote:
Hi all,

I'm trying to do a binary search and collect some stats from a text
file in order to compare the processing times of this program (binary
searching) versus an old program using linked lists. I'm totally new
to binary searches by the way. Can anyone help me with the commented
sections below?
Here's a couple links to how binary trees work. It's about numbers
in a Collatz sequence instead of words but the same principles apply.
In fact, I learned how to do this from the O'Reilly book "Practical
C Programming" where the example processes a list of words. The
solution
I talk about uses a three pointer structure that allows me to create
a binary tree that is simultaneously a linked list. Ignore that third
pointer feature if it's not applicable to your problem.

This is a quickie tutorial on binary trees

<http://groups.google.com/group/Dymanic_Systems/browse_frm/thread/40b6e1f3d696be54/cb42098abc4cf84e?hl=en#cb42098abc4cf84e>

And this is how a binary tree is applicable to the problem of
finding a duplicate number in a Collatz sequence.

<http://groups.google.com/group/Dymanic_Systems/browse_frm/thread/43c11da057cfb30c/6f35e1f30e0a3b43?hl=en#6f35e1f30e0a3b43>


These links take you to the end of the threads. I meant for you to read
the
first message in each thread (the others drift away from the subject
although
you're certainly welcome to read them).


If none of this makes any sense, I can dig up my code examples,
but right now they are coded to use GMP unlimited precision integers
and won't directly plug into your program, but the principles should
apply.
Much of the code such as functions and printfs has
already been completed. Any help is greatly appreciated.

Thanks,
James

Expand|Select|Wrap|Line Numbers
  1.  > #include <stdio.h>
  2.  > #include <malloc.h>
  3.  > struct tnode {            // specify the "shape" of a tnode structure ...
  4.  >     struct tnode *left;    // the left and right branch pointers
  5.  >     struct tnode *right;
  6.  >     int count;        // the word count as before
  7.  >     char *word;        // a pointer to the word
  8.  > } *root;            // declare the root pointer variable
  9.  >
  10.  > struct tnode **tree_search(struct tnode **, char *);
  11.  > void tree_stats(struct tnode *);
  12.  > int get_word(char *);
  13.  >
  14.  > int total_nodes, total_words, high;
  15.  > struct tnode *most_frequent;
  16.  >
  17.  > int main(int argc, char *argv[]) {
  18.  >     struct tnode **tpp;
  19.  >     char word_buff[100];    // the reusable word buffer
  20.  >     int i;
  21.  >     while(get_word(word_buff)) {
  22.  >         tpp = tree_search(&root, word_buff);
  23.  >         /****CODE TO ADD NEW NODES AND COUNT REPEATS *****/
  24.  >
  25.  >
  26.  >
  27.  >
  28.  >
  29.  >
  30.  >     }
  31.  >     tree_stats(root);
  32.  >     printf("total_nodes %d\n", total_nodes);
  33.  >     printf("total_words %d\n", total_words);
  34.  >     if(most_frequent)
  35.  >         printf("most frequent word <%s> count is %d\n",
  36.  >             most_frequent->word, most_frequent->count);
  37.  >     for(i = 1; i < argc; i++) {
  38.  >         tpp = tree_search(&root, argv[i]);
  39.  >         if((*tpp) == NULL)
  40.  >             printf("%s is NOT in the tree\n", argv[i]);
  41.  >         else
  42.  >             printf("<%s> count is %d\n", argv[i], (*tpp)->count);
  43.  >     }
  44.  >     return(0);
  45.  > }
  46.  >
  47.  > // binary tree search returning a pointer to the pointer leading to a
  48.  > hit
  49.  > struct tnode **tree_search(struct tnode **tpp, char *w) {
  50.  >     /***** CODE TO DO THE BINARY SRCH *****/
  51.  >     return(tpp);
  52.  > }
  53.  >
  54.  > // gather stats to check tree correctness
  55.  > void tree_stats(struct tnode *tp) {
  56.  >     /***** RECURSIVE CODE TO COLLECT ALL OF THE STATISTICS *****/
  57.  >
  58.  >
  59.  >
  60.  >
  61.  >
  62.  > }
  63.  >
  64.  > #include <ctype.h>
  65.  > /* Leave this routine EXACTLY as it stands */
  66.  > int get_word(char *s) {
  67.  >     int c;
  68.  >     do {
  69.  >         c = getchar();
  70.  >         if(c == EOF)
  71.  >             return(0);
  72.  >     } while(!isalpha(c) && !isdigit(c));
  73.  >     do {
  74.  >         if(isupper(c))
  75.  >             c = tolower(c);
  76.  >         *s++ = c;
  77.  >         c = getchar();
  78.  >     } while(isalpha(c) || isdigit(c));
  79.  >     *s = 0;
  80.  >     return(1);
  81.  > }
  82.  >
  83.  > 


Nov 15 '05 #4

P: n/a
Thanks guys for the links. That made me laugh when you said this was
homework. I wish it were and I was still in school. However, I'm
self-teaching myself C in order to switch from my electrical
engineering based job (my degree in) into more of a programming based
job (I know, I have a long way to go, haha). The program is an
exercise from a book I'm reading.

Thanks,
Jams

Nov 15 '05 #5

P: n/a

"Foodbank" <v8********@yahoo.com> wrote
I'm trying to do a binary search and collect some stats from a text
file in order to compare the processing times of this program (binary
searching) versus an old program using linked lists. I'm totally new
to binary searches by the way. Can anyone help me with the commented
sections below? Much of the code such as functions and printfs has
already been completed. Any help is greatly appreciated.

The idea of a binary search is this.

We have a telephone directory, and want to find an arbitrary name - Gubbins.
Open the directory in the middle. The middle entry will be a name like
"Marks". If it is equal to "Gubbins" you have found your entry, if not, you
know the name must be somewhere between the start of the directory and
"Marks". Now go a quarter way through. This name is "Henry". Still above
"Gubbins", so we know "Gubbins" is in the top quarter. When we check an
eighth of the way through, we find "Evans". So Gubbins is in the second
eighth of the directory. Split on the third sixteenth and you get "Groves".
Gubbins is in the fourth sixteenth, between "Groves" and "Henry". Continue
doing this until you have narrowed to a single entry.

You will see that you eliminate half the possibilities each time, so you
need log2(N) operations to complete the search. This is much faster than
going through the directory sequentially.

Unfortunately this algorithm only works if the directory entries are held in
a sorted array. If we want to insert and delete entries, we have to move the
array about to keep it sorted, which is expensive.

The solution is to use a tree structure. We select an arbitrary entry about
halfway along as the root. All the entries lower than it go in the left
sub-tree, all the entries higher go in the right sub-tree. leaves have null
children. We can easily add new members by growing new leaves, without
rebuilding the whole tree. (The tree doesn't have to be perfectly balanced
for the algorithm to work in reasonable time. Keeping the tree balanced is a
whole new story we won't go into).

The first thing to do is to write the routine that builds the tree. The
first word you encounter goes in the root, the second becomes the right or
the left leaf, and you just continue until you run out of leaves.
Test the tree on a tiny dataset with only half a dozen words, and print it
out so that you can see it is correct.

Then write the function to search the tree, starting at the top and going
down until it either hits a match or encounters a null pointer indicating it
has fallen off a leaf.


Nov 15 '05 #6

P: n/a
Hi,

Am I on the right path in the section of the code that adds new nodes?
I'm not finished, but trying to make an attempt. It's under the
/****CODE TO ADD NEW NODES AND COUNT REPEATS *****/ section. I'm
pretty sure that it's supposed to check if it's null and if so,
allocates memory for a new node. Syntax is probably wrong, but was
hoping for some input.

Thanks,
James

[code]
#include <stdio.h>
#include <malloc.h>
struct tnode { // specify the "shape" of a tnode
structure ...
struct tnode *left; // the left and right branch pointers
struct tnode *right;
int count; // the word count as before
char *word; // a pointer to the word

} *root; // declare the root pointer variable

struct tnode **tree_search(struct tnode **, char *);
void tree_stats(struct tnode *);
int get_word(char *);

int total_nodes, total_words, high;
struct tnode *most_frequent;
int main(int argc, char *argv[]) {
struct tnode **tpp;
char word_buff[100]; // the reusable word buffer
int i;
while(get_word(word_buff)) {
tpp = tree_search(&root, word_buff);
/****CODE TO ADD NEW NODES AND COUNT REPEATS *****/
///new code below here
if(root==NULL)
if(*tpp==NULL){
tpp=malloc(sizeof(struct tnode));
tpp->word = strdup(word_buff);
tpp->*left = NULL;
tpp->*right = NULL;
}
else statement here if there's a node there, increments count I
think, not sure which variables to use

}
[code]

Nov 15 '05 #7

P: n/a
Foodbank wrote:
Hi,

Am I on the right path in the section of the code that adds new nodes?
I'm not finished, but trying to make an attempt. It's under the
/****CODE TO ADD NEW NODES AND COUNT REPEATS *****/ section. I'm
pretty sure that it's supposed to check if it's null and if so,
allocates memory for a new node. Syntax is probably wrong, but was
hoping for some input.
I've include my version of the example binary tree program.
It simply builds a tree from the command line arguments.

A key difference is a recursive function to enter a word
onto the tree. It starts at the root and follows left/right
pointers until it either finds the word already on the tree
or finds a NULL. Calling enter() with a NULL will add a node
with count=1 since adding a node means this is the first
occurence.

When the word is found to be already on the tree, we just need
to increment the count.

Also in the program are a pair of print routines, one to print
all the words in the tree with their frequency and another
that shows how the path through the binary tree.

And you will, of course, want to add a routine to free the
allocated memory.

Sample output:

$ ./a the ice was here the ice was there the ice was all around
it cracked and groaned and shrieked and howled like noises in a
swound

a (1) all (1) and (3) around (1) cracked (1) groaned (1)
here (1) howled (1) ice (3) in (1) it (1) like (1) noises (1)
shrieked (1) swound (1) the (3) there (1) was (3)

L L L L L [a - 1]
R U [all - 1]
R L L [and - 3]
R U [around - 1]
R L [cracked - 1]
R L [groaned - 1]
R U U U U [here - 1]
R L [howled - 1]
R U U [ice - 3]
R L L [in - 1]
R U [it - 1]
R L L [like - 1]
R L [noises - 1]
R U U [shrieked - 1]
R L [swound - 1]
R U U U U [the - 3]
R L L [there - 1]
R U [was - 3]
R U U

Thanks,
James

[code]
#include <stdio.h>
#include <malloc.h>
struct tnode { // specify the "shape" of a tnode
structure ...
struct tnode *left; // the left and right branch pointers
struct tnode *right;
int count; // the word count as before
char *word; // a pointer to the word

} *root; // declare the root pointer variable

struct tnode **tree_search(struct tnode **, char *);
void tree_stats(struct tnode *);
int get_word(char *);

int total_nodes, total_words, high;
struct tnode *most_frequent;
int main(int argc, char *argv[]) {
struct tnode **tpp;
char word_buff[100]; // the reusable word buffer
int i;
while(get_word(word_buff)) {
tpp = tree_search(&root, word_buff);
/****CODE TO ADD NEW NODES AND COUNT REPEATS *****/
///new code below here
if(root==NULL)
if(*tpp==NULL){
tpp=malloc(sizeof(struct tnode));
tpp->word = strdup(word_buff);
tpp->*left = NULL;
tpp->*right = NULL;
}
else statement here if there's a node there, increments count I
think, not sure which variables to use

}
[code]



#include <stdio.h>
#include <ctype.h>
#include <string.h>
#include <stdlib.h>

// gcc -g -I. c/foodbank.c

struct node {
struct node *left; // tree to the left
struct node *right; // tree to the right
unsigned int count; // # of occurences...
char *word; // ...of this word
};

static struct node *root = NULL;

void memory_error(void)
{
fprintf(stderr, "Error:Out of memory\n");
exit(8);
}

char *save_w(char *string) // copy string to heap
{
char *new_string;
new_string = malloc((unsigned) (strlen(string)+1));
if (new_string == NULL)
memory_error();
strcpy(new_string,string);
return (new_string);
}

void enter(struct node **node, char *word)
{
char *save_w();
int comp;

if ((*node) == NULL)
{
(*node) = malloc(sizeof(struct node));
if ((*node) == NULL)
memory_error();
(*node)->left = NULL;
(*node)->right = NULL;
(*node)->count = 1; // new node = first occurence
(*node)->word = save_w(word); // of this word
return;
}
comp = strcmp((*node)->word,word);
if (comp==0 )
{
(*node)->count++; // word already in tree
return;
}
if (comp<0)
enter(&(*node)->right,word); // recursively try to enter word
else // until a NULL is found
enter(&(*node)->left,word);
}

void print_tree(struct node *top)
{
if (top == NULL)
return;
print_tree(top->left);
printf("%s (%d) ", top->word,top->count);
print_tree(top->right);
}

void print_tree_debug(struct node *top)
{
if (top == NULL)
return;
printf("L "); // go left
print_tree_debug(top->left);
printf("[%s - %d]\n", top->word,top->count);
printf("R "); // go right
print_tree_debug(top->right);
printf("U "); // go up
}

int main(int argc, char *argv[])
{
unsigned int i;

for (i=1; i<argc; i++)
enter(&root,argv[i]);

printf("\n");
print_tree(root);
printf("\n");

printf("\n");
print_tree_debug(root);
printf("\n");

return (0);
}

Nov 15 '05 #8

P: n/a
On 11 Oct 2005 15:26:31 -0700, "me********@aol.com"
<me********@aol.com> wrote:

snip
I've include my version of the example binary tree program.
It simply builds a tree from the command line arguments.

A key difference is a recursive function to enter a word
onto the tree. It starts at the root and follows left/right
pointers until it either finds the word already on the tree
or finds a NULL. Calling enter() with a NULL will add a node
with count=1 since adding a node means this is the first
occurence.

When the word is found to be already on the tree, we just need
to increment the count.

Also in the program are a pair of print routines, one to print
all the words in the tree with their frequency and another
that shows how the path through the binary tree.

And you will, of course, want to add a routine to free the
allocated memory.
snip
[code]



#include <stdio.h>
#include <ctype.h>
#include <string.h>
#include <stdlib.h>

// gcc -g -I. c/foodbank.c

struct node {
struct node *left; // tree to the left
struct node *right; // tree to the right
unsigned int count; // # of occurences...
char *word; // ...of this word
};

static struct node *root = NULL;


Since you pass root to every function that needs it, there is no need
for it to be at file scope. This will prevent an unintended function
from modifying it.

void memory_error(void)
{
fprintf(stderr, "Error:Out of memory\n");
exit(8);
EXIT_FAILURE would be more portable than 8.}

char *save_w(char *string) // copy string to heap
{
char *new_string;
new_string = malloc((unsigned) (strlen(string)+1));
if (new_string == NULL)
memory_error();
strcpy(new_string,string);
return (new_string);
}

void enter(struct node **node, char *word)
{
char *save_w();
Marginally incorrect and error prone. save_w was defined above. That
will serve as the prototype of any subsequent calls to the function.
If you are going to repeat a declaration, repeat it exactly.
int comp;

if ((*node) == NULL)
{
(*node) = malloc(sizeof(struct node));
if ((*node) == NULL)
memory_error();
(*node)->left = NULL;
(*node)->right = NULL;
(*node)->count = 1; // new node = first occurence
(*node)->word = save_w(word); // of this word
return;
}
comp = strcmp((*node)->word,word);
if (comp==0 )
{
(*node)->count++; // word already in tree
return;
}
if (comp<0)
enter(&(*node)->right,word); // recursively try to enter word
else // until a NULL is found
enter(&(*node)->left,word);
}

void print_tree(struct node *top)
{
if (top == NULL)
return;
print_tree(top->left);
printf("%s (%d) ", top->word,top->count);
print_tree(top->right);
}

void print_tree_debug(struct node *top)
{
if (top == NULL)
return;
printf("L "); // go left
print_tree_debug(top->left);
printf("[%s - %d]\n", top->word,top->count);
printf("R "); // go right
print_tree_debug(top->right);
printf("U "); // go up
}

int main(int argc, char *argv[])
{
unsigned int i;

for (i=1; i<argc; i++)
enter(&root,argv[i]);

printf("\n");
print_tree(root);
printf("\n");

printf("\n");
print_tree_debug(root);
printf("\n");

return (0);
}

<<Remove the del for email>>
Nov 15 '05 #9

P: n/a

Barry Schwarz wrote:
On 11 Oct 2005 15:26:31 -0700, "me********@aol.com"
<me********@aol.com> wrote:

snip
I've include my version of the example binary tree program.
It simply builds a tree from the command line arguments.

A key difference is a recursive function to enter a word
onto the tree. It starts at the root and follows left/right
pointers until it either finds the word already on the tree
or finds a NULL. Calling enter() with a NULL will add a node
with count=1 since adding a node means this is the first
occurence.

When the word is found to be already on the tree, we just need
to increment the count.

Also in the program are a pair of print routines, one to print
all the words in the tree with their frequency and another
that shows how the path through the binary tree.

And you will, of course, want to add a routine to free the
allocated memory.
snip
[code]



#include <stdio.h>
#include <ctype.h>
#include <string.h>
#include <stdlib.h>

// gcc -g -I. c/foodbank.c

struct node {
struct node *left; // tree to the left
struct node *right; // tree to the right
unsigned int count; // # of occurences...
char *word; // ...of this word
};

static struct node *root = NULL;


Since you pass root to every function that needs it, there is no need
for it to be at file scope. This will prevent an unintended function
from modifying it.


Where does it go? FWIW, I copied (and modified) this code out
of the O'Reilly book "Practical C Programming" by Steve Oualline.
I don't know enough to make a mistake that sophisticated.

void memory_error(void)
{
fprintf(stderr, "Error:Out of memory\n");
exit(8);
EXIT_FAILURE would be more portable than 8.


Ditto, didn't know there was such a constant.
}

char *save_w(char *string) // copy string to heap
{
char *new_string;
new_string = malloc((unsigned) (strlen(string)+1));
if (new_string == NULL)
memory_error();
strcpy(new_string,string);
return (new_string);
}

void enter(struct node **node, char *word)
{
char *save_w();


Marginally incorrect and error prone. save_w was defined above. That
will serve as the prototype of any subsequent calls to the function.
If you are going to repeat a declaration, repeat it exactly.


That one's partially my fault. The prototype is in the book,
but it got inexplicably changed when I was modifying it.

Thanks for pointing those out. I'm not a C programmer but
occaisionally need to create C programs. I'm completely
dependant on examples from books. Good to know that even
the books aren't perfect.
int comp;

if ((*node) == NULL)
{
(*node) = malloc(sizeof(struct node));
if ((*node) == NULL)
memory_error();
(*node)->left = NULL;
(*node)->right = NULL;
(*node)->count = 1; // new node = first occurence
(*node)->word = save_w(word); // of this word
return;
}
comp = strcmp((*node)->word,word);
if (comp==0 )
{
(*node)->count++; // word already in tree
return;
}
if (comp<0)
enter(&(*node)->right,word); // recursively try to enter word
else // until a NULL is found
enter(&(*node)->left,word);
}

void print_tree(struct node *top)
{
if (top == NULL)
return;
print_tree(top->left);
printf("%s (%d) ", top->word,top->count);
print_tree(top->right);
}

void print_tree_debug(struct node *top)
{
if (top == NULL)
return;
printf("L "); // go left
print_tree_debug(top->left);
printf("[%s - %d]\n", top->word,top->count);
printf("R "); // go right
print_tree_debug(top->right);
printf("U "); // go up
}

int main(int argc, char *argv[])
{
unsigned int i;

for (i=1; i<argc; i++)
enter(&root,argv[i]);

printf("\n");
print_tree(root);
printf("\n");

printf("\n");
print_tree_debug(root);
printf("\n");

return (0);
}

<<Remove the del for email>>


Nov 15 '05 #10

P: n/a
Thanks mensanator and everyone else who helped. I appreciate it. I'll
try this out and see how things go.

Regards,
James

Nov 15 '05 #11

P: n/a
Hey guys, I'm almost there, but getting a seg. fault. I have no idea
why. I've commented where it's happening at the location that says:

//*** SEG FAULT HAPPENING BELOW WHEN CMP IS <, >, OR == 0 ********

As a refresher, this is the binary tree that counts words from a text
file on the command line.

Thank you in advance,
Jim

Expand|Select|Wrap|Line Numbers
  1.  
  2. #include <stdio.h>
  3. #include <malloc.h>
  4. struct tnode {            // specify the "shape" of a tnode structure
  5. struct tnode *left;    // the left and right branch pointers
  6. struct tnode *right;
  7. int count;        // the word count as before
  8. char *word;        // a pointer to the word
  9. } *root;            // declare the root pointer variable
  10.  
  11. struct tnode **tree_search(struct tnode **, char *);
  12. void tree_stats(struct tnode *);
  13. int get_word(char *);
  14.  
  15. int total_nodes, total_words, high;
  16. struct tnode *most_frequent;
  17.  
  18. int main(int argc, char *argv[]) {
  19. struct tnode **tpp;
  20. char word_buff[100];    // the reusable word buffer
  21. int i;
  22. while(get_word(word_buff)) {
  23. tpp = tree_search(&root, word_buff);
  24. /****CODE TO ADD NEW NODES AND COUNT REPEATS *****/
  25.  
  26. if(*tpp!=NULL){
  27. (*tpp)->count++;
  28. }
  29. else{
  30. (*tpp)=malloc(sizeof(struct tnode));
  31. (*tpp)->left=NULL;
  32. (*tpp)->right=NULL;
  33. (*tpp)->count=1;
  34. (*tpp)->word=strdup(word_buff);
  35. }
  36.  
  37. }
  38. tree_stats(root);
  39. printf("total_nodes %d\n", total_nodes);
  40. printf("total_words %d\n", total_words);
  41. if(most_frequent)
  42. printf("most frequent word <%s> count is %d\n",
  43. most_frequent->word, most_frequent->count);
  44. for(i = 1; i < argc; i++) {
  45. tpp = tree_search(&root, argv[i]);
  46. if((*tpp) == NULL)
  47. printf("%s is NOT in the tree\n", argv[i]);
  48. else
  49. printf("<%s> count is %d\n", argv[i], (*tpp)->count);
  50. }
  51. return(0);
  52. }
  53.  
  54. // binary tree search returning a pointer to the pointer leading to a
  55. hit
  56. struct tnode **tree_search(struct tnode **tpp, char *w) {
  57. /***** CODE TO DO THE BINARY SRCH *****/
  58. int cmp;
  59.  
  60. while(*tpp){
  61. cmp=strcmp(w,(*tpp)->word);
  62. //*****SEG FAULT HAPPENING BELOW WHEN CMP IS <, >, OR == 0  ********
  63. if (cmp>0) {
  64. tpp=(*tpp)->right;
  65. }
  66. else if (cmp<0){
  67. tpp=(*tpp)->left;
  68. }
  69. else if (cmp==0){
  70. break;
  71. }
  72. }
  73.  
  74.  
  75. return(tpp);
  76. }
  77.  
  78. // gather stats to check tree correctness
  79. void tree_stats(struct tnode *tp) {
  80. /***** RECURSIVE CODE TO COLLECT ALL OF THE STATISTICS *****/
  81. if(tp=NULL)
  82. return;
  83. total_words+=tp->count;
  84. total_nodes++;
  85. if(tp->count > high){
  86. high=tp->count;
  87. most_frequent=tp;
  88. }
  89. }
  90.  
  91. #include <ctype.h>
  92. /* Leave this routine EXACTLY as it stands */
  93. int get_word(char *s) {
  94. int c;
  95. do {
  96. c = getchar();
  97. if(c == EOF)
  98. return(0);
  99. } while(!isalpha(c) && !isdigit(c));
  100. do {
  101. if(isupper(c))
  102. c = tolower(c);
  103. *s++ = c;
  104. c = getchar();
  105. } while(isalpha(c) || isdigit(c));
  106. *s = 0;
  107. return(1);
  108. }
  109.  
Nov 15 '05 #12

P: n/a

Foodbank wrote:
Hey guys, I'm almost there, but getting a seg. fault. I have no idea
why.
I think a seg. fault means you have a bad pointer.

I thought, perhaps, someone else would have posted the answer by now,
but since that hasn't happened, I'll make a stab at it. Keep in mind
I'm not an expert, so what follows are guesses (hopefully educated).
I've commented where it's happening at the location that says:

//*** SEG FAULT HAPPENING BELOW WHEN CMP IS <, >, OR == 0 ********

As a refresher, this is the binary tree that counts words from a text
file on the command line.

Thank you in advance,
Jim

Expand|Select|Wrap|Line Numbers
  1.  #include <stdio.h>
  2.  #include <malloc.h>
  3.  struct tnode {            // specify the "shape" of a tnode structure
  4.      struct tnode *left;    // the left and right branch pointers
  5.      struct tnode *right;
  6.      int count;        // the word count as before
  7.      char *word;        // a pointer to the word
  8.  } *root;            // declare the root pointer variable
  •  
  • Root HAS to start out as a NULL pointer. I don't know if the
  • preceeding initializes the pointer or not. I think you either
  • need to initialize it here by
  •  
  • } *root = NULL;
  •  
  • assuming that's correct usage or else set it to null at the
  • start of main()
  •  
  • root = NULL
  •  
  •  struct tnode **tree_search(struct tnode **, char *);
  •  void tree_stats(struct tnode *);
  •  int get_word(char *);
  •  int total_nodes, total_words, high;
  •  struct tnode *most_frequent;
  •  int main(int argc, char *argv[]) {
  •      struct tnode **tpp;
  •      char word_buff[100];    // the reusable word buffer
  •      int i;
  •      while(get_word(word_buff)) {
  •          tpp = tree_search(&root, word_buff);
  •          /****CODE TO ADD NEW NODES AND COUNT REPEATS *****/
  •          if(*tpp!=NULL){
  •              (*tpp)->count++;
  •          }
  •          else{
  •              (*tpp)=malloc(sizeof(struct tnode));
  •              (*tpp)->left=NULL;
  •              (*tpp)->right=NULL;
  •              (*tpp)->count=1;
  •              (*tpp)->word=strdup(word_buff);
  •          }
  •      }
  •      tree_stats(root);
  •      printf("total_nodes %d\n", total_nodes);
  •      printf("total_words %d\n", total_words);
  •      if(most_frequent)
  •          printf("most frequent word <%s> count is %d\n",
  •              most_frequent->word, most_frequent->count);
  •      for(i = 1; i < argc; i++) {
  •          tpp = tree_search(&root, argv[i]);
  •          if((*tpp) == NULL)
  •              printf("%s is NOT in the tree\n", argv[i]);
  •          else
  •              printf("<%s> count is %d\n", argv[i], (*tpp)->count);
  •      }
  •      return(0);
  •  }
  •  // binary tree search returning a pointer to the pointer leading to a
  •  hit
  •  struct tnode **tree_search(struct tnode **tpp, char *w) {
  •      /***** CODE TO DO THE BINARY SRCH *****/
  •      int cmp;
  •      while(*tpp){
  •  
  • The other thing I don't know is whether NULL pointers evaluate
  • to FALSE. If not, then you need to explicitly test for NULL in
  • the while statement.
  •          cmp=strcmp(w,(*tpp)->word);
  •     //*****SEG FAULT HAPPENING BELOW WHEN CMP IS <, >, OR == 0  ********
  •              if (cmp>0) {
  •                  tpp=(*tpp)->right;
  •                  }
  •              else if (cmp<0){
  •                  tpp=(*tpp)->left;
  •                  }
  •              else if (cmp==0){
  •                  break;
  •              }
  •  }
  •  return(tpp);
  •  }
  •  // gather stats to check tree correctness
  •  void tree_stats(struct tnode *tp) {
  •      /***** RECURSIVE CODE TO COLLECT ALL OF THE STATISTICS *****/
  •  if(tp=NULL)
  •      return;
  •      total_words+=tp->count;
  •      total_nodes++;
  •      if(tp->count > high){
  •          high=tp->count;
  •          most_frequent=tp;
  •      }
  •  }
  •  #include <ctype.h>
  •  /* Leave this routine EXACTLY as it stands */
  •  int get_word(char *s) {
  •      int c;
  •      do {
  •          c = getchar();
  •          if(c == EOF)
  •              return(0);
  •      } while(!isalpha(c) && !isdigit(c));
  •      do {
  •          if(isupper(c))
  •              c = tolower(c);
  •          *s++ = c;
  •          c = getchar();
  •      } while(isalpha(c) || isdigit(c));
  •      *s = 0;
  •      return(1);
  •  }
  •  


  • Nov 15 '05 #13

    P: n/a
    me********@aol.com wrote:
    Foodbank wrote:
    Hey guys, I'm almost there, but getting a seg. fault. I have no idea
    why.
    I think a seg. fault means you have a bad pointer.


    Or overrunning the end of a buffer. Or a number of other things.
    I thought, perhaps, someone else would have posted the answer by now,
    I didn't because I could see it retained a number of errors I had
    previously (and not many days before) posted corrections to. I can't
    remember if it was the same person, but it was obviously based on the
    same homework question. However, since someone else is showing some
    interest I decided to provide you with some feedback on lots of problems
    with it to help you learn what to avoid. I am almost certain there are
    further problems I have not pointed out.
    but since that hasn't happened, I'll make a stab at it. Keep in mind
    I'm not an expert, so what follows are guesses (hopefully educated).
    I've commented where it's happening at the location that says:

    //*** SEG FAULT HAPPENING BELOW WHEN CMP IS <, >, OR == 0 ********

    As a refresher, this is the binary tree that counts words from a text
    file on the command line.

    Thank you in advance,
    Jim

    Expand|Select|Wrap|Line Numbers
    1. #include <stdio.h>
    2. #include <malloc.h>
  •  
  • malloc.h is not a standard header, the standard header is stdlib.h
  • struct tnode {            // specify the "shape" of a tnode structure
  •     struct tnode *left;    // the left and right branch pointers
  •     struct tnode *right;
  •     int count;        // the word count as before
  •     char *word;        // a pointer to the word
  • } *root;            // declare the root pointer variable
  •  Root HAS to start out as a NULL pointer. I don't know if the
  •  preceeding initializes the pointer or not. I think you either
  •  need to initialize it here by
  •  } *root = NULL;
  •  
  • It is declared at file scope and therefor will therefore be initialised
  • to a null pointer if no explicit initialisation is provided.
  •  assuming that's correct usage or else set it to null at the
  •  start of main()
  •  root = NULL
  •  
  • Not needed, your initialisation syntax was correct.
  • struct tnode **tree_search(struct tnode **, char *);
  •  
  • All this use of pointers to pointers is confusing and pointless.
  •  
  • struct tnode *tree_search(struct tnode *, char *);
  •  
  • void tree_stats(struct tnode *);
  • int get_word(char *);
  • int total_nodes, total_words, high;
  • struct tnode *most_frequent;
  • int main(int argc, char *argv[]) {
  •     struct tnode **tpp;
  •  
  • struct tnode *tpp;
  •     char word_buff[100];    // the reusable word buffer
  •  
  • Don't use // syle comments on news groups, they won't survive line wrapping.
  •     int i;
  •     while(get_word(word_buff)) {
  •         tpp = tree_search(&root, word_buff);
  •  
  • Don't use tabs on code posted to news groups. You have no idea how they
  • will be rendered and sometimes they even get stripped. I've replaces
  • them with spaces.
  •  
  • You are not trying to modify the value of root, so why pass it's address?
  • tpp = tree_search(root, word_buff);
  •         /****CODE TO ADD NEW NODES AND COUNT REPEATS *****/
  •         if(*tpp!=NULL){
  •             (*tpp)->count++;
  •         }
  •         else{
  •             (*tpp)=malloc(sizeof(struct tnode));
  •  
  • Why the parenthesis? Also, why yous sizeof(type) when you can use sizeof
  • object and reduce the chance of error?
  • tpp=malloc(sizeof *tpp);
  • With my changes you also need to remove a number of other dereferences.
  •             (*tpp)->left=NULL;
  •             (*tpp)->right=NULL;
  •             (*tpp)->count=1;
  •             (*tpp)->word=strdup(word_buff);
  •  
  • strdup is not a standard function.
  •  
  • Also you need to add this node you've just created to your tree. It does
  • not happen by magic.
  •         }
  •     }
  •     tree_stats(root);
  •     printf("total_nodes %d\n", total_nodes);
  •     printf("total_words %d\n", total_words);
  •     if(most_frequent)
  •     printf("most frequent word <%s> count is %d\n",
  •            most_frequent->word, most_frequent->count);
  •     for(i = 1; i < argc; i++) {
  •         tpp = tree_search(&root, argv[i]);
  •         if((*tpp) == NULL)
  •             printf("%s is NOT in the tree\n", argv[i]);
  •         else
  •             printf("<%s> count is %d\n", argv[i], (*tpp)->count);
  •     }
  •     return(0);
  •  
  • You don't need parenthesis here, return is not a function.
  • return 0;
  • }
  • // binary tree search returning a pointer to the pointer leading to a
  • hit
  •  
  • See what I meant about // comments not surviving line wrapping?
  • struct tnode **tree_search(struct tnode **tpp, char *w) {
  •     /***** CODE TO DO THE BINARY SRCH *****/
  •     int cmp;
  •     while(*tpp){
  •  The other thing I don't know is whether NULL pointers evaluate
  •  to FALSE. If not, then you need to explicitly test for NULL in
  •  the while statement.
  •  
  • You don't need to compare against NULL, although some people consider it
  • to be better style.
  •         cmp=strcmp(w,(*tpp)->word);
  •    //*****SEG FAULT HAPPENING BELOW WHEN CMP IS <, >, OR == 0  ********
  •  
  • Then you have a problem building your tree.
  •         if (cmp>0) {
  •             tpp=(*tpp)->right;
  •         }
  •         else if (cmp<0){
  •             tpp=(*tpp)->left;
  •                 }
  •         else if (cmp==0){
  •             break;
  •         }
  •     }
  •     return(tpp);
  •  
  • Again, no need for the parenthesis.
  • return(tpp);
  • }
  • // gather stats to check tree correctness
  • void tree_stats(struct tnode *tp) {
  •     /***** RECURSIVE CODE TO COLLECT ALL OF THE STATISTICS *****/
  • if(tp=NULL)
  •     return;
  •     total_words+=tp->count;
  •     total_nodes++;
  •     if(tp->count > high){
  •         high=tp->count;
  •         most_frequent=tp;
  •     }
  •  
  • Functions modifying globals are horrible things. There are many far
  • better solutions.
  • }
  • #include <ctype.h>
  • /* Leave this routine EXACTLY as it stands */
  • int get_word(char *s) {
  •     int c;
  •     do {
  •         c = getchar();
  •         if(c == EOF)
  •             return(0);
  •     } while(!isalpha(c) && !isdigit(c));
  •     do {
  •         if(isupper(c))
  •             c = tolower(c);
  •         *s++ = c;
  •  
  • No protection against overflowing your buffer. Very bad.
  •         c = getchar();
  •     } while(isalpha(c) || isdigit(c));
  •     *s = 0;
  •     return(1);
  • }

  • --
    Flash Gordon
    Living in interesting times.
    Although my email address says spam, it is real and I read it.
    Nov 15 '05 #14

    P: n/a
    On 12 Oct 2005 19:46:00 -0700, "Foodbank" <v8********@yahoo.com>
    wrote:
    Hey guys, I'm almost there, but getting a seg. fault. I have no idea
    why. I've commented where it's happening at the location that says:
    This should not be your real code. It doesn't compile. Are you
    ignoring the warnings? If you don't see the warnings, up the warning
    level of your compiler.

    //*** SEG FAULT HAPPENING BELOW WHEN CMP IS <, >, OR == 0 ********

    As a refresher, this is the binary tree that counts words from a text
    file on the command line.

    Thank you in advance,
    Jim

    Expand|Select|Wrap|Line Numbers
    1. #include <stdio.h>
    2. #include <malloc.h>
  •  
  • Not a standard header.  malloc() is in stdlib.h.  You need string.h
  • and ctype.h.
  • struct tnode {            // specify the "shape" of a tnode structure
  •     struct tnode *left;    // the left and right branch pointers
  •     struct tnode *right;
  •     int count;        // the word count as before
  •     char *word;        // a pointer to the word
  • } *root;            // declare the root pointer variable
  •  
  • root is initialized by default to NULL.
  • struct tnode **tree_search(struct tnode **, char *);
  • void tree_stats(struct tnode *);
  • int get_word(char *);
  • int total_nodes, total_words, high;
  • struct tnode *most_frequent;
  • int main(int argc, char *argv[]) {
  •     struct tnode **tpp;
  •     char word_buff[100];    // the reusable word buffer
  •     int i;
  •     while(get_word(word_buff)) {
  •         tpp = tree_search(&root, word_buff);
  •         /****CODE TO ADD NEW NODES AND COUNT REPEATS *****/
  •         if(*tpp!=NULL){
  •             (*tpp)->count++;
  •         }
  •         else{
  •             (*tpp)=malloc(sizeof(struct tnode));
  •             (*tpp)->left=NULL;
  •             (*tpp)->right=NULL;
  •             (*tpp)->count=1;
  •             (*tpp)->word=strdup(word_buff);
  •  
  • Not a standard function.
  •         }
  •     }
  •     tree_stats(root);
  •     printf("total_nodes %d\n", total_nodes);
  •     printf("total_words %d\n", total_words);
  •     if(most_frequent)
  •         printf("most frequent word <%s> count is %d\n",
  •             most_frequent->word, most_frequent->count);
  •     for(i = 1; i < argc; i++) {
  •         tpp = tree_search(&root, argv[i]);
  •         if((*tpp) == NULL)
  •             printf("%s is NOT in the tree\n", argv[i]);
  •         else
  •             printf("<%s> count is %d\n", argv[i], (*tpp)->count);
  •     }
  •     return(0);
  • }
  • // binary tree search returning a pointer to the pointer leading to a
  • hit
  • struct tnode **tree_search(struct tnode **tpp, char *w) {
  •     /***** CODE TO DO THE BINARY SRCH *****/
  •     int cmp;
  •     while(*tpp){
  •         cmp=strcmp(w,(*tpp)->word);
  •    //*****SEG FAULT HAPPENING BELOW WHEN CMP IS <, >, OR == 0  ********
  •  
  • Do you expect your compiler to parse this as / /*... (divide followed
  • by start of comment) or // *... (one line comment)?  Don't use //
  • comments for posting.  If the comment overflows the line in your
  • newsreader, others cannot assemble your code.
  •             if (cmp>0) {
  •                 tpp=(*tpp)->right;
  •  
  • tpp is a pointer to pointer to struct.  (*tpp)->right is a pointer to
  • struct.  The two are not compatible for implicit conversion.
  •                 }
  •             else if (cmp<0){
  •                 tpp=(*tpp)->left;
  •                 }
  •             else if (cmp==0){
  •                 break;
  •             }
  • }
  • return(tpp);
  • }
  • // gather stats to check tree correctness
  • void tree_stats(struct tnode *tp) {
  •     /***** RECURSIVE CODE TO COLLECT ALL OF THE STATISTICS *****/
  • if(tp=NULL)
  •  
  • You probably meant == here.
  •     return;
  •     total_words+=tp->count;
  •     total_nodes++;
  •     if(tp->count > high){
  •         high=tp->count;
  •         most_frequent=tp;
  •     }
  • }
  • #include <ctype.h>
  • /* Leave this routine EXACTLY as it stands */
  • int get_word(char *s) {
  •     int c;
  •     do {
  •         c = getchar();
  •         if(c == EOF)
  •             return(0);
  •     } while(!isalpha(c) && !isdigit(c));
  •     do {
  •         if(isupper(c))
  •             c = tolower(c);
  •         *s++ = c;
  •         c = getchar();
  •     } while(isalpha(c) || isdigit(c));
  •     *s = 0;
  •     return(1);
  • }

  • <<Remove the del for email>>
    Nov 15 '05 #15

    P: n/a
    Hi everyone,

    Just wanted to say thanks for the help on this one. Finally got it
    working.

    Take care,
    James

    Nov 15 '05 #16

    This discussion thread is closed

    Replies have been disabled for this discussion.