473,498 Members | 1,648 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Memory Allocation Problem, please help

Hi,

I've written the code that follows, and I use the function add_word(),
it seems to work fine
*before* increase_arrays() is called that uses realloc() to allocate
more memory to words. But *after* calling increase_arrays(), I
received segmentation fault. I tried to step it through gdb, and I
found out that after calling increase_arrays(), words[0]'s original
value is modified, and if I tried to access it, I get <address 0x11
out of bound>. It's there something wrong with the way I used
realloc()? Please help me out, my limited knowledge of C can carry as
far as here, and I don't think I can figure this out by myself. (I've
checked the C Faq, but still can't figure it out, now I see why people
says that memory management is a pain.). Thanks in advance.

#define INC_SIZE 3 /* number of elements the arrays should
increase.*/

static int array_size; /* size of arrays words, word_counts */
static int total_words; /* total number of words found */
static char **words; /* an array to of words occured */
static int *word_counts; /* number of the corresponding word in arrays
words appeared */

/* Increases the storage for the arrays words, word_counts
* and adjust related variables
*/
static void increase_arrays(){
int i;
int new_size = array_size + INC_SIZE;
char **tmp = realloc(*words, sizeof(char*)*(new_size+1));
int *tmp1 = realloc(word_counts, sizeof(int)*new_size);
if(!(*tmp) || ! tmp1){
fprintf(stderr, "Fatal Error: failed to allocate memory.\n");
exit(1);
}
memset(tmp1 + array_size, 0, sizeof(int)*INC_SIZE);
*words = *tmp;
for(i = array_size; i < new_size; i++)
words[i] = NULL;
word_counts = tmp1;
array_size = new_size;
}

/* return the index of word in the array words on success,
* -1 if word does not exist in the array.
*/
static int word_index(const char *word){
int index;
for(index = 0; index < total_words; index++){
if(!strcmp(words[index], word)) /* matches*/
return index;
}
/* word does not exist */
return -1;
}

/* add word to the array words if it does not exist,
* else words is not modified.
* adjust related the word_counts, and other related
* variables.
*/
static void add_word(const char *word){
int index;
int length;
index = word_index(word);
if(index == -1){
/* word does not exist*/
if(total_words >= array_size)
increase_arrays();
length = strlen(word)+1;
words[total_words]=calloc(length, sizeof(char*));
if(words[total_words]){
strcpy(words[total_words], word);
word_counts[total_words] = 1;
total_words++;
}
else{
fprintf(stderr, "Fatal Error:failed to allocate memory.\n");
exit(1);
}
}
else{
/* word exists */
word_counts[index]++;
}
}

May 12 '07 #1
9 2478
we********@gmail.com said:
Hi,

I've written the code that follows, and I use the function add_word(),
it seems to work fine
*before* increase_arrays() is called that uses realloc() to allocate
more memory to words. But *after* calling increase_arrays(), I
received segmentation fault.
I looked at your code for about half a minute, and couldn't understand
it. Clearly you don't understand it either (otherwise it would be
working). I conclude that you're trying to do too much in one go.

Modularise.

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at the above domain, - www.
May 12 '07 #2
On 12 May 2007 02:19:03 -0700, "we********@gmail.com"
<we********@gmail.comwrote:
>Hi,

I've written the code that follows, and I use the function add_word(),
it seems to work fine
*before* increase_arrays() is called that uses realloc() to allocate
more memory to words. But *after* calling increase_arrays(), I
received segmentation fault. I tried to step it through gdb, and I
found out that after calling increase_arrays(), words[0]'s original
value is modified, and if I tried to access it, I get <address 0x11
out of bound>. It's there something wrong with the way I used
realloc()? Please help me out, my limited knowledge of C can carry as
far as here, and I don't think I can figure this out by myself. (I've
checked the C Faq, but still can't figure it out, now I see why people
says that memory management is a pain.). Thanks in advance.

#define INC_SIZE 3 /* number of elements the arrays should
increase.*/

static int array_size; /* size of arrays words, word_counts */
static int total_words; /* total number of words found */
static char **words; /* an array to of words occured */
static int *word_counts; /* number of the corresponding word in arrays
words appeared */

/* Increases the storage for the arrays words, word_counts
* and adjust related variables
*/
static void increase_arrays(){
int i;
int new_size = array_size + INC_SIZE;
char **tmp = realloc(*words, sizeof(char*)*(new_size+1));
You want to realloc words, not *words. * words is the same as
words[0]. It is the pointer to the first word. You want to
reallocate the array of pointers.
> int *tmp1 = realloc(word_counts, sizeof(int)*new_size);
if(!(*tmp) || ! tmp1){
*tmp is the value of the first pointer that words used to point to.
Not the value returned by realloc. If realloc actually failed, tmp
would be NULL and *tmp would invoke undefined behavior. You want
if (!tmp || !tmp1)
> fprintf(stderr, "Fatal Error: failed to allocate memory.\n");
exit(1);
Use EXIT_FAILURE, not 1.
> }
memset(tmp1 + array_size, 0, sizeof(int)*INC_SIZE);
*words = *tmp;
words = tmp;
> for(i = array_size; i < new_size; i++)
words[i] = NULL;
word_counts = tmp1;
array_size = new_size;
}

/* return the index of word in the array words on success,
* -1 if word does not exist in the array.
*/
static int word_index(const char *word){
int index;
for(index = 0; index < total_words; index++){
if(!strcmp(words[index], word)) /* matches*/
return index;
}
/* word does not exist */
return -1;
}

/* add word to the array words if it does not exist,
* else words is not modified.
* adjust related the word_counts, and other related
* variables.
*/
static void add_word(const char *word){
int index;
int length;
index = word_index(word);
if(index == -1){
/* word does not exist*/
if(total_words >= array_size)
increase_arrays();
length = strlen(word)+1;
words[total_words]=calloc(length, sizeof(char*));
if(words[total_words]){
strcpy(words[total_words], word);
word_counts[total_words] = 1;
total_words++;
}
else{
fprintf(stderr, "Fatal Error:failed to allocate memory.\n");
exit(1);
}
}
else{
/* word exists */
word_counts[index]++;
}
}

Remove del for email
May 12 '07 #3
<we********@gmail.comha scritto nel messaggio
news:11**********************@y80g2000hsf.googlegr oups.com...
I've written the code that follows, and I use the function add_word(),
it seems to work fine
*before* increase_arrays() is called that uses realloc() to allocate
more memory to words. But *after* calling increase_arrays(), I
received segmentation fault. I tried to step it through gdb, and I
found out that after calling increase_arrays(), words[0]'s original
value is modified, and if I tried to access it, I get <address 0x11
out of bound>. It's there something wrong with the way I used
realloc()? Please help me out, my limited knowledge of C can carry as
far as here, and I don't think I can figure this out by myself. (I've
checked the C Faq, but still can't figure it out, now I see why people
says that memory management is a pain.). Thanks in advance.
I won't bother to correct such a mess, I'm just signaling random
errors I'm finding.
#define INC_SIZE 3 /* number of elements the arrays should
increase.*/

static int array_size; /* size of arrays words, word_counts */
static int total_words; /* total number of words found */
static char **words; /* an array to of words occured */
static int *word_counts; /* number of the corresponding word in arrays
words appeared */

/* Increases the storage for the arrays words, word_counts
* and adjust related variables
*/
static void increase_arrays(){
int i;
int new_size = array_size + INC_SIZE;
char **tmp = realloc(*words, sizeof(char*)*(new_size+1));
what are you doing? tmp points to a char*, while *words points to
a char. I think you meant realloc(words, ...)
int *tmp1 = realloc(word_counts, sizeof(int)*new_size);
if(!(*tmp) || ! tmp1){
*tmp is uninitialized. You want to check tmp itself.
fprintf(stderr, "Fatal Error: failed to allocate memory.\n");
exit(1);
Use exit(EXIT_FAILURE), it is portable.
}
memset(tmp1 + array_size, 0, sizeof(int)*INC_SIZE);
*words = *tmp;
for(i = array_size; i < new_size; i++)
words[i] = NULL;
word_counts = tmp1;
array_size = new_size;
}

/* return the index of word in the array words on success,
* -1 if word does not exist in the array.
*/
static int word_index(const char *word){
int index;
for(index = 0; index < total_words; index++){
if(!strcmp(words[index], word)) /* matches*/
return index;
}
/* word does not exist */
return -1;
}

/* add word to the array words if it does not exist,
* else words is not modified.
* adjust related the word_counts, and other related
* variables.
*/
static void add_word(const char *word){
int index;
int length;
index = word_index(word);
if(index == -1){
/* word does not exist*/
if(total_words >= array_size)
increase_arrays();
length = strlen(word)+1;
words[total_words]=calloc(length, sizeof(char*));
if(words[total_words]){
strcpy(words[total_words], word);
word_counts[total_words] = 1;
total_words++;
}
else{
fprintf(stderr, "Fatal Error:failed to allocate memory.\n");
exit(1);
}
}
else{
/* word exists */
word_counts[index]++;
}
}

May 12 '07 #4
"we********@gmail.com" <we********@gmail.comwrites:
Hi,

I've written the code that follows, and I use the function add_word(),
it seems to work fine
*before* increase_arrays() is called that uses realloc() to allocate
more memory to words. But *after* calling increase_arrays(), I
received segmentation fault. I tried to step it through gdb, and I
found out that after calling increase_arrays(), words[0]'s original
value is modified, and if I tried to access it, I get <address 0x11
out of bound>. It's there something wrong with the way I used
realloc()? Please help me out, my limited knowledge of C can carry as
far as here, and I don't think I can figure this out by myself. (I've
checked the C Faq, but still can't figure it out, now I see why people
says that memory management is a pain.). Thanks in advance.
Full marks for:
(1) Checking the FAQ
(2) Having a reasonable stab at finding the problem (you actually did
find it, you just did not see that you did!).
(3) Checking your allocation return values.
(4) Not casting the return from calloc/realloc,

I've added some comments unrelated to the original problem.
#define INC_SIZE 3 /* number of elements the arrays should
increase.*/

static int array_size; /* size of arrays words, word_counts */
static int total_words; /* total number of words found */
static char **words; /* an array to of words occured */
static int *word_counts; /* number of the corresponding word in arrays
words appeared */

/* Increases the storage for the arrays words, word_counts
* and adjust related variables
*/
When you have two arrays that grow in lock-step like this it is often
easier to have one array whose elements are a struct:

struct word_count {
char *word;
int count;
};

static struct word_count *word_counts;
static void increase_arrays(){
int i;
int new_size = array_size + INC_SIZE;
It is often better to use a multiplicative growth strategy (you need
to remember to deal with the initial size being zero, though).
char **tmp = realloc(*words, sizeof(char*)*(new_size+1));
I am puzzled by the +1 here. Are you being cautious? It is much
better to be *sure* what size you need a write that.
int *tmp1 = realloc(word_counts, sizeof(int)*new_size);
if(!(*tmp) || ! tmp1){
fprintf(stderr, "Fatal Error: failed to allocate memory.\n");
exit(1);
}
memset(tmp1 + array_size, 0, sizeof(int)*INC_SIZE);
*words = *tmp;
Here is you primary problem. You wanted to say "words = tmp;".
*words is the same as word[0] which is why you saw it change
unexpectedly.
for(i = array_size; i < new_size; i++)
words[i] = NULL;
word_counts = tmp1;
array_size = new_size;
}

/* return the index of word in the array words on success,
* -1 if word does not exist in the array.
*/
static int word_index(const char *word){
int index;
for(index = 0; index < total_words; index++){
if(!strcmp(words[index], word)) /* matches*/
return index;
}
/* word does not exist */
return -1;
}

/* add word to the array words if it does not exist,
* else words is not modified.
* adjust related the word_counts, and other related
* variables.
*/
static void add_word(const char *word){
int index;
int length;
index = word_index(word);
if(index == -1){
/* word does not exist*/
if(total_words >= array_size)
increase_arrays();
length = strlen(word)+1;
words[total_words]=calloc(length, sizeof(char*));
You want to allocate length * sizeof char (not char *). By since char
* is at least as big as char you got away with it.

I'd write:

words[total_words] = malloc(length);

because you don't need to zero the data (you replace is at once) and
sizeof char is 1 *by definition*.

A tiny point: I bet 9 out of 10 C programmers would write length =
strlen(word) and then malloc(length + 1).
if(words[total_words]){
strcpy(words[total_words], word);
word_counts[total_words] = 1;
total_words++;
}
else{
fprintf(stderr, "Fatal Error:failed to allocate memory.\n");
exit(1);
}
}
else{
/* word exists */
word_counts[index]++;
}
}
A small quibble: you *should* have posted a whole program (with
headers and main).

--
Ben.
May 12 '07 #5
On May 12, 6:33 pm, Ben Bacarisse <ben.use...@bsb.me.ukwrote:
"weidong...@gmail.com" <weidong...@gmail.comwrites:
Hi,
I've written the code that follows, and I use the function add_word(),
it seems to work fine
*before* increase_arrays() is called that uses realloc() to allocate
more memory to words. But *after* calling increase_arrays(), I
received segmentation fault. I tried to step it through gdb, and I
found out that after calling increase_arrays(), words[0]'s original
value is modified, and if I tried to access it, I get <address 0x11
out of bound>. It's there something wrong with the way I used
realloc()? Please help me out, my limited knowledge of C can carry as
far as here, and I don't think I can figure this out by myself. (I've
checked the C Faq, but still can't figure it out, now I see why people
says that memory management is a pain.). Thanks in advance.

Full marks for:
(1) Checking the FAQ
(2) Having a reasonable stab at finding the problem (you actually did
find it, you just did not see that you did!).
(3) Checking your allocation return values.
(4) Not casting the return from calloc/realloc,

I've added some comments unrelated to the original problem.
#define INC_SIZE 3 /* number of elements the arrays should
increase.*/
static int array_size; /* size of arrays words, word_counts */
static int total_words; /* total number of words found */
static char **words; /* an array to of words occured */
static int *word_counts; /* number of the corresponding word in arrays
words appeared */
/* Increases the storage for the arrays words, word_counts
* and adjust related variables
*/

When you have two arrays that grow in lock-step like this it is often
easier to have one array whose elements are a struct:

struct word_count {
char *word;
int count;

};

static struct word_count *word_counts;
static void increase_arrays(){
int i;
int new_size = array_size + INC_SIZE;

It is often better to use a multiplicative growth strategy (you need
to remember to deal with the initial size being zero, though).
char **tmp = realloc(*words, sizeof(char*)*(new_size+1));

I am puzzled by the +1 here. Are you being cautious? It is much
better to be *sure* what size you need a write that.
int *tmp1 = realloc(word_counts, sizeof(int)*new_size);
if(!(*tmp) || ! tmp1){
fprintf(stderr, "Fatal Error: failed to allocate memory.\n");
exit(1);
}
memset(tmp1 + array_size, 0, sizeof(int)*INC_SIZE);
*words = *tmp;

Here is you primary problem. You wanted to say "words = tmp;".
*words is the same as word[0] which is why you saw it change
unexpectedly.
for(i = array_size; i < new_size; i++)
words[i] = NULL;
word_counts = tmp1;
array_size = new_size;
}
/* return the index of word in the array words on success,
* -1 if word does not exist in the array.
*/
static int word_index(const char *word){
int index;
for(index = 0; index < total_words; index++){
if(!strcmp(words[index], word)) /* matches*/
return index;
}
/* word does not exist */
return -1;
}
/* add word to the array words if it does not exist,
* else words is not modified.
* adjust related the word_counts, and other related
* variables.
*/
static void add_word(const char *word){
int index;
int length;
index = word_index(word);
if(index == -1){
/* word does not exist*/
if(total_words >= array_size)
increase_arrays();
length = strlen(word)+1;
words[total_words]=calloc(length, sizeof(char*));

You want to allocate length * sizeof char (not char *). By since char
* is at least as big as char you got away with it.

I'd write:

words[total_words] = malloc(length);

because you don't need to zero the data (you replace is at once) and
sizeof char is 1 *by definition*.

A tiny point: I bet 9 out of 10 C programmers would write length =
strlen(word) and then malloc(length + 1).
if(words[total_words]){
strcpy(words[total_words], word);
word_counts[total_words] = 1;
total_words++;
}
else{
fprintf(stderr, "Fatal Error:failed to allocate memory.\n");
exit(1);
}
}
else{
/* word exists */
word_counts[index]++;
}
}

A small quibble: you *should* have posted a whole program (with
headers and main).

--
Ben.

Thanks for the advice, which has been the most helpful, and I've
finally got my modified code running. (see code that follows) I am
wondering how I can use realloc() to reallocate memory for an array of
pointer to character string? I tried to use this in the previous code:
char **character_array = malloc(sizeof(char*)*ARRAY_SIZE);

/* This is used in the increase_array()*/
void increase_array(){
char **new_character_array = realloc(character_array,
sizeof(char*)*(ARRAY_SIZE*2));
character_array = new_character_array;
}

I changed the original *character_array = *new_character_array to
character_array = new_character_array. But
I still got segmentation fault, so, how should the above code should
be written? Thanks again.

/* Entropy, a program to calculate the entropy of a file with the
formula:
----
H(S) = \ (P )*log ( 1/(P ))
/ i 2 i
----
i

i.e. H(S) = sum of P[i]*log(base=2, 1/P[i]) for i in 0 to
sizeof(S)

build with -lm
*/
#define INC_SIZE 3 /* number of elements the arrays should
increase.*/
#define MAX_CHAR_LINE 400 /* Maximum number of char in one line */

#include <stdio.h>
#include <math.h>
#include <stdlib.h>
#include <string.h>
#include <errno.h>

/* this can be made to a structure.*/
typedef struct word_tag{
char *word;
int count;
}word;

/* This time, I use a struct, instead of separate arrays. */
struct{
word *words;
int used;
int total;
}array;

static int total_word_count;

static double log_base_2(double i);
static double entropy(double probability, int total_occurance);
static void init_arrays();
static void increase_arrays();
static int word_index(const char *word);
static void add_word(const char *word);

/* calculate the logarithm of i to the base of 2.
* log2() maybe used in C99 implementation, but this conforms to C89
standard.*/
static double log_base_2(double i){
return log(i)/log(2);
}

/* calculate the entropy of the one word. */
static double entropy(double probability, int total_occurance){
return total_occurance * probability * log_base_2(1/probability);
}

/* Initialise the arrays words, word_counts, and the variables
array_size
* and total_words.
*/
static void init_arrays(){
int i;
array.words = malloc(sizeof(array)*INC_SIZE);
if(!array.words){
fprintf(stderr, "Fatal Error: failed to allocate memory.\n");
exit(1);
}
for(i = 0; i < INC_SIZE; i++){
array.words[i].word = NULL;
array.words[i].count = 0;
}
array.total = INC_SIZE;
array.used = 0;
total_word_count = 0;
}

/* Increases the storage for the arrays words, word_counts
* and adjust related variables
*/
static void increase_arrays(){
int i;
int new_size = array.total + INC_SIZE;
/* an array of pointer to char* with size new_size*/
word *tmp_words = realloc(array.words, sizeof(word)*new_size);
if(!tmp_words){
fprintf(stderr, "Fatal Error: failed to allocate memory.\n");
exit(1);
}
array.words = tmp_words;
for(i = array.total; i < new_size; i++)
array.words[i].word = NULL;
array.total = new_size;
}

/* return the index of word in the array words on success,
* -1 if word does not exist in the array.
*/
static int word_index(const char *word){
int index;
for(index = 0; index < array.used; index++){
if(!strcmp(array.words[index].word, word)) /* matches*/
return index;
}
/* word does not exist */
return -1;
}

/* add word to the array words if it does not exist,
* else words is not modified.
* adjust related the word_counts, and other related
* variables.
*/
static void add_word(const char *word){
int index;
int length;
index = word_index(word);
if(index == -1){
/* word does not exist*/
if(array.used >= array.total)
increase_arrays();
length = strlen(word);
(array.words[array.used]).word=malloc(length+1);
if(array.words[array.used].word){
strcpy(array.words[array.used].word, word);
array.words[array.used].count = 1;
array.used +=1;
}
else{
fprintf(stderr, "Fatal Error:failed to allocate memory.\n");
exit(1);
}
}
else{
/* word exists */
array.words[index].count++;
}
total_word_count++;
}

int main(int argc, char *argv[]){
FILE *infile;
int i = 1;
int trace = 1;
double total_entropy = 0.0f;
if(argc < 2){
argv[1] = "../entropy.c";
argc = 2;
/*
printf("Usage:\n\t%s filename\n", argv[0]);
exit(1);
*/
}

infile = fopen(argv[1], "r");
if(!infile){
fprintf(stderr, "Failed to open file: %s. %s\n", argv[0],
strerror(errno));
exit(1);
}
init_arrays();
while(!feof(infile)){
char buffer[MAX_CHAR_LINE];
char *c;
char *token;
memset(buffer, 0, MAX_CHAR_LINE);
fgets(buffer, MAX_CHAR_LINE, infile);
/* replace \n \t with space to work with strtok()*/
while((c = strchr(buffer, '\n')))
*c = ' ';
while((c = strchr(buffer, '\t')))
*c = ' ';
token = strtok(buffer, " ");
while(token){
add_word(token);
token = strtok(NULL, " ");
}
}
fclose(infile);
for(i = 0; i < array.used; i++){
double itsEntropy, probability;
probability = (double)array.words[i].count / total_word_count;
itsEntropy = entropy(probability, array.words[i].count);
if(trace)
printf("%50s %10f\n", array.words[i].word, itsEntropy);
total_entropy += itsEntropy;
}
printf("total entropy: %f\n", total_entropy);
return 0;
}

May 12 '07 #6
static char **words; /* an array to of words occured */

In your declaration, words is a pointer to a pointer to a char. I
would have written

static char* *words;

words is a pointer to an array of (char*)s. If you have say five char*
stored in words [0], words [1], words [2], words [3] and words [4],
then you reallocate words. For example (missing the error handling)

words = realloc (words, 6 * sizeof (char *));

If you want to change the third pointer from 10 to 20 characters, you
would do (missing error handling)

words [2] = realloc (words [2], 20);

May 12 '07 #7
"christian.bau" <ch***********@cbau.wanadoo.co.ukwrites:
>static char **words; /* an array to of words occured */

In your declaration, words is a pointer to a pointer to a char. I
would have written

static char* *words;

words is a pointer to an array of (char*)s.
[...]

I wouldn't; I'd write "static char **words;".

We're not going to resolve the style issue here, but I don't see that
putting a space between the '*' characters implies that words points
to an array of char* rather than to just a single char*.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
May 12 '07 #8

<we********@gmail.comha scritto nel messaggio
news:11**********************@p77g2000hsh.googlegr oups.com...
/* calculate the logarithm of i to the base of 2.
* log2() maybe used in C99 implementation, but this conforms to C89
standard.*/
static double log_base_2(double i){
return log(i)/log(2);
}
Maybe #define log_base_2(i) (log((i))/log(2)) would do that and
won't have function call overhead.
int main(int argc, char *argv[]){
FILE *infile;
int i = 1;
int trace = 1;
double total_entropy = 0.0f;
Why on earth do you use f to make the costant a float if you use it
to initialize a double?
May 12 '07 #9
On 12 May 2007 10:09:25 -0700, "we********@gmail.com"
<we********@gmail.comwrote:

snip 150 obsolete lines
>Thanks for the advice, which has been the most helpful, and I've
finally got my modified code running. (see code that follows) I am
If you are going to post a completely new program, there is no need to
quote 150 lines of obsolete and irrelevant code.

snip questions about old code
>/* Entropy, a program to calculate the entropy of a file with the
formula:
----
H(S) = \ (P )*log ( 1/(P ))
/ i 2 i
----
i
Even in a monospaced font, this didn't come out right.
i.e. H(S) = sum of P[i]*log(base=2, 1/P[i]) for i in 0 to
sizeof(S)

build with -lm
*/
#define INC_SIZE 3 /* number of elements the arrays should
increase.*/
#define MAX_CHAR_LINE 400 /* Maximum number of char in one line */

#include <stdio.h>
#include <math.h>
#include <stdlib.h>
#include <string.h>
#include <errno.h>

/* this can be made to a structure.*/
typedef struct word_tag{
char *word;
int count;
}word;

/* This time, I use a struct, instead of separate arrays. */
struct{
word *words;
int used;
int total;
It doesn't hurt to place these variables inside this struct but it
doesn't buy you anything either. In this program, the only effect is
to increase the typing load.
>}array;
Using words which have a common and intuitive meaning in a completely
different way only leads to confusion. array is not an array. It is
a single instance of an anonymous (having no tag) struct.
>
static int total_word_count;

static double log_base_2(double i);
static double entropy(double probability, int total_occurance);
static void init_arrays();
If the function takes no arguments, then specify it as such:
static void init_arrays(void);
>static void increase_arrays();
static int word_index(const char *word);
static void add_word(const char *word);

/* calculate the logarithm of i to the base of 2.
* log2() maybe used in C99 implementation, but this conforms to C89
standard.*/
static double log_base_2(double i){
return log(i)/log(2);
}

/* calculate the entropy of the one word. */
static double entropy(double probability, int total_occurance){
return total_occurance * probability * log_base_2(1/probability);
}

/* Initialise the arrays words, word_counts, and the variables
array_size
* and total_words.
*/
static void init_arrays(){
int i;
array.words = malloc(sizeof(array)*INC_SIZE);
This is the wrong size. You don't want sizeof(array); you may have
confused yourself into thinking array was the array. You want
sizeof(word). When you are allocating memory and assigning the
address to a pointer, the allocated memory should always have the same
type as what the pointer points to. array.words points to a word.
Therefore, however many objects you are allocating space for will each
have type word and will each occupy sizeof(word) bytes.

Make life easy on yourself. Use the following form for malloc
pointer_name = malloc(object_count * sizeof *pointer_name);
> if(!array.words){
fprintf(stderr, "Fatal Error: failed to allocate memory.\n");
exit(1);
This is not portable. Use EXIT_FAILURE instead of 1 to maximize the
number of people here who can help diagnose your code.
> }
for(i = 0; i < INC_SIZE; i++){
array.words[i].word = NULL;
array.words[i].count = 0;
}
array.total = INC_SIZE;
array.used = 0;
total_word_count = 0;
}

/* Increases the storage for the arrays words, word_counts
* and adjust related variables
*/
static void increase_arrays(){
int i;
int new_size = array.total + INC_SIZE;
/* an array of pointer to char* with size new_size*/
You don't have, and don't want, and array of pointer to char*. You
have a single pointer to word which points to the first element of a
dynamically allocated array of struct. Did using the variable name
"array" confuse you again?
> word *tmp_words = realloc(array.words, sizeof(word)*new_size);
You have it correct here?
> if(!tmp_words){
fprintf(stderr, "Fatal Error: failed to allocate memory.\n");
Use a different error message than the one you used for malloc. You
might also include the value of new_size so you get a clue about the
limits of your system.
> exit(1);
}
array.words = tmp_words;
You have now successfully expanded your array of struct.
> for(i = array.total; i < new_size; i++)
This will loop through each of the new elements of the array.
> array.words[i].word = NULL;
Neither one matters but in init_array you went to the trouble to
initialize both members of each struct in the array and here you only
initialize one.
> array.total = new_size;
}

/* return the index of word in the array words on success,
* -1 if word does not exist in the array.
*/
static int word_index(const char *word){
int index;
for(index = 0; index < array.used; index++){
if(!strcmp(array.words[index].word, word)) /* matches*/
return index;
}
/* word does not exist */
return -1;
}

/* add word to the array words if it does not exist,
* else words is not modified.
* adjust related the word_counts, and other related
* variables.
*/
static void add_word(const char *word){
int index;
int length;
index = word_index(word);
if(index == -1){
/* word does not exist*/
if(array.used >= array.total)
increase_arrays();
length = strlen(word);
(array.words[array.used]).word=malloc(length+1);
if(array.words[array.used].word){
strcpy(array.words[array.used].word, word);
array.words[array.used].count = 1;
array.used +=1;
}
else{
fprintf(stderr, "Fatal Error:failed to allocate memory.\n");
exit(1);
}
}
else{
/* word exists */
array.words[index].count++;
}
total_word_count++;
}

int main(int argc, char *argv[]){
FILE *infile;
int i = 1;
int trace = 1;
double total_entropy = 0.0f;
if(argc < 2){
argv[1] = "../entropy.c";
If argc is less than 2, argv[1] may not exist. In any event, you
never use the new values of argv[1] or argc (below) so why bother?
> argc = 2;
/*
printf("Usage:\n\t%s filename\n", argv[0]);
exit(1);
*/
}

infile = fopen(argv[1], "r");
if(!infile){
fprintf(stderr, "Failed to open file: %s. %s\n", argv[0],
argv[0] is not the name of the file. argv[1] is.
>strerror(errno));
fopen() is not required to set errno.
> exit(1);
}
init_arrays();
while(!feof(infile)){
This does not do what you want. It will cause you to attempt to
process the last line of the file twice. I'm pretty sure it is an
unintended side effect but the memset below insures that the second
attempt to process the last line recognizes end of string immediately.
> char buffer[MAX_CHAR_LINE];
char *c;
char *token;
memset(buffer, 0, MAX_CHAR_LINE);
Why? You immediately fill buffer with data from the file and fgets is
guaranteed to properly terminate the string.
> fgets(buffer, MAX_CHAR_LINE, infile);
You should always check the result of file I/O. This is where you
would find out that you are at end of file and there is no more data.
> /* replace \n \t with space to work with strtok()*/
while((c = strchr(buffer, '\n')))
*c = ' ';
while((c = strchr(buffer, '\t')))
*c = ' ';
token = strtok(buffer, " ");
while(token){
add_word(token);
token = strtok(NULL, " ");
You could eliminate the two while loops above by using " \n\t" as your
token string instead of " ".
> }
}
fclose(infile);
for(i = 0; i < array.used; i++){
double itsEntropy, probability;
probability = (double)array.words[i].count / total_word_count;
itsEntropy = entropy(probability, array.words[i].count);
if(trace)
printf("%50s %10f\n", array.words[i].word, itsEntropy);
Since your variables are doubles, you might as well use %lf and take
advantage of the extra precision.
> total_entropy += itsEntropy;
}
printf("total entropy: %f\n", total_entropy);
return 0;
}

Remove del for email
May 12 '07 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
6751
by: PaulR | last post by:
Hi, We have a Server running SLES 8 and 3GB memory, with 1 DB2 instance and 2 active Databases. General info... DB2level = "DB2 v8.1.0.72", "s040914", "MI00086", and FixPak "7" uname -a =...
11
3357
by: Roman Hartmann | last post by:
hello, I do have a question regarding structs. I have a struct (profil) which has a pointer to another struct (point). The struct profil stores the coordinates of points. The problem is that I...
11
2820
by: William Buch | last post by:
I have a strange problem. The code isn't written by me, but uses the qsort function in stdlib. ALWAYS, the fourth time through, the memory location of variable list (i.e. mem location = 41813698)...
15
1569
by: berthelot samuel | last post by:
Hi, I'm trying to develop an application for modeling 3D objects from Bezier patches, but I have a memory allocation problem. Here are my structures: typedef struct _vector3 { union { struct...
7
1840
by: Dan Nilsen | last post by:
Hi! I'm writing a small piece of software that basically runs on an embedded system with a Power-PC cpu. This runs on a stripped down version of Linux - Busybox. As I'm writing a piece of...
24
19032
by: Ken | last post by:
In C programming, I want to know in what situations we should use static memory allocation instead of dynamic memory allocation. My understanding is that static memory allocation like using array...
1
7944
by: Peterwkc | last post by:
Hello all expert, i have two program which make me desperate bu after i have noticed the forum, my future is become brightness back. By the way, my problem is like this i the first program was...
1
1708
by: krishna81m | last post by:
In the following code, I am trying to return a char, a char* (a type of non-const without using new, what do we call this type of pointer?) and char* created using new operator. What I do not know at...
5
505
by: cham | last post by:
Hi, I am working on c++ in a linux system ( Fedora core 4 ), kernel version - 2.6.11-1.1369_FC4 gcc version - 4.0.0 20050519 ( Red Hat 4.0.0-8 ) In my code i am creating a vector to store...
0
7126
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
7005
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
7168
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
7210
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
7381
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
5465
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
1
4916
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...
0
3087
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
0
293
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.