473,748 Members | 8,760 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Memory Allocation Problem, please help

Hi,

I've written the code that follows, and I use the function add_word(),
it seems to work fine
*before* increase_arrays () is called that uses realloc() to allocate
more memory to words. But *after* calling increase_arrays (), I
received segmentation fault. I tried to step it through gdb, and I
found out that after calling increase_arrays (), words[0]'s original
value is modified, and if I tried to access it, I get <address 0x11
out of bound>. It's there something wrong with the way I used
realloc()? Please help me out, my limited knowledge of C can carry as
far as here, and I don't think I can figure this out by myself. (I've
checked the C Faq, but still can't figure it out, now I see why people
says that memory management is a pain.). Thanks in advance.

#define INC_SIZE 3 /* number of elements the arrays should
increase.*/

static int array_size; /* size of arrays words, word_counts */
static int total_words; /* total number of words found */
static char **words; /* an array to of words occured */
static int *word_counts; /* number of the corresponding word in arrays
words appeared */

/* Increases the storage for the arrays words, word_counts
* and adjust related variables
*/
static void increase_arrays (){
int i;
int new_size = array_size + INC_SIZE;
char **tmp = realloc(*words, sizeof(char*)*( new_size+1));
int *tmp1 = realloc(word_co unts, sizeof(int)*new _size);
if(!(*tmp) || ! tmp1){
fprintf(stderr, "Fatal Error: failed to allocate memory.\n");
exit(1);
}
memset(tmp1 + array_size, 0, sizeof(int)*INC _SIZE);
*words = *tmp;
for(i = array_size; i < new_size; i++)
words[i] = NULL;
word_counts = tmp1;
array_size = new_size;
}

/* return the index of word in the array words on success,
* -1 if word does not exist in the array.
*/
static int word_index(cons t char *word){
int index;
for(index = 0; index < total_words; index++){
if(!strcmp(word s[index], word)) /* matches*/
return index;
}
/* word does not exist */
return -1;
}

/* add word to the array words if it does not exist,
* else words is not modified.
* adjust related the word_counts, and other related
* variables.
*/
static void add_word(const char *word){
int index;
int length;
index = word_index(word );
if(index == -1){
/* word does not exist*/
if(total_words >= array_size)
increase_arrays ();
length = strlen(word)+1;
words[total_words]=calloc(length, sizeof(char*));
if(words[total_words]){
strcpy(words[total_words], word);
word_counts[total_words] = 1;
total_words++;
}
else{
fprintf(stderr, "Fatal Error:failed to allocate memory.\n");
exit(1);
}
}
else{
/* word exists */
word_counts[index]++;
}
}

May 12 '07 #1
9 2517
we********@gmai l.com said:
Hi,

I've written the code that follows, and I use the function add_word(),
it seems to work fine
*before* increase_arrays () is called that uses realloc() to allocate
more memory to words. But *after* calling increase_arrays (), I
received segmentation fault.
I looked at your code for about half a minute, and couldn't understand
it. Clearly you don't understand it either (otherwise it would be
working). I conclude that you're trying to do too much in one go.

Modularise.

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at the above domain, - www.
May 12 '07 #2
On 12 May 2007 02:19:03 -0700, "we********@gma il.com"
<we********@gma il.comwrote:
>Hi,

I've written the code that follows, and I use the function add_word(),
it seems to work fine
*before* increase_arrays () is called that uses realloc() to allocate
more memory to words. But *after* calling increase_arrays (), I
received segmentation fault. I tried to step it through gdb, and I
found out that after calling increase_arrays (), words[0]'s original
value is modified, and if I tried to access it, I get <address 0x11
out of bound>. It's there something wrong with the way I used
realloc()? Please help me out, my limited knowledge of C can carry as
far as here, and I don't think I can figure this out by myself. (I've
checked the C Faq, but still can't figure it out, now I see why people
says that memory management is a pain.). Thanks in advance.

#define INC_SIZE 3 /* number of elements the arrays should
increase.*/

static int array_size; /* size of arrays words, word_counts */
static int total_words; /* total number of words found */
static char **words; /* an array to of words occured */
static int *word_counts; /* number of the corresponding word in arrays
words appeared */

/* Increases the storage for the arrays words, word_counts
* and adjust related variables
*/
static void increase_arrays (){
int i;
int new_size = array_size + INC_SIZE;
char **tmp = realloc(*words, sizeof(char*)*( new_size+1));
You want to realloc words, not *words. * words is the same as
words[0]. It is the pointer to the first word. You want to
reallocate the array of pointers.
> int *tmp1 = realloc(word_co unts, sizeof(int)*new _size);
if(!(*tmp) || ! tmp1){
*tmp is the value of the first pointer that words used to point to.
Not the value returned by realloc. If realloc actually failed, tmp
would be NULL and *tmp would invoke undefined behavior. You want
if (!tmp || !tmp1)
> fprintf(stderr, "Fatal Error: failed to allocate memory.\n");
exit(1);
Use EXIT_FAILURE, not 1.
> }
memset(tmp1 + array_size, 0, sizeof(int)*INC _SIZE);
*words = *tmp;
words = tmp;
> for(i = array_size; i < new_size; i++)
words[i] = NULL;
word_counts = tmp1;
array_size = new_size;
}

/* return the index of word in the array words on success,
* -1 if word does not exist in the array.
*/
static int word_index(cons t char *word){
int index;
for(index = 0; index < total_words; index++){
if(!strcmp(word s[index], word)) /* matches*/
return index;
}
/* word does not exist */
return -1;
}

/* add word to the array words if it does not exist,
* else words is not modified.
* adjust related the word_counts, and other related
* variables.
*/
static void add_word(const char *word){
int index;
int length;
index = word_index(word );
if(index == -1){
/* word does not exist*/
if(total_words >= array_size)
increase_arrays ();
length = strlen(word)+1;
words[total_words]=calloc(length, sizeof(char*));
if(words[total_words]){
strcpy(words[total_words], word);
word_counts[total_words] = 1;
total_words++;
}
else{
fprintf(stderr, "Fatal Error:failed to allocate memory.\n");
exit(1);
}
}
else{
/* word exists */
word_counts[index]++;
}
}

Remove del for email
May 12 '07 #3
<we********@gma il.comha scritto nel messaggio
news:11******** **************@ y80g2000hsf.goo glegroups.com.. .
I've written the code that follows, and I use the function add_word(),
it seems to work fine
*before* increase_arrays () is called that uses realloc() to allocate
more memory to words. But *after* calling increase_arrays (), I
received segmentation fault. I tried to step it through gdb, and I
found out that after calling increase_arrays (), words[0]'s original
value is modified, and if I tried to access it, I get <address 0x11
out of bound>. It's there something wrong with the way I used
realloc()? Please help me out, my limited knowledge of C can carry as
far as here, and I don't think I can figure this out by myself. (I've
checked the C Faq, but still can't figure it out, now I see why people
says that memory management is a pain.). Thanks in advance.
I won't bother to correct such a mess, I'm just signaling random
errors I'm finding.
#define INC_SIZE 3 /* number of elements the arrays should
increase.*/

static int array_size; /* size of arrays words, word_counts */
static int total_words; /* total number of words found */
static char **words; /* an array to of words occured */
static int *word_counts; /* number of the corresponding word in arrays
words appeared */

/* Increases the storage for the arrays words, word_counts
* and adjust related variables
*/
static void increase_arrays (){
int i;
int new_size = array_size + INC_SIZE;
char **tmp = realloc(*words, sizeof(char*)*( new_size+1));
what are you doing? tmp points to a char*, while *words points to
a char. I think you meant realloc(words, ...)
int *tmp1 = realloc(word_co unts, sizeof(int)*new _size);
if(!(*tmp) || ! tmp1){
*tmp is uninitialized. You want to check tmp itself.
fprintf(stderr, "Fatal Error: failed to allocate memory.\n");
exit(1);
Use exit(EXIT_FAILU RE), it is portable.
}
memset(tmp1 + array_size, 0, sizeof(int)*INC _SIZE);
*words = *tmp;
for(i = array_size; i < new_size; i++)
words[i] = NULL;
word_counts = tmp1;
array_size = new_size;
}

/* return the index of word in the array words on success,
* -1 if word does not exist in the array.
*/
static int word_index(cons t char *word){
int index;
for(index = 0; index < total_words; index++){
if(!strcmp(word s[index], word)) /* matches*/
return index;
}
/* word does not exist */
return -1;
}

/* add word to the array words if it does not exist,
* else words is not modified.
* adjust related the word_counts, and other related
* variables.
*/
static void add_word(const char *word){
int index;
int length;
index = word_index(word );
if(index == -1){
/* word does not exist*/
if(total_words >= array_size)
increase_arrays ();
length = strlen(word)+1;
words[total_words]=calloc(length, sizeof(char*));
if(words[total_words]){
strcpy(words[total_words], word);
word_counts[total_words] = 1;
total_words++;
}
else{
fprintf(stderr, "Fatal Error:failed to allocate memory.\n");
exit(1);
}
}
else{
/* word exists */
word_counts[index]++;
}
}

May 12 '07 #4
"we********@gma il.com" <we********@gma il.comwrites:
Hi,

I've written the code that follows, and I use the function add_word(),
it seems to work fine
*before* increase_arrays () is called that uses realloc() to allocate
more memory to words. But *after* calling increase_arrays (), I
received segmentation fault. I tried to step it through gdb, and I
found out that after calling increase_arrays (), words[0]'s original
value is modified, and if I tried to access it, I get <address 0x11
out of bound>. It's there something wrong with the way I used
realloc()? Please help me out, my limited knowledge of C can carry as
far as here, and I don't think I can figure this out by myself. (I've
checked the C Faq, but still can't figure it out, now I see why people
says that memory management is a pain.). Thanks in advance.
Full marks for:
(1) Checking the FAQ
(2) Having a reasonable stab at finding the problem (you actually did
find it, you just did not see that you did!).
(3) Checking your allocation return values.
(4) Not casting the return from calloc/realloc,

I've added some comments unrelated to the original problem.
#define INC_SIZE 3 /* number of elements the arrays should
increase.*/

static int array_size; /* size of arrays words, word_counts */
static int total_words; /* total number of words found */
static char **words; /* an array to of words occured */
static int *word_counts; /* number of the corresponding word in arrays
words appeared */

/* Increases the storage for the arrays words, word_counts
* and adjust related variables
*/
When you have two arrays that grow in lock-step like this it is often
easier to have one array whose elements are a struct:

struct word_count {
char *word;
int count;
};

static struct word_count *word_counts;
static void increase_arrays (){
int i;
int new_size = array_size + INC_SIZE;
It is often better to use a multiplicative growth strategy (you need
to remember to deal with the initial size being zero, though).
char **tmp = realloc(*words, sizeof(char*)*( new_size+1));
I am puzzled by the +1 here. Are you being cautious? It is much
better to be *sure* what size you need a write that.
int *tmp1 = realloc(word_co unts, sizeof(int)*new _size);
if(!(*tmp) || ! tmp1){
fprintf(stderr, "Fatal Error: failed to allocate memory.\n");
exit(1);
}
memset(tmp1 + array_size, 0, sizeof(int)*INC _SIZE);
*words = *tmp;
Here is you primary problem. You wanted to say "words = tmp;".
*words is the same as word[0] which is why you saw it change
unexpectedly.
for(i = array_size; i < new_size; i++)
words[i] = NULL;
word_counts = tmp1;
array_size = new_size;
}

/* return the index of word in the array words on success,
* -1 if word does not exist in the array.
*/
static int word_index(cons t char *word){
int index;
for(index = 0; index < total_words; index++){
if(!strcmp(word s[index], word)) /* matches*/
return index;
}
/* word does not exist */
return -1;
}

/* add word to the array words if it does not exist,
* else words is not modified.
* adjust related the word_counts, and other related
* variables.
*/
static void add_word(const char *word){
int index;
int length;
index = word_index(word );
if(index == -1){
/* word does not exist*/
if(total_words >= array_size)
increase_arrays ();
length = strlen(word)+1;
words[total_words]=calloc(length, sizeof(char*));
You want to allocate length * sizeof char (not char *). By since char
* is at least as big as char you got away with it.

I'd write:

words[total_words] = malloc(length);

because you don't need to zero the data (you replace is at once) and
sizeof char is 1 *by definition*.

A tiny point: I bet 9 out of 10 C programmers would write length =
strlen(word) and then malloc(length + 1).
if(words[total_words]){
strcpy(words[total_words], word);
word_counts[total_words] = 1;
total_words++;
}
else{
fprintf(stderr, "Fatal Error:failed to allocate memory.\n");
exit(1);
}
}
else{
/* word exists */
word_counts[index]++;
}
}
A small quibble: you *should* have posted a whole program (with
headers and main).

--
Ben.
May 12 '07 #5
On May 12, 6:33 pm, Ben Bacarisse <ben.use...@bsb .me.ukwrote:
"weidong...@gma il.com" <weidong...@gma il.comwrites:
Hi,
I've written the code that follows, and I use the function add_word(),
it seems to work fine
*before* increase_arrays () is called that uses realloc() to allocate
more memory to words. But *after* calling increase_arrays (), I
received segmentation fault. I tried to step it through gdb, and I
found out that after calling increase_arrays (), words[0]'s original
value is modified, and if I tried to access it, I get <address 0x11
out of bound>. It's there something wrong with the way I used
realloc()? Please help me out, my limited knowledge of C can carry as
far as here, and I don't think I can figure this out by myself. (I've
checked the C Faq, but still can't figure it out, now I see why people
says that memory management is a pain.). Thanks in advance.

Full marks for:
(1) Checking the FAQ
(2) Having a reasonable stab at finding the problem (you actually did
find it, you just did not see that you did!).
(3) Checking your allocation return values.
(4) Not casting the return from calloc/realloc,

I've added some comments unrelated to the original problem.
#define INC_SIZE 3 /* number of elements the arrays should
increase.*/
static int array_size; /* size of arrays words, word_counts */
static int total_words; /* total number of words found */
static char **words; /* an array to of words occured */
static int *word_counts; /* number of the corresponding word in arrays
words appeared */
/* Increases the storage for the arrays words, word_counts
* and adjust related variables
*/

When you have two arrays that grow in lock-step like this it is often
easier to have one array whose elements are a struct:

struct word_count {
char *word;
int count;

};

static struct word_count *word_counts;
static void increase_arrays (){
int i;
int new_size = array_size + INC_SIZE;

It is often better to use a multiplicative growth strategy (you need
to remember to deal with the initial size being zero, though).
char **tmp = realloc(*words, sizeof(char*)*( new_size+1));

I am puzzled by the +1 here. Are you being cautious? It is much
better to be *sure* what size you need a write that.
int *tmp1 = realloc(word_co unts, sizeof(int)*new _size);
if(!(*tmp) || ! tmp1){
fprintf(stderr, "Fatal Error: failed to allocate memory.\n");
exit(1);
}
memset(tmp1 + array_size, 0, sizeof(int)*INC _SIZE);
*words = *tmp;

Here is you primary problem. You wanted to say "words = tmp;".
*words is the same as word[0] which is why you saw it change
unexpectedly.
for(i = array_size; i < new_size; i++)
words[i] = NULL;
word_counts = tmp1;
array_size = new_size;
}
/* return the index of word in the array words on success,
* -1 if word does not exist in the array.
*/
static int word_index(cons t char *word){
int index;
for(index = 0; index < total_words; index++){
if(!strcmp(word s[index], word)) /* matches*/
return index;
}
/* word does not exist */
return -1;
}
/* add word to the array words if it does not exist,
* else words is not modified.
* adjust related the word_counts, and other related
* variables.
*/
static void add_word(const char *word){
int index;
int length;
index = word_index(word );
if(index == -1){
/* word does not exist*/
if(total_words >= array_size)
increase_arrays ();
length = strlen(word)+1;
words[total_words]=calloc(length, sizeof(char*));

You want to allocate length * sizeof char (not char *). By since char
* is at least as big as char you got away with it.

I'd write:

words[total_words] = malloc(length);

because you don't need to zero the data (you replace is at once) and
sizeof char is 1 *by definition*.

A tiny point: I bet 9 out of 10 C programmers would write length =
strlen(word) and then malloc(length + 1).
if(words[total_words]){
strcpy(words[total_words], word);
word_counts[total_words] = 1;
total_words++;
}
else{
fprintf(stderr, "Fatal Error:failed to allocate memory.\n");
exit(1);
}
}
else{
/* word exists */
word_counts[index]++;
}
}

A small quibble: you *should* have posted a whole program (with
headers and main).

--
Ben.

Thanks for the advice, which has been the most helpful, and I've
finally got my modified code running. (see code that follows) I am
wondering how I can use realloc() to reallocate memory for an array of
pointer to character string? I tried to use this in the previous code:
char **character_arr ay = malloc(sizeof(c har*)*ARRAY_SIZ E);

/* This is used in the increase_array( )*/
void increase_array( ){
char **new_character _array = realloc(charact er_array,
sizeof(char*)*( ARRAY_SIZE*2));
character_array = new_character_a rray;
}

I changed the original *character_arra y = *new_character_ array to
character_array = new_character_a rray. But
I still got segmentation fault, so, how should the above code should
be written? Thanks again.

/* Entropy, a program to calculate the entropy of a file with the
formula:
----
H(S) = \ (P )*log ( 1/(P ))
/ i 2 i
----
i

i.e. H(S) = sum of P[i]*log(base=2, 1/P[i]) for i in 0 to
sizeof(S)

build with -lm
*/
#define INC_SIZE 3 /* number of elements the arrays should
increase.*/
#define MAX_CHAR_LINE 400 /* Maximum number of char in one line */

#include <stdio.h>
#include <math.h>
#include <stdlib.h>
#include <string.h>
#include <errno.h>

/* this can be made to a structure.*/
typedef struct word_tag{
char *word;
int count;
}word;

/* This time, I use a struct, instead of separate arrays. */
struct{
word *words;
int used;
int total;
}array;

static int total_word_coun t;

static double log_base_2(doub le i);
static double entropy(double probability, int total_occurance );
static void init_arrays();
static void increase_arrays ();
static int word_index(cons t char *word);
static void add_word(const char *word);

/* calculate the logarithm of i to the base of 2.
* log2() maybe used in C99 implementation, but this conforms to C89
standard.*/
static double log_base_2(doub le i){
return log(i)/log(2);
}

/* calculate the entropy of the one word. */
static double entropy(double probability, int total_occurance ){
return total_occurance * probability * log_base_2(1/probability);
}

/* Initialise the arrays words, word_counts, and the variables
array_size
* and total_words.
*/
static void init_arrays(){
int i;
array.words = malloc(sizeof(a rray)*INC_SIZE) ;
if(!array.words ){
fprintf(stderr, "Fatal Error: failed to allocate memory.\n");
exit(1);
}
for(i = 0; i < INC_SIZE; i++){
array.words[i].word = NULL;
array.words[i].count = 0;
}
array.total = INC_SIZE;
array.used = 0;
total_word_coun t = 0;
}

/* Increases the storage for the arrays words, word_counts
* and adjust related variables
*/
static void increase_arrays (){
int i;
int new_size = array.total + INC_SIZE;
/* an array of pointer to char* with size new_size*/
word *tmp_words = realloc(array.w ords, sizeof(word)*ne w_size);
if(!tmp_words){
fprintf(stderr, "Fatal Error: failed to allocate memory.\n");
exit(1);
}
array.words = tmp_words;
for(i = array.total; i < new_size; i++)
array.words[i].word = NULL;
array.total = new_size;
}

/* return the index of word in the array words on success,
* -1 if word does not exist in the array.
*/
static int word_index(cons t char *word){
int index;
for(index = 0; index < array.used; index++){
if(!strcmp(arra y.words[index].word, word)) /* matches*/
return index;
}
/* word does not exist */
return -1;
}

/* add word to the array words if it does not exist,
* else words is not modified.
* adjust related the word_counts, and other related
* variables.
*/
static void add_word(const char *word){
int index;
int length;
index = word_index(word );
if(index == -1){
/* word does not exist*/
if(array.used >= array.total)
increase_arrays ();
length = strlen(word);
(array.words[array.used]).word=malloc(l ength+1);
if(array.words[array.used].word){
strcpy(array.wo rds[array.used].word, word);
array.words[array.used].count = 1;
array.used +=1;
}
else{
fprintf(stderr, "Fatal Error:failed to allocate memory.\n");
exit(1);
}
}
else{
/* word exists */
array.words[index].count++;
}
total_word_coun t++;
}

int main(int argc, char *argv[]){
FILE *infile;
int i = 1;
int trace = 1;
double total_entropy = 0.0f;
if(argc < 2){
argv[1] = "../entropy.c";
argc = 2;
/*
printf("Usage:\ n\t%s filename\n", argv[0]);
exit(1);
*/
}

infile = fopen(argv[1], "r");
if(!infile){
fprintf(stderr, "Failed to open file: %s. %s\n", argv[0],
strerror(errno) );
exit(1);
}
init_arrays();
while(!feof(inf ile)){
char buffer[MAX_CHAR_LINE];
char *c;
char *token;
memset(buffer, 0, MAX_CHAR_LINE);
fgets(buffer, MAX_CHAR_LINE, infile);
/* replace \n \t with space to work with strtok()*/
while((c = strchr(buffer, '\n')))
*c = ' ';
while((c = strchr(buffer, '\t')))
*c = ' ';
token = strtok(buffer, " ");
while(token){
add_word(token) ;
token = strtok(NULL, " ");
}
}
fclose(infile);
for(i = 0; i < array.used; i++){
double itsEntropy, probability;
probability = (double)array.w ords[i].count / total_word_coun t;
itsEntropy = entropy(probabi lity, array.words[i].count);
if(trace)
printf("%50s %10f\n", array.words[i].word, itsEntropy);
total_entropy += itsEntropy;
}
printf("total entropy: %f\n", total_entropy);
return 0;
}

May 12 '07 #6
static char **words; /* an array to of words occured */

In your declaration, words is a pointer to a pointer to a char. I
would have written

static char* *words;

words is a pointer to an array of (char*)s. If you have say five char*
stored in words [0], words [1], words [2], words [3] and words [4],
then you reallocate words. For example (missing the error handling)

words = realloc (words, 6 * sizeof (char *));

If you want to change the third pointer from 10 to 20 characters, you
would do (missing error handling)

words [2] = realloc (words [2], 20);

May 12 '07 #7
"christian. bau" <ch***********@ cbau.wanadoo.co .ukwrites:
>static char **words; /* an array to of words occured */

In your declaration, words is a pointer to a pointer to a char. I
would have written

static char* *words;

words is a pointer to an array of (char*)s.
[...]

I wouldn't; I'd write "static char **words;".

We're not going to resolve the style issue here, but I don't see that
putting a space between the '*' characters implies that words points
to an array of char* rather than to just a single char*.

--
Keith Thompson (The_Other_Keit h) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
May 12 '07 #8

<we********@gma il.comha scritto nel messaggio
news:11******** **************@ p77g2000hsh.goo glegroups.com.. .
/* calculate the logarithm of i to the base of 2.
* log2() maybe used in C99 implementation, but this conforms to C89
standard.*/
static double log_base_2(doub le i){
return log(i)/log(2);
}
Maybe #define log_base_2(i) (log((i))/log(2)) would do that and
won't have function call overhead.
int main(int argc, char *argv[]){
FILE *infile;
int i = 1;
int trace = 1;
double total_entropy = 0.0f;
Why on earth do you use f to make the costant a float if you use it
to initialize a double?
May 12 '07 #9
On 12 May 2007 10:09:25 -0700, "we********@gma il.com"
<we********@gma il.comwrote:

snip 150 obsolete lines
>Thanks for the advice, which has been the most helpful, and I've
finally got my modified code running. (see code that follows) I am
If you are going to post a completely new program, there is no need to
quote 150 lines of obsolete and irrelevant code.

snip questions about old code
>/* Entropy, a program to calculate the entropy of a file with the
formula:
----
H(S) = \ (P )*log ( 1/(P ))
/ i 2 i
----
i
Even in a monospaced font, this didn't come out right.
i.e. H(S) = sum of P[i]*log(base=2, 1/P[i]) for i in 0 to
sizeof(S)

build with -lm
*/
#define INC_SIZE 3 /* number of elements the arrays should
increase.*/
#define MAX_CHAR_LINE 400 /* Maximum number of char in one line */

#include <stdio.h>
#include <math.h>
#include <stdlib.h>
#include <string.h>
#include <errno.h>

/* this can be made to a structure.*/
typedef struct word_tag{
char *word;
int count;
}word;

/* This time, I use a struct, instead of separate arrays. */
struct{
word *words;
int used;
int total;
It doesn't hurt to place these variables inside this struct but it
doesn't buy you anything either. In this program, the only effect is
to increase the typing load.
>}array;
Using words which have a common and intuitive meaning in a completely
different way only leads to confusion. array is not an array. It is
a single instance of an anonymous (having no tag) struct.
>
static int total_word_coun t;

static double log_base_2(doub le i);
static double entropy(double probability, int total_occurance );
static void init_arrays();
If the function takes no arguments, then specify it as such:
static void init_arrays(voi d);
>static void increase_arrays ();
static int word_index(cons t char *word);
static void add_word(const char *word);

/* calculate the logarithm of i to the base of 2.
* log2() maybe used in C99 implementation, but this conforms to C89
standard.*/
static double log_base_2(doub le i){
return log(i)/log(2);
}

/* calculate the entropy of the one word. */
static double entropy(double probability, int total_occurance ){
return total_occurance * probability * log_base_2(1/probability);
}

/* Initialise the arrays words, word_counts, and the variables
array_size
* and total_words.
*/
static void init_arrays(){
int i;
array.words = malloc(sizeof(a rray)*INC_SIZE) ;
This is the wrong size. You don't want sizeof(array); you may have
confused yourself into thinking array was the array. You want
sizeof(word). When you are allocating memory and assigning the
address to a pointer, the allocated memory should always have the same
type as what the pointer points to. array.words points to a word.
Therefore, however many objects you are allocating space for will each
have type word and will each occupy sizeof(word) bytes.

Make life easy on yourself. Use the following form for malloc
pointer_name = malloc(object_c ount * sizeof *pointer_name);
> if(!array.words ){
fprintf(stderr, "Fatal Error: failed to allocate memory.\n");
exit(1);
This is not portable. Use EXIT_FAILURE instead of 1 to maximize the
number of people here who can help diagnose your code.
> }
for(i = 0; i < INC_SIZE; i++){
array.words[i].word = NULL;
array.words[i].count = 0;
}
array.total = INC_SIZE;
array.used = 0;
total_word_coun t = 0;
}

/* Increases the storage for the arrays words, word_counts
* and adjust related variables
*/
static void increase_arrays (){
int i;
int new_size = array.total + INC_SIZE;
/* an array of pointer to char* with size new_size*/
You don't have, and don't want, and array of pointer to char*. You
have a single pointer to word which points to the first element of a
dynamically allocated array of struct. Did using the variable name
"array" confuse you again?
> word *tmp_words = realloc(array.w ords, sizeof(word)*ne w_size);
You have it correct here?
> if(!tmp_words){
fprintf(stderr, "Fatal Error: failed to allocate memory.\n");
Use a different error message than the one you used for malloc. You
might also include the value of new_size so you get a clue about the
limits of your system.
> exit(1);
}
array.words = tmp_words;
You have now successfully expanded your array of struct.
> for(i = array.total; i < new_size; i++)
This will loop through each of the new elements of the array.
> array.words[i].word = NULL;
Neither one matters but in init_array you went to the trouble to
initialize both members of each struct in the array and here you only
initialize one.
> array.total = new_size;
}

/* return the index of word in the array words on success,
* -1 if word does not exist in the array.
*/
static int word_index(cons t char *word){
int index;
for(index = 0; index < array.used; index++){
if(!strcmp(arra y.words[index].word, word)) /* matches*/
return index;
}
/* word does not exist */
return -1;
}

/* add word to the array words if it does not exist,
* else words is not modified.
* adjust related the word_counts, and other related
* variables.
*/
static void add_word(const char *word){
int index;
int length;
index = word_index(word );
if(index == -1){
/* word does not exist*/
if(array.used >= array.total)
increase_arrays ();
length = strlen(word);
(array.words[array.used]).word=malloc(l ength+1);
if(array.words[array.used].word){
strcpy(array.wo rds[array.used].word, word);
array.words[array.used].count = 1;
array.used +=1;
}
else{
fprintf(stderr, "Fatal Error:failed to allocate memory.\n");
exit(1);
}
}
else{
/* word exists */
array.words[index].count++;
}
total_word_coun t++;
}

int main(int argc, char *argv[]){
FILE *infile;
int i = 1;
int trace = 1;
double total_entropy = 0.0f;
if(argc < 2){
argv[1] = "../entropy.c";
If argc is less than 2, argv[1] may not exist. In any event, you
never use the new values of argv[1] or argc (below) so why bother?
> argc = 2;
/*
printf("Usage:\ n\t%s filename\n", argv[0]);
exit(1);
*/
}

infile = fopen(argv[1], "r");
if(!infile){
fprintf(stderr, "Failed to open file: %s. %s\n", argv[0],
argv[0] is not the name of the file. argv[1] is.
>strerror(errno ));
fopen() is not required to set errno.
> exit(1);
}
init_arrays();
while(!feof(inf ile)){
This does not do what you want. It will cause you to attempt to
process the last line of the file twice. I'm pretty sure it is an
unintended side effect but the memset below insures that the second
attempt to process the last line recognizes end of string immediately.
> char buffer[MAX_CHAR_LINE];
char *c;
char *token;
memset(buffer, 0, MAX_CHAR_LINE);
Why? You immediately fill buffer with data from the file and fgets is
guaranteed to properly terminate the string.
> fgets(buffer, MAX_CHAR_LINE, infile);
You should always check the result of file I/O. This is where you
would find out that you are at end of file and there is no more data.
> /* replace \n \t with space to work with strtok()*/
while((c = strchr(buffer, '\n')))
*c = ' ';
while((c = strchr(buffer, '\t')))
*c = ' ';
token = strtok(buffer, " ");
while(token){
add_word(token) ;
token = strtok(NULL, " ");
You could eliminate the two while loops above by using " \n\t" as your
token string instead of " ".
> }
}
fclose(infile);
for(i = 0; i < array.used; i++){
double itsEntropy, probability;
probability = (double)array.w ords[i].count / total_word_coun t;
itsEntropy = entropy(probabi lity, array.words[i].count);
if(trace)
printf("%50s %10f\n", array.words[i].word, itsEntropy);
Since your variables are doubles, you might as well use %lf and take
advantage of the extra precision.
> total_entropy += itsEntropy;
}
printf("total entropy: %f\n", total_entropy);
return 0;
}

Remove del for email
May 12 '07 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
6768
by: PaulR | last post by:
Hi, We have a Server running SLES 8 and 3GB memory, with 1 DB2 instance and 2 active Databases. General info... DB2level = "DB2 v8.1.0.72", "s040914", "MI00086", and FixPak "7" uname -a = Linux galahad 2.4.19-64GB-SMP #1 SMP /etc/sysctl.conf kernel.shmmax=268435456
11
3400
by: Roman Hartmann | last post by:
hello, I do have a question regarding structs. I have a struct (profil) which has a pointer to another struct (point). The struct profil stores the coordinates of points. The problem is that I don't know how many points there will be in every struct in the end, so I have to allocate memory dynamically for them and can't use an array of fixed size, unfortunately. I would like to know if there is a better way to access struct members...
11
2856
by: William Buch | last post by:
I have a strange problem. The code isn't written by me, but uses the qsort function in stdlib. ALWAYS, the fourth time through, the memory location of variable list (i.e. mem location = 41813698) becomes 11, then the program crashes. It is obviously qsort that may me overwritten the used memory location. The weird thing is that it is ALWAYS the fourth time running the program. Below is the code MEMORY LOCATION OK HERE for (i = 0; i...
15
1601
by: berthelot samuel | last post by:
Hi, I'm trying to develop an application for modeling 3D objects from Bezier patches, but I have a memory allocation problem. Here are my structures: typedef struct _vector3 { union { struct {
7
1857
by: Dan Nilsen | last post by:
Hi! I'm writing a small piece of software that basically runs on an embedded system with a Power-PC cpu. This runs on a stripped down version of Linux - Busybox. As I'm writing a piece of code that basically acts as a server and that will be running for weeks or months and probably even longer, memory management is a topic that is quite crucial.
24
19088
by: Ken | last post by:
In C programming, I want to know in what situations we should use static memory allocation instead of dynamic memory allocation. My understanding is that static memory allocation like using array is faster than malloc, but dynamic memory allocation is more flexible. Please comment... thanks.
1
7973
by: Peterwkc | last post by:
Hello all expert, i have two program which make me desperate bu after i have noticed the forum, my future is become brightness back. By the way, my problem is like this i the first program was compiled and run without any erros but the second program has a run time error when the function return from allocate and the ptr become NULL. How to fixed this? Second Program: /* Best Method to allocate memory for 2D Array because it's ...
1
1721
by: krishna81m | last post by:
In the following code, I am trying to return a char, a char* (a type of non-const without using new, what do we call this type of pointer?) and char* created using new operator. What I do not know at all is how variables are created in what type of memory and how they are deleted in the following three cases when calling and exiting test, test1 and test2 functions. I also note that I see a different result while using char and char* which I...
5
505
by: cham | last post by:
Hi, I am working on c++ in a linux system ( Fedora core 4 ), kernel version - 2.6.11-1.1369_FC4 gcc version - 4.0.0 20050519 ( Red Hat 4.0.0-8 ) In my code i am creating a vector to store pointers of type structure "SAMPLE_TABLE_STRUCT" ( size of this structure is 36 bytes ). I create an instance of structure "SAMPLE_TABLE_STRUCT" using operator "new" and push back into the vector,this is done inside a for loop for
0
8991
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
8831
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
9374
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
9325
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
9249
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
6076
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
4607
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
4876
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
3315
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.