473,398 Members | 2,525 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,398 software developers and data experts.

to free() or not to free() in lex/yacc

Hi,

I am working on a school project where we use lex/yacc to write a compiler
for a fictional (Java-like) language. I have handled all the details about
the yacc and lex files, but I still have a question regarding the dynamic
memory allocation for strings. When the lex file encounters a variable
name, I want it to pass this to yacc through yylval, and then
retrieve it to add to a syntax tree. For this purpose, I wrote the
following code in the lex file (third section), that will save the
variable in yylval:

void copy_name(char** dst, char* yy) {
int len;

//free(*dst);

// allocate memory for the new string
len = strlen(yy);
*dst = (char*) malloc(len * sizeof(char) + 1);
// copy the string
strcpy(*dst, yy);
}

It's basically a wrapper around strcpy(), which also allocates some memory
through malloc().

In the lex file, I have something like this
{LETTER}({LETTER}|{DIGIT}|"_")* {copy_name(&(yylval.Name) , yytext);
return(NAME);}

Now my question is whether I should keep the call to free in copy_name or
not. I would think that I need to do that, otherwise new memory would be
allocated each time a new variable is found, and I'd be facing memory
leaks. Yet when I uncomment that line, I start to get segfaults, for
reasons that I can't understand.

Can anybody with more expertise with lex/yacc help me out with this
problem?

Thank you,
Berk Birand

--
Posted via a free Usenet account from http://www.teranews.com

Apr 16 '07 #1
13 4971
Berk Birand wrote:
Hi,

I am working on a school project where we use lex/yacc to write a compiler
for a fictional (Java-like) language. I have handled all the details about
the yacc and lex files, but I still have a question regarding the dynamic
memory allocation for strings. When the lex file encounters a variable
name, I want it to pass this to yacc through yylval, and then
retrieve it to add to a syntax tree. For this purpose, I wrote the
following code in the lex file (third section), that will save the
variable in yylval:

void copy_name(char** dst, char* yy) {
int len;

//free(*dst);

// allocate memory for the new string
len = strlen(yy);
*dst = (char*) malloc(len * sizeof(char) + 1);
Why not *dst = malloc(len + 1); ?

sizeof(char) is 1 by definition.
>
It's basically a wrapper around strcpy(), which also allocates some memory
through malloc().

In the lex file, I have something like this
{LETTER}({LETTER}|{DIGIT}|"_")* {copy_name(&(yylval.Name) , yytext);
return(NAME);}

Now my question is whether I should keep the call to free in copy_name or
not. I would think that I need to do that, otherwise new memory would be
allocated each time a new variable is found, and I'd be facing memory
leaks. Yet when I uncomment that line, I start to get segfaults, for
reasons that I can't understand.
Where does yylval.Name come form, it it isn't allocated by malloc, all
bets are off. You either have to allocate all of the memory for your
string with malloc, or devise another scheme for copying them.

--
Ian Collins.
Apr 16 '07 #2
On Mon, 16 Apr 2007 12:24:10 +1200, Ian Collins wrote:
Berk Birand wrote:
> len = strlen(yy);
*dst = (char*) malloc(len * sizeof(char) + 1);

Why not *dst = malloc(len + 1); ?

sizeof(char) is 1 by definition.
Yeah I guess I can do that. I figured it would be more explicit this way,
but if it's defined as 1 then it's just extra complication.

>It's basically a wrapper around strcpy(), which also allocates some
memory through malloc().

In the lex file, I have something like this
{LETTER}({LETTER}|{DIGIT}|"_")* {copy_name(&(yylval.Name) , yytext);
return(NAME);}

Now my question is whether I should keep the call to free in copy_name
or not. I would think that I need to do that, otherwise new memory
would be allocated each time a new variable is found, and I'd be facing
memory leaks. Yet when I uncomment that line, I start to get segfaults,
for reasons that I can't understand.
Where does yylval.Name come form, it it isn't allocated by malloc, all
bets are off. You either have to allocate all of the memory for your
string with malloc, or devise another scheme for copying them.
That's the problem. yylval is an internal struct variable to yacc. In my
yacc file, I set it's type as follows:

%union {
int Value;
char Operator[3];
char* Name;
NodeType *nPtr;
};

Now Name is a character pointer. Lex doesn't mess with the allocation of
it, but just declares yylval to be a struct.The point of
this copy_name function is to do that allocation. So I suppose we can
assume that the code that I wrote will be the one that will allocate it.
Would it the make sense to use free?

Thanks for you answer,
bb

--
Posted via a free Usenet account from http://www.teranews.com

Apr 16 '07 #3
Berk Birand wrote:
On Mon, 16 Apr 2007 12:24:10 +1200, Ian Collins wrote:
>>
Where does yylval.Name come form, it it isn't allocated by malloc, all
bets are off. You either have to allocate all of the memory for your
string with malloc, or devise another scheme for copying them.


That's the problem. yylval is an internal struct variable to yacc. In my
yacc file, I set it's type as follows:

%union {
int Value;
char Operator[3];
char* Name;
NodeType *nPtr;
};

Now Name is a character pointer. Lex doesn't mess with the allocation of
it, but just declares yylval to be a struct.The point of
this copy_name function is to do that allocation. So I suppose we can
assume that the code that I wrote will be the one that will allocate it.
Would it the make sense to use free?
Not unless something has been assigned to it. If adding the free causes
a crash, odds are the pointer is unassigned and contains a random value
that chokes free. So if the assignment is made once, by you and never
again, drop the free. If it gets reassigned (by you) then you have a
problem as there is no way of telling if the value is valid or not.

--
Ian Collins.
Apr 16 '07 #4
Berk Birand wrote:
>
Hi,

I am working on a school project where we use lex/yacc to write a compiler
for a fictional (Java-like) language. I have handled all the details about
the yacc and lex files, but I still have a question regarding the dynamic
memory allocation for strings. When the lex file encounters a variable
name, I want it to pass this to yacc through yylval, and then
retrieve it to add to a syntax tree. For this purpose, I wrote the
following code in the lex file (third section), that will save the
variable in yylval:

void copy_name(char** dst, char* yy) {
int len;

//free(*dst);

// allocate memory for the new string
len = strlen(yy);
*dst = (char*) malloc(len * sizeof(char) + 1);
// copy the string
strcpy(*dst, yy);
}
sizeof (char) is 1 by definition, and can be omitted. Avoid //
comments in Usenet, they are foulable by line wraps. Check the
result of malloc. #include <stdlib.h>. A simpler version is:

char *copy_name(char *yy) {
char *dst;

if (dst = malloc(1 + strlen(yy))) strcpy(dst, yy);
return dst;
}

which still needs the #include.

--
<http://www.cs.auckland.ac.nz/~pgut001/pubs/vista_cost.txt>
<http://www.securityfocus.com/columnists/423>
<http://www.aaxnet.com/editor/edit043.html>

"A man who is right every time is not likely to do very much."
-- Francis Crick, co-discover of DNA
"There is nothing more amazing than stupidity in action."
-- Thomas Matthews

--
Posted via a free Usenet account from http://www.teranews.com

Apr 16 '07 #5
"CBFalconer" <cb********@yahoo.comwrote in message
news:46***************@yahoo.com...
A simpler version is:

char *copy_name(char *yy) {
char *dst;

if (dst = malloc(1 + strlen(yy))) strcpy(dst, yy);
return dst;
}
<OT>
An even simpler version, though less portable, is:

#define copy_name(x) strdup(x)

Of course, that assumes strdup() is available, but it's common enough you
can usually depend on it (and, as shown above, it's easy to implement if
not).
</OT>

S

--
Stephen Sprunk "Those people who think they know everything
CCIE #3723 are a great annoyance to those of us who do."
K5SSS --Isaac Asimov
--
Posted via a free Usenet account from http://www.teranews.com

Apr 16 '07 #6
In article <pa****************************@email.me>
Berk Birand <do**@email.mewrote:
>I am working on a school project where we use lex/yacc to write a compiler
for a fictional (Java-like) language. I have handled all the details about
the yacc and lex files, but I still have a question regarding the dynamic
memory allocation for strings. ...
This is not the right group for lex and yacc discussion (comp.compilers
is generally the right place), but: dealing sensibly with symbols
and strings is probably the hardest part of working with what these
tools give you. There are a number of approaches, with different
tradeoffs. (I have used several myself, even going so far as to
use more than one in a single lex-and-yacc system.)

One thing to keep in mind is that yacc's LALR(1) parser will do at
most one token of "lookahead", so depending on many things, you
can sometimes get away with a known but fixed set of buffers. If
you do full-blown dynamic allocation it is hard to avoid leaks when
dealing with (e.g.) syntax errors.

For more information, try comp.compilers. :-)
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.
Apr 16 '07 #7
Berk Birand wrote:
Hi,

I am working on a school project where we use lex/yacc to write a compiler
for a fictional (Java-like) language. I have handled all the details about
the yacc and lex files, but I still have a question regarding the dynamic
memory allocation for strings. When the lex file encounters a variable
name, I want it to pass this to yacc through yylval, and then
retrieve it to add to a syntax tree. For this purpose, I wrote the
following code in the lex file (third section), that will save the
variable in yylval:

void copy_name(char** dst, char* yy) {
int len;

//free(*dst);

// allocate memory for the new string
len = strlen(yy);
*dst = (char*) malloc(len * sizeof(char) + 1);
// copy the string
strcpy(*dst, yy);
}

It's basically a wrapper around strcpy(), which also allocates some memory
through malloc().

In the lex file, I have something like this
{LETTER}({LETTER}|{DIGIT}|"_")* {copy_name(&(yylval.Name) , yytext);
return(NAME);}

Now my question is whether I should keep the call to free in copy_name or
not. I would think that I need to do that, otherwise new memory would be
allocated each time a new variable is found, and I'd be facing memory
leaks. Yet when I uncomment that line, I start to get segfaults, for
reasons that I can't understand.

Can anybody with more expertise with lex/yacc help me out with this
problem?

Thank you,
Berk Birand
There are several solutions to this. Ordering from the easiest to the more
complex ones:

1) Use a garbage collector for C. Some compilers have it as standard in
their distributions (lcc-win32 for example). If not, google for
"Boehm's garbage collector". The advantage is that you only allocate
memory, never freeing it. Problems gone.
2) If that doesn't help, look again at your problem. Why do you want to
free the space allocated for the names? Maybe it is a much better
strategy to just keep allocating memory and free it all automatically
when your compiler exits. I have used this strategy in many parsers
or compilers that I use. The total amount of memory is small, and it
is used everywhere later in the program, so it is just a waste of
time to micro-manage each small piece of storage.
3) If that doesn't help, at the start of your program allocate a big
chunk of memory for names. When you are done with all names, free
all the memory for names in a single call to free. You specify a
buffer of say, 256K. You allocate names in this buffer. When you are
done with the parsing of names, you free the buffer, not caring about
the individual names.
If you run out of space, you can allocate those buffers in a linked
list.

have fun

jacob
Apr 16 '07 #8
On Mon, 16 Apr 2007 01:10:25 -0500, Stephen Sprunk wrote:
"CBFalconer" <cb********@yahoo.comwrote in message
news:46***************@yahoo.com...
>A simpler version is:

char *copy_name(char *yy) {
char *dst;

if (dst = malloc(1 + strlen(yy))) strcpy(dst, yy);
return dst;
}

<OT>
An even simpler version, though less portable, is:

#define copy_name(x) strdup(x)
I never actually thought of using strdup for this problem. Thanks for
pointing it out. It actually does exactly the same thing as my
copy_name(), except for the free() call. If I end up not using the free(),
I can directly call strdup.

Thanks for the tip.

--
Posted via a free Usenet account from http://www.teranews.com

Apr 16 '07 #9
On Mon, 16 Apr 2007 06:38:40 +0000, Chris Torek wrote:

>
For more information, try comp.compilers. :-)
I looked around for some groups that might be oriented towards lex and
yacc, but couldn't find any. Thanks for pointing me to the compilers one,
never thought of that. I will also look into the fixed buffer solution, as
it seems to be an easy way to handle this problem.

Thanks,
Berk

--
Posted via a free Usenet account from http://www.teranews.com

Apr 16 '07 #10
On Mon, 16 Apr 2007 11:08:56 +0200, jacob navia wrote:

re are several solutions to this. Ordering from the easiest to the more
complex ones:

1) Use a garbage collector for C.
I actually didn't know there were garbage collection libraries for C.
Thanks a lot for pointing this out. I will definitely keep this in mind
for future projects, as it seems like it can be a life saver. For this
project though, I think such a solutions might be a little too big.

2) If that doesn't help, look again at your problem. Why do you want to
free the space allocated for the names? Maybe it is a much better
strategy to just keep allocating memory and free it all automatically
when your compiler exits. I have used this strategy in many parsers
or compilers that I use. The total amount of memory is small, and it
is used everywhere later in the program, so it is just a waste of
time to micro-manage each small piece of storage.
I suppose this is what will happen if I don't put in that call for free.
It will just allocate some more memory each time a new variable is found,
and keep a copy of the previous one in the memory. This is definitely a
very viable solution (after all, it is what I have now). Especially since
this is an academic project, the programs that we'll be compiling won't be
longer than 30-40 lines, with a dozen variable declarations at best. The
one reason I would want to eschew this approach is so that I can learn a
more rigorous answer. I'm sure this is not what's being done in gcc, and
since the point of the project is to learn, might as well do that.

3) If that doesn't help, at the start of your program allocate a big
chunk of memory for names. When you are done with all names, free
all the memory for names in a single call to free. You specify a
buffer of say, 256K. You allocate names in this buffer. When you are
done with the parsing of names, you free the buffer, not caring about
the individual names.
If you run out of space, you can allocate those buffers in a linked
list.
This last one also seems very possible. I figured I could declare the name
attribute of yylval to be a character array of size 256 (I don't think I
will have longer variables). Then I could simply use strcpy to copy the
string found by lex to that array, and later read from it.

I am nevertheless going to ask this same question in comp.compilers to
see whether they have any other ideas. It seems like bullets 2) and 3) of
your answer will be what I implement.

Thanks for your answers,
Berk

--
Posted via a free Usenet account from http://www.teranews.com

Apr 16 '07 #11
Berk Birand wrote:
On Mon, 16 Apr 2007 01:10:25 -0500, Stephen Sprunk wrote:
>"CBFalconer" <cb********@yahoo.comwrote in message
>>A simpler version is:

char *copy_name(char *yy) {
char *dst;

if (dst = malloc(1 + strlen(yy))) strcpy(dst, yy);
return dst;
}

An even simpler version, though less portable, is:

#define copy_name(x) strdup(x)

I never actually thought of using strdup for this problem. Thanks
for pointing it out. It actually does exactly the same thing as
my copy_name(), except for the free() call. If I end up not using
the free(), I can directly call strdup.
What free()? Neither the original, nor my replacement, used it.

--
<http://www.cs.auckland.ac.nz/~pgut001/pubs/vista_cost.txt>
<http://www.securityfocus.com/columnists/423>
<http://www.aaxnet.com/editor/edit043.html>

"A man who is right every time is not likely to do very much."
-- Francis Crick, co-discover of DNA
"There is nothing more amazing than stupidity in action."
-- Thomas Matthews

--
Posted via a free Usenet account from http://www.teranews.com

Apr 17 '07 #12
CBFalconer <cb********@yahoo.comwrote:
What free()? Neither the original, nor my replacement, used it.
It was commented out in OP's code.

--
C. Benson Manica | I *should* know what I'm talking about - if I
cbmanica(at)gmail.com | don't, I need to know. Flames welcome.
Apr 17 '07 #13
CBFalconer <cb********@yahoo.comwrites:
Berk Birand wrote:
>>
Hi,

I am working on a school project where we use lex/yacc to write a compiler
for a fictional (Java-like) language. I have handled all the details about
the yacc and lex files, but I still have a question regarding the dynamic
memory allocation for strings. When the lex file encounters a variable
name, I want it to pass this to yacc through yylval, and then
retrieve it to add to a syntax tree. For this purpose, I wrote the
following code in the lex file (third section), that will save the
variable in yylval:

void copy_name(char** dst, char* yy) {
int len;

//free(*dst);

// allocate memory for the new string
len = strlen(yy);
*dst = (char*) malloc(len * sizeof(char) + 1);
// copy the string
strcpy(*dst, yy);
}

sizeof (char) is 1 by definition, and can be omitted. Avoid //
comments in Usenet, they are foulable by line wraps. Check the
result of malloc. #include <stdlib.h>. A simpler version is:

char *copy_name(char *yy) {
char *dst;

if (dst = malloc(1 + strlen(yy))) strcpy(dst, yy);
This indentation is strongly frowned upon on any codebase I have worked
on where debuggers might be used to locate problems or walk programmers
through the code. It is too easy to miss the result of the
comparison. Much nicer to read and easier to debug is:

if (dst = malloc(1 + strlen(yy))){
strcpy(dst, yy);
}

Apr 18 '07 #14

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
by: Profetas | last post by:
Hi. I was wondering how can I access the yacc functions? because when use a normal yacc function such as qstring { yylval.string = strdup(yytext+1); if(yylval.string !='"')...
4
by: Rodrick Brown | last post by:
This might be off topic but can someone give me a quick run down on why tools like lexx/yacc/bison are usefull ? I know what there used for but not sure when they should be used and where ? ...
6
by: Volker Hetzer | last post by:
Hi! We are finding ourselves in a situation where we have to parse several more or less free format text files. In the past, on linux, we had flex and bison for generating very fast parsers for...
1
by: deepusrp | last post by:
Helo everyone, i am doing a project on some graphic tool using qt as the front end. i am using lex and yacc as parser. now i am facing a problem since yacc generates only c code as i want to...
2
by: max.giacometti | last post by:
Hi everybody! I am using lex and yacc to write a vhdl to systemc converter. Lex simply reads the input file and yacc implements grammar and translation. I'd like to be able to make yacc...
2
by: Mohitz | last post by:
Hi Guys, I am facing a peculiar problem with my yacc script. Following is the snippet of the yacc script that i am using. I am using lex as the lexical analyzer. Sample Input :
1
by: neena.usenet | last post by:
I have a gui that reads in a file using C++/yacc. If the file has a parsing error, yacc throws the error to my program and I report it. The problem is at that point, I would like to try to open...
0
by: Laszlo Nagy | last post by:
This is a fragment from my yacc file: import ply.yacc as yacc from lex import tokens from ast import * def p_msd(p): r"""msd : SCHEMA WORD LBRACE defs RBRACE """ p = MSDSchema(p)
1
by: Robert | last post by:
If the goal is to take a 1-dimensional package of text and produce a tree data structure that is easy to understand and manipulate, and then write C/C++ code that works with that tree - does...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.