Smart Pointers: Is there something similar to smart pointers in C?

MotoK

Hi Experts,
I've just joined this group and want to know something:
Is there something similar to smart pointers in C or something to
prevent memory leakages in C programs.

Regards
MotoK

Sep 12 '06

Subscribe Post Reply

5046

Richard Heathfield

Rod Pemberton said:

>
"Richard Heathfield" <in*****@invalid.invalidwrote in message
news:VZ******************************@bt.com...
>Rod Pemberton said:

>
"Richard Heathfield" <in*****@invalid.invalidwrote in message
news:B4******************************@bt.com...
No, your compiler system is not denigrated here. What is denigrated

here

is
your apparent inability to separate the idea of "C" from the idea of
"lcc-win32". The distinction is an important one.

No. That idea is completely incorrect. You'll never find a post from
Doug
Gwyn (comp.std.c, ANSI C X3J11 standard, developer of the Army's

BRL-UNIX

in ANSI C) where his explanation doesn't take the underlying assembly,
physical hardware such as the cpu and memory into account.

Yes, I know who Doug Gwyn is, and he is very much a respected, albeit
occasional, contributor to this newsgroup. I am very familiar with his
writing style. And I can say without a shadow of a doubt that you are
absolutely and utterly mistaken. Doug Gwyn has written a great many
articles that don't even mention the underlying assembly, physical
hardware
>such as the cpu, or memory (in the stuff-you-can-kick sense), let alone
"take them into account", so your claim that he never does so is quite
wrong.

You've misinterpreted this statement: "his explanation doesn't take the
underlying...into account" to mean that he _explicitly_ "mention(s) the
underlying assembly, physical hardware". That isn't even close to what I
said. He doesn't always explicitly mention them. But, his answers are
always worded to work properly with them.

That's because he answers questions about C with reference to the notional
abstract machine, rather than to specific implementations thereof. That is
precisely the point.

I've seen numerous laughable
answers from Plauger where he doesn't make sure that his answers comply
with assembly or hardware implementations.

If they comply with the abstract machine, then that is sufficient. It is the
C implementation's responsibility to ensure that correct programs are
translated in such a way as to work correctly on the target platform.

>
What this also tells me is:
either A) you have little, if no, assembly experience
or B) if you do, you've failed to fully comprehend what you read
(which I think I had numerous prior complaints with you, didn't I?)

Your inability to comprehend what I write does not imply my inability to
comprehend what you write.

>
>Furthermore, I am quite sure Doug Gwyn would agree fully with me that the
distinction between "C" and "lcc-win32" is an important one. Why don't
you ask him?

Read my reply to Keith on this same issue... You do agree with Keith,
don't you?

Yes, and neither of us agrees with you on this occasion.

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at above domain (but drop the www, obviously)

Sep 14 '06 #51

Ancient_Hacker

Spiro Trikaliotis wrote:

I think this will work. lcc-win32 uses Boehm's garbage collector, cf.
http://www.hpl.hp.com/personal/Hans_Boehm/gc/

Wow! I looked over the explanation, and it's very clever! It will
"kinda" work. Here's what it does:

The garbage collector scans the heap and stack and registers for things
that look like pointers. Anything that looks like a valid address
doesnt get collected. Heap blocks that don't seem to be represented
in memory are candidates for collection. Will kinda work, perhaps a
usable amount of the time, except:

(1) If you're using more than 1/65536'th of the potential address
space, addresses will no longer be very unique-- i.e. things like
zero-terminated strings will start looking like valid addresses. Once
you start using more than 1/256'th of the address space, then even
string bodies and floats will start looking like addresses, which will
make the GC's job a lot harder (like, nearly impossible).

(2) If you pass a pointer to a system API, or pass a pointer in a
struct to a system API, the GC probably won't see the struct address,
or any addresses embedded in the struct. So things like call-back
addresses, semaphore addresses, indirect block references, async-I/O
control blocks, all those blocks are likely to be prematurely
collected,, leading to major blammos. I know, "non-standard", you
deserve anything that happens. This applies to the run-time library
also, so the GC has to have hooks, either binary or source-code, into a
good deal of the RTL.

(3) There's a potential "buffer-bashing" security hole-- there are
internet worms out there that know about the four msot common RTL heap
allocation schemes and they plop down plausible looking heap structures
in your web server input buffers. If the GC gets fooled by these,
almost anything can happen.
----

But still, I'm impressed with this GC implementation, a brave attempt
at doing the very difficult to impossible. Kudos to the
imnplementors. I think.

Sep 14 '06 #52

Richard Bos

"Rod Pemberton" <do*********@bitfoad.cmmwrote:

"Richard Heathfield" <in*****@invalid.invalidwrote in message
No, your compiler system is not denigrated here. What is denigrated here is
your apparent inability to separate the idea of "C" from the idea of
"lcc-win32". The distinction is an important one.

No. That idea is completely incorrect. You'll never find a post from Doug
Gwyn (comp.std.c, ANSI C X3J11 standard, developer of the Army's BRL-UNIX
in ANSI C) where his explanation doesn't take the underlying assembly,
physical hardware such as the cpu and memory into account.

You mean messages such as <44***************@null.net>;
<cl****************@plethora.net>; <44***************@null.net>;
<44***************@null.net>; and <45***************@null.net>; anre
figments of my news server's imagination?

C is built upon assembly.

And assembly is built upon the movement of electrons in semi-conductors;
but that does not make Feynman diagrams on-topic in either an assembly
newsgroup, or in comp.lang.c.

Richard

Sep 14 '06 #53

William Hughes

Rod Pemberton wrote:

"Keith Thompson" <ks***@mib.orgwrote in message
news:ln************@nuthaus.mib.org...
"Rod Pemberton" <do*********@bitfoad.cmmwrites:
"Richard Heathfield" <in*****@invalid.invalidwrote in message
news:B4******************************@bt.com...
>No, your compiler system is not denigrated here. What is denigrated
>here is your apparent inability to separate the idea of "C" from
>the idea of "lcc-win32". The distinction is an important one.
>
No. That idea is completely incorrect.
You're saying that the distinction between C and lcc-win32 is *not* an
important one? Fascinating.

Really? You disagree with me here and then turn around and (indirectly)
agree with what I just said:
KTC, as defined by the ISO C standard, works on an "abstract machine".

But, of course, your IQ is high enough and your experience is deep enough
that you understood that you can't separate one from the other, didn't you?

Unfortunately your IQ is not high enough to separate abstract from
particular. (Note that there is no assembler for the abstract machine,
it
is definied in terms of results not methods.)

-William Hughes

>

Rod Pemberton

Sep 14 '06 #54

Keith Thompson

"Rod Pemberton" <do*********@bitfoad.cmmwrites:

"Keith Thompson" <ks***@mib.orgwrote in message
news:ln************@nuthaus.mib.org...
>"Rod Pemberton" <do*********@bitfoad.cmmwrites:
"Richard Heathfield" <in*****@invalid.invalidwrote in message
news:B4******************************@bt.com...
No, your compiler system is not denigrated here. What is denigrated
here is your apparent inability to separate the idea of "C" from
the idea of "lcc-win32". The distinction is an important one.

No. That idea is completely incorrect.

You're saying that the distinction between C and lcc-win32 is *not* an
important one? Fascinating.

Really? You disagree with me here and then turn around and (indirectly)
agree with what I just said:
KTC, as defined by the ISO C standard, works on an "abstract machine".

I have no idea how you interpret that to mean that I agree with you.
For the record, I do not agree with you.

See above for the specific statement of yours with which I disagreed.
Richard said that the distinction between "C" and "lcc-win32" is an
important one. You replied, "No. That idea is completely incorrect."
The only reasonable interpretation of that is that you don't believe
that the distinction between "C" and "lcc-win32" is an important one.

If that's not what you meant, please clarify.

But, of course, your IQ is high enough and your experience is deep
enough that you understood that you can't separate one from the
other, didn't you?

Nonsense. It's entirely possible to understand C without knowing
anything at all about lcc-win32. C existed long before lcc-win32 did;
I'm sure there were plenty of people who understand C at a time when
lcc-win32 didn't even exist.

Again, if you're making a claim other than one that there's no
important distinction between "C" and "lcc-win32", please say so.
(That's a surprising claim, since I thought your position was that a
knowledge of assembly language was critical to an understanding of C.)

In any case, you seem not to understand what an "abstraction" is. C
is defined on an abstract level, with little dependency on the
specifics of, for example, the underlying hardware. If I gathered a
group of people in a room with pencils and paper, and had them follow
the semantics of the C standard without using any computer hardware
(programs are submitted, and output is returned, as written text on
paper), the result, if done properly, would be a conforming C
implementation.

It's certainly true that the C abstract machine is *designed* to be
implementable on real-world hardware, and an understanding of one or
more assembly languages can certainly be helpful in understanding why
C is the way it is. But it's entirely possible to have a good
understanding of C based only on the abstract description in the
standard.

When I write C code, I don't think much about what happens within the
CPU when my program is executed. The CPU's job is to execute my code
and produce the required results. I don't need to know how it does
it.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Sep 14 '06 #55

Keith Thompson

"Ancient_Hacker" <gr**@comcast.netwrites:

Spiro Trikaliotis wrote:

>I think this will work. lcc-win32 uses Boehm's garbage collector, cf.
http://www.hpl.hp.com/personal/Hans_Boehm/gc/

Wow! I looked over the explanation, and it's very clever! It will
"kinda" work. Here's what it does:

The garbage collector scans the heap and stack and registers for things
that look like pointers. Anything that looks like a valid address
doesnt get collected. Heap blocks that don't seem to be represented
in memory are candidates for collection. Will kinda work, perhaps a
usable amount of the time, except:

(1) If you're using more than 1/65536'th of the potential address
space, addresses will no longer be very unique-- i.e. things like
zero-terminated strings will start looking like valid addresses. Once
you start using more than 1/256'th of the address space, then even
string bodies and floats will start looking like addresses, which will
make the GC's job a lot harder (like, nearly impossible).

Disclaimer: I haven't read the web page, but (I think) I have a
general idea of how the GC works.

I don't *think* what you describe is going to be much of a problem in
practice, though it's certainly a theoretical problem. In pratice, I
suspect that most valid addresses tend to have different bit patterns
than most other valid data. There will be a *few* arbitrary bit
patterns that happen to look like pointers, but not many. This just
means that garbage collection won't be 100% efficient, but it
shouldn't be far from it. There may be applications where anything
short of 100% efficiency is unacceptable; such applications either
can't use GC, or need to use some more intrusive form of it.

(2) If you pass a pointer to a system API, or pass a pointer in a
struct to a system API, the GC probably won't see the struct address,
or any addresses embedded in the struct. So things like call-back
addresses, semaphore addresses, indirect block references, async-I/O
control blocks, all those blocks are likely to be prematurely
collected,, leading to major blammos. I know, "non-standard", you
deserve anything that happens. This applies to the run-time library
also, so the GC has to have hooks, either binary or source-code, into a
good deal of the RTL.

That's an interesting point. Any time a pointer value is stashed away
where the GC can't see it, you have the potential of blocks being
collected prematurely. I've been thinking in terms of storing the
value in an external file, or breaking it down into bits or bytes
(e.g., by encrypting or compressing some chunk of data containing
pointers), but <OT>copying it into the kernel's memory space where GC
code, which runs in user mode, can't see it, is also likely to be an
issue</OT>. But I don't think it's likely that a program would pass a
pointer value to a system API *and forget it*. The application itself
would probably keep a copy of the pointer value in its own memory
space. And if the application is intended to work with GC, it must do
so.

[snip]

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Sep 14 '06 #56

Tak-Shing Chan

On Thu, 14 Sep 2006, Keith Thompson wrote:

If I gathered a
group of people in a room with pencils and paper, and had them follow
the semantics of the C standard without using any computer hardware
(programs are submitted, and output is returned, as written text on
paper), the result, if done properly, would be a conforming C
implementation.

To do this properly would require a team of infallible human
beings. But there is only one Dan Pop.

Tak-Shing

Sep 14 '06 #57

Ancient_Hacker

Keith Thompson wrote:

Disclaimer: I haven't read the web page, but (I think) I have a
general idea of how the GC works.

I don't *think* what you describe is going to be much of a problem in
practice, though it's certainly a theoretical problem. In pratice, I
suspect that most valid addresses tend to have different bit patterns
than most other valid data.

Yep, many systems start handing out addresses that have a few of the
high address bytes as zeroes, so addresses tend to be discernible if
you don't push the address range very far. And some addresses tend to
be 2/4/8/16 byte aligned when first handed out, so that thins the vald
address range somewhat, at least until the program starts indexing into
arrays.

But as soon as you ask for more than 16-bits of memory, the high byte
on a 32-bit architecture will be zero, followed or preceded by three
non-zero bytes (depending on address ordering), and that will be
mimicked by zero-terminated strings.

Worse yet, as soon as you ask for more than 24-bits of memory, the high
byte on a 32-bit architecture will be non-zero, making many string
bodies mimic addresses. Very bad.

I guess the moral is, when addresses start looking like data, switch to
64-bit compilers!

But I don't think it's likely that a program would pass a
pointer value to a system API *and forget it*.

Well, yes, usually true, two or three thorny canonical problems:

(1) The app may keep the pointer, so the GC won't toss out the block,
but there are many OS's where you can pass in arbitrary arrays or
structs or even linked lists. For example a database app might pass an
array to the OS meaning "gather up these 3,200 pairs of random disk
blocks and put them in this other block of pointers to addresses in my
memory space". The GC would have (usually) no way to intercept that OS
call, and no intrinsic knowledge that the block passed has addresses in
it. Worse yet, the app need NOT keep a pointer to this request
structure, as in most cases the OS will asyncronously call back to the
user program, passing back the request block address, where the OS is
returning result codes. This is very common in Windows NT/XP. So
yipes, the app for a while may not have any trace of these addresses.

(2) There are OS calls to request the OS to allocate memory and return
the virtual address where it's given the app memory. The GC may not
have any way to hook this call and learn about those addresses.

So yes, this clever GC may be able to root around and figure out where
most blocks are, as long as addresses don't get too large, and apps
don't make any fancy OS calls. Whether this is a tenable situation
probably varies a lot from case to case.

I'd love to have a reliable GC for C. Last week I had what I thought
was a pretty clean C program, but when I used my malloc_watcher, at the
end it said "84,132 blocks using 68,321,144 bytes left dangling at
exit(0) time". I had forgotten to free() some large linked lists.
Sigh.

Sep 15 '06 #58

Frederick Gotham

Keith Thompson posted:

It's certainly true that the C abstract machine is *designed* to be
implementable on real-world hardware, and an understanding of one or
more assembly languages can certainly be helpful in understanding why
C is the way it is. But it's entirely possible to have a good
understanding of C based only on the abstract description in the
standard.

I myself haven't really got much of a clue about assembly language, call
stack and so forth... but I can write some decent code.

--

Frederick Gotham

Sep 15 '06 #59

Keith Thompson

"Ancient_Hacker" <gr**@comcast.netwrites:

Keith Thompson wrote:

>Disclaimer: I haven't read the web page, but (I think) I have a
general idea of how the GC works.

I don't *think* what you describe is going to be much of a problem in
practice, though it's certainly a theoretical problem. In pratice, I
suspect that most valid addresses tend to have different bit patterns
than most other valid data.

Yep, many systems start handing out addresses that have a few of the
high address bytes as zeroes, so addresses tend to be discernible if
you don't push the address range very far. And some addresses tend to
be 2/4/8/16 byte aligned when first handed out, so that thins the vald
address range somewhat, at least until the program starts indexing into
arrays.

But as soon as you ask for more than 16-bits of memory, the high byte
on a 32-bit architecture will be zero, followed or preceded by three
non-zero bytes (depending on address ordering), and that will be
mimicked by zero-terminated strings.

Worse yet, as soon as you ask for more than 24-bits of memory, the high
byte on a 32-bit architecture will be non-zero, making many string
bodies mimic addresses. Very bad.

You're making some assumptions about how addresses are allocated. In
my experience, they're not typically allocated starting at 0; the
address space for a given program tends to be sparse.

On one system, the following program:

#include <stdio.h>
#include <stdlib.h>
int main(void)
{
static int Static;
int Auto;
int *Allocated = malloc(sizeof *Allocated);
printf("&Static = %p\n", (void*)&Static);
printf("&Auto = %p\n", (void*)&Auto);
printf("Allocated = %p\n", (void*)Allocated);
return 0;
}

produces the following output:

&Static = 0x804962c
&Auto = 0xbffc2804
Allocated = 0x860c008

And my vague intuition tells me that there are still enough
differences between what addresses tend to "smell like" vs. other
kinds of data that accidental matches are unlikely. For example, all
of the above addresses contain bytes with the high-order bit set; in a
program dealing mainly with ASCII characters, these byte values are
unlikely to appear in strings.

The only way to be sure of this, one way or the other, is to measure
it. I'm guessing someone has already done so.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Sep 15 '06 #60

Similar topics