473,903 Members | 3,336 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

sizeof([ALLOCATED MEMORY])

If I have malloc()'ed a pointer and want to read from it as if it were
an array, I need to know that I won't be reading past the last index.

If this is a pointer to a pointer, a common technique seems to be
setting a NULL pointer to the end of the list, and here we know that
the allocated memory has been exhausted. All good.

When this is a pointer to another type, say int, I could have a
variable that records how much memory is being allocated and use that
to track the size of the 'array'.
Alternatively, we could set the end of the 'array' to some kind of
error-code, such as 99 or MAX_INT.
I don't like either of these techniques.

So, what is a good way to stop a loop reading or writing past the
memory allocated to a pointer?
Or if possible, what is a good way of determining the size of memory
allocated to a pointer?

Cheers,
Matt

May 3 '06
74 4733
In article <pa************ *************** *@gmail.com>,
Kelsey Bjarnason <kb********@gma il.com> wrote:
[snips]

On Thu, 04 May 2006 14:18:31 +0000, Howard Hinnant wrote:
The data structure as described above has an efficiency concern:
Repeated growth in size can lead to quadratic expense due to continually
reallocating for a slightly larger size.


So don't do that. In simplest terms, don't add, multiply.


Did you read my entire post or just stop here?

-Howard
May 4 '06 #11
"Howard Hinnant" <ho************ @gmail.com> wrote in message
news:ho******** *************** ***********@syr cnyrdrs-02-ge0.nyroc.rr.co m...
No matter what the growth strategy for capacity, its existence is a
major motivation for finding out the amount of memory actually
allocated. create_array_sh ort(N) may request only N*sizeof(short)
bytes, but may receive slightly more than that (for alignment purposes
or whatever). If create_array_sh ort(N) had some way to find out how
much memory it actually received, then it makes perfect sense to set
capacity to size_received/sizeof(short) instead of to N. It makes for
higher performing code in the average case.


You're over-optimizing here. If malloc() returns more memory than you asked
for, when you expand the array, you'll get access to it. On average, you'll
perform the same number of realloc()s with or without this optimization
unless you're expanding by a very small amount each time, in which case a
large fraction of your realloc()s will be no-ops for the implementation.

S

--
Stephen Sprunk "Stupid people surround themselves with smart
CCIE #3723 people. Smart people surround themselves with
K5SSS smart people who disagree with them." --Aaron Sorkin
*** Posted via a free Usenet account from http://www.teranews.com ***
May 4 '06 #12
"CBFalconer " <cb********@yah oo.com> wrote in message
news:44******** *******@yahoo.c om...
Chris McDonald wrote:
In short, you cannot determine the allocated size, from the
allocated memory itself.


However, you can determine how much memory is needed by some means
or other, and perform a realloc to that size. This can lead to
snarling dogs playing tug-of-war.


My first thought was "how clever", but there are serious problems with this:

1. it assumes the pointer the callee received is to the start of the object
2. realloc() may relocate the object

The latter is a serious problem if you can't communicate that fact back to
the caller; it is cleaner for the interface to provide a way to indicate the
object's size to the callee in the first place so that this hack isn't
needed.

S

--
Stephen Sprunk "Stupid people surround themselves with smart
CCIE #3723 people. Smart people surround themselves with
K5SSS smart people who disagree with them." --Aaron Sorkin
*** Posted via a free Usenet account from http://www.teranews.com ***
May 4 '06 #13
In article <44************ ***********@fre e.teranews.com> ,
"Stephen Sprunk" <st*****@sprunk .org> wrote:
"Howard Hinnant" <ho************ @gmail.com> wrote in message
news:ho******** *************** ***********@syr cnyrdrs-02-ge0.nyroc.rr.co m...
No matter what the growth strategy for capacity, its existence is a
major motivation for finding out the amount of memory actually
allocated. create_array_sh ort(N) may request only N*sizeof(short)
bytes, but may receive slightly more than that (for alignment purposes
or whatever). If create_array_sh ort(N) had some way to find out how
much memory it actually received, then it makes perfect sense to set
capacity to size_received/sizeof(short) instead of to N. It makes for
higher performing code in the average case.


You're over-optimizing here. If malloc() returns more memory than you asked
for, when you expand the array, you'll get access to it. On average, you'll
perform the same number of realloc()s with or without this optimization
unless you're expanding by a very small amount each time, in which case a
large fraction of your realloc()s will be no-ops for the implementation.


Let's go through a specific example. I'm going to assume the struct I
showed before:

struct array_of_short
{
short* data;
size_t size;
size_t capacity;
};

Let's say that size == capacity == 100. Now for some reason my
algorithm decides it needs 2 more shorts. What do I do?

Today I call realloc similar to:

data = realloc(data, 3*capacity/2 * sizeof(short));

I.e. I up the capacity by 50% (or whatever growth factor you want to
use). Now under the hood lets say (and this is quite reasonable) that
the heap had already allocated 4 more bytes to data, and had another 64
bytes beyond that which it could have expanded that chunk in place. But
because my code is ignorant of the current heap structure in the
neighborhood of my pointer, the best thing I know to do is blindly ask
for a realloc expanding to 50 more shorts (assuming a 2 byte short for
this example). That in turn will trigger the heap to relocate my 200
byte array to another location.

If I had more access to the current heap structure, I could have known
that there was already room for two extra shorts allocated to me. I
could've avoided a call to realloc altogether.

Continuing along with this example: let's say instead of 2 more shorts,
my algorithm actually needed 10 more shorts. Here's my choices:

1. realloc for 10 more shorts.
2. realloc for a 50% bigger capacity.

Choice 1 risks the poor efficiency of an incremental growth strategy
(known performance problem).

Choice 2 completely wastes the 68 bytes that were sitting beyond my 200
byte buffer but had no way to find out about.

Wouldn't it be nice if I could:

A: Discover those extra 2 shorts that were allocated to me anyway.
B: Discover that I could expand in place by another 32 shorts without
the expense of relocating my array.

?

These seem like significant and obvious optimizations to me.

Indeed, I've coded them up and tested them. Given the right underlying
allocation interface one can achieve very decent performance increases
with very little effort. It won't make your code 10 times faster. But
it sure is an easy way to gain say 15% while minimizing heap
fragmentation and thrashing.

But it is only easy if you have the right allocation interface. Indeed
today the only way to achieve this type of optimization in C is to write
your own malloc/realloc/free functions (adding to the interface as
appropriate). Anyone who's written these functions (I have) knows that
this exercise can't be labeled easy.

-Howard
May 4 '06 #14


Howard Hinnant wrote On 05/04/06 17:26,:
[...]

If I had more access to the current heap structure, I could have known
that there was already room for two extra shorts allocated to me. I
could've avoided a call to realloc altogether.


IMHO, this isn't such a big deal. Or to put it another
way: if you've got an array that's (1) big enough to be a
problem and (2) expands and expands and expands and expands,
it's time to think about other data structures.

Are you familiar with any present-day file systems
that require each file to occupy contiguous disk sectors?

--
Er*********@sun .com

May 4 '06 #15
Howard Hinnant <ho************ @gmail.com> writes:
In article <44************ ***********@fre e.teranews.com> ,
"Stephen Sprunk" <st*****@sprunk .org> wrote:
"Howard Hinnant" <ho************ @gmail.com> wrote in message
news:ho******** *************** ***********@syr cnyrdrs-02-ge0.nyroc.rr.co m...
> No matter what the growth strategy for capacity, its existence is a
> major motivation for finding out the amount of memory actually
> allocated. create_array_sh ort(N) may request only N*sizeof(short)
> bytes, but may receive slightly more than that (for alignment purposes
> or whatever). If create_array_sh ort(N) had some way to find out how
> much memory it actually received, then it makes perfect sense to set
> capacity to size_received/sizeof(short) instead of to N. It makes for
> higher performing code in the average case.


You're over-optimizing here. If malloc() returns more memory than
you asked for, when you expand the array, you'll get access to it.
On average, you'll perform the same number of realloc()s with or
without this optimization unless you're expanding by a very small
amount each time, in which case a large fraction of your realloc()s
will be no-ops for the implementation.


Let's go through a specific example. I'm going to assume the struct I
showed before:

struct array_of_short
{
short* data;
size_t size;
size_t capacity;
};

Let's say that size == capacity == 100. Now for some reason my
algorithm decides it needs 2 more shorts. What do I do?

Today I call realloc similar to:

data = realloc(data, 3*capacity/2 * sizeof(short));

I.e. I up the capacity by 50% (or whatever growth factor you want to
use). Now under the hood lets say (and this is quite reasonable) that
the heap had already allocated 4 more bytes to data, and had another 64
bytes beyond that which it could have expanded that chunk in place. But
because my code is ignorant of the current heap structure in the
neighborhood of my pointer, the best thing I know to do is blindly ask
for a realloc expanding to 50 more shorts (assuming a 2 byte short for
this example). That in turn will trigger the heap to relocate my 200
byte array to another location.

If I had more access to the current heap structure, I could have known
that there was already room for two extra shorts allocated to me. I
could've avoided a call to realloc altogether.


No, you couldn't necessarily avoid the realloc altogether, unless the
extra memory is actually reserved. There might happen to be another
two shorts worth of storage available immediately after your allocated
space, but the system might later choose to use it for another
allocation.

More generally, if I call, say, malloc(300), the system might reserve
512 bytes for me, and might (internally) guarantee that it won't use
any of it for anything else until I call free() -- or it might
allocate 512 bytes, but remember that I only asked for 300, and later
carve out the last 128 bytes for another allocation. Or another call
to free() might result in my 300-byte allocation being at the
beginning of a contiguous chunk of 1024 bytes.

In other words, the number of bytes that are *really* reserved for a
given malloc() call isn't necessarily a meaningful question, and may
vary over time depending on things outside the program's control.

If I were going to suggest a new feature to address this (perhaps
straying off-topic a bit), I'd probably propose a new function similar
to realloc(), except that it's guaranteed not to relocate the object.
Let's call it xrealloc() (not a great name, but it will do for now).
If realloc() would have expanded or contracted the object in place,
xrealloc() is equivalent to realloc(). If realloc() would have
relocated the object, xrealloc() fails. (We wouldn't require
xrealloc() to succeed in *exactly* the same circumstances where
realloc() would expand or shrink the object in place, but it would
usually work out that way.)

If I've allocated space for 300 bytes, and I find that I need 302
bytes, I can call xrealloc(), asking for just 302 bytes. If this
succeeds, I'm done, and the call was probably very cheap. If it
fails, I can then do a more expensive realloc() call, requesting 400
or 500 bytes to leave room for future growth. And of course existing
code can ignore xrealloc() and continue to work as it always has.

This would allow an implementation to allocate more memory than you
ask for, while still permitting it to take back some of the extra
space if it needs it.

--
Keith Thompson (The_Other_Keit h) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
May 4 '06 #16
In article <ln************ @nuthaus.mib.or g>,
Keith Thompson <ks***@mib.or g> wrote:
If I had more access to the current heap structure, I could have known
that there was already room for two extra shorts allocated to me. I
could've avoided a call to realloc altogether.
No, you couldn't necessarily avoid the realloc altogether, unless the
extra memory is actually reserved. There might happen to be another
two shorts worth of storage available immediately after your allocated
space, but the system might later choose to use it for another
allocation.


There are two issues here (I'm guilty of introducing the second one in
my last post).

1. The allocator may give you extra memory for free but currently has
no way to tell you that.

2. The allocator may have extra memory located directly after what it
has allocated to you and that amount may vary with time.

In my previous example I attempted to address both of these.

I heartily agree that insight into #2 is the bigger performance win.
However, insight into #1 is practically cost free (there's a dollar on
the sidewalk, do you pick it up, or is it too much trouble to bend
over?).
If I were going to suggest a new feature to address this (perhaps
straying off-topic a bit), I'd probably propose a new function similar
to realloc(), except that it's guaranteed not to relocate the object.
Let's call it xrealloc() (not a great name, but it will do for now).
If realloc() would have expanded or contracted the object in place,
xrealloc() is equivalent to realloc(). If realloc() would have
relocated the object, xrealloc() fails. (We wouldn't require
xrealloc() to succeed in *exactly* the same circumstances where
realloc() would expand or shrink the object in place, but it would
usually work out that way.)

If I've allocated space for 300 bytes, and I find that I need 302
bytes, I can call xrealloc(), asking for just 302 bytes. If this
succeeds, I'm done, and the call was probably very cheap. If it
fails, I can then do a more expensive realloc() call, requesting 400
or 500 bytes to leave room for future growth. And of course existing
code can ignore xrealloc() and continue to work as it always has.

This would allow an implementation to allocate more memory than you
ask for, while still permitting it to take back some of the extra
space if it needs it.


I think that's an excellent idea. Perhaps there is some refinement
still to be done here. But the basic idea is fundamentally a win-win.

-Howard
May 4 '06 #17
In article <1146780106.641 149@news1nwk>,
Eric Sosman <Er*********@su n.com> wrote:
Howard Hinnant wrote On 05/04/06 17:26,:
[...]

If I had more access to the current heap structure, I could have known
that there was already room for two extra shorts allocated to me. I
could've avoided a call to realloc altogether.
IMHO, this isn't such a big deal. Or to put it another
way: if you've got an array that's (1) big enough to be a
problem and (2) expands and expands and expands and expands,
it's time to think about other data structures.


Having multiple data structures in your tool box is always a good thing.
Sometimes you need contiguous array, sometimes you need a linked list,
sometimes a hash table or tree, etc. That being said, the contiguous
array is the workhorse of data structures. It's the one I find myself
reaching for most often. Even when I have no idea how much it will grow.
Are you familiar with any present-day file systems
that require each file to occupy contiguous disk sectors?


Nope. But I am intimately familiar with an in-ram data structure which
stores a set of contiguous arrays (and makes it look contiguous). Every
once in a while it comes in very handy.

-Howard
May 4 '06 #18
Howard Hinnant <ho************ @gmail.com> writes:
In article <ln************ @nuthaus.mib.or g>,
Keith Thompson <ks***@mib.or g> wrote:

[snip xrealloc() idea]
This would allow an implementation to allocate more memory than you
ask for, while still permitting it to take back some of the extra
space if it needs it.


I think that's an excellent idea. Perhaps there is some refinement
still to be done here. But the basic idea is fundamentally a win-win.


Thanks. On the other hand, there's always the danger of creeping
featurism. The potential complexity of a memory allocation system is
nearly unbounded. For example, it might sometimes make sense to make
a request like:

I need a chunk of at least 300 bytes of data. I'll probably need
to expand it to 500 bytes later on, so reserve 500 if you can; if
not, I'll settle for 300. I *might* need to expand it to 4000
bytes eventually, so optimize for that case if you can. Tell me
how many bytes you were actually able to allocate. And I'm also
going to need to allocate a 10000 byte chunk; if you can't satisfy
that request immediately, let me exactly what my options are for
freeing other chunks of allocated memory to make that allocation
possible.

(Substitute larger sizes and/or more allocations to make this scenario
realistic on a modern non-embedded system.)

In general, you want to optimize for your program's actual usage
pattern, without necessarily knowing in advance what that's going to
be. Just defining the data structures to describe what you want and
what the system can give you would be a daunting task, let alone
actually implementing it. There has to be a tradeoff between having a
simple interface and adding more functionality.

One could argue that the C's current malloc/realloc/free interface is
the best tradeoff. A malloc/realloc/xrealloc/free interface would
certainly provide some options that don't currently exist, but it's
not clear that it would be worth the added complexity (or that it
wouldn't).

--
Keith Thompson (The_Other_Keit h) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
May 5 '06 #19
In article <ln************ @nuthaus.mib.or g>,
Keith Thompson <ks***@mib.or g> wrote:
Just defining the data structures to describe what you want and
what the system can give you would be a daunting task, let alone
actually implementing it. There has to be a tradeoff between having a
simple interface and adding more functionality.

One could argue that the C's current malloc/realloc/free interface is
the best tradeoff. A malloc/realloc/xrealloc/free interface would
certainly provide some options that don't currently exist, but it's
not clear that it would be worth the added complexity (or that it
wouldn't).


Excellent points all.

I propose that a good saddle point between complexity and simplification
is somewhere near:

1. How much memory did you actually give me?

2. You gave me a N-byte block (and thanks). I'm interested in
expanding it (in-place) to at a minimum of (N+Nmin) bytes, and perhaps
to a maximum of (N+Nmax) bytes. Can you help me do that?

3. You gave me a N-byte block (and thanks). I no longer need a large
part of this memory and would like to recycle it. Can you shrink this
block in-place such that it is no smaller than M bytes?

My proposal focuses more on the immediate, and not so much on the "what
I might change my mind to in the future". Imho, the client is better
informed to predict future needs and should pose its questions such that
it hides those details from the allocation system and just asks more
specific questions.

-Howard
May 5 '06 #20

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
2052
by: Andreas Suurkuusk | last post by:
Hi, I just noticed your post in the "C# memory problem: no end for our problem?" thread. In the post you implied that I do not how the garbage collector works and that I mislead people. Since the thread is over a month old, I decided to start a new one with my response. Please see my comments inline.
4
13021
by: Frank Esser | last post by:
I am using SQL 8 Personal edition with sp2 applied. I set the max server memory to 32MB and leave the min server memory at 0. When my application starts hitting the database hard the memory usage reported through task manager peaks between 41-42MB. I've stopped and restarted the MSSQLserver service and checked that the running values are what I set them to be. Does anybody have any ideas as to why the sqlservr.exe would be utilizing more...
0
1055
by: Bill Burwell | last post by:
Which memory properties, or what combinations of memory properties, provide useful information about a program's memory usage when that program has just started leaking memory? While I have a VB bias, it seems to me the answer to this question should be generic - that is language independent.
4
2598
by: Franklin Lee | last post by:
Hi All, I use new to allocate some memory,even I doesn't use delete to release them. When my Application exit, OS will release them. Am I right? If I'm right, how about Thread especally on Solaries OS? This means that I use new to allocate memory in one Thread and doesn't use delete to release them.
9
2363
by: Mike P | last post by:
I know everything about reference counting and making sure you don't have large objects lying around. I have also profiled my app with multiple tools. I know about the fact GC collects memory but not necessary give it back to the OS. It seems that .NET win app will only return memory to the OS when the OS is asking for it. But!!! When the OS is asking for it is usually too late, tons of swapping and slow performance.
22
3496
by: xixi | last post by:
hi, we are using db2 udb v8.1 for windows, i have changed the buffer pool size to accommadate better performance, say size 200000, if i have multiple connection to the same database from application server, will each connection take the memory 800M (200000 x 4k = 800 M), so the memory took will be 800M times number of connections, or the total memory get from bufferpool will be 800M?
14
20798
by: Alessandro Monopoli | last post by:
Hi all, I'm searching a PORTABLE way to get the available and total physical memory. Something like "getTotalMemory" and it returns the memory installed on my PC in bytes, and "getAvailableMemory" and it returns the available memory in bytes. Do you know is there's a C function, a c++ Object or anything else that compiles in Linux and Windows to get these data?
5
24921
by: kumarmdb2 | last post by:
Hi guys, For last few days we are getting out of private memory error. We have a development environment. We tried to figure out the problem but we believe that it might be related to the OS (I am new to Windows so not sure). We are currently bouncing the instance to overcome this error. This generally happen at the end of business day only (So maybe memory might be getting used up?). We have already increased the statement heap & ...
1
2058
by: Jean-Paul Calderone | last post by:
On Tue, 22 Apr 2008 14:54:37 -0700 (PDT), yzghan@gmail.com wrote: The test doesn't demonstrate any leaks. It does demonstrate that memory usage can remain at or near peak memory usage even after the objects for which that memory was allocated are no longer live in the process. This is only a leak if peak memory goes up again each time you create any new objects. Try repeated allocations of a large dictionary and observe how memory...
5
505
by: cham | last post by:
Hi, I am working on c++ in a linux system ( Fedora core 4 ), kernel version - 2.6.11-1.1369_FC4 gcc version - 4.0.0 20050519 ( Red Hat 4.0.0-8 ) In my code i am creating a vector to store pointers of type structure "SAMPLE_TABLE_STRUCT" ( size of this structure is 36 bytes ). I create an instance of structure "SAMPLE_TABLE_STRUCT" using operator "new" and push back into the vector,this is done inside a for loop for
0
9997
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
10872
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
10981
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
10499
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
1
8047
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
7205
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5893
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
2
4307
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
3323
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.