Array Performance: fix-size vs dynamic

In following codes, both the places 'p' is a global variable. In 1st case it's an array and 2nd case it's a pointer to an array. Do we get any performance benefit, while accessing 'p[ i ]' in the fixed size case ?

Fixed size array:
============
int p[1000];
void main ( )
{
for(int i = 0; i < 1000; i++)
p[ i ] = i;
}

Dynamic array:
===========
int *p = new int[1000];
void main ( )
{
for(int i = 0; i < 1000; i++)
p[ i ] = i;
}

Jul 6 '10 #1

Subscribe Post Reply

✓ answered by Banfa

Take this program

Expand|Select|Wrap|Line Numbers

 
int ugarray[5];

int igarray[5] = {10,9,8,7,6};
 
int get1(int *array);
 
int main()

{

    int sarray[5];

    int* harray = new int[5];

    int ix;
 
    ugarray[1] = 4;

    igarray[1] = 4;

    sarray[1] = 4;

    harray[1] = 4;
 
    ix = ugarray[1];

    ix = igarray[1];

    ix = sarray[1];

    ix = harray[1];
 
    get1(ugarray);

    get1(igarray);

    get1(sarray);

    get1(harray);
 
    delete harray;
 
  return 0;

}
 
int get1(int *array)

{

    return array[1];

}

It creates all types of arrays and accesses them for both read and write.

Take a look at the assembler

Expand|Select|Wrap|Line Numbers

 
int main()

{

0042D660  push        ebp  

0042D661  mov         ebp,esp 

0042D663  sub         esp,10Ch 

0042D669  push        ebx  

0042D66A  push        esi  

0042D66B  push        edi  

0042D66C  lea         edi,[ebp-10Ch] 

0042D672  mov         ecx,43h 

0042D677  mov         eax,0CCCCCCCCh 

0042D67C  rep stos    dword ptr es:[edi] 

    int sarray[5];

    int* harray = new int[5];

0042D67E  push        14h  

0042D680  call        operator new (42C072h) 

0042D685  add         esp,4 

0042D688  mov         dword ptr [ebp-108h],eax 

0042D68E  mov         eax,dword ptr [ebp-108h] 

0042D694  mov         dword ptr [harray],eax 

    int ix;
 
    ugarray[1] = 4;

0042D697  mov         dword ptr [ugarray+4 (495504h)],4 

    igarray[1] = 4;

0042D6A1  mov         dword ptr [igarray+4 (494004h)],4 

    sarray[1] = 4;

0042D6AB  mov         dword ptr [ebp-14h],4 

    harray[1] = 4;

0042D6B2  mov         eax,dword ptr [harray] 

0042D6B5  mov         dword ptr [eax+4],4 
 
    ix = ugarray[1];

0042D6BC  mov         eax,dword ptr [ugarray+4 (495504h)] 

0042D6C1  mov         dword ptr [ix],eax 

    ix = igarray[1];

0042D6C4  mov         eax,dword ptr [igarray+4 (494004h)] 

0042D6C9  mov         dword ptr [ix],eax 

    ix = sarray[1];

0042D6CC  mov         eax,dword ptr [ebp-14h] 

0042D6CF  mov         dword ptr [ix],eax 

    ix = harray[1];

0042D6D2  mov         eax,dword ptr [harray] 

0042D6D5  mov         ecx,dword ptr [eax+4] 

0042D6D8  mov         dword ptr [ix],ecx 
 
    get1(ugarray);

0042D6DB  push        offset ugarray (495500h) 

0042D6E0  call        get1 (42C32Eh) 

0042D6E5  add         esp,4 

    get1(igarray);

0042D6E8  push        offset igarray (494000h) 

0042D6ED  call        get1 (42C32Eh) 

0042D6F2  add         esp,4 

    get1(sarray);

0042D6F5  lea         eax,[sarray] 

0042D6F8  push        eax  

0042D6F9  call        get1 (42C32Eh) 

0042D6FE  add         esp,4 

    get1(harray);

0042D701  mov         eax,dword ptr [harray] 

0042D704  push        eax  

0042D705  call        get1 (42C32Eh) 

0042D70A  add         esp,4 
 
    delete harray;

0042D70D  mov         eax,dword ptr [harray] 

0042D710  mov         dword ptr [ebp-0FCh],eax 

0042D716  mov         ecx,dword ptr [ebp-0FCh] 

0042D71C  push        ecx  

0042D71D  call        operator delete (42B5EBh) 

0042D722  add         esp,4 
 
  return 0;

0042D725  xor         eax,eax 

}

0042D727  push        edx  

0042D728  mov         ecx,ebp 

0042D72A  push        eax  

0042D72B  lea         edx,[ (42D74Ch)] 

0042D731  call        @ILT+1470(@_RTC_CheckStackVars@8) (42B5C3h) 

0042D736  pop         eax  

0042D737  pop         edx  

0042D738  pop         edi  

0042D739  pop         esi  

0042D73A  pop         ebx  

0042D73B  add         esp,10Ch 

0042D741  cmp         ebp,esp 

0042D743  call        @ILT+3420(__RTC_CheckEsp) (42BD61h) 

0042D748  mov         esp,ebp 

0042D74A  pop         ebp  

0042D74B  ret

And for get1

Expand|Select|Wrap|Line Numbers

 
int get1(int *array)

{

0042D860  push        ebp  

0042D861  mov         ebp,esp 

0042D863  sub         esp,0C0h 

0042D869  push        ebx  

0042D86A  push        esi  

0042D86B  push        edi  

0042D86C  lea         edi,[ebp-0C0h] 

0042D872  mov         ecx,30h 

0042D877  mov         eax,0CCCCCCCCh 

0042D87C  rep stos    dword ptr es:[edi] 

    return array[1];

0042D87E  mov         eax,dword ptr [array] 

0042D881  mov         eax,dword ptr [eax+4] 

}

0042D884  pop         edi  

0042D885  pop         esi  

0042D886  pop         ebx  

0042D887  mov         esp,ebp 

0042D889  pop         ebp  

0042D88A  ret

A number of things are obvious (but note this only holds for the platform I am using here, Windows XP Visual Studio 2008).

uninitialised global, initialised global and stack variables all use the same instructions for access. A heap array using these instructions too but has to start by loading the pointer to the array. That is natural the heap array is accessed via an intermediary variable.

This might suggest that heap access will be a lot slower, it appears that it take 2 instructions to read where everything else takes 1. However a good optimising compiler is going to notice the constant loading of the pointer in harray and will endeavour, and normally succeed to optimise away most of them just leaving the read/write instruction making access to the heap about the same as anywhere else.

Next look at get1. Everyone can pass their pointer to get1 and it executes exactly the same set of instructions. My point here is that you rarely operate directly on data objects, you most often are passing them to other functions via a pointer and once you have done that you really have completely levelled the playing field.

5396

code green

1,726

Expert 1GB

The first would allocate memoery on the stack,
The second would allocate memory on the heap.
Stack access should be faster.
Why not run a benchmark test

Expand|Select|Wrap|Line Numbers

 int main() 

{ 

    clock_t start = clock(); 

    int p[10000];

    for(int i = 0; i < 10000; i++)

        p[ i ] = i;
 
    clock_t duration = clock() - start; 

    cout << "stack time " << duration << "\n"; 
 
    start = clock(); 

    int *p = new int[10000];

    for(int i = 0; i < 10000; i++)

        p[ i ] = i;
 
    duration = clock() - start; 

    cout << "heap time " << duration << "\n"; 

}

Jul 6 '10 #2

Banfa

9,065

Expert Mod 8TB

Stack access should be faster.

Complete rubbish. Heap and stack access times are platform dependent, since platforms are not even required to have them.

However the majority of platforms that have a heap and a stack use the same type of memory to store them and therefore the access times will be the same.

It may take marginally longer to allocate data from the heap.

All as verified by running you code on my PC.

Jul 6 '10 #3

code green

1,726

Expert 1GB

Complete rubbish.
It may take marginally longer to allocate data from the heap.

Yes I didn't read the post properly and was thinking more about memory allocation instead os access

Jul 6 '10 #4

iammilind

@Banfa
Please note that here the comparison is not happening between stack and heap. But the data segment and the heap.
For this case, assume that we are not accounting the original heap allocation.
'p' in the 1st case is at data segment; thus its address would be known by compiler.
'p' in the 2nd case at heap segment is also known; but it's pointing to some location.
I was just wondering at assembly level, is it possible that the data segment 'p' might become a little faster ?
I am unsure about it.

Jul 9 '10 #5

Banfa

9,065

Expert Mod 8TB

@Banfa
Please note that here the comparison is not happening between stack and heap. But the data segment and the heap.

Not really relevant or if anything even less relevant. The "data segment" can mean a number of things but most commonly either

The area of memory containing heap segment and uninitialised program global/static variables (which will be initialised to 0) called bss and initialised program global/static variables called data.
Initialised program global/static variables.

Sometimes if using definition 1 the stack segment is included as part of the data segment. Regardless all of those segments generally (and for a PC are specifically) stored in the same type of memory. Once you have the pointer to that memory the method used to allocate that pointer, be it stack, heap or program data is irrelevance to the speed at which the memory is accessed.

Jul 9 '10 #6

Banfa

9,065

Expert Mod 8TB

Take this program

Expand|Select|Wrap|Line Numbers

 
int ugarray[5];

int igarray[5] = {10,9,8,7,6};
 
int get1(int *array);
 
int main()

{

    int sarray[5];

    int* harray = new int[5];

    int ix;
 
    ugarray[1] = 4;

    igarray[1] = 4;

    sarray[1] = 4;

    harray[1] = 4;
 
    ix = ugarray[1];

    ix = igarray[1];

    ix = sarray[1];

    ix = harray[1];
 
    get1(ugarray);

    get1(igarray);

    get1(sarray);

    get1(harray);
 
    delete harray;
 
  return 0;

}
 
int get1(int *array)

{

    return array[1];

}

It creates all types of arrays and accesses them for both read and write.

Take a look at the assembler

Expand|Select|Wrap|Line Numbers

 
int main()

{

0042D660  push        ebp  

0042D661  mov         ebp,esp 

0042D663  sub         esp,10Ch 

0042D669  push        ebx  

0042D66A  push        esi  

0042D66B  push        edi  

0042D66C  lea         edi,[ebp-10Ch] 

0042D672  mov         ecx,43h 

0042D677  mov         eax,0CCCCCCCCh 

0042D67C  rep stos    dword ptr es:[edi] 

    int sarray[5];

    int* harray = new int[5];

0042D67E  push        14h  

0042D680  call        operator new (42C072h) 

0042D685  add         esp,4 

0042D688  mov         dword ptr [ebp-108h],eax 

0042D68E  mov         eax,dword ptr [ebp-108h] 

0042D694  mov         dword ptr [harray],eax 

    int ix;
 
    ugarray[1] = 4;

0042D697  mov         dword ptr [ugarray+4 (495504h)],4 

    igarray[1] = 4;

0042D6A1  mov         dword ptr [igarray+4 (494004h)],4 

    sarray[1] = 4;

0042D6AB  mov         dword ptr [ebp-14h],4 

    harray[1] = 4;

0042D6B2  mov         eax,dword ptr [harray] 

0042D6B5  mov         dword ptr [eax+4],4 
 
    ix = ugarray[1];

0042D6BC  mov         eax,dword ptr [ugarray+4 (495504h)] 

0042D6C1  mov         dword ptr [ix],eax 

    ix = igarray[1];

0042D6C4  mov         eax,dword ptr [igarray+4 (494004h)] 

0042D6C9  mov         dword ptr [ix],eax 

    ix = sarray[1];

0042D6CC  mov         eax,dword ptr [ebp-14h] 

0042D6CF  mov         dword ptr [ix],eax 

    ix = harray[1];

0042D6D2  mov         eax,dword ptr [harray] 

0042D6D5  mov         ecx,dword ptr [eax+4] 

0042D6D8  mov         dword ptr [ix],ecx 
 
    get1(ugarray);

0042D6DB  push        offset ugarray (495500h) 

0042D6E0  call        get1 (42C32Eh) 

0042D6E5  add         esp,4 

    get1(igarray);

0042D6E8  push        offset igarray (494000h) 

0042D6ED  call        get1 (42C32Eh) 

0042D6F2  add         esp,4 

    get1(sarray);

0042D6F5  lea         eax,[sarray] 

0042D6F8  push        eax  

0042D6F9  call        get1 (42C32Eh) 

0042D6FE  add         esp,4 

    get1(harray);

0042D701  mov         eax,dword ptr [harray] 

0042D704  push        eax  

0042D705  call        get1 (42C32Eh) 

0042D70A  add         esp,4 
 
    delete harray;

0042D70D  mov         eax,dword ptr [harray] 

0042D710  mov         dword ptr [ebp-0FCh],eax 

0042D716  mov         ecx,dword ptr [ebp-0FCh] 

0042D71C  push        ecx  

0042D71D  call        operator delete (42B5EBh) 

0042D722  add         esp,4 
 
  return 0;

0042D725  xor         eax,eax 

}

0042D727  push        edx  

0042D728  mov         ecx,ebp 

0042D72A  push        eax  

0042D72B  lea         edx,[ (42D74Ch)] 

0042D731  call        @ILT+1470(@_RTC_CheckStackVars@8) (42B5C3h) 

0042D736  pop         eax  

0042D737  pop         edx  

0042D738  pop         edi  

0042D739  pop         esi  

0042D73A  pop         ebx  

0042D73B  add         esp,10Ch 

0042D741  cmp         ebp,esp 

0042D743  call        @ILT+3420(__RTC_CheckEsp) (42BD61h) 

0042D748  mov         esp,ebp 

0042D74A  pop         ebp  

0042D74B  ret

And for get1

Expand|Select|Wrap|Line Numbers

 
int get1(int *array)

{

0042D860  push        ebp  

0042D861  mov         ebp,esp 

0042D863  sub         esp,0C0h 

0042D869  push        ebx  

0042D86A  push        esi  

0042D86B  push        edi  

0042D86C  lea         edi,[ebp-0C0h] 

0042D872  mov         ecx,30h 

0042D877  mov         eax,0CCCCCCCCh 

0042D87C  rep stos    dword ptr es:[edi] 

    return array[1];

0042D87E  mov         eax,dword ptr [array] 

0042D881  mov         eax,dword ptr [eax+4] 

}

0042D884  pop         edi  

0042D885  pop         esi  

0042D886  pop         ebx  

0042D887  mov         esp,ebp 

0042D889  pop         ebp  

0042D88A  ret

Jul 9 '10 #7

iammilind

Hi Banfa,
This was the exact point i was trying to make.
In my particular requirement, I have the ability to use the data segment directly. If I use heap segment with global pointer, then as your assembly suggests, it will create one extra indirection.
Thus the heap access will be slower compare to the direct data segment access.

get1() example is naturally same for all segments, because we deal with a simple pointer. But I don't intend to do that.

Thanks.

Jul 9 '10 #8

Banfa

9,065

Expert Mod 8TB

My point is your trying to decided which is the "best" memory to use is mainly a waste of time unless you actually have 2 different banks of memory with different access times. I have worked with processors that had a small amount of on-board RAM you could use to hold some variables. In that situation there was a significant difference in access speed to the different memory banks.

Efficiency of the algorithm used is much more likely to have an effect than choosing the "right" memory allocation technique.

Accessing an array directly rather than through a pointer may be ever so marginally faster but you would be needing to do an extremely large number of accesses. By definition a heap array is always accessed through a pointer but as I said most of the time a stack or static data array is also accessed through a pointer too eliminating difference.

Jul 9 '10 #9

iammilind

No, you got me wrong. I am not deciding which is the better memory to use by this single example.

My requirement was something like following:
I need an array which will keep record of some info.
The array length is decided before the program executes, so naturally the choice will go for static array on data segment.
At the same time, the array length has to be configurable by user before the program starts, so naturally the choice will go for heap segment (otherwise we need to always go and edit the file and recompile it);

The code needs to be highly optimized. So I was wondering if heap doesn't have any performance difference compared to data segment, then I will straight away choose heap segment itself.
Hope I made my 'complex' requirement clear. :)

Jul 9 '10 #10

Array Performance: fix-size vs dynamic

✓ answered by Banfa

Similar topics