The fastest way to fill 2D dynamic array with zeros

romayankin

Hello All,

I'm writing cross-platform code so i'm bound to standards. Here is
the code I have:
~~~~~~~~~~~~~~~~~~
double **mx = new double*[col - 1];
for(int i = 0; i < col - 1; i++) {
mx[i] = new double[row];
sizeOfMx += sizeof(double) * row;
}
//I need to fill "mx" with zeros as fast as it is possible
~~~~~~~~~~~~~~~~~~~

So as I mentioned in comment, i have problem with finding the extremely
fast way to fill the 2D dynamic array with zeros. Does anyone know
something I that might help me?

Thanks,
-R.

Feb 25 '06 #1

Subscribe Post Reply

8296

Bob Hairgrove

On 25 Feb 2006 07:42:44 -0800, ro********@gmail.com wrote:

Hello All,

I'm writing cross-platform code so i'm bound to standards. Here is
the code I have:
~~~~~~~~~~~~~~~~~~
double **mx = new double*[col - 1];
for(int i = 0; i < col - 1; i++) {
mx[i] = new double[row];
sizeOfMx += sizeof(double) * row;
}
//I need to fill "mx" with zeros as fast as it is possible
~~~~~~~~~~~~~~~~~~~

So as I mentioned in comment, i have problem with finding the extremely
fast way to fill the 2D dynamic array with zeros. Does anyone know
something I that might help me?

Thanks,
-R.

It would be much faster to calculate rows*columns and allocate your 2D
array in one call to new[] as if it were a 1D array, rather than doing
one new[] per column. Then you could initialize the elements with a
for loop as if it were a 1D array. I don't know if using a loop is the
fastest way, but all those new[] calls are probably going to be much
more expensive.

If you have millions or billions of doubles, you might want to split
the memory up into buckets and allocate a few threads to initialize
them in parallel. But this is not going to be platform-independent, so
I would just use a loop.

Probably the best idea is to use a std::vector<double> and treat it
like an array ... that way you won't have to call delete[] on your
arrays.

Implementing 2D arrays is also treated by Scott Meyers, among others,
in his book "More Effective C++". Google for "c++" and "2D array" and
you'll get lots of hits.

--
Bob Hairgrove
No**********@Home.com

Feb 25 '06 #2

Gianni Mariani

ro********@gmail.com wrote:

Hello All,

I'm writing cross-platform code so i'm bound to standards. Here is
the code I have:
Look at "matrix" "C++" on google.
~~~~~~~~~~~~~~~~~~
double **mx = new double*[col - 1];
for(int i = 0; i < col - 1; i++) {
mx[i] = new double[row];
sizeOfMx += sizeof(double) * row;
}
//I need to fill "mx" with zeros as fast as it is possible
~~~~~~~~~~~~~~~~~~~

So as I mentioned in comment, i have problem with finding the extremely
fast way to fill the 2D dynamic array with zeros. Does anyone know
something I that might help me?

How big are these arrays going to be ?

Feb 25 '06 #3

Victor Bazarov

Gianni Mariani wrote:

ro********@gmail.com wrote:
Hello All,

I'm writing cross-platform code so i'm bound to standards. Here is
the code I have:
Look at "matrix" "C++" on google.
~~~~~~~~~~~~~~~~~~
double **mx = new double*[col - 1];
I didn't see anybody mention this, but why are you allocating
one less than what seems to be the "number of columns"?
for(int i = 0; i < col - 1; i++) {
mx[i] = new double[row];
sizeOfMx += sizeof(double) * row;
}
//I need to fill "mx" with zeros as fast as it is possible
~~~~~~~~~~~~~~~~~~~

So as I mentioned in comment, i have problem with finding the
extremely fast way to fill the 2D dynamic array with zeros. Does
anyone know something I that might help me?

Yes, if you do

mx[i] = new double[row]();

(notice the parentheses after the brackets), then the elements
of that array shall be value-initialised, and for 'double' it
means zero-initialised. Beware, not all compilers are capable
of handling that syntax (surprisingly).
How big are these arrays going to be ?

Why should it matter, Gianni?

V
--
Please remove capital As from my address when replying by mail

Feb 25 '06 #4

Axter

ro********@gmail.com wrote:

Hello All,

I'm writing cross-platform code so i'm bound to standards. Here is
the code I have:
~~~~~~~~~~~~~~~~~~
double **mx = new double*[col - 1];
for(int i = 0; i < col - 1; i++) {
mx[i] = new double[row];
sizeOfMx += sizeof(double) * row;
}
//I need to fill "mx" with zeros as fast as it is possible
~~~~~~~~~~~~~~~~~~~

So as I mentioned in comment, i have problem with finding the extremely
fast way to fill the 2D dynamic array with zeros. Does anyone know
something I that might help me?

Here's a wrapper function that will initialize a 2D array to zero, much
more efficient.
template < typename T >
T **Allocate2DArray( int nRows, int nCols)
{
T **ppi = new T*[nRows];
//parentheses after the brackets makes it initialize to zero
T *curPtr = new T [nRows * nCols]();
for( int i = 0; i < nRows; i++) {
*(ppi + i) = curPtr;
curPtr += nCols;
}
return ppi;
}

The above wrapper function only makes two calls to new, which means the
following wrapper function only needs to make two calls to delete:

template < typename T >
void Free2DArray(T** Array)
{
delete [] *Array;
delete [] Array;
}

Example usage:
double **d = Allocate2DArray<double>(x, y);
d[0][0] = 10.0;
d[1][1] = 20.0;
d[3][2] = 2345.09;
Free2DArray(d);

Feb 25 '06 #5

romayankin

First answering your questions:

2Bob Hairgrove
I didn't want to create 1D array or use std::vector, because working
with 1D array involves to much address arithmetic's, and std::vector
is by default slower then any array, besides I do not need any
functionality that is provided by std::vector

How big are these arrays going to be ? Not quite big the max value is around 1000 elements, in most cases
~100.
I didn't see anybody mention this, but why are you allocating one less than what seems to be the "number of columns"?

This is an atavism. That's how it works for me :)
OK, thanks everyone for the replies, but that was my fault that I
didn't emphasize that my performance problem does not involve
allocation. It is completely in setting all elements of a particular 2D
array to zero. I'm creating this array only once and then i might need
to zero it tens thousands of times.

Up until now the best option I could come up with is:
~~~~~~~~~~~~~~~~~~
for(int i = 0; i < col - 1; i++) {
memset(mx[i], 0, sizeof(double)*row);
}
~~~~~~~~~~~~~~~~~~

Feb 25 '06 #6

Gianni Mariani

Victor Bazarov wrote:

Gianni Mariani wrote:
ro********@gmail.com wrote:
Hello All,

I'm writing cross-platform code so i'm bound to standards. Here is
the code I have:
Look at "matrix" "C++" on google.

~~~~~~~~~~~~~~~~~~
double **mx = new double*[col - 1];

I didn't see anybody mention this, but why are you allocating
one less than what seems to be the "number of columns"?

Good catch. I was thinking "matrix" class so I wasn't looking there.

for(int i = 0; i < col - 1; i++) {
mx[i] = new double[row];
sizeOfMx += sizeof(double) * row;
}
//I need to fill "mx" with zeros as fast as it is possible
~~~~~~~~~~~~~~~~~~~

So as I mentioned in comment, i have problem with finding the
extremely fast way to fill the 2D dynamic array with zeros. Does
anyone know something I that might help me?

Yes, if you do

mx[i] = new double[row]();

(notice the parentheses after the brackets), then the elements
of that array shall be value-initialised, and for 'double' it
means zero-initialised. Beware, not all compilers are capable
of handling that syntax (surprisingly).

How big are these arrays going to be ?

Why should it matter, Gianni?

Big (100's of cache size) sized arrays would benefit by using cache
prefetch and depending on the architecture (multiple instruction issue
and speculative execution) there are a number of things you can do to
re-order the statements in ways the compiler probably would not be able
to optimize for. These kinds of optimizations depend heavily on
architecture.

For example, in SGI's IRIX 4.x days, the memcpy routine was not
optimized for the R400x CPU's and some simple loop unrolling and the use
of "unaligned" ld/st increased performance by significant multiples (the
R4K had a single way associative cache). For the R8K and R10K, with
multiple issue, speculative execution and cache prefetch instructions,
the main loop was very different to get the most out of the CPU. In the
R10K, loop unrolling was next to useless, in the R4K loop unrolling was
a big boost.

However, if we're talking about what fits into a secondary cache, all
this is moot since big hits in performance in modern CPU's are due to
cache misses (mostly secondary cache). This also means that the speed
for "clearing out" a block of memory may be limited by the memory
bandwidth. So, the aim is to come up with an algorithm that is able to
saturate the memory interfaces.

However, if we're talking about a 4x4 matrix, that would fit in a small
number of cache lines and by the time the memory allocator handed over
the memory, it would more than likely already be in primary cache so
there would be next to no benefit to apply any of the more advanced
techniques to deal with cache misses.

All this is of course highly dependant on *your* system.

Feb 25 '06 #7

Gianni Mariani

ro********@gmail.com wrote:

First answering your questions:

2Bob Hairgrove
I didn't want to create 1D array or use std::vector, because working
with 1D array involves to much address arithmetic's, and std::vector
is by default slower then any array, besides I do not need any
functionality that is provided by std::vector
*BZZT* Wrong.

std::vector is not slower than an array unless you have tested it. In
many cases, I have seen negligible different between std::vector and an
array when the optimizer gets done with it.

How big are these arrays going to be ?
Not quite big the max value is around 1000 elements, in most cases
~100.

I didn't see anybody mention this, but why are you allocating one less than what seems to be the "number of columns"?

This is an atavism. That's how it works for me :)
OK, thanks everyone for the replies, but that was my fault that I
didn't emphasize that my performance problem does not involve
allocation. It is completely in setting all elements of a particular 2D
array to zero. I'm creating this array only once and then i might need
to zero it tens thousands of times.

BTW - the address arithmetic you feel is causing you performance issues
is far more likely to be less of an issue than the memory fetch for the
address you're looking for.

Up until now the best option I could come up with is:
~~~~~~~~~~~~~~~~~~
for(int i = 0; i < col - 1; i++) {
memset(mx[i], 0, sizeof(double)*row);
}
~~~~~~~~~~~~~~~~~~

Feb 25 '06 #8

Axter

ro********@gmail.com wrote:

First answering your questions:

2Bob Hairgrove
I didn't want to create 1D array or use std::vector, because working
with 1D array involves to much address arithmetic's, and std::vector
is by default slower then any array, besides I do not need any
functionality that is provided by std::vector
How big are these arrays going to be ?

Not quite big the max value is around 1000 elements, in most cases
~100.
I didn't see anybody mention this, but why are you allocating one less than what seems to be the "number of columns"?

This is an atavism. That's how it works for me :)
OK, thanks everyone for the replies, but that was my fault that I
didn't emphasize that my performance problem does not involve
allocation. It is completely in setting all elements of a particular 2D
array to zero. I'm creating this array only once and then i might need
to zero it tens thousands of times.

Up until now the best option I could come up with is:
~~~~~~~~~~~~~~~~~~
for(int i = 0; i < col - 1; i++) {
memset(mx[i], 0, sizeof(double)*row);
}

If you use the method I posted, you wouldn't have to interatore through
each column of the array to set it.
With the method I posted, you can do the following:
memset(mx[0], 0, sizeof(double)*row*col);

You can just make one call to memset.
The method I posted allocates a single block for the 2D array, and then
devides it in sections to set the main array of pointers.
Since it's one continuous block, you can just use one call to memset.

Feb 25 '06 #9

Gavin Deane

ro********@gmail.com wrote:

First answering your questions:

2Bob Hairgrove
I didn't want to create 1D array or use std::vector, because working
with 1D array involves to much address arithmetic's, and std::vector
is by default slower then any array, besides I do not need any
functionality that is provided by std::vector

std::vector provides memory management, exception safety and a clean
and simple way of passing the data between functions and/or containing
the data within an object. You must be writing a very trivial C++
program if you don't need any of that functionality.

Gavin Deane

Feb 25 '06 #10

Amadeus W. M.

On Sat, 25 Feb 2006 07:42:44 -0800, romayankin wrote:

Hello All,

I'm writing cross-platform code so i'm bound to standards. Here is
the code I have:
~~~~~~~~~~~~~~~~~~
double **mx = new double*[col - 1];
for(int i = 0; i < col - 1; i++) {
mx[i] = new double[row];
sizeOfMx += sizeof(double) * row;
}
//I need to fill "mx" with zeros as fast as it is possible
~~~~~~~~~~~~~~~~~~~

So as I mentioned in comment, i have problem with finding the extremely
fast way to fill the 2D dynamic array with zeros. Does anyone know
something I that might help me?

Thanks,
-R.

If you have to zero out repeatedly the double ** array, then I think the
fastest way is to call memset on a contiguous array (along the lines
Axter suggested). It's not just how many times you allocate/deallocate
memory. Once it's allocated, you only call memset once each time you need
to zero out the array, if the array is contiguous. And memset might be
optimized for the different architectures.

It's not good to have MANY large, contiguous arrays, because they may not
fit in memory. For instance, an image is invariably stored contiguously,
but how many images can you keep in memory simultaneously? Certainly not
a whole movie. Otherwise it's ok. Contiguous arrays of size O(1000) don't
pose any threat.

Now if you're concerned about page hits/misses, caches and virtual memory,
then you may have to unroll the loops, or resort to other tricks that
people in sci.math.num-analysis will most certainly know better.

Feb 26 '06 #11

romayankin

2Axter regarding your code. The thing is that my 2D array is not
rectangular. And therefore your approach will cause some unnecessary
memory overheads. But thanks for the idea i might implement it as an
"speed optimized" option.

2Gianni

BTW - the address arithmetic you feel is causing you performance issues
is far more likely to be less of an issue than the memory fetch for the
address you're looking for.

I didn't mean it would affect the speed, my saying that it involves too
much address arithmetic I meant that the code gets less crystal and
besides of that, its easier to make a mistake when you always have to
multiply iterator by the length of the row. And you right its more
likely that retrieving address using square brackets is usually less
efficient then using incrementing pointers.

So ... as far as i'm having non-rectangular 2D array the best way to
fill it with zeros is:
~~~~~~~~~~~~~~~~~~~~~~~~~~
for(int i = 0; i < col - 1; ++i)
{
memset(mx[i], 0, sizeof(double)*rowLen[i]);
}
~~~~~~~~~~~~~~~~~~~~~~~~~~
or am I missing something again?

Feb 26 '06 #12

John Carson

<ro********@gmail.com> wrote in message
news:11**********************@i40g2000cwc.googlegr oups.com

2Axter regarding your code. The thing is that my 2D array is not
rectangular. And therefore your approach will cause some unnecessary
memory overheads. But thanks for the idea i might implement it as an
"speed optimized" option.

The code in your original post was for a rectangular array. However, if you
have an array of row lengths, then Axter's approach can easily be modified
to allocate a block equal in size to the sum of those row lengths. Pointers
can easily be initialized to point at the appropriate places.
--
John Carson

Feb 26 '06 #13

Daniel T.

In article <11**********************@p10g2000cwp.googlegroups .com>,
ro********@gmail.com wrote:

First answering your questions:

2Bob Hairgrove
I didn't want to create 1D array or use std::vector, because working
with 1D array involves to much address arithmetic's, and std::vector
is by default slower then any array, besides I do not need any
functionality that is provided by std::vector

You are quite wrong. I once had a rather good C programmer insist the
same thing so we wrote a test program. Once the compiler was through
optimizing the code, the vector was actually 5% *faster* than anything
he could come up with.

Worst case, use a one dimensional array encapsulated in a class that
does the address arithmetic for you.
--
Magic depends on tradition and belief. It does not welcome observation,
nor does it profit by experiment. On the other hand, science is based
on experience; it is open to correction by observation and experiment.

Feb 26 '06 #14

romayankin

Okey thanks, I guess then my question has been answered. Thanks for
everyone who has participated in this discussion

Feb 26 '06 #15

Similar topics

array of class attributes

by: beliavsky | last post by:

Suppose I define a trivial class composed of x and y coordinates and create a Numeric array of instances of that class. class xy: def __init__(self,x=0.0,y=0.0): self.x = x self.y = y def...

Python

Matrix/ fill with zeros.

by: David | last post by:

Anyone, Does anyonee now how to use the following outline to fill a matrix with zeros and call on the first function in the following program? void fill_with_zeros(int *mat){...

C / C++

Array initalization doubt.

by: K.M. Jr. | last post by:

Hi all, Consider this line - char s = "\0"; Does this initialize all array elements to zero ? Thanks.

C / C++

Dynamic fill form fields depending on dropdown box

by: Alex | last post by:

Hello list This question has probably already been asked, but let me ask again I have a mysql database to which I connect with my php scripts. The database contains articles. Name, Unit_Price...

PHP

find length of a dynamic array

by: finerrecliner | last post by:

hey everyone i'm trying to make a function that will return the length and width of a dynamically allocated 2D array. here's what i came up with to find width: int findWidth(int** matrix)...

C / C++

byte[] to byte*... What is the fastest way?

by: ThunderMusic | last post by:

Hi, The subject says it all... I want to use a byte and use it as byte* so I can increment the pointer to iterate through it. What is the fastest way of doing so in C#? Thanks ThunderMusic

.NET Framework

Re: Fill array keys if it doesn't exists

by: Sjoerd | last post by:

On Mon, 08 Sep 2008 00:11:38 -0700, mouac01@yahoo.com wrote: Loop through the array and make an array of all keys. Then loop through the array again and add the missing keys.

PHP

Is this even possible!!! Dynamic Option List plus onclick fill

by: bytesFTW99 | last post by:

I have been struggling with this for some time can anyone help out? just trying to have 3 dropdown boxes that fill depending on what is selected, then in some cases click a button and have the second...

Javascript

Migrating Website to Cloud - Emmanuel Katto

by: emmanuelkatto | last post by:

Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel

General

Looking to do Android software development, any suggestions? Is flutter better?

by: nemocccc | last post by:

hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?

General

Is that possible of reading the .csv file in column wise and the column have different lengths ?

by: Sonnysonu | last post by:

This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

C / C++

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...

General

Maximizing Business Potential: The Nexus of Website Design and Digital Marketing

by: jinu1996 | last post by:

In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

Online Marketing

Discussion: How does Zigbee compare with other wireless protocols in smart home applications?

by: tracyyun | last post by:

Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

General

AI Job Threat for Devs

by: agi2029 | last post by:

Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

Career Advice

Access Europe - Using VBA to create a class based on a table - Wed 1 May

by: isladogs | last post by:

The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

Microsoft Access / VBA