473,397 Members | 1,969 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,397 software developers and data experts.

Multidimensional array performance

I'm working with displaying and manipulating very large image sets. The
program handles anything from 2D images to 4D RGB volumes in a
time-series. I've been using dynamically allocated arrays to accomplish
this, but having recently added the ability to load and display 4D
data, I've run into some significant performance issues.

For a dataset of size 256x176x176x1, it takes about 30 seconds to
allocate the array. Loading the data into the array from 176 files only
take a few seconds. Accessing the array is fine, I'm able to loop
through the entire volume in a couple seconds. When its time to
deallocate the memory, it takes a couple minutes. This is on a P4
2.4GHz with 1GB RAM. I'm using the code below to allocate and
deallocate the memory.

I've read about alternate methods for multidimensional arrays,
including templates. I've also read about memory pools, but haven't
seen anything about how to allocate a multidimensional array inside of
the pool. I also dont think a pool is ideal because I'll only be
(de)allocating memory when the user program loads or unloads a dataset.
I've also thought about just declaring a single array in memory and
accessing elements by multiplication (ie array[row*col*plane])

What method would be best to quickly allocate and deallocate large
chunks of memory?

Thanks,
Greg
-------------- DECLARATION -----------------
unsigned short int ****imgDataINT16U; /* unsigned 16-bit */
--------------- ALLOCATION -------------------
int i, j, k;
/* setup the image buffer */
imgData->imgDataINT16U = new unsigned short int***[YSize];
for (i=0; i < YSize;i++)
imgData->imgDataINT16U[i] = new unsigned short int**[XSize];
for (i=0; i < YSize;i++) {
for (j=0; j < XSize;j++)
imgData->imgDataINT16U[i][j] = new unsigned short int*[ZSize];
}
for (i=0; i < YSize;i++) {
for (j=0; j < XSize;j++)
for (k=0; k < ZSize;k++)
imgData->imgDataINT16U[i][j][k] = new unsigned short int[TSize];
}
--------------- DEALLOCATION -------------
for (i=0; i < imgInfo->GetVolumeYSize();i++) {
for (j=0; j < imgInfo->GetVolumeXSize();j++)
for (k=0; k < imgInfo->GetVolumeZSize();k++)
delete[] imgDataINT16U[i][j][k];
progDlg.Update(i,"Unloading Data");
}
for (i=0; i < imgInfo->GetVolumeYSize();i++) {
for (j=0; j < imgInfo->GetVolumeXSize();j++)
delete[] imgDataINT16U[i][j];
}
for (i=0; i < imgInfo->GetVolumeYSize();i++)
delete[] imgDataINT16U[i];
delete[] imgDataINT16U;

------------ GENERAL ACCESS ------------
voxelValue = imgData->imgDataINT16U[col][row][slice][0];

Sep 30 '06 #1
4 5499
<Gr************@gmail.comwrote in message
news:11**********************@m73g2000cwd.googlegr oups.com...
: I'm working with displaying and manipulating very large image sets.
The
: program handles anything from 2D images to 4D RGB volumes in a
: time-series. I've been using dynamically allocated arrays to
accomplish
: this, but having recently added the ability to load and display 4D
: data, I've run into some significant performance issues.
:
: For a dataset of size 256x176x176x1, it takes about 30 seconds to
: allocate the array. Loading the data into the array from 176 files
only
: take a few seconds.
....
: -------------- DECLARATION -----------------
: unsigned short int ****imgDataINT16U; /* unsigned 16-bit */
: --------------- ALLOCATION -------------------
: int i, j, k;
: /* setup the image buffer */
: imgData->imgDataINT16U = new unsigned short int***[YSize];
: for (i=0; i < YSize;i++)
: imgData->imgDataINT16U[i] = new unsigned short int**[XSize];
: for (i=0; i < YSize;i++) {
: for (j=0; j < XSize;j++)
: imgData->imgDataINT16U[i][j] = new unsigned short int*[ZSize];
: }
: for (i=0; i < YSize;i++) {
: for (j=0; j < XSize;j++)
: for (k=0; k < ZSize;k++)
: imgData->imgDataINT16U[i][j][k] = new unsigned short int[TSize];
: }
Think of it: if TSize=1, the every single unsigned short 'voxel'
will be allocated in its own memory block! This represents an
space overhead of a pointer+several bytes (depending on your
library implementation, maybe 16!) for each 2 bytes of data !
This of course also means expensive initial allocations.
But during data manipulation as well, performance will be degraded
by the many misaligned memory accesses when manipulating the data.
: What method would be best to quickly allocate and deallocate large
: chunks of memory?

A first step would be to modify your code to use something like:
class Buf4D_Int16U {
public:
Buf4D_Int16U(int xSize, int ySize, int zSize, int tSize)
: p_( new unsigned short(xSize*ySize*zSize*tSize) )
, xSize_(xSize)
, ySize_(ySize)
, zSize_(zSize)
, tSize_(tSize)
{ }

~Buf4D_Int16U() { delete p_; }

// Element access (non-const and const)
unsigned short& operator(int x,int y,int z,int t)
{ return p_[((x*ySize_+y)*zSize_+z_)*tSize+t]; }
unsigned short operator(int x,int y,int z,int t) const
{ return p_[((x*ySize_+y)*zSize_+z_)*tSize+t]; }

private:
unsigned short* p_;
int xSize_,ySize_,zSize_,tSize_;
Buf4D_Int16U(Buf4D_Int16U const&); //disabled copy-ctr
void operator=(Buf4D_Int16U const&); //disabled copy-assign
};

: ------------ GENERAL ACCESS ------------
: voxelValue = imgData->imgDataINT16U[col][row][slice][0];

Becomes:
voxelValue = imgData(col,row,slice,0);

Pass a reference to the imgData object instead of the
unsigned short**** pointers when passing the buffer to functions.

Note that you probably want to use an std::vector<shortinstead
of the naked pointer p_ (but this may have a performance cost,
especially in non-optimized builds...).
Many further improvements are possible, but this first step
should be a no-brainer...
Cheers,
Ivan
--
http://ivan.vecerina.com/contact/?subject=NG_POST <- email contact form
Brainbench MVP for C++ <http://www.brainbench.com

Sep 30 '06 #2
That looks like an excellent solution. I never thought about putting
the indexing math into a function.
How does the performance of using [n][m][...] notation compare to a
function call when accessing elements?
Many further improvements are possible, but this first step
should be a no-brainer...
What other ways are there of improving large multidimensional arrays?

-Greg

Ivan Vecerina wrote:
<Gr************@gmail.comwrote in message
news:11**********************@m73g2000cwd.googlegr oups.com...
: I'm working with displaying and manipulating very large image sets.
The
: program handles anything from 2D images to 4D RGB volumes in a
: time-series. I've been using dynamically allocated arrays to
accomplish
: this, but having recently added the ability to load and display 4D
: data, I've run into some significant performance issues.
:
: For a dataset of size 256x176x176x1, it takes about 30 seconds to
: allocate the array. Loading the data into the array from 176 files
only
: take a few seconds.
...
: -------------- DECLARATION -----------------
: unsigned short int ****imgDataINT16U; /* unsigned 16-bit */
: --------------- ALLOCATION -------------------
: int i, j, k;
: /* setup the image buffer */
: imgData->imgDataINT16U = new unsigned short int***[YSize];
: for (i=0; i < YSize;i++)
: imgData->imgDataINT16U[i] = new unsigned short int**[XSize];
: for (i=0; i < YSize;i++) {
: for (j=0; j < XSize;j++)
: imgData->imgDataINT16U[i][j] = new unsigned short int*[ZSize];
: }
: for (i=0; i < YSize;i++) {
: for (j=0; j < XSize;j++)
: for (k=0; k < ZSize;k++)
: imgData->imgDataINT16U[i][j][k] = new unsigned short int[TSize];
: }
Think of it: if TSize=1, the every single unsigned short 'voxel'
will be allocated in its own memory block! This represents an
space overhead of a pointer+several bytes (depending on your
library implementation, maybe 16!) for each 2 bytes of data !
This of course also means expensive initial allocations.
But during data manipulation as well, performance will be degraded
by the many misaligned memory accesses when manipulating the data.
: What method would be best to quickly allocate and deallocate large
: chunks of memory?

A first step would be to modify your code to use something like:
class Buf4D_Int16U {
public:
Buf4D_Int16U(int xSize, int ySize, int zSize, int tSize)
: p_( new unsigned short(xSize*ySize*zSize*tSize) )
, xSize_(xSize)
, ySize_(ySize)
, zSize_(zSize)
, tSize_(tSize)
{ }

~Buf4D_Int16U() { delete p_; }

// Element access (non-const and const)
unsigned short& operator(int x,int y,int z,int t)
{ return p_[((x*ySize_+y)*zSize_+z_)*tSize+t]; }
unsigned short operator(int x,int y,int z,int t) const
{ return p_[((x*ySize_+y)*zSize_+z_)*tSize+t]; }

private:
unsigned short* p_;
int xSize_,ySize_,zSize_,tSize_;
Buf4D_Int16U(Buf4D_Int16U const&); //disabled copy-ctr
void operator=(Buf4D_Int16U const&); //disabled copy-assign
};

: ------------ GENERAL ACCESS ------------
: voxelValue = imgData->imgDataINT16U[col][row][slice][0];

Becomes:
voxelValue = imgData(col,row,slice,0);

Pass a reference to the imgData object instead of the
unsigned short**** pointers when passing the buffer to functions.

Note that you probably want to use an std::vector<shortinstead
of the naked pointer p_ (but this may have a performance cost,
especially in non-optimized builds...).
Many further improvements are possible, but this first step
should be a no-brainer...
Cheers,
Ivan
--
http://ivan.vecerina.com/contact/?subject=NG_POST <- email contact form
Brainbench MVP for C++ <http://www.brainbench.com
Oct 1 '06 #3
<Gr************@gmail.comwrote in message
news:11**********************@c28g2000cwb.googlegr oups.com...
: That looks like an excellent solution. I never thought about putting
: the indexing math into a function.
: How does the performance of using [n][m][...] notation compare to a
: function call when accessing elements?
You should check using a profiler on your specific platform.
Some reasoning though:
- In Debug mode/unoptimized builds, you will suffer the overhead
of a function call. This may or may not matter.
It may actually be useful to check, within the accessor function,
at the expense of an additional runtime overhead, that all indices
are within range (e.g. ASSERT(x<xSize) if x and xSize are unsigned).
In release mode, this should not matter.
- On a modern HW architecture, the contiguous (and cache-friendly)
memory accesses to [xSize,ySize,zSize], and the multiplications
to compute an index should be less expensive than the multiple
dereferencing of pointers. But take this as a wild guess.

: Many further improvements are possible, but this first step
: should be a no-brainer...
:
: What other ways are there of improving large multidimensional arrays?

For very large data sets, it can be useful to tile the data in
memory (e.g. store small sub-cubes of the data in one contiguous
memory block). This can, for example, improve the performance
of some multi-planar reconstructions (memory locality, can
be more cache-friendly, especially if some of the data does
not fit in main memory).
But this will complicate the implementation of most algorithms.
Since your program ran with all the allocation space overhead
you had, this should not be an issue for you.

For many algorithms, you will save time by using relative offsets to
reach the neighbors of a pixel rather than to recompute each index.
E.g.: tStep=1, zStep=tSize, yStep=zStep*zSize, xStep=yStep*ySize;
When you want to acess the neighbors of a voxel at pv,
you can directly address pv[+zStep], pv[-zStep] etc...
Even better, by adding a 'margin' around your data set,
you can linearly scan and transform the data step from
begin() to end() without any full-index computation.

In a different vein, many algorithms today can be
reimplemented to leverage the graphics processor
of your display card:
http://www.google.com/search?q=GPU+processing

But remember the saying about premature optimization...
hth
--
http://ivan.vecerina.com/contact/?subject=NG_POST <- email contact form
Brainbench MVP for C++ <http://www.brainbench.com

Oct 1 '06 #4
Thanks! All your information will be helpful.
-Greg
Ivan Vecerina wrote:
<Gr************@gmail.comwrote in message
news:11**********************@c28g2000cwb.googlegr oups.com...
: That looks like an excellent solution. I never thought about putting
: the indexing math into a function.
: How does the performance of using [n][m][...] notation compare to a
: function call when accessing elements?
You should check using a profiler on your specific platform.
Some reasoning though:
- In Debug mode/unoptimized builds, you will suffer the overhead
of a function call. This may or may not matter.
It may actually be useful to check, within the accessor function,
at the expense of an additional runtime overhead, that all indices
are within range (e.g. ASSERT(x<xSize) if x and xSize are unsigned).
In release mode, this should not matter.
- On a modern HW architecture, the contiguous (and cache-friendly)
memory accesses to [xSize,ySize,zSize], and the multiplications
to compute an index should be less expensive than the multiple
dereferencing of pointers. But take this as a wild guess.

: Many further improvements are possible, but this first step
: should be a no-brainer...
:
: What other ways are there of improving large multidimensional arrays?

For very large data sets, it can be useful to tile the data in
memory (e.g. store small sub-cubes of the data in one contiguous
memory block). This can, for example, improve the performance
of some multi-planar reconstructions (memory locality, can
be more cache-friendly, especially if some of the data does
not fit in main memory).
But this will complicate the implementation of most algorithms.
Since your program ran with all the allocation space overhead
you had, this should not be an issue for you.

For many algorithms, you will save time by using relative offsets to
reach the neighbors of a pixel rather than to recompute each index.
E.g.: tStep=1, zStep=tSize, yStep=zStep*zSize, xStep=yStep*ySize;
When you want to acess the neighbors of a voxel at pv,
you can directly address pv[+zStep], pv[-zStep] etc...
Even better, by adding a 'margin' around your data set,
you can linearly scan and transform the data step from
begin() to end() without any full-index computation.

In a different vein, many algorithms today can be
reimplemented to leverage the graphics processor
of your display card:
http://www.google.com/search?q=GPU+processing

But remember the saying about premature optimization...
hth
--
http://ivan.vecerina.com/contact/?subject=NG_POST <- email contact form
Brainbench MVP for C++ <http://www.brainbench.com
Oct 1 '06 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

9
by: Charles Banas | last post by:
i've got an interesting peice of code i'm maintaining, and i'd like to get some opinions and comments on it, hopefully so i can gain some sort of insight as to why this works. at the top of the...
1
by: Mark Smith | last post by:
I'm trying to copy data from a 1D array to a 2D array. The obvious thing doesn't work: int twoDee = new int; int oneDee = new int { 1, 2 }; Array.Copy(oneDee, 2, twoDee, 2, 2); This causes a...
2
by: chris | last post by:
Hi there, I created a Multidimensional array of labels Label lblMultiArray = new Label { {Label3, LblThuTotal}, {Label4,LblFriTotal} }; Now I would like to compare the values in the array,...
10
by: | last post by:
I'm fairly new to ASP and must admit its proving a lot more unnecessarily complicated than the other languages I know. I feel this is because there aren't many good official resources out there to...
1
by: Chuy08 | last post by:
If I have a multidimensional array like the following: Array $records =Array 0 = 30 year, 6.0; 1 = 30 year, 6.0; 2 = Pay Option, 1.0; 3 = Pay Option, 1.0; How could I flatten this to...
1
by: Mike | last post by:
Hi, I have a multidimensional array that I would like to access as a hash map for performance reasons. I cannot see anything in the STL (I mean STL/TR1) that supports multi-dimensional hash...
5
by: LittleCake | last post by:
Hi All, I have a multidimensional array where each sub-array contains just two entries, which indicates a relationship between those two entries. for example the first sub-array: =Array ( =30...
4
Jezternz
by: Jezternz | last post by:
First of all I am open to any suggestions and advice. If a javscript multidimensional array is a bad way to do this please say so. I considered XML but I wondered if this would be a bad idea as it...
9
by: Slain | last post by:
I need to convert a an array to a multidimensional one. Since I need to wrok with existing code, I need to modify a declaration which looks like this In the .h file int *x; in a initialize...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.