Tips on optimizing these functions

Andrea Taverna

Hello everyone,

I wrote a bunch of recursive functions to operate on multi-dimensional
matrices. The matrices are allocated dynamically in a non-contiguous way,
i.e. as an array of pointers pointing to arrays of data,or other pointers
if the matrix has more than 2 dimensions.

The parameters passed to these functions are:
- current_dimension: counts (from 0 to dimensions-1) the matrix
dimension on which the function is working, it's the variable passed on
the stack by the recursion
- dimensions: number of matrix's dimensions
- elem_size: size of the matrix's elements
- dimensions_sizes: a vector containing the 'size' of each dimension
For example, to work on a 10x20 matrix of integers, following the
ordering of above, we would pass:
(0,2,sizeof(int),(unsigned int [2]){10,20})
for a 10x20x15 one, we would pass
(0,3,sizeof(int),(unsigned int [3]){10,20,15})

The functions work fast for allocation and freeing, 'cause calls to
malloc and free take up most of the execution time. They're somewhat slow
at copying or initialising matrices. For initialization I mean assign a
scalar value to the elements of the matrix.

I've done some benchmarks with copying and initialisation. Compared to a
specific-nested-loop solution, the functions take up to twice the time.
However, turning on some optimization flags, specifically '-O3' with gcc,
the gap between the recursive and the specific solution reduces to 20%.

So, have you got any advice about optimizing this code?
Other suggestions are welcomeas well.

TIA

Andrea

Here follows the copying function. The initialising function is almost
identical

NB: to better understand the code you should imagine to work with a bi-
dimensional matrix (implemented as a pointer to pointer in the code). The
recursive step casts either the matrix to a vector, if the function
reached the elements' dimension, ending recursion, or the rows of the
matrix to a bi-dimensional matrix (again, pointer to pointer), continuing
recursion.

//////////////////////////////////

typedef unsigned char byte;

// this one copy one row of the matrix. The row is supposed to store the
value of elements, not pointers
void _copy_row(void* dest, void* src, unsigned short elem_size, unsigned
int n)
{
unsigned short length;

byte* d1,*d2;

d1 = (byte*)dest;
d2 = (byte*)src;

// copy byte to byte
while (n 0)
{
for (length = 0; length < elem_size; length++)
{
(*d1) = (*d2);
d1++;
d2++;
};
n--;
};
}

// this is the recursive function
void _vec_copy(byte current_dimension, byte dimensions,unsigned short
elem_size, unsigned int* dimensions_size, void** restrict dest, void**
restrict src)
{
int i; // row index

if (current_dimension < dimensions)
{
if (current_dimension == dimensions -1)
{
_copy_row((void*)dest, (void*)src, elem_size,dimensions_size
[current_dimension]);
}
else
{
for (i = 0; i < dimensions_size[current_dimension]; i++)
_vec_copy(current_dimension+1, dimensions,
elem_size,dimensions_size, (void**)dest[i], (void**)src[i]);
};
};

Sep 27 '08 #1

Subscribe Post Reply

1742

Tim Prince

Andrea Taverna wrote:

I've done some benchmarks with copying and initialisation. Compared to a
specific-nested-loop solution, the functions take up to twice the time.
However, turning on some optimization flags, specifically '-O3' with gcc,
the gap between the recursive and the specific solution reduces to 20%.

So, have you got any advice about optimizing this code?
Other suggestions are welcomeas well.

typedef unsigned char byte;

// this one copy one row of the matrix. The row is supposed to store the
value of elements, not pointers
void _copy_row(void* dest, void* src, unsigned short elem_size, unsigned
int n)
{
unsigned short length;

byte* d1,*d2;

d1 = (byte*)dest;
d2 = (byte*)src;

// copy byte to byte
while (n 0)
{
for (length = 0; length < elem_size; length++)
{
(*d1) = (*d2);
d1++;
d2++;
};
n--;
};
}

This is so dependent on the platform that we could justifiably argue you
should choose one, and go to a forum associated with that platform.
Do any of the compilers you use take advantage of restrict?
If elem_size happens to match frequently the size of a stdint type, you
will need to switch case the code so as to remove the inner loop for those
cases.
Some compilers automatically substitute a run-time library copy function
which invokes all the usual memcpy() optimizations (align destination,
move groups of bytes per instruction).
If you wrote memcpy() in line, that would work well with certain
compilers, not so well with others (possibly depending on command line
options and which run time library you choose). If you are somehow
prohibited from using restrict, writing in memcpy() makes the same assertion.

Sep 27 '08 #2

Barry Schwarz

On Sat, 27 Sep 2008 14:13:50 +0200 (CEST), Andrea Taverna
<a.****@libero.itwrote:

snip discussion of matrix philosophy

>typedef unsigned char byte;

// this one copy one row of the matrix. The row is supposed to store the
value of elements, not pointers
void _copy_row(void* dest, void* src, unsigned short elem_size, unsigned
int n)
{
unsigned short length;

byte* d1,*d2;

d1 = (byte*)dest;
d2 = (byte*)src;

// copy byte to byte
while (n 0)
{
for (length = 0; length < elem_size; length++)
{
(*d1) = (*d2);
d1++;
d2++;
};
n--;
};
}

Each element consists of elem_size contiguous bytes. Each row
consists of n contiguous elements. Therefore, each row must consist
of n*elem_size contiguous bytes.

The entire body of your function can be replaced with
memcpy(dest, src, (size_t)n*elem_size);

In fact, the entire function can be deleted and any call to the
function replaced with the above statement.

Either substitution will have the additional benefit of not invoking
undefined behavior if any of the elements are indeterminate.

--
Remove del for email

Sep 27 '08 #3

Similar topics

optimizing code

by: Andreas Paasch | last post by:

I've finally gotten my nice little system working and it's gone live. Now, I spent time optimizing my code and adding a little smart functionality here and there, based on needs and simplicity. ...

PHP

MySQL tips that I have come across

by: Mike Chirico | last post by:

Interesting Things to Know about MySQL Mike Chirico (mchirico@users.sourceforge.net) Copyright (GPU Free Documentation License) 2004 Last Updated: Mon Jun 7 10:37:28 EDT 2004 The latest...

MySQL Database

Q about "optimizing away" "non-used" code

by: J. Campbell | last post by:

From reading this forum, it is my understanding that C++ doesn't require the compiler to keep code that does not manifest itself in any way to the user. For example, in the following: { for(int...

C / C++

much optimizing = bad code ? (array/vector)

by: Hagen | last post by:

Hi, I have a question that you probably shouldnÂ´t worry about since the compiler cares for it, but anyways: When you run your compiler with optimization turned on (eg. g++ with -Ox flag) and...

C / C++

useful tips on how to write your C codes more efficiently

by: eyh5 | last post by:

Hi, I'm writing some C codes to run simulations. I'm wondering if there is a website that may contain useful information on how to make one's code run more efficiently and in a...

C / C++

PHP Hacks : Tips & Tools For Creating Dynamic Websites (Hacks)

by: code | last post by:

Hi Grp http://www.books-download.com/?Book=1493-PHP+Hacks+%3a+Tips+%26+Tools+For+Creating+Dynamic+Websites+(Hacks) Description Programmers love its flexibility and speed; designers love its...

PHP

Optimizing function pointer usage

by: Jack | last post by:

I have a chunk of code that loads a few dozen function pointers into global variables. I'm concerned with unused memory consumption. What if the client only needs to use one or two functions? Then...

C / C++

Optimizing Slow Queries on millions of records

by: Got2Go | last post by:

Hello Group, I have a table that has millions of records in it. About 100 records are added every 5 minutes (one per OIDID) (the sample provided below has data for 2 OIDIDs (99 and 100) And I...

MySQL Database

My tail call optimizing decorator

by: Miguel Perez | last post by:

Please critique this tail call optimizing decorator I've written. I've tried to fix the pitfalls of other proposed decorators, and the result is this one that supports mutual recursion, does not...

Python

How to turn on java script in a villaon keypad mobile phone

by: Charles Arthur | last post by:

How do i turn on java script on a villaon, callus and itel keypad mobile phone

Java

Batch import of multiple excel files into the database

by: ryjfgjl | last post by:

If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...

Data Management

Navigating the Data Structures and Algorithms (DSA)

by: BarryA | last post by:

What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...

Algorithms / Advanced Math

Looking to do Android software development, any suggestions? Is flutter better?

by: nemocccc | last post by:

hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?

General

Is that possible of reading the .csv file in column wise and the column have different lengths ?

by: Sonnysonu | last post by:

This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

C / C++

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...

General

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

Windows Server

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

C / C++

Maximizing Business Potential: The Nexus of Website Design and Digital Marketing

by: jinu1996 | last post by:

In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

Online Marketing