473,856 Members | 1,738 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Tips on optimizing these functions

Hello everyone,

I wrote a bunch of recursive functions to operate on multi-dimensional
matrices. The matrices are allocated dynamically in a non-contiguous way,
i.e. as an array of pointers pointing to arrays of data,or other pointers
if the matrix has more than 2 dimensions.

The parameters passed to these functions are:
- current_dimensi on: counts (from 0 to dimensions-1) the matrix
dimension on which the function is working, it's the variable passed on
the stack by the recursion
- dimensions: number of matrix's dimensions
- elem_size: size of the matrix's elements
- dimensions_size s: a vector containing the 'size' of each dimension
For example, to work on a 10x20 matrix of integers, following the
ordering of above, we would pass:
(0,2,sizeof(int ),(unsigned int [2]){10,20})
for a 10x20x15 one, we would pass
(0,3,sizeof(int ),(unsigned int [3]){10,20,15})

The functions work fast for allocation and freeing, 'cause calls to
malloc and free take up most of the execution time. They're somewhat slow
at copying or initialising matrices. For initialization I mean assign a
scalar value to the elements of the matrix.

I've done some benchmarks with copying and initialisation. Compared to a
specific-nested-loop solution, the functions take up to twice the time.
However, turning on some optimization flags, specifically '-O3' with gcc,
the gap between the recursive and the specific solution reduces to 20%.

So, have you got any advice about optimizing this code?
Other suggestions are welcomeas well.

TIA

Andrea

Here follows the copying function. The initialising function is almost
identical

NB: to better understand the code you should imagine to work with a bi-
dimensional matrix (implemented as a pointer to pointer in the code). The
recursive step casts either the matrix to a vector, if the function
reached the elements' dimension, ending recursion, or the rows of the
matrix to a bi-dimensional matrix (again, pointer to pointer), continuing
recursion.

//////////////////////////////////

typedef unsigned char byte;

// this one copy one row of the matrix. The row is supposed to store the
value of elements, not pointers
void _copy_row(void* dest, void* src, unsigned short elem_size, unsigned
int n)
{
unsigned short length;

byte* d1,*d2;

d1 = (byte*)dest;
d2 = (byte*)src;

// copy byte to byte
while (n 0)
{
for (length = 0; length < elem_size; length++)
{
(*d1) = (*d2);
d1++;
d2++;
};
n--;
};
}

// this is the recursive function
void _vec_copy(byte current_dimensi on, byte dimensions,unsi gned short
elem_size, unsigned int* dimensions_size , void** restrict dest, void**
restrict src)
{
int i; // row index

if (current_dimens ion < dimensions)
{
if (current_dimens ion == dimensions -1)
{
_copy_row((void *)dest, (void*)src, elem_size,dimen sions_size
[current_dimensi on]);
}
else
{
for (i = 0; i < dimensions_size[current_dimensi on]; i++)
_vec_copy(curre nt_dimension+1, dimensions,
elem_size,dimen sions_size, (void**)dest[i], (void**)src[i]);
};
};

Sep 27 '08 #1
2 1779
Andrea Taverna wrote:
I've done some benchmarks with copying and initialisation. Compared to a
specific-nested-loop solution, the functions take up to twice the time.
However, turning on some optimization flags, specifically '-O3' with gcc,
the gap between the recursive and the specific solution reduces to 20%.

So, have you got any advice about optimizing this code?
Other suggestions are welcomeas well.
typedef unsigned char byte;

// this one copy one row of the matrix. The row is supposed to store the
value of elements, not pointers
void _copy_row(void* dest, void* src, unsigned short elem_size, unsigned
int n)
{
unsigned short length;

byte* d1,*d2;

d1 = (byte*)dest;
d2 = (byte*)src;

// copy byte to byte
while (n 0)
{
for (length = 0; length < elem_size; length++)
{
(*d1) = (*d2);
d1++;
d2++;
};
n--;
};
}
This is so dependent on the platform that we could justifiably argue you
should choose one, and go to a forum associated with that platform.
Do any of the compilers you use take advantage of restrict?
If elem_size happens to match frequently the size of a stdint type, you
will need to switch case the code so as to remove the inner loop for those
cases.
Some compilers automatically substitute a run-time library copy function
which invokes all the usual memcpy() optimizations (align destination,
move groups of bytes per instruction).
If you wrote memcpy() in line, that would work well with certain
compilers, not so well with others (possibly depending on command line
options and which run time library you choose). If you are somehow
prohibited from using restrict, writing in memcpy() makes the same assertion.
Sep 27 '08 #2
On Sat, 27 Sep 2008 14:13:50 +0200 (CEST), Andrea Taverna
<a.****@libero. itwrote:

snip discussion of matrix philosophy
>typedef unsigned char byte;

// this one copy one row of the matrix. The row is supposed to store the
value of elements, not pointers
void _copy_row(void* dest, void* src, unsigned short elem_size, unsigned
int n)
{
unsigned short length;

byte* d1,*d2;

d1 = (byte*)dest;
d2 = (byte*)src;

// copy byte to byte
while (n 0)
{
for (length = 0; length < elem_size; length++)
{
(*d1) = (*d2);
d1++;
d2++;
};
n--;
};
}
Each element consists of elem_size contiguous bytes. Each row
consists of n contiguous elements. Therefore, each row must consist
of n*elem_size contiguous bytes.

The entire body of your function can be replaced with
memcpy(dest, src, (size_t)n*elem_ size);

In fact, the entire function can be deleted and any call to the
function replaced with the above statement.

Either substitution will have the additional benefit of not invoking
undefined behavior if any of the elements are indeterminate.

--
Remove del for email
Sep 27 '08 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

7
2326
by: Andreas Paasch | last post by:
I've finally gotten my nice little system working and it's gone live. Now, I spent time optimizing my code and adding a little smart functionality here and there, based on needs and simplicity. In this context, I was wondering about something. I have a growing include.inc file that holds all my functions in it. Thinking speed, I was thinking that it might be a bit faster to split that include.inc file up into the different functions...
0
3957
by: Mike Chirico | last post by:
Interesting Things to Know about MySQL Mike Chirico (mchirico@users.sourceforge.net) Copyright (GPU Free Documentation License) 2004 Last Updated: Mon Jun 7 10:37:28 EDT 2004 The latest version of this document can be found at: http://prdownloads.sourceforge.net/souptonuts/README_mysql.txt?download
4
2549
by: J. Campbell | last post by:
From reading this forum, it is my understanding that C++ doesn't require the compiler to keep code that does not manifest itself in any way to the user. For example, in the following: { for(int i = 0; i < 10; ++i){ std::cout << i << std::endl; for(int j = 0; j < 0x7fffffff; ++j){} } }
8
1955
by: Hagen | last post by:
Hi, I have a question that you probably shouldn´t worry about since the compiler cares for it, but anyways: When you run your compiler with optimization turned on (eg. g++ with -Ox flag) and your program gets significantly faster than without, did you write bad code/ have a bad design? Cause what happens in those optimization steps is, I think, mostly
7
2335
by: eyh5 | last post by:
Hi, I'm writing some C codes to run simulations. I'm wondering if there is a website that may contain useful information on how to make one's code run more efficiently and in a computational-time-saving manner. Specifically, what I'd like to know is if there're any useful tips about writing your codes more efficiently. One such useful tip is that we can use the "switch" statement instead of multiple "if-else" statements; another is that...
1
2056
by: code | last post by:
Hi Grp http://www.books-download.com/?Book=1493-PHP+Hacks+%3a+Tips+%26+Tools+For+Creating+Dynamic+Websites+(Hacks) Description Programmers love its flexibility and speed; designers love its accessibility and convenience. When it comes to creating web sites, the PHP scripting language is truly a red-hot property. In fact, PHP is currently used on more than 19 million web sites, surpassing
2
2551
by: Jack | last post by:
I have a chunk of code that loads a few dozen function pointers into global variables. I'm concerned with unused memory consumption. What if the client only needs to use one or two functions? Then there's quite a few function pointers consuming memory and going to waste. Here's little example: // mycode.cpp or mycode.c typedef int (*PFN) (); PFN g_pfn;
4
3502
by: Got2Go | last post by:
Hello Group, I have a table that has millions of records in it. About 100 records are added every 5 minutes (one per OIDID) (the sample provided below has data for 2 OIDIDs (99 and 100) And I have a webpage that executes 9 queries one after the other, and then displays the results on the webpage. When the database was empty, this process was very quick. But, as the DB grew, it became slower.
0
1268
by: Miguel Perez | last post by:
Please critique this tail call optimizing decorator I've written. I've tried to fix the pitfalls of other proposed decorators, and the result is this one that supports mutual recursion, does not use exceptions, stack inspection or any implementation-dependent hack, and is pretty short and fast - the fastest out of the ones I could find and try. In fact, in tail-recursive environments I tested the impact of using the decorator is difficult...
0
9755
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
11048
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
10376
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
1
7928
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
7084
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5754
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
5954
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
2
4168
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
3194
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.