457,837 Members | 1,537 Online Need help? Post your question and get tips & solutions from a community of 457,837 IT Pros & Developers. It's quick & easy.

# Matrix multiply code

 P: n/a Here is some code I wrote for Matrix multiplication for arbitrary dimensionality known at compile-time. I am curious how practical it is. For instance, is it common to know the dimensionality of matricies at compile-time? Any help would be appreciated. Hopefully this code comes in useful for someone, let me know if you find it useful, or if you have suggestions on how to improve it. // Public Domain by Christopher Diggins, 2005 #include #include #include #include template class Slice { public: // constructor Slice(Value_T* x) : p(x) { } // typedef typedef Value_T value_type; // only a forward iterator is provided, // others are left as an exercise for the reader struct Slice_forward_iter { Slice_forward_iter(Value_T* x) : p(x) { } Value_T& operator*() { return *p; } const Value_T& operator*() const { return *p; } Value_T& operator++() { Value_T* tmp = p; p += Stride_N; return *tmp; } Value_T& operator++(int) { return *(p+=Stride_N); } bool operator==(const Slice_forward_iter& x) { return p == x.p; } bool operator!=(const Slice_forward_iter& x) { return p != x.p; } Value_T* p; }; typedef Slice_forward_iter iterator; typedef const Slice_forward_iter const_iterator; // public functions iterator begin() { return iterator(p); } iterator end() { return iterator(p + (Size_N * Stride_N)); } value_type& operator[](size_t n) { return *(p + (n * Stride_N)); } const value_type& operator[](size_t n) const { return *(p + (n * Stride_N)); } static size_t size() { return Size_N; } static size_t stride() { return Stride_N; } private: // prevent default construction Slice() { }; Value_T* p; }; template class Matrix { public: typedef Slice row_type; typedef Slice col_type; const row_type row(unsigned int n) const { assert(n < rows); return row_type(data + (n * Cols_N)); } const col_type column(unsigned int n) const { assert(n < cols); return col_type(data + n); } row_type operator[](unsigned int n) { return row(n); } const row_type operator[](unsigned int n) const { return row(n); } const static unsigned int rows = Rows_N; const static unsigned int cols = Cols_N; typedef Value_T value_type; private: mutable Value_T data[Rows_N * Cols_N]; }; template struct MatrixMultiplicationType { typedef Matrix type; }; template typename Slice1_T::value_type dot_product(Slice1_T x, Slice2_T y) { assert(x.size() == y.size()); return std::inner_product(x.begin(), x.end(), y.begin(), typename Slice1_T::value_type(0)); } template typename MatrixMultiplicationType::type matrix_multiply(const Matrix1& m1, const Matrix2& m2) { typename MatrixMultiplicationType::type result; assert(Matrix1::cols == Matrix2::rows); for (int i=0; i < Matrix1::rows; ++i) for (int j=0; j < Matrix2::cols; ++j) result[i][j] = dot_product(m1.row(i), m2.column(j)); return result; } template void output_matrix(const Matrix_T& x) { for (int i=0; i < x.rows; ++i) { for (int j=0; j < x.cols; ++j) { std::cout << x[i][j] << ", "; } std::cout << std::endl; } std::cout << std::endl; } void test_matrix() { Matrix m1; m1 = 2; m1 = 3; Matrix m2; m2 = 2; m2 = 0; m2 = 0; m2 = 2; output_matrix(m2); Matrix m3 = matrix_multiply(m1, m2); output_matrix(m3); } int main() { test_matrix(); system("pause"); return 0; } -- Christopher Diggins Object Oriented Template Library (OOTL) http://www.ootl.org Jul 23 '05 #1
15 Replies

 P: n/a On 2005-05-15, christopher diggins wrote: Here is some code I wrote for Matrix multiplication for arbitrary dimensionality known at compile-time. I am curious how practical it is. For instance, is it common to know the dimensionality of matricies at compile-time? There are some matrices where this is known. It depends on what the matrix models. If the matrix models some transform, e.g. a 3d coordinate transform, or a linear transform between two known lists of variables, then the dimensions may well be known at compile time. But if the matrix is a data set of some sort (example: some sort of collection of samples or measurements), the dimensions probably will not be known at compile time. Cheers, -- Donovan Rebbechi http://pegasus.rutgers.edu/~elflord/ Jul 23 '05 #2

 P: n/a christopher diggins wrote: Here is some code I wrote for Matrix multiplication for arbitrary dimensionality known at compile-time. I am curious how practical it is. For instance, is it common to know the dimensionality of matricies at compile-time? Any help would be appreciated. Hopefully this code comes in useful for someone, let me know if you find it useful, or if you have suggestions on how to improve it. // Public Domain by Christopher Diggins, 2005 Unless you claim the copyright, You have no right to give your code away or even to use it yourself. For more details, see: http://www.gnu.org/licenses/licenses.html You will probably want to use a copyleft: http://www.gnu.org/licenses/licenses...WhatIsCopyleft #include #include #include #include template class Slice { public: // constructor Slice(Value_T* x) : p(x) { } // typedef typedef Value_T value_type; // only a forward iterator is provided, // others are left as an exercise for the reader struct Slice_forward_iter { Slice_forward_iter(Value_T* x) : p(x) { } Value_T& operator*() { return *p; } const Value_T& operator*() const { return *p; } Value_T& operator++() { Value_T* tmp = p; p += Stride_N; return *tmp; } Value_T& operator++(int) { return *(p+=Stride_N); } bool operator==(const Slice_forward_iter& x) { return p == x.p; } bool operator!=(const Slice_forward_iter& x) { return p != x.p; } Value_T* p; }; typedef Slice_forward_iter iterator; typedef const Slice_forward_iter const_iterator; // public functions iterator begin() { return iterator(p); } iterator end() { return iterator(p + (Size_N * Stride_N)); } value_type& operator[](size_t n) { return *(p + (n * Stride_N)); } const value_type& operator[](size_t n) const { return *(p + (n * Stride_N)); } static size_t size() { return Size_N; } static size_t stride() { return Stride_N; } private: // prevent default construction Slice() { }; Value_T* p; }; template class Matrix { public: typedef Slice row_type; typedef Slice col_type; const row_type row(unsigned int n) const { assert(n < rows); return row_type(data + (n * Cols_N)); } const col_type column(unsigned int n) const { assert(n < cols); return col_type(data + n); } row_type operator[](unsigned int n) { return row(n); } const row_type operator[](unsigned int n) const { return row(n); } const static unsigned int rows = Rows_N; const static unsigned int cols = Cols_N; typedef Value_T value_type; private: mutable Value_T data[Rows_N * Cols_N]; }; template struct MatrixMultiplicationType { typedef Matrix type; }; template typename Slice1_T::value_type dot_product(Slice1_T x, Slice2_T y) { assert(x.size() == y.size()); return std::inner_product(x.begin(), x.end(), y.begin(), typename Slice1_T::value_type(0)); } template typename MatrixMultiplicationType::type matrix_multiply(const Matrix1& m1, const Matrix2& m2) { typename MatrixMultiplicationType::type result; assert(Matrix1::cols == Matrix2::rows); for (int i=0; i < Matrix1::rows; ++i) for (int j=0; j < Matrix2::cols; ++j) result[i][j] = dot_product(m1.row(i), m2.column(j)); return result; } template void output_matrix(const Matrix_T& x) { for (int i=0; i < x.rows; ++i) { for (int j=0; j < x.cols; ++j) { std::cout << x[i][j] << ", "; } std::cout << std::endl; } std::cout << std::endl; } void test_matrix() { Matrix m1; m1 = 2; m1 = 3; Matrix m2; m2 = 2; m2 = 0; m2 = 0; m2 = 2; output_matrix(m2); Matrix m3 = matrix_multiply(m1, m2); output_matrix(m3); } int main() { test_matrix(); system("pause"); return 0; } g++ -Wall -ansi -pedantic -o main main.cpp main.cpp: In function `void output_matrix(const Matrix_T&) [with Matrix_T = Matrix]': main.cpp:112: instantiated from here main.cpp:94: warning: comparison between signed and unsigned integer expressionsmain.cpp:112: instantiated from here main.cpp:95: warning: comparison between signed and unsigned integer expressionsmain.cpp: In function `typename MatrixMultiplicationType::type matrix_multiply(const Matrix1&, const Matrix2&) [with Matrix1 = Matrix, Matrix2 = Matrix]': main.cpp:113: instantiated from here main.cpp:86: warning: comparison between signed and unsigned integer expressionsmain.cpp:113: instantiated from here main.cpp:87: warning: comparison between signed and unsigned integer expressionsmain.cpp: In function `void output_matrix(const Matrix_T&) [with Matrix_T = Matrix]': main.cpp:114: instantiated from here main.cpp:94: warning: comparison between signed and unsigned integer expressionsmain.cpp:114: instantiated from here main.cpp:95: warning: comparison between signed and unsigned integer expressions Evidently, your code needs lots of work. Also, the design is still very poor. See The Object-Oriented Numerics Page http://www.oonumerics.org/oon/ to get an idea about how the experts do this. Good Luck. Jul 23 '05 #3

 P: n/a > Evidently, your code needs lots of work. Also, the design is still very poor. See The Object-Oriented Numerics Page http://www.oonumerics.org/oon/ to get an idea about how the experts do this. I was curious how an "expert" implementation compares, so I compared my code to boost::ublas. Preliminary tests show that my implementation runs over twice as fast, and I haven't done any profiling or optimizing whatsoever. This makes sense, because the compiler can optimize the indexing using constants rather than doing variable lookups. This still leaves my original question: How practical is it? And do people often no matrix dimensions at compile time? For those interested here is my first attempt at benchmarking.: #include #include #include // ... see previous post ... using namespace boost::numeric::ublas; template void compare_naive_and_ublas() { Matrix result1; matrix result2; { std::cout << "about to run Naive test " << std::endl; Matrix m1; Matrix m2; int n = 0; for (unsigned i = 0; i < M; ++i) for (unsigned j = 0; j < R; ++j) m1[i][j] = ++n; n = 0; for (unsigned i = 0; i < R; ++i) for (unsigned j = 0; j < N; ++j) m2[i][j] = ++n; boost::progress_timer t; for (int i=0; i < I; i++) { result1 = matrix_multiply(m1, m2); } std::cout << "naive time elapsed " << t.elapsed() << std::endl; } { std::cout << "about to run uBlas test " << std::endl; matrix m1(M, R); matrix m2(R, N); int n = 0; for (unsigned i = 0; i < M; ++i) for (unsigned j = 0; j < R; ++j) m1(i, j) = ++n; n = 0; for (unsigned i = 0; i < R; ++i) for (unsigned j = 0; j < N; ++j) m2(i, j) = ++n; boost::progress_timer t; for (int i=0; i < I; i++) { result2 = prod(m1, m2); } std::cout << "boost time elapsed " << t.elapsed() << std::endl; } for (unsigned i = 0; i < M; ++i) for (unsigned j = 0; j < N; ++j) assert(result1[i][j] == result2(i, j)); } int main() { compare_naive_and_ublas(); system("pause"); return 0; } Jul 23 '05 #6

 P: n/a "christopher diggins" wrote in message news:bJ**********************@weber.videotron.net. .. Here is some code I wrote for Matrix multiplication for arbitrary dimensionality known at compile-time. I am curious how practical it is. For instance, is it common to know the dimensionality of matricies at compile-time? Any help would be appreciated. Hopefully this code comes in useful for someone, let me know if you find it useful, or if you have suggestions on how to improve it. I should be saying rows and columns, not dimensionality. Christopher Diggins Jul 23 '05 #7

 P: n/a > Take a look at the blitz++ Tiny Vector Matrix library http://www.oonumerics.org/blitz/ This only works for very small matricies, my method works for very arbitrarily large matricies. You can also look at the Array classes in The C++ Scalar, Vector, Matrix and Tensor class Library http://www.netwood.net/~edwin/svmtl/ I don't see how this code tracks the rows and columns at compile-time. I also have an old recursive template definitions for Matrix classes that you can get from ftp.cs.ucla.edu/pub/Tensor.tar.Z I can't open it. All of these things exist mainly to help answer the questions that you have asked. Strictly speaking, your questions are off-topic in this forum but probably dead on in the object oriented numerics mailing list Okay let me rephrase it. I have presented a way to do very fast matrix multiplication in C++ by representing the rows and columns as template parameters. I use some simple template metaprogramming to compute the type of the result of the matrix multiplication. The core of the technique is: template<class Matrix1, class Matrix2> struct MatrixMultiplicationType { typedef Matrix type; }; template typename Slice1_T::value_type dot_product(Slice1_T x, Slice2_T y) { assert(x.size() == y.size()); return std::inner_product(x.begin(), x.end(), y.begin(), typename Slice1_T::value_type(0)); } template typename MatrixMultiplicationType::type matrix_multiply(const Matrix1& m1, const Matrix2& m2) { typename MatrixMultiplicationType::type result; assert(Matrix1::cols == Matrix2::rows); for (int i=0; i < Matrix1::rows; ++i) for (int j=0; j < Matrix2::cols; ++j) result[i][j] = dot_product(m1.row(i), m2.column(j)); return result; } My question is: am I duplicating prior work and does anyone find this interesting? http://www.oonumerics.org/mailman/li....cgi/oon-list/ Try there. I don't think that you'll be disappointed. Thanks. -- Christopher Diggins Object Oriented Template Library (OOTL) http://www.ootl.org Jul 23 '05 #8

 P: n/a christopher diggins wrote: Here is some code I wrote for Matrix multiplication for arbitrary dimensionality known at compile-time. I am curious how practical it is. For instance, is it common to know the dimensionality of matricies at compile-time? Any help would be appreciated. Hopefully this code comes in useful for someone, let me know if you find it useful, or if you have suggestions on how to improve it. You might want to look into Strassen's algorithm for matrix multiplication which is asymptotically faster than the conventional (row.column) approach. Jul 23 '05 #9

 P: n/a "christopher diggins" wrote in message news:bJ**********************@weber.videotron.net. .. Here is some code I wrote for Matrix multiplication for arbitrary dimensionality known at compile-time. I am curious how practical it is. For --------- BTW if I may interfere - now mostly they do it in assembly language to boost the performance. I'd like to see some bench comparisons for normal C++ algorithms because I'd say there's not much difference. UF Jul 23 '05 #10

 P: n/a "Ufit" wrote in message news:d6*********@news.onet.pl... "christopher diggins" wrote in message news:bJ**********************@weber.videotron.net. .. Here is some code I wrote for Matrix multiplication for arbitrary dimensionality known at compile-time. I am curious how practical it is. For --------- BTW if I may interfere Of course! :-) - now mostly they do it in assembly language to boost the performance. I'd like to see some bench comparisons for normal C++ algorithms because I'd say there's not much difference. UF Here are the results for 1000 multiplications of a 17x23 matrix by a 23x31 matrix of integers by my code compared to ublas. Compiled on Visual C++ 7.1, about to run Naive test 1.33 s about to run uBlas test 3.38 s (YMMV) There are probably optimizations left to do on the code. However, the code is mostly pointer arithmetic, so there might not be that much improvement to be gained by doing it in assembly, though I don't know yet. Any optimization suggestions would be welcome. -- Christopher Diggins Object Oriented Template Library (OOTL) http://www.ootl.org Jul 23 '05 #11

 P: n/a "Mark P" wrote in message news:fu******************@newssvr14.news.prodigy.c om... christopher diggins wrote: Here is some code I wrote for Matrix multiplication for arbitrary dimensionality known at compile-time. I am curious how practical it is. For instance, is it common to know the dimensionality of matricies at compile-time? Any help would be appreciated. Hopefully this code comes in useful for someone, let me know if you find it useful, or if you have suggestions on how to improve it. You might want to look into Strassen's algorithm for matrix multiplication which is asymptotically faster than the conventional (row.column) approach. That is great stuff, thanks for pointing that out. Even though it is only applicable to n x n matricies, it can still be leveraged by using partial specializations. What could be done is something like the following (not tested): template struct MatrixMultiplicationType { typedef Matrix type; }; template typename MatrixMultiplicationType::type matrix_multiply(const Matrix1& m1, const Matrix2& m2) { return matrix_multiply_algorithm_chooser < Matrix1, Matrix2, (Matrix1::rows == Matrix1::cols) && (Matrix1::cols == Matrix2::rows) && (Matrix2::rows == Matrix2::cols) (m1, m2); } template typename MatrixMultiplicationType::type matrix_multiply_algorithm_chooser(const Matrix1& m1, const Matrix2& m2) { // normal multiplication algorithm } template typename MatrixMultiplicationType::type matrix_multiply_algorithm_chooser(const Matrix1& m1, const Matrix2& m2) { // strassen multiplication algorithm } This brings to mind other possibilities for other special cases of matricies. -- Christopher Diggins Object Oriented Template Library (OOTL) http://www.ootl.org Jul 23 '05 #12

 P: n/a "christopher diggins" wrote in message news:Hh**********************@weber.videotron.net. .. .... My question is: am I duplicating prior work and does anyone find this interesting? IIRC, Dave Abrahams is currently working on Vector/Matrix Linear Algebra topics over at boost. Jeff Flinn Jul 23 '05 #13

 P: n/a "Mark P" wrote in message news:fu******************@newssvr14.news.prodigy.c om... christopher diggins wrote: Here is some code I wrote for Matrix multiplication for arbitrary dimensionality known at compile-time. I am curious how practical it is. For instance, is it common to know the dimensionality of matricies at compile-time? Any help would be appreciated. Hopefully this code comes in useful for someone, let me know if you find it useful, or if you have suggestions on how to improve it. You might want to look into Strassen's algorithm for matrix multiplication which is asymptotically faster than the conventional (row.column) approach. Upon further research it is important to point out to the interested reader that Strassen's is only useful for multiplication of square matricies for n 654. For reference : http://mathworld.wolfram.com/StrassenFormulas.html Christopher Diggins Jul 23 '05 #14

 P: n/a christopher diggins wrote: "Mark P" wrote in messageYou might want to look into Strassen's algorithm for matrix multiplicationwhich is asymptotically faster than the conventional (row.column)approach. Upon further research it is important to point out to the interested reader that Strassen's is only useful for multiplication of square matricies for n > 654. It's not quite so bad. That result of 654 is obtained by assuming that Strassen is applied recursively all the way down to matrices of size 1. This is actually a pretty horrible idea :) It's much better to apply Strassen down to a certain minimum size n0, and then use conventional multiplication for these small matrices. From my own calculations, the theoretically optimal choice is to apply Strassen down to a size of about 16. In practice, due to non-algorithmic overhead (memory access patterns, extra copying, etc.), this number may be a few times larger. I implemented this once in Java and found n0 around 45, but my implementation may not have been ideal (perhaps a certainty, given that I used Java :). The other point is that the algorithm can be extended to deal with non-square matrices without great difficulty. Assuming M < N, an MxN matrix can be viewed as floor(N/M) adjacent MxM square matrices and an adjacent (N%M)xM "non-square" matrix. You can use Strassen to multiply all combinations of the square matrices and then deal with the combinations involving non-square matrices separately (either recurse or use conventional multiply). Finally, you may want to consider asking about this on a group with algorithms in the name. Surely you'll find people with better practical knowledge of this than I can offer. -Mark Jul 23 '05 #15

 P: n/a "christopher diggins" wrote in message news:XZ********************@weber.videotron.net... "Ufit" wrote in message news:d6*********@news.onet.pl... "christopher diggins" wrote in message news:bJ**********************@weber.videotron.net. .. Here is some code I wrote for Matrix multiplication for arbitrary dimensionality known at compile-time. I am curious how practical it is. For As a scientific programmer, my answer to the "usefulness" of efficient matrix multiply (and other linear algebra routines) for compile-time-known dimensionality would probably be something like: very useful for small matrices (eg. transforms in 3, 4, ... dimensions), less so for large matrices, since in my experience large matrix dimensions tend to relate to problem-size parameters and/or data set sizes. Either of the latter one wants to be able to experiment with and a recompile per experiment would certainly be undesirable. --------- BTW if I may interfere Of course! :-) - now mostly they do it in assembly language to boost the performance. I'd like to see some bench comparisons for normal C++ algorithms because I'd say there's not much difference. UF /.../ There are probably optimizations left to do on the code. However, the code is mostly pointer arithmetic, so there might not be that much improvement to be gained by doing it in assembly, though I don't know yet. Any optimization suggestions would be welcome. For small matrices, I suspect something like the Blitz++ approach (basically unrolling all loops) is probably about as good as it gets. For large matrix multiply everything ultimately boils down, as you say, to pointer chasing. In either case, further optimisation can probably only be achieved by exploitation of CPU features - pipelining, minimising cache misses, intelligent register allocation, that sort of thing. The established benchmark for (large) matrix multiply is probably the level 3 BLAS gemm() routines. The most efficient BLAS implementations tend to be the vendor-supplied ones (frequently written in assembler and/or Fortran) which achieve performance through very architecture-specific techniques. I've also found the ATLAS (Automatically Tuned Linear Algebra Software) implementation (http://math-atlas.sourceforge.net/) of the BLAS to produce very fast matrix multiply - many times faster than I've ever managed to code it myself in C or C++. If you're interested in further optimisation techniques, the ATLAS white paper http://www.netlib.org/lapack/lawns/lawn147.ps may be a good place to start. Regards, -- Lionel B Jul 23 '05 #16

### This discussion thread is closed

Replies have been disabled for this discussion. 