Dear members
I am working with a 2000 GH P4 Intel, and 512GB RAM.
I have a long list matrix 3000 * 15,000 of type double.
I have a calculation procedure which can be executed on any single
element of the matrix. Regarding to nature of this procedure, it is
possible to run procedure on every element independently. I designed
30 threads. Every thread runs same procedure on equal portion of the
matrix. For example rows from 0 to 100 are dedicated to thread #1 to
execute some calculation on them. There is no data or function
dependency. The only shared data is that matrix.
Before threading, I wrote the program in sequential manner. That, all
of the elements were calculated one by one each after each without any
parallelism.
After threading, I expected to see significant increase in speed of
execution. While I have 30 workers working concurrently. But the speed
increased a little and too less that what was expected.
What is the reason? How can I increase performance much more?