I am making one program in CUDA C, I've solved the problem on classic way, but I should to parallelise the code using CUDA. The problem is: print all vectors of length n, in which each element can have a value of [0 ... K] and for which the sum of all elements is SUM.
I've wrote the program in CUDA C, and the program should return to me the numbers of vectors who satisfied the condition. Now the problem is that I can't to find any error in the code, I don't know how to debug in Ubuntu, and the output always give me 0. I think that the __global__ function doesn't execute. This is the code, I hope that someone will help me:
The code of the program is: - #include <stdio.h>
-
#include <stdlib.h>
-
#include <assert.h>
-
#include <cuda.h>
-
-
#define MIN(x, y) (((x) < (y)) ? (x) : (y))
-
#define MYASSERT(condition) if(!(condition)) { return; }
-
-
__device__ void distribute2 (int vec[], int n, int k, int sum)
-
{
-
int i;
-
for (i = blockIdx.x * blockDim.x + threadIdx.x;; i < n; i+=blockDim.x * gridDim.x)
-
{
-
vec[i]=MIN(sum, k);
-
sum = sum- vec[i];
-
}
-
MYASSERT (sum == 0);
-
}
-
__global__ void moveUp (int vec[], int n, int k, int *res)
-
{
-
int i;
-
int collected = 0;
-
for(i=blockIdx.x * blockDim.x + threadIdx.x; i<n;i+=blockDim.x * gridDim.x)
-
{
-
if (collected == 0)
-
collected = vec[i];
-
else
-
{
-
if (vec[i] < k)
-
{
-
vec[i] =vec[i]+1;
-
distribute2 (vec, i, k, collected-1);
-
__synchthreads();
-
res[0]=res[0]+1;
-
}
-
else
-
{
-
collected += k;
-
}
-
}
-
-
}
-
MYASSERT(collected != 0);
-
}
-
int main()
-
{
-
int n=5;
-
int vec[n];
-
int k=5;
-
int sum=10;
-
-
int *res_h, *res_d;
-
size_t size = 1 * sizeof(int);
-
res_h = (int *)malloc(size);
-
cudaMalloc((void **) &res_d, size);
-
res_h[0] = 0;
-
cudaMemcpy(res_d, res_h, size, cudaMemcpyHostToDevice);
-
-
cudaDeviceProp devProp;
-
cudaGetDeviceProperties(&devProp, 0);
-
unsigned maxbytes = devProp.totalGlobalMem / 3;
-
unsigned max_samples = maxbytes / sizeof(int);
-
-
if (n > max_samples) n = max_samples;
-
-
printf("Using %d samples to estimate pi\n", n);
-
-
moveUp<<<256, 256>>>(vec, n, k, res_d);
-
cudaMemcpy(res_h, res_d, size, cudaMemcpyDeviceToHost);
-
printf("%d\n", res_h[0]);
-
return 0;
-
}
3 1457
You can always put a printf in various places to display the situation when the program gets there. That would solve your worry about functions not executing.
I have never used Invidia's CUDA platform so I won't be of much help. Hopefully, someone else will come along with more experience with it.
I can't to use printf in __global__ function when I am using CUDA.
Certainly there must be a CUDA debugger?
Sign in to post your reply or Sign up for a free account.
Similar topics
by: CHRISTOF WARLICH |
last post by:
Hi,
does anyone know of an efficient way to find the number of
digits (i.e. the most significant position that is 1) of a
binary number? What I found so far is:
- digits = (int) log2(number),...
|
by: Massimiliano Alberti |
last post by:
Can someone check this? If it's OK, you can use however you want... :-)
It should search for an element in an array and, if it can't find it, return
the next element.
key is what you are...
|
by: shyam |
last post by:
Hello to all C geeks
My query is as follows
i declare a variable in main as
UINT8 *c
if i check for sizeof(c) it returns 20, i.e 20/4 = 5 elements.
This is fine .
|
by: Petrakid |
last post by:
Hey, I have a task to complete. I am trying to figure out the best
way, in C++ to determine the following. There is this farm with pigs
and chickens. Only the legs of the pigs and chickens look...
|
by: sunmat |
last post by:
To find number of character without space using java
Example:
String = prabu sun
No. of char =8
plz send me code
|
by: Anonymous |
last post by:
I ahve a vector of doubles taht I need to extract values from. I was
just about to use the STL find() algo, but I have a couple of questions:
first: can you specify the tolerance threshold to...
|
by: lightaiyee |
last post by:
Dear Gurus,
I would like to implement a function that computes the number of times
a certain condition is met in a global array.
For example, I have an global array of size 500.
float array;...
|
by: sh.vipin |
last post by:
is there any way to find out number of bytes freed on a particular
free() call in C
|
by: Praveen Raj V |
last post by:
"socket.Available" is not working.
How would you Determine number of bytes present in a TCP/IP socket before reading those bytes without using socket.Available
void Receive(Socket socket, byte...
|
by: isladogs |
last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM).
In this session, we are pleased to welcome former...
|
by: taylorcarr |
last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
|
by: Charles Arthur |
last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
|
by: aa123db |
last post by:
Variable and constants
Use var or let for variables and const fror constants.
Var foo ='bar';
Let foo ='bar';const baz ='bar';
Functions
function $name$ ($parameters$) {
}
...
|
by: ryjfgjl |
last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
|
by: emmanuelkatto |
last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud.
Please let me know.
Thanks!
Emmanuel
|
by: BarryA |
last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
|
by: nemocccc |
last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
|
by: Hystou |
last post by:
There are some requirements for setting up RAID:
1. The motherboard and BIOS support RAID configuration.
2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
| |