Question

Consider a C function negMat(), that negates each element of a K x K matrix y[][],...

Consider a C function negMat(), that negates each element of a K x K matrix y[][], and stores each result into the matrix x[][] :

void negMat(float *x, float *y, int K) {

int i, j;
for (i=0; i<K; i++) {

for (j=0; j<K; j++) {
x[i * K + j] = - y[i * K + j];

}

}

}

negMat() runs on the CPU (obviously), and x[][]and y[][] are stored in row-major order.

Write a CUDA kernel negMatGPU(), that negates each element of a K x K matrix yG[][], and stores each result into the matrix xG[][] (see the given prototype). xG[][]and yG[][] are in the GPU’s global memory, and also stored in row-major order. In your kernel, assume that there are K threads total, and each thread takes care of all the elements in a single row. Remember that the K threads may be in multiple blocks. Make sure your code is efficient; otherwise points will be deducted. (Check Problem 3 for useful CUDA information.)

__global__ void negMatGPU(float *xG, float *yG, int K) {

// your code follows here

0 0
Add a comment Improve this question Transcribed image text
Answer #1

__global__

void negMatGPU(float *xG, float *yG, int K) {

int column = blockDim.x * blockIdx.x + threadIdx.x;           //calculating column number

int row = blockDim.y * blockIdx.y + threadIdx.y;           //calculating row number

if (row < K && column < K)

{

int thread_id = row * K + column;            //finding exact location using row and column number in row major storage

xG[thread_id] = -yG[thread_id];

}

}

*/row and column depends on block size which can exceed K so we have to check if row and column lies within 0 to K or not and then thread_id is calculated which varies from 0 to K*K-1/*

Add a comment
Know the answer?
Add Answer to:
Consider a C function negMat(), that negates each element of a K x K matrix y[][],...
Your Answer:

Post as a guest

Your Name:

What's your source?

Earn Coins

Coins can be redeemed for fabulous gifts.

Not the answer you're looking for? Ask your own homework help question. Our experts will answer your question WITHIN MINUTES for Free.
Similar Homework Help Questions
  • 4. (20%) Suppose you are given the below C code and the corresponding CUDA code: //...

    4. (20%) Suppose you are given the below C code and the corresponding CUDA code: // Invoke DAXPY with 256 threads per Thread Block host nt nbTocks (n 255) /256; /I Invoke DAXPY daxpy(n, 2.0, x, y) // DAXPY in vold daxPy(int n, double a, double .x, double "y) daxpyceenblocks, 256(n, 2.0,x.Y): DAXPY 1n CUDA device void daxpy(int n, double a, double ., double- for (int 1 -0:1n:1 int 1 blockidx.x blockDim.x threadIdx.x (1). (14%) Rewrite the CUDA code so...

  • (a) Consider the following C++ function: 1 int g(int n) { 2 if (n == 0)...

    (a) Consider the following C++ function: 1 int g(int n) { 2 if (n == 0) return 0; 3 return (n-1 + g(n-1)); 4} (b) Consider the following C++ function: 1 bool Function (const vector <int >& a) { 2 for (int i = 0; i < a. size ()-1; i ++) { 3 for (int j = i +1; j < a. size (); j ++) { 4 if (a[i] == a[j]) return false; 5 6 } 7 return...

  • Directions: Develop a C++ program that can solve any matrix, up to 15 equations with 15...

    Directions: Develop a C++ program that can solve any matrix, up to 15 equations with 15 unknowns. Put the result on the screen as well as write the results to a text file This is what we were given as a hint: /* how to read matrix data file in rows and colloms written by tom tucker 03/26/2018 it uses a string array to read in the first line and then a nested for statement to read in the matrix...

  • The C++ program below contains a function that adds a specific row of two matrices and...

    The C++ program below contains a function that adds a specific row of two matrices and store the result in the corresponding row of a third matrix. This is done by a loop to invokes a function that adds a single row of the two matrices. By the end of the loop all rows should be added and stored in the result matrix. The rows are added sequentially one after the other. You are required to modify the program below...

  • Exercise 4 – Passing an argument to a new thread and reporting on execution times================...

    Exercise 4 – Passing an argument to a new thread and reporting on execution times===================================================================== Here are some miscellaneous items on threads and resource usage times. This is a very short exercise to learn how to pass an argument to a thread’s start function. Write a program called threadArgA.c that simply does the following: Pass a simple integer to a thread’s start function at thread creation time. ii) The thread function will assign this argument as variable int i (just...

  • Using java fix the code I implemented so that it passes the JUnit Tests. MATRIX3 public...

    Using java fix the code I implemented so that it passes the JUnit Tests. MATRIX3 public class Matrix3 { private double[][] matrix; /** * Creates a 3x3 matrix from an 2D array * @param v array containing 3 components of the desired vector */ public Matrix3(double[][] array) { this.matrix = array; } /** * Clones an existing matrix * @param old an existing Matrix3 object */ public Matrix3(Matrix3 old) { matrix = new double[old.matrix.length][]; for(int i = 0; i <...

  • 1. Write a MATLAB function that takes a matrix, a row number and a scalar as...

    1. Write a MATLAB function that takes a matrix, a row number and a scalar as arguments and multiplies each element of the row of the matrix by the scalar returning the updated matrix. 2. Write a MATLAB function that takes a matrix, two row numbers and a scalar as arguments and returns a matrix with a linear combination of the rows. For example, if the rows passed to the function were i and j and the scalar was m,...

  • Consider the following program: # include <iostream> int x = 3, y = 5; void foo(void)...

    Consider the following program: # include <iostream> int x = 3, y = 5; void foo(void) { x = x + 2; y = y + 4; } void bar(void) { int x = 10; y = y + 3; foo( ); cout << x << endl; cout << y << endl; } void baz(void) { int y = 7; bar( ); cout << y << endl; } void main( ) { baz( ); } What output does this program...

  • Consider the following code C++ like program: int i, j, arr[5]; //arr is an array starting...

    Consider the following code C++ like program: int i, j, arr[5]; //arr is an array starting at index 0 void exchange(int x, int y) { int temp:= x; x:= y; y:= temp; } main(){ for (j = 0; j < 5; j++) arr[j]:= j; i:= 1; exchange(i, arr[i+1]); output(i, arr[2]); //print i and arr[2] } What is the output of the code if both parameters in function swapping are passed by: a- value? b- reference? c- value-result?

  • 24) A C program has the following declarations: float x = 10.0, y = 5.5, z...

    24) A C program has the following declarations: float x = 10.0, y = 5.5, z = 2.1; int i = 3, j = 5, k = 7, m; EVALUATE each of the following expressions. SHOW YOUR WORK, including the type of each sub expression (indicating a float with a decimal point). If the expression would compile and runs but would produce garbage output, mark GARBAGE and EXPLAIN. If you are not confident of your answer, type in, compile and...

ADVERTISEMENT
Free Homework Help App
Download From Google Play
Scan Your Homework
to Get Instant Free Answers
Need Online Homework Help?
Ask a Question
Get Answers For Free
Most questions answered within 3 hours.
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT