gpucoder.atomicCAS

Atomically compare and swap the value of a variable in global or shared memory

Since R2021b

Syntax

[A,oldA] = gpucoder.atomicCAS(A,B,C)

Description

[A,oldA] = gpucoder.atomicCAS(A,B,C) compares B to the value of A in global or shared memory and if the values are the same writes the value of C into A. The operation is atomic in a sense that the entire read-modify-write operation is guaranteed to be performed without interference from other threads. The order of the input and output arguments must match the syntax provided.

example

Examples

collapse all

Compare and Swap Using CUDA atomicCAS

Perform a simple atomic compare and swap operation by using the gpucoder.atomicCAS function and generate CUDA^® code that calls corresponding CUDA atomicCAS() APIs.

In one file, write an entry-point function myAtomicCAS that accepts matrix inputs a,b, and c.

function a = myAtomicCAS(a,b,c)

coder.gpu.kernelfun;
for i =1:numel(a)
    [a(i),~] = gpucoder.atomicCAS(a(i), b, c);
end

end

To create a type for a matrix of doubles for use in code generation, use the coder.newtype function.

A = coder.newtype('uint32', [1 30], [0 1]);
B = coder.newtype('uint32', [1 1], [0 0]);
C = coder.newtype('uint32', [1 1], [0 0]);
inputArgs = {A,B,C};

To generate a CUDA library, use the codegen function.

cfg = coder.gpuConfig('lib');
cfg.GenerateReport = true;

codegen -config cfg -args inputArgs myAtomicCAS -d myAtomicCAS

The generated CUDA code contains the myAtomicCAS_kernel1 kernel with calls to the atomicCAS() CUDA APIs.

//
// File: myAtomicCAS.cu
//
...

static __global__ __launch_bounds__(1024, 1) void myAtomicCAS_kernel1(
    const uint32_T c, const uint32_T b, const int32_T i, uint32_T a_data[])
{
  uint64_T loopEnd;
  uint64_T threadId;

...

  for (uint64_T idx{threadId}; idx <= loopEnd; idx += threadStride) {
    int32_T b_i;
    b_i = static_cast<int32_T>(idx);
    atomicCAS(&a_data[b_i], b, c);
  }
}
...

void myAtomicCAS(uint32_T a_data[], int32_T a_size[2], uint32_T b, uint32_T c)
{
  dim3 block;
  dim3 grid;
...

  if (validLaunchParams) {
    cudaMemcpy(gpu_a_data, a_data, a_size[1] * sizeof(uint32_T),
               cudaMemcpyHostToDevice);
    myAtomicCAS_kernel1<<<grid, block>>>(c, b, i, gpu_a_data);
    cudaMemcpy(a_data, gpu_a_data, a_size[1] * sizeof(uint32_T),
               cudaMemcpyDeviceToHost);
...

}

Input Arguments

collapse all

`A`, `B`, `C` — Operands
scalars | vectors | matrices | multidimensional arrays

Operands, specified as scalars, vectors, matrices, or multidimensional arrays. Inputs A, B, and C must satisfy the following requirements:

Have the same data type.
Have the same size or have sizes that are compatible. For example, A is an M-by-N matrix and B,C is a scalar or 1-by-N row vector.

Data Types: int32 | uint32 | uint64

Version History

Introduced in R2021b

gpucoder.atomicCAS

Syntax

Description

Examples

Compare and Swap Using CUDA atomicCAS

Input Arguments

`A`, `B`, `C` — Operands
scalars | vectors | matrices | multidimensional arrays

Version History

See Also

Functions

Topics

gpucoder.atomicCAS

Syntax

Description

Examples

Compare and Swap Using CUDA atomicCAS

Input Arguments

A, B, C — Operands scalars | vectors | matrices | multidimensional arrays

Version History

See Also

Functions

Topics

`A`, `B`, `C` — Operands
scalars | vectors | matrices | multidimensional arrays