workgroupArray

(Compute Shader Only) Declares a high-speed, shared memory array within a workgroup. It enables efficient data exchange among all threads in the same workgroup, forming the basis for high-performance parallel algorithms (e.g., reductions, convolutions).

Core Advantages

Provides access to extremely fast, on-chip local shared memory. This ultra-fast inter-thread communication capability significantly reduces accesses to slower global memory and is key to implementing advanced GPGPU algorithms and performance optimizations.

Common Uses

Implementing parallel reduction algorithms (e.g., sum, max)

Serving as a pixel neighborhood cache in image processing (e.g., large kernel convolutions)

Caching data tiles in tiled matrix multiplication

How to adjust

Adjusted by changing the `type` and `count` at creation. Increasing `count` allows a single workgroup to process more data, but is constrained by hardware shared memory limits. Changing the `type` (e.g., from 'f32' to 'vec4') can improve efficiency with vectorized computations but consumes more memory.

Code Examples

1// Load data from global memory into the shared array
2sharedData.element( localIndex ).assign( globalInput.element( globalIndex ) );
3
4// Wait for all threads in the workgroup to finish loading
5workgroupBarrier();
6
7// Threads collaboratively process data in shared memory (e.g., perform one reduction step)
8const myData = sharedData.element( localIndex );
9const neighborData = sharedData.element( localIndex.add( stride ) );
10myData.assign( myData.add( neighborData ) );

workgroupArray

Core Advantages

Common Uses

Implementing parallel reduction algorithms (e.g., sum, max)

Serving as a pixel neighborhood cache in image processing (e.g., large kernel convolutions)

Caching data tiles in tiled matrix multiplication

How to adjust

Code Examples

1// Load data from global memory into the shared array
2sharedData.element( localIndex ).assign( globalInput.element( globalIndex ) );
3
4// Wait for all threads in the workgroup to finish loading
5workgroupBarrier();
6
7// Threads collaboratively process data in shared memory (e.g., perform one reduction step)
8const myData = sharedData.element( localIndex );
9const neighborData = sharedData.element( localIndex.add( stride ) );
10myData.assign( myData.add( neighborData ) );

Docs

Docs

workgroupArray

Core Advantages

Common Uses

How to adjust

Code Examples

workgroupArray

Core Advantages

Common Uses

How to adjust

Code Examples