subgroupInclusiveMul
Performs an inclusive prefix product of e across active invocations in the hardware subgroup and returns the cumulative product up to the current invocation. Also known as an inclusive prefix product.
Core Advantages
Single subgroup op. No shared-memory loop or barriers, which reduces latency and synchronization overhead.
Common Uses
Prefix scan of multiplicative quantities such as transmittance/attenuation, probabilities, or gains.
Building fast prefix results for multiplicative segment processing or hierarchical weights.
Convert a boolean mask to 0/1 then scan by product to test if all prior lanes are true.
How to adjust
Only input e is adjustable. Scale/normalize e to control the range and stability. To avoid overflow, use the log-domain trick: exp( subgroupInclusiveAdd( log( e + 1e-8 ) ) ). The actual prefix extent follows the active mask and the hardware subgroup size.
Code Examples
1// Inclusive multiplicative prefix scan within a subgroup
2const x = bufferIn.element( globalId.x ).tofloat();
3const prefix = subgroupInclusiveMul( x );
4bufferOut.element( globalId.x ).assign( prefix );
5
6// Note: globalId is the global thread ID available in the same compute group