Skip to main content

Reduce的并行加速

CUDA

1. 采用Divergence的支持和Block同步来支持

image.pngimage.png

2. 其他的深度优化:https://developer.download.nvidia.com/assets/cuda/files/reduction.pdf

DSA/ASIC