GPU CUDA has developed quite a lot recent years, especially in the field of high performance computing, as well as asymmetric cryptographic applications. Much of the involved work has been done based on the coarsegrained method, in which each thread within thread blocks does a complete task process respectively. In this paper, we develop a fine-grained parallel approach for Montgomery multiplications, which is much different with previous work. All the threads within the thread block of GPU cooperate to deal with a complete task process. Experiment shows that the approach performs better when the number of tasks to be dealt with is small, and performs more or less equally effectively in other cases. And the acceleration is well reached compared with CPU-based implementation. Also the idea can be adopted in many acceleration applications.
Weitere Kapitel dieses Buchs durch Wischen aufrufen
Bitte loggen Sie sich ein, um Zugang zu diesem Inhalt zu erhalten
Sie möchten Zugang zu diesem Inhalt erhalten? Dann informieren Sie sich jetzt über unsere Produkte:
- A GPU-Based Fine-Grained Parallel Montgomery Multiplication Algorithm
- Springer Berlin Heidelberg