Low-Cost High-Performance VLSI Architecture for Montgomery Modular Multiplication

Abstract:

This paper proposes a simple and efficientMontgomery multiplication algorithm such that the low-costand high-performance Montgomery modular multiplier can beimplemented accordingly. The proposed multiplier receives andoutputs the data with binary representation and uses onlyone-level carry-save adder (CSA) to avoid the carry propagationat each addition operation. This CSA is also used to performoperand pre-computation and format conversion from the carrysave format to the binary representation, leading to a lowhardware cost and short critical path delay at the expense of extra clock cycles for completing one modular multiplication.To overcome the weakness, a configurable CSA (CCSA), whichcould be one full-adder or two serial half-adders, is proposed toreduce the extra clock cycles for operand pre-computation andformat conversion by half. In addition, a mechanism that candetect and skip the unnecessary carry-save addition operationsin the one-level CCSA architecture while maintaining the shortcritical path delay is developed. As a result, the extra clock cyclesfor operand pre-computation and format conversion can be hiddenand high throughput can be obtained. Experimental resultsshow that the proposed Montgomery modular multiplier can achieve higher performance and significant area–time productimprovement when compared with previous designs.Using VHDL to design the RTL, and the result to be shown in Xilinx 14.2 with Power consumption and area reduction.

Enhancement of the project:

Increase the size of the data values or use different adder for the addition operation