ABSTRACT: This paper presents an efficient design and implementation of ECC Encryption System using Encoded Multiplier. ECC algorithm is implemented based on ancient Indian Vedic Mathematics. The speed of the system mainly depends on multipliers and adders. To improve the speed of the system, the multiplier architecture is modified using a new encoded algorithm. Using this algorithm number of partial products in the multiplier architecture is reduced to half and thus it speeds up the operation. Effectively no multipliers are required and number of adders required is reduced drastically. The most significant aspect of this paper is the development of encoded architecture and embedding it in Point Multiplication circuitry of ECC algorithm. The coding is done in Verilog HDL and FPGA implementation using Xilinx Spartan 6 library.

Keywords: ECC, Encryption, Decryption, Vedic Mathematics, Encoder, Cryptography, Point Addition, Point Doubling, Scalar Multiplication.

INTRODUCTION

Cryptography is a technique for making the message secure. Sensitive information can be stored or transmitted across insecure network so that unauthorized persons cannot access it. Cryptography is implemented by means of various encryption algorithms. These encryption algorithms are classified into symmetric key algorithms (private key) and asymmetric key algorithms (public key). Private key cryptography uses one key shared by both sender and receiver. Public key cryptography uses two keys one for encryption and one for decryption.

In public key algorithms use of elliptic curves was proposed by Victor Miller and Neal Koblitz in 1980[2]. Its small key sizes make it a most powerful algorithm for cryptography while comparing to standard algorithms such as RSA. It is commonly used in security protocols such as IP data security, transport data security, email security, terminal connection security, conferencing service security, etc. The smaller key size reduces the power consumption and increases the speed of the cryptographic system. The major time consuming arithmetic operations in ECC are point addition and point doubling.

In encryption systems and most digital signal processing applications such as FFT and convolution multipliers play fundamental role in its computation. Implementing high speed systems with low power consumption and time delay mainly depends on multiplier execution time. Comparing to conventional multiplication Vedic method of multiplication requires very less number of operations resulting in a faster and high performance multiplier.

This paper describes a new multiplier architecture using encoder that requires half the number of operations than Vedic Multipliers. ECC algorithm using Vedic mathematics and encoder multiplier architecture makes the encryption system more efficient.
II. MULTIPLIER USING ENCODER

Multiplier is the core component for most applications such as digital signal processing, encryption and decryption algorithm in cryptography and in other logical computations. Performance of a system mainly depends on the speed of the multiplier. Array multipliers and booth multipliers are the most common multipliers used in digital hardware. To improve the speed and power consumption of multipliers, many studies have taken place and are going on.

Vedic method of multiplication using ancient Indian Mathematics is much simpler and easier to understand. Swami Bharati Krishna Tirthaji Maharaj reintroduced Vedic mathematics into the world. According to his research all mathematical operations are based on sixteen sutras and thirteen sub-sutras. It is used in all arithmetical operations such as multiplication, squaring, cubing, quadratic equations, etc. Vedic multipliers [6] designed using Urdhvatiryakbhyam sutra and Nikhilam sutras are some of the fastest multipliers. Vedic multiplier using Urdhvatiryakbhyam sutra and array multipliers have almost the same architecture. For eight bit multiplication the number of partial products required for both is eight. So large number of adder circuits are required to find the final product. The speed of a multiplier can be improved by either using fast addition algorithms or by reducing the number of partial products.

Multiplier using encoded algorithm [1] illustrates a new architecture for multiplication. The number of partial products generated by using this algorithm is half compared to Vedic and conventional methods. For eight bit multiplier the number of partial products will be four. The number of adder circuits required will be drastically reduced. This results in having multipliers with minimal adder circuits and faster adder algorithms.

ENCODING TECHNIQUE: In this technique the multiplier bits are grouped in a combination of 2-2 bits starting from LSB. The grouping of bits is shown in Fig 1[1]. Each group is given to the encoder circuit. The output from the encoder is based on the encoder table shown in Table 1[1].

![Fig. 1 Grouping bits using Encoder](image)

<table>
<thead>
<tr>
<th>( b_{F1} )</th>
<th>( b_i )</th>
<th>( A_i )</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>0</td>
<td>1</td>
<td>1</td>
</tr>
<tr>
<td>1</td>
<td>0</td>
<td>2</td>
</tr>
<tr>
<td>1</td>
<td>1</td>
<td>3</td>
</tr>
</tbody>
</table>

Table 1 Encoder output

ENCODING ALGORITHM:
1. If \( A_i \) is 0 then partial product \( P_i \) is 0.
2. If \( A_i \) is 1 then partial product \( P_i \) is the multiplicand.
3. If \( A_i \) is 2 then partial product \( P_i \) is obtained by shifting the multiplicand one bit left.
4. If \( A_i \) is 3 then partial product \( P_i \) is the sum of partial products of \( A_i \) for \( i = 1 \) and 2.
ENCODING STEPS:
1. Group the multiplier into 2 bits each starting from the LSB.
2. Find the value of $A_i$ from the encoder table.
3. Find the partial products based on the value of $A_i$.
4. Partial products are given to the adder circuit with a shift of one bit, two bit, four bit, six bit one by one.
5. The adder circuit gives the final product.

For example 10101001 x 11001001 works as follows

Group the multiplier bits starting from LSB

<table>
<thead>
<tr>
<th>Multiplier bits</th>
<th>11</th>
<th>00</th>
<th>10</th>
<th>01</th>
</tr>
</thead>
<tbody>
<tr>
<td>$A_i$</td>
<td>3</td>
<td>0</td>
<td>2</td>
<td>1</td>
</tr>
</tbody>
</table>

From the algorithm the partial products are

<table>
<thead>
<tr>
<th>$A_i$</th>
<th>Partial Products</th>
</tr>
</thead>
<tbody>
<tr>
<td>01</td>
<td>10101001</td>
</tr>
<tr>
<td>10</td>
<td>101010010</td>
</tr>
<tr>
<td>00</td>
<td>00000000</td>
</tr>
<tr>
<td>11</td>
<td>111111011</td>
</tr>
</tbody>
</table>

Shifting the partial products by 2, 4 and 6 and given to the adder circuit to get the final product. The way multiplication is performed is shown below. Here the partial products are written one below the other after the required shifting.

\[
\begin{array}{cccccccccc}
1 & 0 & 1 & 0 & 1 & 0 & 0 & 0 & 1 \\
1 & 0 & 1 & 0 & 1 & 0 & 0 & 0 & 1 \\
0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\
\hline
1 & 1 & 1 & 1 & 1 & 1 & 0 & 1 & 1 \\
\hline
1 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 1 & 1 & 0 & 0 & 0 & 1
\end{array}
\]

ENCODED MULTIPLIER ARCHITECTURE:

Fig. 2 shows the block diagram of the multiplier architecture. The input to the encoder circuit is the multiplicand and the multiplier. Partial products generated from the encoded circuit is given to shift register and then to the adder circuit. Adder circuits are optimized by using carry save adders. Fig. 3 shows the block diagram of the proposed adder circuit for eight bit multiplication.
COMPARISON WITH VEDIC MULTIPLICATION ALGORITHM: Vedic multiplication involves both multiplication and addition. In encoded multiplier architecture multiplier circuit is not used. Partial products are generated directly from the proposed algorithm. The comparison result is shown in Table 2. It shows a large reduction in hardware structure. So power consumption, cost and delay are less.

<table>
<thead>
<tr>
<th>Bit size</th>
<th>Vedic Multiplier</th>
<th>Encoded Multiplier</th>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td>M</td>
<td>A</td>
</tr>
<tr>
<td>8</td>
<td>64</td>
<td>97</td>
</tr>
<tr>
<td>16</td>
<td>256</td>
<td>413</td>
</tr>
</tbody>
</table>

Table 2 Comparison of Encoded Multiplier using Vedic Multiplier

III. ELLIPTIC CURVE ARITHMETIC

An Elliptic Curve is defined as an equation having set of solutions along with the point at infinity. The elliptic curve equations $y^2 + xy = x^3 + ax + b$ in GF(2m) and $y^2 = x^3 + ax + b$ in GM(p) are called Weierstrass equations [4]. Variables and coefficients are chosen from a large finite field. These points form a group. The group operations for elliptic curve cryptography are point multiplication, point addition and point doubling. Point multiplication means multiplying a point $P(x,y,z)$ with a scalar value $k$, i.e, $Q = kP$. The security of ECC in cryptography is based on finding the value of $k$ if $P$ and $Q$ are given. This is called Elliptic Curve Discrete Logarithmic Problem. For large values of $k$ it is computationally infeasible. Point Multiplication is the main operation in ECC [3].

POINT DOUBLING IN PROJECTIVE COORDINATES: To double a point $P$, i.e. $Q = 2P$

Let $2(X_1, Y_1, Z_1) = (X_3, Y_3, Z_3)$ then

$Z_3 = X_1^2$ \ $Z_1^2$

$X_3 = X_1^4 + b \ Z_1^4$

$Y_3 = b \ Z_1^4 \ Z_3 + X_3 \ (a \ Z_3 + Y_1^2 + b \ Z_1^4)$

POINT ADDITION IN PROJECTIVE COORDINATES: To add two points in a curve, i.e $L = J + K$

Let $(X_1, Y_1, Z_1) + (X_2, Y_2, 1) = (X_3, Y_3, Z_3)$ then

$A = Y_2 \ Z_1^{-1} + Y_1$

$B = X_2 \ Z_1 + X_1$

$C = Z_1 \ B$

$Z_1^2 = C^2$

$D = B^2 \ (C + a \ Z_1^7)$

$E = A \ C$

$X_3 = A^3 + D + E$
DOUBLE AND ADD ALGORITHM FOR POINT MULTIPLICATION:
1. The base point on the curve P(x,y) is given as the input.
2. Scalar value k = (k_{m-1}, k_{m-2}, …k_0).
3. Another point on the curve Q = kP will be the output

Algorithm to compute Q = kP

\[ Q = P \]

\[
\text{for } i = m-2 \text{ down to } 0 \\
\text{do } Q = 2.Q \\
\text{if } k_i = 1 \\
\text{do } Q = Q + P \\
\text{end }
\]

From this algorithm we can see implementing point addition and point doubling for point multiplication requires multiplication operation and squaring.

IV. SQUARING USING VEDIC MATHEMATICS

For squaring, a dedicated architecture can improve its performance than using multiplier architecture. Using Duplex D property of binary numbers from the sutra Dwandwayoga of Vedic Mathematics algorithm for squaring is implemented [4].

1. Duplex of a number is twice that number, Duplex of a is \( a^2 \)
2. Duplex of two numbers is multiplying two with the product of that number, Duplex of \( ab \) is \( 2\times a \times b \)
3. Duplex of three numbers is multiplying the product of the outer most pair with two plus the square of the middle number, Duplex of \( abc \) is \( 2\times a \times c + b^2 \)

Example: \( 12431^2 = 154529761 \)

<table>
<thead>
<tr>
<th>No</th>
<th>Duplex</th>
</tr>
</thead>
<tbody>
<tr>
<td>1</td>
<td>1*1 = 1</td>
</tr>
<tr>
<td>12</td>
<td>2<em>1</em>2 = 4</td>
</tr>
<tr>
<td>124</td>
<td>2<em>1</em>4 + 2*2 = 12</td>
</tr>
<tr>
<td>1243</td>
<td>2<em>1</em>3 + 2<em>2</em>4 = 22</td>
</tr>
<tr>
<td>12431</td>
<td>2<em>1</em>1 + 2<em>2</em>3 + 4*4 = 30</td>
</tr>
<tr>
<td>2431</td>
<td>2<em>2</em>1 + 2<em>4</em>3 = 28</td>
</tr>
<tr>
<td>431</td>
<td>2<em>4</em>1 + 3*3 = 17</td>
</tr>
<tr>
<td>31</td>
<td>2<em>3</em>1 = 6</td>
</tr>
<tr>
<td>1</td>
<td>1*1 = 1</td>
</tr>
</tbody>
</table>

\( 1/4/12/22/30/28/17/6/1 \) can be written in the form shown below and finally find their sum

\[
\begin{array}{ccc|ccc|cc}
1 & 4 & 2 & 2 & 0 & 8 & 7 & 6 & 1 & + \\
0 & 1 & 2 & 3 & 2 & 1 & 0 \\
\hline
1 & 5 & 4 & 5 & 2 & 9 & 7 & 6 & 1
\end{array}
\]

V. IMPLEMENTATION RESULTS

Point Addition, Point doubling and Scalar Multiplication are done in Verilog. The code is synthesized using Xilinx 12.1 to verify the functionality. Simulation results for Point Doubling, Point Addition and Scalar Multiplication is shown in Fig 4, 5 and 6.
Comparative study of point addition and point doubling using different multipliers such as Vedic, Booth and Array are done. Fig.7 shows the delay comparison of point addition and point doubling using different multipliers. Fig.8 shows the comparison results of occupied slice LUTs.
VI. CONCLUSION

The aim of this project is the development of high performance ECC encryption system. The proposed multiplier architecture achieves a significant improvement in performance. The encoded multiplier having only shift registers and adder circuits thus reduces the complexity, cost, power consumption and delay. The Point Addition using various multipliers was taken for comparison and the implementation using encoder multiplier was found to be 2.06 times faster than Vedic multiplier, 2.54 times faster than Booth multiplier and 5.59 times faster than Array multiplier. While comparing the Point Doubling the implementation using encoder multiplier was found to be 1.25 times faster than Vedic multiplier, 1.84 times faster than Booth multiplier and 2.02 times faster than Array multiplier. The usage of slice LUTs is also found to be more efficient with encoder multiplier. In Point Addition encoder multiplier uses 219% lesser slice LUTs compared to Vedic multiplier, 556% lesser slice LUTs compared to Booth multiplier and 69% lesser slice LUTs compared to Array multiplier. In Point Doubling encoder multiplier uses 11% lesser slice LUTs compared to Vedic multiplier, 542% lesser slice LUTs compared to Booth multiplier and 550% lesser slice LUTs compared to Array multiplier.

REFERENCES


Copyright to IJAREEIE www.ijareeie.com 5537


