An Efficient Power Reduction in Multiplexer Based On Cordic Using Cadence-Digital IC Design

Uma.P, G.AnuVidhya

ME Student, Dept. of VLSI Design, Karpaga Vinayaga college of Engineering and Technology, Madurantakam, India
Assistant Professor, Dept. of ECE, Karpaga Vinayaga college of Engineering and Technology, Madurantakam, India

ABSTRACT: CORDIC is an iterative Algorithm to perform a wide range of functions including vector rotations, certain trigonometric, hyperbolic, linear and logarithmic functions. Both non pipelined and 2 level pipelined CORDIC with 8 stages, using two schemes was performed. First scheme was original unrolled CORDIC and second scheme was MUX based pipelined unrolled CORDIC. Compared to first scheme, the second scheme is more reliable, since the second scheme uses multiplexer and registers. By adding multiplexer the area is reduced comparatively to the first architecture, since the first scheme uses only addition, subtraction and shifting operation in all the 8 stages.8 iterations are performed and it is implemented on QUARTUS II software. The same is implemented in cadence tool and the results was compared with QUARTUS II and CADENCE TOOL. An efficient power reduction is obtained in CADENCE (Digital) implementation.

KEYWORDS: CORDIC, rotation mode, multiplexer, pipelining, QUARTUS II, CADENCE (Digital) implementation.

1 INTRODUCTION

The CORDIC is a class of hardware-efficient algorithms for the computation of trigonometric and other transcendental functions that use only shifts and adds to perform. The CORDIC set of algorithms for the computation of trigonometric functions was developed by Jack E. Volder in 1959 to help in building a real-time navigational system for the B-58 supersonic bomber. Later, J. Walther in 1971 extended the CORDIC scheme to other transcendental functions.

Calculators can only perform four operations inexpensively:
1. Addition and Subtraction
2. Storing in memory and Retrieving from memory
3. Digit shift (multiplication/division by the base)
4. Comparisons

The CORDIC Algorithm is a unified computational scheme to perform
1. Computations of the trigonometric functions: sin, cos and arctan.
2. Computations of the hyperbolic trigonometric functions: sinh, cosh and arctanh.
3. It also compute the exponential function, the natural logarithm and the square root.
4. Multiplication and division.

Both non pipelined and 2 level pipelined CORDIC with 8 stages and using two schemes was done. First scheme using adders in all the stages and second scheme using multiplexers only in the second and third stages, the other stages are as same as first scheme.

The second scheme achieves less area compared to original unrolled CORDIC (first scheme). It is performed in QUARTUS II. Multiplexer has been proposed for the ASIC implementation of unrolled CORDIC (Coordinate Rotation Digital Computer) processor

The CORDIC algorithm is an iterative method of performing vector rotations by arbitrary angles using shifts and addition. In the rotation mode, CORDIC may be used for converting a vector in polar form to rectangular form. In
the vector mode, it converts a vector in rectangular form to polar form. Both the modes are derived from the general rotation transform.

\[
X_{\text{fin}} = X_{\text{in}} \cos \theta - Y_{\text{in}} \sin \theta \quad (1)
\]

\[
Y_{\text{fin}} = X_{\text{in}} \sin \theta + Y_{\text{in}} \cos \theta \quad (2)
\]

The rotation may be achieved by performing a series of successively smaller elementary rotations \(\theta_1, \theta_2, \theta_3, \ldots, \theta_N\). Rotation of the vector by an angle can be rewritten as

\[
X_{i+1} = X_i \cos \theta_i - Y_i \sin \theta_i \quad (3)
\]

\[
Y_{i+1} = X_i \sin \theta_i + Y_i \cos \theta_i \quad (4)
\]

\[
X_{i+1} = X_i + X_i \tan \theta_i \quad (5)
\]

\[
Y_{i+1} = Y_i + X_i \tan \theta_i \quad (6)
\]

The computational complexity of (5), (6) can be reduced by rewriting these equations as

\[
X_{i+1} = X_i \tan \theta_i \quad (7)
\]

\[
Y_{i+1} = Y_i + X_i \tan \theta_i \quad (8)
\]

\[
(X_{\text{fin}}, Y_{\text{fin}}) = (X_N/\pi, Y_N/\pi) \cos \theta_0 \quad (9)
\]

\[
Z_{i+1} = Z_i - \delta_i \tan^{-1} 2^{-i} \quad (10)
\]

\[
k = \prod_{0}^{N} \cos \theta_i \quad (11)
\]

\[\theta_i\] is considered to be positive when the rotation required is anticlockwise and is negative otherwise. The direction of this rotation depends on the \(\delta_i\).

\[
\delta_i = \text{sgn}(Z_i) \quad (11)
\]

II. RELATED WORK

In rotation mode, CORDIC can simultaneously compute the sine and cosine of the input angles. In this mode, set the y component of the input vector to zero, x component to 1/k and the angle accumulator is initialized with the desired rotation angle \(\theta\). For rotation mode, the CORDIC equations are given by

\[
X_{i+1} = X_i \cos \delta_i - Y_i 2^{-i} \quad (12)
\]

\[
Y_{i+1} = Y_i + X_i \delta_i 2^{-i} \quad (13)
\]

\[
Z_{i+1} = Z_i - \delta_i \tan^{-1} 2^{-i} \quad (14)
\]

\[
k = \prod_{0}^{N} \cos \theta_i \quad (15)
\]

Fig1. The Unrolled CORDIC

The architecture of the eight stage unrolled CORDIC is shown, this consists of only adders, subtractors and shifters; accuracy improves as the number of stages increases. Addition or subtraction on the angle value takes place in each rotation of the vector depending on the most significant bit of previous angle. Perform division just by doing right shift using shift registers.

The scheme for reducing the area of the CORDIC using multiplexer is proposed for the ASIC implementation. This is adopted for the QUARTUS II based implementation. The area is reduced by removing some of the stages.

\[
Y_i = X_i \quad (16)
\]

\[
X_i = X_i \quad (17)
\]
If the first stage output is positive, then
\[ Y_2 = Y_1 - \frac{x_1}{2} = \frac{y_1}{2} \] (18)
\[ X_2 = X_1 + \frac{y_1}{2} = \frac{3X_i}{2} \] (19)

The vector coordinates corresponding to negative output is
\[ Y_2 = Y_1 + \frac{x_1}{2} = \frac{3X_i}{2} \] (20)
\[ X_2 = X_1 - \frac{y_1}{2} = \frac{x_2}{2} \] (21)

For \( sgn_1 = 0, sgn_2 = 0 \)
\[ Y_3 = Y_2 + \frac{x_2}{4} = \frac{3X_i}{4} + \frac{x_2}{4} = \frac{13X_i}{8} \] (22)
\[ X_3 = X_2 - \frac{y_2}{4} = \frac{3X_i}{4} - \frac{y_2}{4} = \frac{8X_i}{8} \] (23)
For \( sgn_1 = 0, sgn_2 = 0 \)

For \( sgn_1 = 1, sgn_2 = 1 \)
\[ Y_3 = Y_2 - \frac{x_2}{4} = \frac{3X_i}{4} - \frac{x_2}{4} = \frac{11X_i}{8} \] (26)
\[ X_3 = X_2 + \frac{y_2}{4} = \frac{3X_i}{4} + \frac{y_2}{4} = \frac{7X_i}{8} \] (27)
For \( sgn_1 = 0, sgn_2 = 1 \)

\[ Y_3 = Y_2 - \frac{x_2}{4} = \frac{3X_i}{4} - \frac{x_2}{4} = \frac{13X_i}{8} \] (28)
\[ X_3 = X_2 + \frac{y_2}{4} = \frac{3X_i}{4} + \frac{y_2}{4} = \frac{8X_i}{8} \] (29)
For \( sgn_1 = 1, sgn_2 = 1 \)

<table>
<thead>
<tr>
<th>No. of eliminated stages</th>
<th>No. of Mux Required</th>
</tr>
</thead>
<tbody>
<tr>
<td>1</td>
<td>0</td>
</tr>
<tr>
<td>2</td>
<td>2</td>
</tr>
<tr>
<td>3</td>
<td>6</td>
</tr>
<tr>
<td>4</td>
<td>14</td>
</tr>
<tr>
<td>5</td>
<td>30</td>
</tr>
</tbody>
</table>

Table 1. Multiplexers required for eliminating different stages.

The block diagram of the CORDIC when the adders till third stage are replaced with Mux is shown. As the adders are replaced with Mux, the area of the circuit is reduced till 3rd stage. But the replacement of adders with Mux beyond the third stage results in an exponential increase in the number of Mux as shown in Table I.
III. PROPOSED ALGORITHM

The pipelined CORDIC use registers in between each iteration stage as shown. The advantage of pipelined unrolled CORDIC over the unrolled CORDIC is its higher frequency of operation. This property can be used in high speed applications. The number of registers depends on the number of stages in pipelining and there will be an increase in area. The first output of an N-stage pipelined CORDIC core is obtained after N clock cycles. Thereafter, outputs will be generated during every clock cycle. Here, pipelined registers are placed after fourth and seventh stages. Mux based pipeline unrolled CORDIC architecture in which pipeline registers are inserted at the output.

Fig 3. Pipelined CORDIC Using Registers

Fig 4. Pipelined MUX Based Unrolled CORDIC
IV. SIMULATION RESULTS

A. THE RESULTS IN QUARTUS II FOR THE ORIGINAL UNROLLED CORDIC

Phase_in1, rst_n and clk are the inputs assigned to the block diagram. Cos_out1, sin_out1 and eps1 are the outputs obtained. For different angle value the corresponding sin and cos values are calculated. For 30° corresponding hexadecimal value is 15 and output obtained for sin_out1 is 64 and cos_out1 is 255. For 45° corresponding value is 20 and output obtained for sin_out1 is 51 and cos_out1 is 254.

<table>
<thead>
<tr>
<th>ANGLE VALUE</th>
<th>CALCULATED VALUE</th>
<th>SIN_out1 VALUE</th>
<th>COS_out1 VALUE</th>
</tr>
</thead>
<tbody>
<tr>
<td>30°</td>
<td>15</td>
<td>64</td>
<td>255</td>
</tr>
<tr>
<td>45°</td>
<td>20</td>
<td>51</td>
<td>254</td>
</tr>
</tbody>
</table>

Table 2. Output Waveform Calculation

B. POWER ANALYZER

Power analyzer report says the total power thermal dissipation, core dynamic thermal power dissipation, core static thermal power dissipation and input/output power dissipation. From this report the static and dynamic power rate and also input/output power rate are analysed.

C. RESULTS IN QUARTUS II FOR PIPELINED MUX BASED UNROLLED CORDIC
Phase_in1, rst_n and clk are the inputs assigned to the block diagram. Cos_out1, sin_out1 and eps1 are the outputs obtained. For $30^\circ$ corresponding hexadecimal value is 15 and output obtained for sin_out1 is 247 and cos_out1 is 115. For $45^\circ$ corresponding value is 20 and output obtained for sin_out1 is 86 and cos_out1 is 82.

<table>
<thead>
<tr>
<th>INPUT</th>
<th>OUTPUT</th>
<th>OUTPUT</th>
</tr>
</thead>
<tbody>
<tr>
<td>ANGLE VALUE</td>
<td>MEASURED VALUE</td>
<td>SIN_out1 VALUE</td>
</tr>
<tr>
<td>30°</td>
<td>15</td>
<td>247</td>
</tr>
<tr>
<td>45°</td>
<td>20</td>
<td>86</td>
</tr>
</tbody>
</table>

Table 3. Output Waveform Calculation

D. POWER ANALYZER

![Power Analyzer Report](image)

Fig 8. Power Analyzer Report

E. RESULTS IN CADENCE TOOL FOR ORIGINAL UNROLLED CORDIC

Phase_in1, rst_n and clk are the inputs assigned to the block diagram. Cos_out1, sin_out1 and eps1 are the outputs obtained. For $30^\circ$ corresponding hexadecimal value is 15 and output obtained for sin_out1 is 64 and cos_out1 is 255. For $45^\circ$ corresponding value is 20 and output obtained for sin_out1 is 51 and cos_out1 is 254.

<table>
<thead>
<tr>
<th>INPUT</th>
<th>OUTPUT</th>
</tr>
</thead>
<tbody>
<tr>
<td>ANGLE VALUE</td>
<td>CALCULATED VALUE</td>
</tr>
<tr>
<td>30°</td>
<td>15</td>
</tr>
<tr>
<td>45°</td>
<td>20</td>
</tr>
</tbody>
</table>

Table 4. Output Waveform Calculation

![Report for Area](image)

Fig 10. Report for Area

![Report for Power](image)

Fig 11. Report for Power
From fig.10 it is easy to identify the total area consumed by the original unrolled cordic. The area consumed is about 3927. From fig.11, it is possible to identify the leakage power, dynamic power and total power obtained for the original unrolled cordic. The leakage power is 13136.856 nw, dynamic power is 65127.083 nw and total power is 78263.939 nw.

F. RESULTS IN CADENCE TOOL FOR MUX BASED PIPELINED UNROLLED CORDIC

<table>
<thead>
<tr>
<th>INPUT</th>
<th>OUTPUT</th>
</tr>
</thead>
<tbody>
<tr>
<td>ANGLE</td>
<td>CALCULATED</td>
</tr>
<tr>
<td>VALUE</td>
<td>VALUE</td>
</tr>
<tr>
<td>30°</td>
<td>15</td>
</tr>
<tr>
<td>45°</td>
<td>20</td>
</tr>
</tbody>
</table>

From fig.14 it is possible to identify the total area consumed by the Mux based pipelined unrolled cordic. The area consumed is about 3959. From fig.15, it is possible to identify the leakage power, dynamic power and total power obtained for the original unrolled cordic. The leakage power is 10548.948 nw, dynamic power is 59433.044 nw and total power is 69981.992 nw.
TOOLS USED | THE POWER IN ORIGINAL UNROLLED CORDIC | THE POWER IN MUX BASED PIPELINED UNROLLED CORDIC
---|---|---
QUARTUS II SOFTWARE | 111.79mw | 115.55mw
CADENCE(DIGITAL) IMPLEMENTATION | 78263.939nw | 69981.992nw

Table 6. Comparison table for QUARTUS II and CADENCE TOOL

V CONCLUSION AND FUTURE WORK

CORDIC algorithm was used to find out the trigonometric, hyperbolic, linear and logarithmic functions. In CORDIC algorithm two schemes was discussed .First scheme was original unrolled CORDIC and second scheme was MUX based pipelined unrolled CORDIC. Compared to first scheme, the second scheme is more reliable, since the second scheme uses multiplexer and registers. By adding multiplexer the area is reduced comparatively to the first architecture, since the first scheme uses only addition, subtraction and shifting operation in all the 8 stages. 8 iterations are performed and it is implemented on QUARTUS II software. The same is implemented in cadence(digital) tool and the power obtained was compared with both QUARTUS II and CADENCE TOOL. An efficient power reduction was obtained in CADENCE TOOL. For future work, the number of iterations can be increased and also increase the bit size.

REFERENCES