Power Efficient Enhancement Technique for Flip- Flop Design | Open Access Journals

ISSN ONLINE(2319-8753)PRINT(2347-6710)

Power Efficient Enhancement Technique for Flip- Flop Design

S. Sathyapriya1
PG student, Department of ECE, Vivekananda Institute of engineering and Technology for women, Tiruchengode, India1
Related article at Pubmed, Scholar Google

Visit for more related articles at International Journal of Innovative Research in Science, Engineering and Technology

Abstract

Low power pulse triggered flip-flop is designed in this paper. In the pulse generation control logic, AND function is removed and a simple two – transistor AND gate design is used to reduce complexity and to facilitate a faster discharge operation. A pulse enhancement technique is applied to speed up the discharge along the critical path when needed. In resultant circuit, transistor size in the delay inverter and pulse generator circuit is reduced for power saving. Various post layout simulation results based on UMC CMOS 90-nm technology reveal that the proposed design features the best power-delay-product performance in seven FF designs under comparison. Its maximum power saving against existing design is up to 38.4%. Compared with the conventional transmission gatebased FF design, the average leakage power consumption is also reduced by a factor of 3.52.

Keywords

flip-flop, low power, pulse-triggered

INTRODUCTION

Pulse triggered flip flop is an alternative to conventional master-slave flip flop in application of high speed operations[2]-[5]. Speed is advantage, the circuit is simple due to this power consumption is also reduced. PFF consists of pulse generator for generating strobe signal and latch is for data storage. Since triggering pulse generated are narrow in pulse width, the latch acts like a edge-triggered flip flop. In P-FF only one latch is used compared with conventional master-slave flip flop. And this flip flop is less sensitive to clock jitter. Depending on the method of pulse generation, P-FF design is two types. They are 1.implicit and 2.explicit type flip flop [6]. In implicit type P-FF, the pulse generator is build-in logic of the latch design and no external clock pulse are generated. In the explicit type p-FF, pulse generator and latch are designed separately. Implicit type pulse generation is mostly considered as power efficient pulse generation compared with explicit pulse generation. This is because implicit type design controls the discharging path while explicit type design needs to physically generate a pulse train. This paper will present low-power implicit type P-FF design featuring a conditional enhancement scheme. To support this conditional pulse enhancement feature 3 additional transistors are employed. Even though there is increase in the total transistor count. Transistors of pulse generation logic give the benefit of size reduction and the overall area layout is also slightly reduced. This gives rise to competitive power and power–delay–product performances against other P-FF designs. Proposed implicit-type design with pulse control scheme is given below

II. PROPOSED IMPLICIT-TYPE P-FF DESIGN WITH PULSE CONTROL SCHEME

A. Conventional implicit-type P-FF designs Some of the conventional implicit type P-FF designs are first reviewed. A P-FF design, named ip-DCO, is shown in the fig. 1. This design contains an AND logic-based pulse generator and a semi-dynamic structured latch design. Inverters I5 and I6 are used to latch data and inverters I7 and I8 are used to hold the internal node X. Two problems exist in this design. First, problem is during the rising edge, nMOS transistors N2 and N3 are turned on. If the data remains high node X will discharge continuously on every rising edge of the clock this leads to large switching power. Second problem is large capacitive load at node X causes speed and power performance degradation.
image
Fig.2 shows an improved p-FF design, named MHLLF, by employing a static latch structure presented in [1]. Node X is no longer precharged periodically by the clock signal. A weak pull-up transistor p1 is controlled by FF output signal Q is used to maintain the node X level at high when Q is zero. This design eliminates unnecessary discharging problem at node X. But it encounters a longer data-to-Q (D-to-Q) delay during 0 to 1 transition because node X is not pre-discharged. The drawback of this design is that to enhance the discharging capability larger transistor N3 and N4 are required and node X becomes floating when output Q and input data both equal to “1”. An extra DC power emerges if node X becomes drifted from intact “1”. Fig.3 shows a refined low P-FF design named SCCER using a conditional discharging technique [9], [12]. In this design, the keeper logic i.e. back to back
image
Inverters I7 and I8 in fig.1 is replaced by a weak pull up resistor P1 in conjunction with an inverter i2 to reduce the load capacitance of node X [12]. The disadvantage of this design is that in order to eliminate unwanted switching at node X, an extra nMOS transistor N3 is employed. Since N3 is controlled by Q_fdbk, no discharge occurs if input data remains “1”. Another drawback of this design is that a powerful pull-down circuitry is needed to ensure that node X is discharging properly.
image
B. Proposed P-FF design In the proposed design two measures are given to overcome the problems associated with existing designs. In the first measure number of nMOS transistors is reduced in the discharging path. The second measure provides a supporting mechanism to conditionally enhance the pull-down strength when the input is “1”. In the proposed design (fig.4) the upper part latch design is similar to SCCER design [12]. In this design transistor N2 is removed from the discharging path in the ip-DCO and SCCER design. Transistor N2 in conjunction with additional transistor logic (PTL) [13], [14] based AND gate to control the discharge of transistor n1. Since, the two inputs to the AND logic are mostly complementary the output node Z is kept”0” most of the time. When both input signal equal to”0”temporary floating at node Z is basically harmless. At the rising edge of clock both transistor N2 and N3 are turned on and collaborate to pass a weak logic high to node Z which then turn ON N1 by a time span defined by the delay inverter I1. The switching power at node Z can be reduced due to a diminished voltage swing. Unlike MHLLF design [11], where the discharge control signal is driven by a single transistor, parallel condition of two nMOS transistor (N2 and N3) speeds up the operation of pulse generation. The number of stacked transistors along the discharging path is reduced and the sizes of transistors N1-N5 can also be reduced with this design measure. In the proposed design longest discharging path is formed when input data is “1” while the Q bar output is “1”. To enhance the discharging in this path transistor P3 is added. P3 is always turned off because node X is pulled high most of the time. Transistor P3 steps in when node X is discharged to  Vtp below the Vdd. This provides additional boost to node Z. the generated pulse is taller, which enhances the pull-down strength of transistor N1. After the rising edge of the clock, the delay inverter I1 drives node Z back to “0” through transistor N3 to shut down the discharging path.
Transistor P3 is turned off when the voltage level of node x is raised. With the intervention of P3, the width of the generated discharging pulse is stretched out. This means to create a pulse with sufficient width for correct data capturing. Conditional pulse enhancement technique takes effects only when the flip-flop output Q is changed from 0 to 1. This leads to better power performance then those schemes using indiscriminate pulse width enhancement approach. Another advantage of conditional pulse enhancement scheme is the reduction in leakage power due to shrunken transistor in the critical discharging path and in delay inverter. Flip-flops are most essential elements in the design of sequential circuits. We did the comparison for their performance and power dissipations and the transistor count of each flip-flop.
image

III. SIMULATION RESULTS

To minimize the signal rising and falling edge time delays, input signal are generated through buffers. Fig.5 shows the simulation setup model. By taking an account of loading effect of the FF to the previous stage and clock tree, power consumption of clock and data buffer is also considered. FF output is loaded with 20-fF capacitor and also an extra capacitance of 3-fF is also placed after clock buffer. In the fig.5 the advantages of proposed work is illustrated. And the simulation resulted waveforms of the proposed P-FF design against MHLLF design is shown in fig.6. In the proposed design node Z pulses are generated at every rising edge of the clock. Transistor P3 provides extra voltage boost and hence pulse generated to capture input data “1” is enhanced in their heights and widths compared with pulse generated for capturing the data “0” (0.84 V versus 0.65V in heights and 141 ps versus 84 ps in width). In the MHLLF design, there is no such differentiation in their pulse generation. And in addition there is no signal degradation occurs in the internal node X of proposed design. But the internal node X of MHLLF design is degraded when Q equals to “0” and data equals to “1”. So, due to this node Q deviates slightly from intact value “0” and causes DC power consumption at output signal. From fig.6 the height of pulses at node Z is around 0.68V. Furthermore node Z is floating when clock equals to “0” and its value drifts gradually. Power consumption behavior of these FF designs, are elaborated by five test pattern, each exhibiting a different data switching probability. All this five of them are deterministic patterns with 0%, 25%, 50% and 100% for all 0’s and 1’s and the power consumption results are summarized in table I. due to shorter discharging path and employing conditional pulse enhancement scheme, the power consumption of the proposed design is the lowest in all test patterns. Now we take test pattern with 50% data transition probability as example, the power saving of proposed design ranges from 38.4% against the ip-DCO design to 5.6% against the TGFF design. When we operate at low switching voltage means the power saving of proposed system is even more pronounced i.e. will be increased. Due to the redundant switching power consumption problem at internal node, in the ip-DCO design largest power consumption when data switching activity is 0%(all 1). One of the primary requirements of a flip-flop for highspeed digital design, besides short latency, is to have a simple and robust clocking scheme. So, the flip-flop design should have a simple and efficient clock circuit.
image
Fig. 6 shows the simulation waveforms of the proposed design tanner tool is used for simulation. PDPDQ values of the proposed system are small in all design when the setup times are greater than -60ps. And its minimum PDPDQ value occurs when the setup time is -53.9ps and its corresponding D-to-Q delay is 116.9ps. The setup time of CCFF design is optimal to -67ps and it is marked as second. The conventional TGFF design as positive setup time always and has smallest PDPDQ value when the setup time is 47ps. MHLLF design has worst PDPDQ performance due to drawback of its latch structure. Fig.7 shows the best PDPDQ performance of each design under different data switching activity. In the second place SCCER and CCFF design is placed. Fig.7 shows PDPDQ performance of these design under different process and under the condition of 50% data switching. The performance of the proposed system is maintained well. In MHLLF design PDPDQ performance is worst especially at the SS process corner due to large D to Q delay and the poor driving capability of its pulse generation circuit. The improved performance of P-FF design is given table-I. The table gives details about transistor count, layout area, setup time, and hold time, min D to Q delay, optimal PDP and the clock tree power. In the MHLLF design layout area is largest due to oversized pulse generation circuit. Setup time is defined as the point in the curve where D to Q delay is minimum. Hold time is measured at the point where the slope of the curve equals -1. Features of proposed design is that it has shortest minimum D to Q delay, its hold time is longer than other design, because the transistor P3 for the pulse enhancement requires a prolonged availability of the input. Optimal PDP value is also significantly better than the other designs. And also the power drawn from clock tree is also calculated to evaluate the impact of FF loading on clock jitter. Even though the proposed FF design requires clock signal connected to the drain of transistor N2, the drawn current is not significant. Due to the complementary switching behavior of N2 and N3, there exist no signal path from the entry of the clock signal to either VDD or GND. And the clock signal is labile for charging/discharging at node Z. Simulation result shows the clock tree power of the proposed design is closer to two leading design (MHLLF and CCFF) and outperforms ip-DCO, SCCER, TGFF and SAFF where clock signal is connected to gate of the transistor only. And the setup time of this desi varies only -67 to +47ps. Although the optimal setup time of the proposed design is -53.9ps, its PDP value is lowest in all design for any setup time greater then -60ps. The hold time of proposed design seems to be larger due to negative setup time. This number reduces as the setup time moves towards a positive value. For different clock and data input combinations, the proposed design gives minimum leakage power consumption, which is mainly aimed to reduction in transistor sizes along the discharging path. When we compare proposed design with SA FF design, SAFF design experience the worst leakage power consumption when clock equals to “0” because its two precharge PMOS transistors are always turned ON. When proposed design is compared with conventional TGFF design, the average leakage power is reduced by a factor of 3.52. Even though significant fluctuations in pulse width and height are observed, the unique conditional pulse-enhancement scheme works well in all cases.
image
image
image

IV. CONCLUSION

In the proposed design, we introduced a low-power pulse-triggered FF design by employing two new design measures. One is successfully reduces the number of transistor stacked along the discharging path by introducing PTL-based AND logic. And the other one gives conditional enhancement to the height and width of the discharging pulse so the size of the transistors in pulse generation circuit can be kept minimum. Simulation result shows that the proposed design has better performance indexes such as power, D to Q delay and PDP compared to other designs or existing designs. And the advantage of this design is that it has longer holdingtime.

V. FUTURE WORK

In the future Dual edge triggered flip-flop can be designed. By designing the clock generation circuit that will operate in both positive and negative level triggering (i.e., it can able to trigger in both rising and falling edge of the clock pulse). This method overcomes the single edge triggering and also reduces the power consumption.

References

[1] H. Kawaguchi and T. Sakurai, “A reduced clock-swing flipflop( RCSFF) for 63% power reduction,” IEEE J. Solid-State Circuits, vol.33, no. 5, pp. 807–811, May 1998.

[2] A. G. M. Strollo, D. De Caro, E. Napoli, and N. Petra, “A novel high speed sense-amplifier-based flip-flop,” IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 13, no. 11, pp. 1266–1274, Nov. 2005.

[3] H. Partovi, R. Burd, U. Salim, F.Weber, L. DiGregorio, and D. Draper, “Flowthrough latch and edge-triggered flip-flop hybrid elements,” in IEEE Tech. Dig. ISSCC, 1996, pp. 138–139.

[4] F. Klass, C. Amir, A. Das, K. Aingaran, C. Truong, R.Wang, A. Mehta, R. Heald, and G.Yee, “A newfamily of semi-dynamic and dynamic flip flops with embedded logic for high-performance processors,” IEEE J. Solid-State Circuits, vol. 34, no. 5, pp. 712–716, May 1999.

[5] S. D. Naffziger, G. Colon-Bonet, T. Fischer, R. Riedlinger, T. J. Sullivan, and T. Grutkowski, “The implementation of the Itanium 2 microprocessor,” IEEE J. Solid- State Circuits, vol. 37, no. 11, pp.1448–1460, Nov. 2002.

[6] J. Tschanz, S. Narendra, Z. Chen, S. Borkar, M. Sachdev, and V. De, “Comparative delay and energy of single edge-triggered and dual edge triggered pulsed flip-flops for high-performance microprocessors,” in Proc. ISPLED, 2001, pp. 207–212.

[7] B. Kong, S. Kim, and Y. Jun, “Conditional-capture flip-flop for statistical power reduction,” IEEE J. Solid-State Circuits, vol. 36, no. 8, pp.1263–1271, Aug. 2001.

[8] N. Nedovic, M. Aleksic, and V. G. Oklobdzija, “Conditional precharge techniques for power-efficient dual-edge clocking,” in Proc. Int. Symp.Low-Power Electron.Design, Monterey, CA, Aug. 12–14, 2002, pp.56–59.

[9] P. Zhao, T. Darwish, and M. Bayoumi, “High-performance and low power conditional discharge flip-flop,” IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 12, no. 5, pp. 477–484, May 2004.

[10] C. K. Teh, M. Hamada, T. Fujita, H. Hara, N. Ikumi, and Y. Oowaki, “Conditional data mapping flip-flops for low-power and highperformance systems,” IEEE Trans. Very Large Scale Integr. (VLSI) Systems, vol. 14, pp. 1379–1383, Dec. 2006.

[11] S. H. Rasouli, A. Khademzadeh, A. Afzali-Kusha, and M. Nourani, “Lowpower single- and double-edge-triggered flip-flops for high speed applications,” Proc. Inst. Electr. Eng.—Circuits Devices Syst., vol. 152, no. 2, pp. 118–122, Apr. 2005.

[12] H. Mahmoodi, V. Tirumalashetty, M. Cooke, and K. Roy, “Ultra low power clocking scheme using energy recovery and clock gating,” IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 17, pp. 33–44, Jan. 2009.

[13] P. Zhao, J. McNeely, W. Kaung, N. Wang, and Z. Wang, “Design of sequential elements for low power clocking system,” IEEE Trans. Very Large Scale Integr. (VLSI) Syst., to be published.

[14] Y.-H. Shu, S. Tenqchen, M.-C.Sun, and W.-S.Feng, “XNORbased double-edge-triggered flip-flop for two-phase pipelines,” IEEE Trans. Circuits Syst. II, Exp. Briefs, vol. 53, no. 2, pp. 138–142, Feb. 2006.

[15] V. G. Oklobdzija, “Clocking and clocked storage elements in a multigiga- hertz environment,” IBM J. Res. Devel., vol. 47, pp. 567– 584, Sep. 2003.