Keywords

Clock power reduction, merging, multibit flipflop, replacement, wirelength. 
INTRODUCTION

Due to the popularity of portable electronic products, Low power system has attracted more attention in recent years. Reducing the power consumption not only can enhance battery life but also can avoid the overheating problem, which would increase the difficulty of packaging or cooling. Moreover, in modern VLSI designs, power consumed by clocking has taken a major part of the whole design.. Given a design that the locations of the cells have been determined, the power consumed by clocking can be reduced further by replacing several flipflops with multibit flipflops. During clock tree synthesis, less number of flipflops means less number of clock sinks. Thus, the resulting clock network would have smaller power consumption and uses less routing resource. Besides, once more smaller flipflops are replaced by larger multibit flipflops; device variations in the corresponding circuit can be effectively reduced. Fig. 1 shows the clock diagrams of 1 and 2bit flipflops. If we replace the two 1bit flipflops as shown in Fig. 1(a) by the 2bit flipflop as shown in Fig. 1(b), the total power consumption can be reduced because the two 1bit flipflops can share the same clock buffer. However, the locations of some flipflops would be changed after this replacement, and thus the wirelengths of nets connecting pins to a flipflop are also changed. To avoid violating the timing constraints, we restrict that the wirelengths of nets connecting pins to a flipflop cannot be longer than specified values after this process. Besides, to guarantee that a new flipflop can be placed within the desired region, we also need to be considering the area capacity of the region. Given a design that the locations of the cells have been determined, the power consumed by clocking can be reduced further by replacing several flipflops with multibit flipflops. During clock tree synthesis, less number of flipflops means less number of clock sinks. Thus, the resulting clock network would have smaller power consumption and uses less routing resource. 
RELATED WORK

They use the graphbased approach. In a graph, each node represents a flipflop. If two flipflops can be replaced by a new flipflop without violating timing and capacity constraints, they build an edge between the corresponding nodes. After the graph is built, the problem of replacement of flipflops can be solved by finding an mclique in the graph. The flipflops corresponding to the nodes in an mclique can be replaced by an mbit flip flop. They use the branchandbound and backtracking algorithm to find all mcliques in a graph. Because one node (flipflop) may belong to several mcliques (mbit flipflop), they use greedy heuristic algorithm to find the maximum independent set of cliques, which every node only belongs to one clique, while finding mcliques groups. However, if some nodes correspond to kbit flipflops that k 1, the bit width summation of flipflops correspond ing to nodes in an mclique, j , may not equal m. If the type of a j bit flipflop is not supported by the library, it may be timewasting in finding impossible combinations of flip flops. 
OUR CONTRIBUTIONS

The difficulty of this problem has been illustrated in the above descriptions. To deal with this problem, the direct way is to repeatedly search a set of flipflops that can be replaced by a new multibit flipflop until none can be done. However, as the number of flipflops in a chip increases dramatically, the complexity would increase exponentially, which makes the method impractical. To handle this problem more efficiently and get better results, we have used the following approaches. 
1) To facilitate the identification of mergeable flipflops, we transform the coordinate system of cells. In this way, the memory used to record the feasible placement region can also be reduced. 
2) To avoid wasting time in finding impossible combinations of flipflops, we first build a combination table before actually merging two flipflops. For example, if a library only provides three kinds of flipflops, which are 1, 2, and 3bit, we first separate the flipflops into three groups. Therefore, the combination of 1 and 3bit flipflops is not considered since the library does not provide the type of 4bit flipflop. 
3) We partition a chip into several sub regions and perform replacement in each sub region to reduce the complexity. However, this method may degrade the solution quality. To resolve the problem, we also use a hierarchical way to enhance the result. 
PROPOSED METHOD

DESIGN FLOW

Our design flow can be roughly divided into three stages .In the beginning, we have to identify a legal placement region for each flipflop fi .First, the feasible placement region of a flipflop associated with different pins are found based on the timing constraints defined on the pins. Then, the legal placement region of the flipflop fi can be obtained by the overlapped area of these regions. However, because these regions are in the diamond shape, it is not easy to identify the overlapped area .Therefore, the overlapped area can be identified more easily if we can transform the coordinate system of cells to get rectangular regions. In the second stage, we would like to build a combination table, which defines all possible combinations of flipflops in order to get a new multibit flipflop provided by the library. 
The flipflops can be merged with the help of the table. After the legal placement regions of flipflops are found and the combination table is built, we can use them to merge flipflops. To speed up our program, we will divide a chip into several bins and merge flipflops in a local bin. However, the flipflops in different bins may be mergeable. Thus, we have to combine several bins into a larger bin and repeat this step until no flipflop can be merged anymore. 
BUILD A COMBINATION TABLE

we may add pseudo types, which denote those flipflops that are not provided by the library, For example, assume that a library only supports two kinds of flipflops whose bit widths are 1and 4, respectively. In order to use a binary tree to denote a combination whose bit width is 4, there must exist flipflops whose bit widths are 2 and 3. 
Let bmax and bmin denote the maximum and minimum bit width of flipflops in L. In InsertPseudoType, it inserts all flipflops whose bit widths are larger than bmin and smaller than bmax into L if they are not provided by L originally. After this procedure, all combinations in L are sorted according to their bit widths in the ascending order. At present, all combinations are represented by binary trees with 0level. Thus, we would assign NULL to its right and left child. Finally, for every two kinds of combinations in T, we try to combine them to create a new combination. If the new combination is the flipflop of a feasible type (this can be checked by the function Type Verify), we would add it to the table T. In the function Type Verify, we first add the bit widths of the two combinations together and store the result in bsum. Then, we will add a new combination n to T with bit width bsum if L has such kind of a flipflop. After these procedures, there may exist some duplicated or unused combinations in T. Thus, we have to delete them from the table and the two functions Duplicate Combination Delete and Unused Combination Delete are called for the purpose In DuplicateCombinationDelete, it checks whether the duplicated combinations exist or not. If the duplicated combinations exist, only the one with the smallest height of its corresponding binary tree is left and the others are deleted. In Unused Combination Delete, it checks the combinations whose corresponding type is pseudo in L. If the combination is not included into any other combinations, it will be deleted. 
MERGE FLIPFLOPS

In merging of flip flops to reduce the complexity, we first divide the whole placement region into several subregions, and use the combination table to replace flipflops in each subregion. Then, several subregions are combined into a larger subregion and the flipflops are replaced again so that those flipflops in the neighboring subregions can be replaced further. Finally, those flipflops with pseudo types are deleted in the last stage because they are not provided by the library. 
Region Partition 
To speed up our problem, we divide the whole chip into several subregions. By suitable partition, the computation complexity of merging flipflops can be reduced significantly we divide the region into several subregion, each subregion contains sixbins.bin is a small unit of subregion. 
SIMULATION RESULTS

Flip flop Merging output 

Proposed Power 

Existing Power 

CONCLUSION

This paper has proposed an algorithm for flipflop replacement for power reduction in digital integrated circuit design. The procedure of flipflop replacements is depending on the combination table, which records the relationships among the flip flop types. The concept of pseudo type is, introduced to help to enumerate all possible combinations in the combination table. By the guidelines of replacements from the combination table, the impossible combinations of flipflops will not be considered that decreases execution time. Besides power reduction, the objective of minimizing the total wirelength also is considered to the cost function. The experimental results show that our algorithm can achieve a balance between power reduction and wirelength reduction. 
Figures at a glance






Figure 1 
Figure 2 
Figure 3 
Figure 4 
Figure 5 


References

 P. Gronowski, W. J. Bowhill, R. P. Preston, M. K. Gowan, and R. L.Allmon, “Highperformance microprocessor design,” IEEE J. SolidStateCircuits, vol. 33, , May 1998.
 D. Duarte, V. Narayanan, and M. J. Irwin, “Impact of technology scaling in the clock power,” in Proc. IEEE VLSI Comput. Soc. Annu. Symp.,Pittsburgh, PA, Apr. 2002, pp. 52–57.
 H. Kawagachi and T. Sakurai, “A reduced clockswing flipflop (RCSFF) for 63% clock power reduction,” in VLSI Circuits Dig. Tech. PapersSymp.,Jun. 1997, pp. 97–98.
 Y. Cheon, P.H. Ho, A. B. Kahng, S. Reda, and Q. Wang, “Poweraware placement,” in Proc. Design Autom. Conf., Jun. 2005, pp. 795–800.
 Y.T. Chang, C.C. Hsu, P.H. Lin, Y.W. Tsai, and S.F. Chen,“Postplacement power optimization with multibit flipflops,” in Proc.IEEE/ACM Comput.Aided Design Int. Conf., San Jose, CA, Nov. 201pp. 218–223.
