CONSISTENCY MANAGEMENT STRATEGIES IN DATA REPLICATION USING ANT COLONY OPTIMIZATION METHOD OVER MOBILE PEER TO PEER NETWORK

S.J.K. Jagadeesh Kumar; R.Saraswathi

CONSISTENCY MANAGEMENT STRATEGIES IN DATA REPLICATION USING ANT COLONY OPTIMIZATION METHOD OVER MOBILE PEER TO PEER NETWORK

S.J.K. Jagadeesh Kumar^*1 and R.Saraswathi²

Professor, Dept. of Computer Science and Engineering, SKCT, Coimbatore
PG Student, Sri Krishna College of Technology, Coimbatore, Tamil Nadu, India

Related article at Pubmed, Scholar Google

Visit for more related articles at Journal of Global Research in Computer Sciences

Abstract

A mobile peer-to-peer computer network is the one in which each computer in the network can act as a client or server for the other computers in the network. The communication process among the nodes in the mobile peer to peer network requires more no of messages. Due to this large number of messages passing, propose an interconnection structure called distributed Spanning Tree (DST) and it improves the efficiency of the mobile peer to peer network. The proposed method improves the data availability and consistency across the entire network and also reduces the data latency and the required number of message passes for any specific application in the network. Further to enhance the effectiveness of the proposed system, the DST network is optimized with the Ant Colony Optimization method. It gives the optimal solution of the DST method and increased availability, enhanced consistency and scalability of the network. This paper also addresses the consistency maintenance in the replica management. This proposed method to maintain the effective local consistency management in the mobile peer to peer network using Ant Colony Optimization method.

INTRODUCTION

MOBILE PEER TO PEER NETWORK

In a mobile P2P network, the "peers" are computer systems which are connected to each other via the Internet. Files can be shared directly between systems on the network without the need of a central server. In other words, the P2P network is called a distributed structure if the participants share a part of their own resources. These shared resources are necessary to provide the service offered by the network. The participants of such a network are both resource providers and resource consumers. The P2P network has the following characteristics:

• All nodes are both clients and servers

• Provide and consume data

• Any node can initiate a connection

• No centralized data source

• Nodes contribute content, storage, memory, CPU

• Network is dynamic: nodes enter and leave the network “frequently”

• Nodes collaborate directly with each other

• Nodes have widely varying capabilities

Collaborative applications, Distributed computation and Ad-hoc networks. The various Challenges of the P2P network are Decentralization, Scalability, Performance, Anonymity, Fairness, Dynamism, Security, Transparency, Fault Resilience and Robustness.

INTRODUCTION TO DATA REPLICATION

Data replication is generally used to improve the data availability, reduce data access delay and network traffic. In order to maximize the degree of availability, most of the replication methods are available. But on the other hand, a large number of replicas consume massive amounts of memory and increase the unit cost of the individual nodes. Replica Management is an important methodology to reduce bandwidth utilization, to get better data accessibility and to preserve data consistency in Wireless Ad hoc Networks (WANET).Global Replica Management (GRM) is the method to preserve the replica consistent throughout the network. Many solutions are proposed and they differ in terms of their location, permanency, scope and applicability [4,5]. Based on the scope and applicability, the replications methods can be classified as local and global. The local replication schemes are limited and they are applicable for intra-group management, whereas the global replication schemes are applicable for inter-group management. The Local Replication Management (LRM) systems are associated with a single group, where the number of nodes and peers are fixed and limited. But Global Replica Management (GRM) is very much required for all those applications which support multi-peer collaborative works, where the numbers of nodes are not fixed and defined [6]. But GRM is not suitable for multi collaborative applications because of huge number of message passes are needed for replica management. To reduce the message passes required to accomplish GRM, we introduce a interconnection structure called Distributed Spanning Tree (DST).

DISTRIBUTED SPANNING TREE

The distributed spanning tree (DST) is an overlay structure designed to be scalable. It supports the growth from a small number of nodes to a large one. The DST is a tree without bottlenecks which automatically balances the load between its nodes. The DST breaks the common assumption that a tree is build of leaves and intermediate nodes. In a DST every nodes are equal. The nodes are put together into small cliques. Then, the cliques are put together into small cliques of higher level recursively. The cliques are represented in each node by a routing table. The memory space complexity of the routing tables is O(log(n)) for a n nodes DST.

The section II describes the related work and section III describes the proposed system and section IV describes the conclusion of the work.

RELATED WORK

In [6], Takahiro Hara proposed new consistency maintenance based on local conditions such as location and time need to be investigated. It attempts to classify different consistency levels according to requirements from applications and provides protocols to realize them. In [8], Ren Xun-yil et al proposed a consistency technique based on a replica clustering coefficient to classify replica nodes into multi-levels. Replica consistency has been maintained in which the updating of the data item is performed at first-level replica nodes initially and then it is propagated to the next level of nodes in sequence. Though efficiency is proved in terms of response time and the number of message passes required. In [9], Chun-Pin et al propose a Dynamic Maintenance Service to maintain the data in gird environment. The Bandwidth Hierarchy based Replication algorithm was proposed to maintain the replica dynamically in grid environment. In [9],Chao-Tung et al proposed a One-way Replica Consistency Service (ORCS) for grid environment to resolve the consistency maintenance issues and also balancing the tradeoff between the improving data Access performance and replica consistency.

In [11] Sang-Min Park proposed a novel dynamic replication strategy; called BHR (Bandwidth Hierarchy based Replication). It tries to maximize locality of file to reduce data access time. However, grid sites may be able to hold only small portion of overall amount of data since very large quantity of data is produced in data grid and the storage space in a site is limited. Therefore, effect from this locality is limited to a certain degree. BHR strategy takes benefit from other form of locality, called network-level locality. In [12], Haiying Shen propose an Integrated file Replication and consistency Maintenance mechanism (IRM) that integrates the two techniques in a systematic and harmonized manner. It achieves high efficiency in file replication and consistency maintenance at a significantly low cost. Instead of passively accepting replicas and updates, each node determines file replication and update polling by dynamically adapting to time-varying file query and update rates, which avoids unnecessary file replications and updates. It dramatically reduces overhead and yields significant improvements on the efficiency of both file replication and consistency maintenance approaches. In [15], Xin Sun et al proposed a bidirectional linked list based replica location service to provide a global replica view for supporting the replica management to realize a replica selection strategy and optimal replication strategy on tree-based hierarchical unstructured overlay network.

In [16], Jun Zheng et al proposed a dynamic minimum access cost based replication strategy called MAC replication strategy. It takes into account the access frequency, the status of the network connection and average response time. It calculates an appropriate site to replicate for better shortening the response time of the data source. In [17], Wanlei et al propose the Hybrid Replica Control protocol that attempts to maximize the data availability and communication overhead. In [18], Feras et al propose a Constrained Fast Spread (CFS) method to alleviate the main problems encountered in the current replication techniques and mainly concentrating on the feasibility of replicating the requested replica on each node among the network. In [19] Baskaran et al proposed a GRM in a tree structured P2P network to preserve the replica consistency through the network and reduce the traffic in the network. In [20], Sylvain Dahan et al proposed a Distributed Spanning Tree structure and it is designed to support scalable searches and traversal algorithms. The DST based searches generates less messages to send the query and avoids tree bottleneck. In [21], Sylvain Dahan et al proposed a distributed Spanning tree Structure for large scale environment. This method achieves load balancing and Fault Tolerance in the network. In [22],Xin sun et al proposed the bidirectional linked list based replica location service (BLL-RLS) on tree-based hierarchical unstructured overlay networks, including the deployment of replica location service and the design of the bidirectional linked list based replica catalog. Based on the bidirectional linked list based replica catalog, replica location and selection algorithm is also proposed.

Drawbacks:

The Existing methods suffers from, huge number of messages sent or a higher volume of computations.

Space complexity is very high.

Communication overhead is high.

When increasing the number of nodes in the network, the Consistency maintenance yields poor efficiency.

PROPOSED SYSTEM

In a mobile peer to peer network, each node has to exchange information and services directly with each other without any dedicated intermittent. So it develops bottlenecks in the network due to the huge volume of messages being exchanged. This could be avoided to optimize the number of messages across the network. In this paper, a distributed spanning tree approach is proposed. The proposed system consists of the following steps:

Formation of Mobile Peer to Peer Network.

Formation of Distributed Spanning Tree.

Optimization of DST with Ant Colony Optimization.

DST for Global Replica Management.

FORMATION OF MOBILE PEER TO PEER NETWORK

There are various steps for creating the P2P network. In fig 1.a shows the sample mobile peer to peer network and fig 1.b shows the simulation on mobile peer to peer network.

Define the network options such as communication channel, propagation models, queue types and the network interface.

Creating the instance of the simulator and set up the trace file.

Create a topology object that keeps the movement of the mobile nodes within the topological boundary and also set the coordinate values of the boundary.

Configure the nodes and create the number of mobile nodes in the network. Establish the communication between the nodes.

Define the initial position of the node when it displayed in the NAM simulator.

NS-2 uses NAM (network animator) to provide visualization. NAM also allows users to design and debug the network protocols.

FORMATION OF DISTRIBUTED SPANNING TREE

The mobile P2P network is converted into the set of spanning trees called the Distributed Spanning Tree (DST) and the corresponding graph based algorithms are developed to optimize the number of messages across the network. The DST is an overlay structure designed to be scalable, which supports the growth of the nodes from fewer nodes to higher volumes [16,17].It allows the instantaneous creation of spanning trees rooted by any node and maintains the load balancing between the nodes [16]. This instantaneous creation of spanning trees improves the overall scalability of the intended network [18]. So, DST structures help to automatically balance and optimize the load among the nodes.

The P2P network is converted into DST and each tree should have its root node, named as the Head Node (HN) and the possible Leaf Nodes (LNs). Every HN will hold the complete details regarding its LNs and vice versa. These HNs are to be generated dynamically and should hold the replica, which is to be accessed by their corresponding LNs and indeed by other HNs also. Fig 2a shows the simulation of distributed spanning tree. The DST algorithm consists of three procedures.

1.Initialize() : This procedure create the set of Head Nodes (HNs) in a peer network based on criteria such as user approval, traffic in a particular region, etc. This procedure creates a list on each HN to hold its LNs details. This procedure assign unique id for every HN and then it calls the procedure probe ().

2. Probe() :This procedure creates probe message and flood this message to all the nodes connected to it. On receiving a probe messages, every node executes receive () procedure.

•If the probe message is received by an HN, then it will be discarded.

•If the message is received by the LN, which is not under any HN, then the LN stores the head variable as the HeadID. Then the procedure reply() and the forward() will be called.

•If the reply message is received by the LN, it will be forwarded to the HN.

•If the reply message is received by the HN, then it reads the „HeadIdÃÂ¢Ãâ¬ÃÅ¸ from rmsg. If the „HeadidÃÂ¢Ãâ¬ÃÅ¸ equals the id of the current node then it concludes that the respective head node is reached.

3.Reply() : The reply() procedure called by the corresponding LN to reply to its HN.

Definition 1: Let Ta be the graph of the peer network with HNs and LNs. Then Ga can be defined as,

where,

i. „DSTiÃÂ¢Ãâ¬ÃÅ¸ is the Distributed Spanning Tree and „iÃÂ¢Ãâ¬ÃÅ¸ is the total number of DSTs formed in the network.

ii. „HNiÃÂ¢Ãâ¬ÃÅ¸ is the Head Node (HN) and „iÃÂ¢Ãâ¬ÃÅ¸ is the total number of HNs in the peer network equal to the number of DSTs.

iii. „„LNÃÂ¢Ãâ¬ÃÅ¸ refers to the Leaf Node(s). In „LNiz ÃÂ¢Ãâ¬ÃÅ¸, refers to the corresponding HNi and 0 < z ≤ ij − 1, where „ij − 1ÃÂ¢Ãâ¬ÃÅ¸ is the total number of LNs in the corresponding DST.

Definition 2: The number of messages required to pass the nodes in the MP2P network with the DST structure can be evaluated as the following equation,

n(msgpass)=((P/LN) × P × R) + ((P/LN) × T) (2)

AN ANT COLONY OPTIMIZATION OF DST

The Ant Colony Optimization technique is a probabilistic optimization technique which could find the optimal path in a graph, which is based on behaviours of ant seeking a path between their colonies and source of food. By applying the ACO over the formulated DST, we can obtain the optimal path in terms of reduced number of message passes among the nodes in the network. The Ant colony optimization Algorithm for DST optimization is presented in the fig 3.This algorithm consists of four procedure; optimization (G), propagate (),construction() and daemon action().The Procedure for optimization (G) consists of two operations:

1.Finding the optimal path between every HNs. Let HNi is a HN among { HN1,HN2,.....HNn}, where n is the number of HNs in the P2P network. HNi use probe message „pÃÂ¢Ãâ¬ÃÅ¸ to find the optimal path between HNi and other HNs.

2.The propagate (G,x,p) ,which propagates messages through different paths is called which takes graph G, HNi as „xÃÂ¢Ãâ¬ÃÅ¸ and probe message „pÃÂ¢Ãâ¬ÃÅ¸ as parameters. Probe message „pÃÂ¢Ãâ¬ÃÅ¸ is flooded through the possible path which increase the number of feasible path discovered between the HNi and other HN.

3.The construction (G,Ãâ ÃÂ®,x,z) which calculates the edge value through the destination HN, is called by HNi which takes the graph G, start HN „xÃÂ¢Ãâ¬ÃÅ¸ start HN „xÃÂ¢Ãâ¬ÃÅ¸ specific end HN „ZÃÂ¢Ãâ¬ÃÅ¸ and „Ãâ ÃÂ®ÃÂ¢Ãâ¬ÃÅ¸ as the parameter. The „Ãâ ÃÂ®ÃÂ¢Ãâ¬ÃÅ¸ is the measure of cumulative edge value between „xÃÂ¢Ãâ¬ÃÅ¸ and „zÃÂ¢Ãâ¬ÃÅ¸.ÃÂ¢Ãâ¬ÃÅ¸ „Ãâ ÃÂ®iÃÂ¢Ãâ¬ÃÅ¸ value is used to decide the optimal path between the nodes. The value of „ValÃÂ¢Ãâ¬ÃÅ¸ can be given as

other operation and may wait for some period (threshold value has to be set) and repeat the same sequence of action.

Case 2: If the value of LL or GL is set, then the LL is set and requests a reply message, which may contains the copy the replica that is to be sent to the requested LN.

If the value of Local Read Lock (LL) message received by the HN, then the message will be propagated and also send the wait message to the respective LN so that the unnecessary local read lock request messages can be avoided. This situation is termed as On-Demand Lock Forward (ODLF).By using ODLF, it is possible to reduce the unnecessary message passes among the nodes in the DSTs. So it reduces the message overload in the network.

If there is no acknowledgement messages from the sequence of read request for receiving the LN and if LN couldnÃÂ¢Ãâ¬ÃÅ¸t find its HN, then the respective LN becomes the HN. The LN may initiate the procedure to reconstruct the corresponding DST to form the cluster with the LNs of the unfound HN. This scenario is called as Re-Clustering (RC).

ALGORITHM FOR WRITE OPERATION

The LN send the write Lock message to its HN for performing write operation (update) on the data item. On receiving the write lock message, the HN will perform any of the following operations:

1. When a HN receives write lock message from any of its LNs, then the HN will check its GL value and the following sequence of actions may be performed:

If the GL value of the corresponding HN is set, then it will send a wait message to the LN to initiate the write request again.

If the LL value of the corresponding HN is set, then it will wait all the completion of local read operation. (iii) If GL value of the corresponding HN is not set, then it will set its LL and GL values and also updates the Write Operation Priority (WOP). Each WOP has its own PN and broadcasts the write lock message to all other HNs in the network.

2. When a HN receives write lock message from another HN, then dependant on the value of its GL the following sequence of actions can be performed.

If the GL value of the corresponding HN is set, check whether the write operation priority value of GL locked HN is WOPi greater than Priority Number GL requested HN.is PNj .

If the condition is TRUE, then the write lock message received HN will reply a wait message to the corresponding write lock requester HN and if the condition is FALSE, then Step-2b will be performed.

If the LL of the corresponding HN is set, then it will wait till the completion of the local read operation and then it proceed with Step-2c.

If GL is not set, then it set its GL value and of course updates its WOP value as the PN of the write lock requested HN (PNj) and sends write acknowledge message to the corresponding requester HN.

When the LN canÃÂ¢Ãâ¬ÃÅ¸t find its HN, then it aborts the write request process and proceeds with RC operation. If a HN finds that any of the HNs are not responding, then it induces any of the LNs under the faulty HN (LN can be identified by its previous communication through the corresponding HN) to proceed with RC operation.

CONCLUSION

By employing the DST structures in the P2P network, the consistency and replication efficiency can be achieved with the few messages compared to the traditional method. The scalability of the P2P network can be improved with the application of DST structures. The proposed model increases the data availability, reduces the bandwidth conception and number of messages in the network and also improves the fault tolerant capacity of the overall system. Further to enhance the effectiveness of the proposed system, the DST network is optimized with the Ant Colony Optimization method. It gives the optimal solution of the DST method and increased availability, enhanced consistency and scalability. The proposed consistency method also provides the effective local consistency management using the Ant Colony Optimization method. We have to plan to achieve the cluster based replica allocation for mobile peer to peer networks and also achieve the effective service cache management in the network.

References

Xin Sun, Jun Zheng, Qiongxin Liu, Yushu Liu, Dynamic data replication based on access cost in distributed systems, in: 2009 Fourth International Conference on Computer Sciences and Convergence Information Technology, IEEE 2009.
H. Stockinger, A. Samar, B. Allcock, I. Foster, K. Holtman, B. Tierney, File and object replication in data grids, J. Cluster Comput. 5 (3) (2002) 305–314.
A. Chervenak, I. Foster, C. Kesselman, C. Salisbury, S. Tuecke, The data grid: towards an architecture for the distributed management and analysis of large scientific datasets, J. Netw. Comput.Appl. 23 (2001) 187– 200. [4] A.A. Helal, A.A. Heddaya, B.B. Bhargava, Replication Techniques in Distributed Systems, Kluwer Academic, ISBN: 0-7923-9800-9, 1996.
B. Ciciani, D.M. Dias, P.S. Yu, Analysis of replication in distributed database systems, IEEE Trans. Knowl Data Eng. 2 (2) (1990) 247–261.
Takahiro Hara, Sanjay Kumar Madria, Consistency management strategies for data replication in mobile ad hoc networks, IEEE Trans. Mob. Comput. 8 (7) (2009) 950–967.
S.Y. Hwang, K.K.S. Lee, Y.H. Chin, Data replication in a distributed system: a performance study, in: Proc. Conf. Database and Expert Systems Applications, 1996, pp. 708–717.
RenXun-yi, Wang Ru-chuan, Kong Qiang, Efficient model for replica consistency maintenance in data grids, in: International Symposium on Computer Science and its Applications, IEEE, 2008.
Chao-Tung Yang, Chun-Pin Fu, Ching-Hsien Hsu, File replication, maintenance, and consistency management services in data grids, J. Supercomput. 53 (3)(2009) 411–439.
R.M. Rahman, K. Barker, R. Alhajj, Replica placement design with static optimality and dynamic maintainability, in: Proceedings of the Sixth IEEE International Symposium on Cluster Computing and the Grid, CCGRIDÃÂ¢Ãâ¬ÃÅ¸06, pp. 434–437.
S.M. Park, J.H. Kim, Y.B. Ko, S. YoonW-, Dynamic data grid replication strategy based on internet hierarchy, in: The IInd Intl Workshop on Grid and Cooperative Computing, GCC2003, pp. 838–846.
HaiyingShen “IRM: Integrated File Replication and Consistency Maintenance in P2P Systems” IEEE transactions on Parallel and Distributed systems,Vol.21 pp.100-113. Top of FoBottom of Form
J. Luo, J.P. Hubaux, P. Eugster, PAN: providing reliable storage in mobile ad hoc networks with probabilistic quorum systems, in: Proc. ACM MobiHoc, 2003, pp. 1–12.
A. Datta, M. Hauswirth, K. Aberer, Updates in highly unreliable, replicated peer-to-peer systems, in:Proceedings OF IEEE ICDCSÃÂ¢Ãâ¬ÃÅ¸03, Providence, RI, USA, May 2003.
Xin Sun, KanLi,Yushu Liu,” An Efficient Replica Location Method in Hierarchical P2P Networks”,IEEE ICIS Intl Conf. Pp.769-774.
Xin Sun, Jun Zheng, Qiongxin Liu, Yushu Liu, “Dynamic data replication based on access cost in distributed systems”, in: 2009 Fourth International Conference on Computer Sciences and Convergence Information Technology, IEEE 2009.
WanleiZhou,Holmes,” The design and simulation of a hybrid replication control protocol” procin.IEEE Intl sym. on PAAN, pp.210-215.1999.
FerasHanandeh and Mutaz,” CFS: A New Dynamic Replication strategy for Data grids”,Intl,Arab JIT Vol.9 2012.
Baskaran .R and Paul.P.V,”Algorithm and direction for analysis of global Replica Management in P2P network”,IEEEIntl.Conf. in ICRTIT 2012.
Sylvain Dahan, Laurent Philippe, Jean-Marc Nicod, “The distributed spanning tree structure”, IEEE Trans. Parallel Distrib. Syst. (2009) 1738–1751.
Sylvain Dahan, Distributed spanning tree algorithms for large scale traversals,in: 11th International Conference on Parallel and Distributed Systems,ICPADSÃÂ¢Ãâ¬ÃÅ¸05, IEEE, 2005.
Xin Sun and Kan Li “ An Efficient Replica Location Method in Hierarchical P2P Networks” IEEE /ACIS Intl Conf.pp.769-774.2009.