ISSN ONLINE(2320-9801) PRINT (2320-9798)

All submissions of the EM system will be redirected to Online Manuscript Submission System. Authors are requested to submit articles directly to Online Manuscript Submission System of respective journal.

TPA Based Batch Auditing Mechanism for Commercial Clouds

Mr.R.Sathyaraj1, Dr.V.K.Manavala Sundaram M.E., Ph.D., 2
  1. Department of CSE, Velalar College of Engineering & Technology, Erode, Tamilnadu, India
  2. Associate Professor (Sr. Gr)/CSE, Velalar College of Engineering & Technology, Erode, Tamilnadu, India
Related article at Pubmed, Scholar Google

Visit for more related articles at International Journal of Innovative Research in Computer and Communication Engineering

Abstract

Cloud computing enables highly scalable services consumed over the Internet. Cloud services are provided on user request basis. In cloud environment users’ data are usually processed remotely in unknown machines that users do not own or operate. User data control is reduced on data sharing under remote machines. Entities are allowed to join and leave the cloud in a flexible manner. Public and private auditing schemes are used to monitor the user data access activities. Data access management can be done through the cloud service providers (CSP). Cloud services are provided by Amazon, Google, Microsoft, Yahoo and Salesforce. Remote data storages are used to share data and services in the cloud environment. Data provider uploads the shared data into the data centers. Public auditing methods are used to verify the data integrity in remote data storages. Third-party auditor (TPA) is used to check the integrity of outsourced data. Privacy preserving public auditing mechanism is used to verify the data integrity with privacy. TPA supports auditing for multiple users simultaneously. Batch auditing mechanism is used for multi user environment. Homomorphic linear authenticator and random masking techniques are used to protect the data from TPA. The privacy preserving public auditing scheme is enhanced to perform data verification for multi user environment. Batch verification scheme is adapted to multi user data sharing environment. Data dynamism is integrated with public data auditability scheme. The system is improved to support public auditing based data sharing under commercial cloud environment.

INTRODUCTION

A cloud comprises processing, network, and storage elements, and cloud architecture consists of three abstract layers. Infrastructure is the lowest layer and is a means of delivering basic storage and compute capabilities as standardized services over the network. Servers, storage systems, switches, routers, and other systems handle specific types of workloads, from batch processing to server or storage augmentation during peak loads. The middle platform layer provides higher abstractions and services to develop, test, deploy, host, and maintain applications in the same integrated development environment. The application layer is the highest layer and features a complete application offered as a service.
In 1961, John McCarthy envisioned that “computation may someday be organized as a public utility.” We can view the cloud computing paradigm as a big step toward this dream. To realize it fully, however, we must address several significant problems and unexploited opportunities concerning the deployment, efficient operation, and use of cloud computing infrastructures.
The shift of computer processing, storage and software delivery away from desktop and local servers, across the Internet and into next-generation data centers results in limitations new opportunities regarding data management. Data is replicated across large geographic distances, where its availability and durability are paramount for cloud service providers. It’s also stored at untrusted hosts, which creates enormous risks for data privacy. Computing power in clouds must be elastic to face changing conditions. For instance, providers can allocate additional computational resources on the fly to handle increased demand. They should deploy novel data management approaches, such as analytical data management tasks, multitenant databases for SaaS, or hybrid designs among database management systems (DBMSs) and MapReducelike systems so as to address data limitations and harness cloud computing platforms’ capabilities.
In cloud computing, a data center holds information that end-users would more traditionally have stored on their computers. This raises concerns regarding user privacy protection because users must outsource their data. Additionally, the move to centralized services could affect the privacy and security of users’ interactions. Security threats might happen in resource provisioning and during distributed application execution. Also, new threats are likely to emerge. For instance, hackers can use the virtualized infrastructure as a launching pad for new attacks. Cloud services should preserve data integrity and user privacy. At the same time, they should enhance interoperability across multiple cloud service providers. In this context, we must investigate new data-protection mechanisms to secure data privacy, resource security, and content copyrights.

RELATED WORK

Ateniese et al. are the first to consider public auditability in their “provable data possession” (PDP) model for ensuring possession of data files on untrusted storages. They utilize the RSA-based homomorphic linear authenticators for auditing outsourced data and suggest randomly sampling a few blocks of the file. However, among their two proposed schemes, the one with public auditability exposes the linear combination of sampled blocks to external auditor. When used directly, their protocol is not provably privacy preserving, and thus may leak user data information to the external auditor. Juels et al. describe a “proof of retrievability” (PoR) model, where spot-checking and error-correcting codes are used to ensure both “possession” and “retrievability” of data files on remote archive service systems. However, the number of audit challenges a user can perform is fixed a priori, and public auditability is not supported in their main scheme. Although they describe a straightforward Merkle-tree construction for public PoRs, this approach only works with encrypted data. Later, Bowers et al. [1] propose an improved framework for POR protocols that generalizes Juels’ work. Dodis et al. [6] also give a study on different variants of PoR with private auditability. Shacham and Waters design an improved PoR scheme built from BLS signatures with proofs of security in the security model defined. Similar to the construction, they use publicly verifiable homomorphic linear authenticators that are built from provably secure BLS signatures. Based on the elegant BLS construction, a compact and public verifiable scheme is obtained. Again, their approach is not privacy preserving due to the same reason. Shah et al. [10] propose introducing a TPA to keep online storage honest by first encrypting the data then sending a number of precomputed symmetric-keyed hashes over the encrypted data to the auditor. The auditor verifies the integrity of the data file and the server’s possession of a previously committed decryption key. This scheme only works for encrypted files, requires the auditor to maintain state, and suffers from bounded usage, which potentially brings in online burden to users when the keyed hashes are used up.
Dynamic data have also attracted attentions in the recent literature on efficiently providing the integrity guarantee of remotely stored data. Ateniese et al. is the first to propose a partially dynamic version of the prior PDP scheme, using only symmetric key cryptography but with a bounded number of audits. In [9], Wang et al. consider a similar support for partially dynamic data storage in a distributed scenario with additional feature of data error localization. In a subsequent work, Wang et al. [8] propose to combine BLS-based HLA with MHT to support fully data dynamics. Concurently, Erway et al. develop a skip list based scheme to also enable provable data possession with full dynamics support. However, the verification in both protocols requires the linear combination of sampled blocks as an input, like the designs and thus does not support privacy-preserving auditing.
In other related work, Sebe et al. thoroughly study a set of requirements which ought to be satisfied for a remote data possession checking protocol to be of practical use. Their proposed protocol supports unlimited times of file integrity verifications and allows preset tradeoff between the protocol running time and the local storage burden at the user. Schwarz and Miller propose the first study of checking the integrity of the remotely stored data across multiple distributed servers. Their approach is based on erasure-correcting code and efficient algebraic signatures, which also have the similar aggregation property as the homomorphic authenticator utilized in our approach. Curtmola et al. aim to ensure data possession of multiple replicas across the distributed storage system. They extend the PDP scheme cover multiple replicas without encoding each replica separately, providing guarantees that multiple copies of data are actually maintained. In [7], Bowers et al. utilize a two-layer erasure-correcting code structure on the remotely archived data and extend their POR model [1] to distributed scenario with high-data availability assurance. While all the above schemes provide methods for efficient auditing and provable assurance on the correctness of remotely stored data, almost none of them necessarily meet all the requirements for privacy-preserving public auditing of storage. Moreover, none of these schemes consider batch auditing, while our scheme can greatly reduce the computation cost on the TPA when coping with a large number of audit delegations.

CLOUD DATA STORAGE SERVICES

We consider a cloud data storage service involving three different entities: the cloud user, who has large amount of data files to be stored in the cloud; the cloud server, which is managed by the cloud service provider to provide data storage service and has significant storage space and computation resources; the third-party auditor, who has expertise and capabilities that cloud users do not have and is trusted to assess the cloud storage service reliability on behalf of the user upon request. Users rely on the CS for cloud data storage and maintenance. They may also dynamically interact with the CS to access and update their stored data for various application purposes. As users no longer possess their data locally, it is of critical importance for users to ensure that their data are being correctly stored and maintained. To save the computation resource as well as the online burden potentially brought by the periodic storage correctness verification, cloud users may resort to TPA for ensuring the storage integrity of their outsourced data, while hoping to keep their data private from TPA. We assume the data integrity threats toward users’ data can come from both internal and external attacks at CS. These may include: software bugs, hardware failures, bugs in the network path, economically motivated hackers, malicious or accidental management errors, etc. Besides, CS can be self-interested. For their own benefits, such as to maintain reputation, CS might even decide to hide these data corruption incidents to users. Using third-party auditing service provides a cost-effective method for users to gain trust in cloud. We assume the TPA, who is in the business of auditing, is reliable and independent. However, it may harm the user if the TPA could learn the outsourced data after the audit.
Note that in our model, beyond users’ reluctance to leak data to TPA, we also assume that cloud servers have no incentives to reveal their hosted data to external parties. On the one hand, there are regulations, e.g., HIPAA, requesting CS to maintain users’ data privacy. On the other hand, as users’ data belong to their business asset [10], there also exist financial incentives for CS to protect it from any external parties. Therefore, we assume that neither CS nor TPA has motivations to collude with each other during the auditing process. In other words, neither entity will deviate from the prescribed protocol execution in the following presentation. To authorize the CS to respond to the audit delegated to TPA’s, the user can issue a certificate on TPA’s public key, and all audits from the TPA are authenticated against such a certificate. These authentication handshakes are omitted in the following presentation.

PRIVACY-PRESERVING PUBLIC AUDITING SCHEME FOR CLOUDS

Cloud computing has been envisioned as the next generation information technology (IT) architecture for enterprises, due to its long list of unprecedented advantages in the IT history: on-demand self-service, ubiquitous network access, location independent resource pooling, rapid resource elasticity, usage-based pricing and transference of risk [2]. As a disruptive technology with profound implications, cloud computing is transforming the very nature of how businesses use information technology. One fundamental aspect of this paradigm shifting is that data are being centralized or outsourced to the cloud. From users’ perspective, including both individuals and IT enterprises, storing data remotely to the cloud in a flexible on-demand manner brings appealing benefits: relief of the burden for storage management, universal data access with location independence, and avoidance of capital expenditure on hardware, software, and personnel maintenances, etc., [3].
While cloud computing makes these advantages more appealing than ever, it also brings new and challenging security threats toward users’ outsourced data. Since cloud service providers (CSP) are separate administrative entities, data outsourcing is actually relinquishing user’s ultimate control over the fate of their data. As a result, the correctness of the data in the cloud is being put at risk due to the following reasons. First of all, although the infrastructures under the cloud are much more powerful and reliable than personal computing devices, they are still facing the broad range of both internal and external threats for data integrity [4]. Examples of outages and security breaches of noteworthy cloud services appear from time to time. Second, there do exist various motivations for CSP to behave unfaithfully toward the cloud users regarding their outsourced data status. For examples, CSP might reclaim storage for monetary reasons by discarding data that have not been or are rarely accessed, or even hide data loss incidents to maintain a reputation [10]. In short, although outsourcing data to the cloud is economically attractive for long-term large-scale storage, it does not immediately offer any guarantee on data integrity and availability. This problem, if not properly addressed, may impede the success of cloud architecture.
As users no longer physically possess the storage of their data, traditional cryptographic primitives for the purpose of data security protection cannot be directly adopted. In particular, simply downloading all the data for its integrity verification is not a practical solution due to the expensiveness in I/O and transmission cost across the network. Besides, it is often insufficient to detect the data corruption only when accessing the data, as it does not give users correctness assurance for those unaccessed data and might be too late to recover the data loss or damage. Considering the large size of the outsourced data and the user’s constrained resource capability, the tasks of auditing the data correctness in a cloud environment can be formidable and expensive for the cloud users [12], [8]. Moreover, the overhead of using cloud storage should be minimized as much as possible, such that a user does not need to perform too many operations to use the data (in additional to retrieving the data). In particular, users may not want to go through the complexity in verifying the data integrity. Besides, there may be more than one user accesses the same cloud storage, say in an enterprise setting. For easier management, it is desirable that cloud only entertains verification request from a single designated party.
To fully ensure the data integrity and save the cloud users’ computation resources as well as online burden, it is of critical importance to enable public auditing service for cloud data storage, so that users may resort to an independent thirdparty auditor (TPA) to audit the outsourced data when needed. The TPA, who has expertise and capabilities that users do not, can periodically check the integrity of all the data stored in the cloud on behalf of the users, which provides a much more easier and affordable way for the users to ensure their storage correctness in the cloud. Moreover, in addition to help users to evaluate the risk of their subscribed cloud data services, the audit result from TPA would also be beneficial for the cloud service providers to improve their cloud-based service platform, and even serve for independent arbitration purposes [10]. In a word, enabling public auditing services will play an important role for this nascent cloud economy to become fully established, where users will need ways to assess risk and gain trust in the cloud.
Recently, the notion of public auditability has been proposed in the context of ensuring remotely stored data integrity under different system and security models. Public auditability allows an external party, in addition to the user himself, to verify the correctness of remotely stored data. However, most of these schemes [8] do not consider the privacy protection of users’ data against external auditors. Indeed, they may potentially reveal user’s data to auditors. This severe drawback greatly affects the security of these protocols in cloud computing. From the perspective of protecting data privacy, the users, who own the data and rely on TPA just for the storage security of their data, do not want this auditing process introducing new vulnerabilities of unauthorized information leakage toward their data security [11]. Moreover, there are legal regulations, such as the US Health Insurance Portability and Accountability Act (HIPAA), further demanding the outsourced data not to be leaked to external parties [10]. Simply exploiting data encryption before outsourcing could be one way to mitigate this privacy concern of data auditing, but it could also be an overkill when employed in the case of unencrypted/public cloud data, due to the unnecessary processing burden for cloud users. Besides, encryption does not completely solve the problem of protecting data privacy against third-party auditing but just reduces it to the complex key management domain. Unauthorized data leakage still remains possible due to the potential exposure of decryption keys.
Therefore, how to enable a privacy-preserving thirdparty auditing protocol, independent to data encryption, is the problem we are going to tackle in this paper. Our work is among the first few ones to support privacy-preserving public auditing in cloud computing, with a focus on data storage. Besides, with the prevalence of cloud computing, a foreseeable increase of auditing tasks from different users may be delegated to TPA. As the individual auditing of these growing tasks can be tedious and cumbersome, a natural demand is then how to enable the TPA to efficiently perform multiple auditing tasks in a batch manner, i.e., simultaneously. To address these problems, our work utilizes the technique of public key-based homomorphic linear authenticator (or HLA for short), which enables TPA to perform the auditing without demanding the local copy of data and thus drastically reduces the communication and computation overhead as compared to the straightforward data auditing approaches. By integrating the HLA with random masking, our protocol guarantees that the TPA could not learn any knowledge about the data content stored in the cloud server (CS) during the efficient auditing process. The aggregation and algebraic properties of the authenticator further benefit our design for the batch auditing. Specifically, our contribution can be summarized as the following three aspects:
1. We motivate the public auditing system of data storage security in cloud computing and provide a privacypreserving auditing protocol. Our scheme enables an external auditor to audit user’s cloud data without learning the data content.
2. To the best of our knowledge, our scheme is the first to support scalable and efficient privacy-preserving public storage auditing in cloud. Specifically, our scheme achieves batch auditing where multiple delegated auditing tasks from different users can be performed simultaneously by the TPA in a privacy-preserving manner.
3. We prove the security and justify the performance of our proposed schemes through concrete experiments and comparisons with the state of the art.

ISSUES ON PUBLIC AUDITING SCHEMES

Remote data storages are used to share data and services in the cloud environment. Data provider uploads the shared data into the data centers. Public auditing methods are used to verify the data integrity in remote data storages. Third-party auditor (TPA) is used to check the integrity of outsourced data. Privacy preserving public auditing mechanism is used to verify the data integrity with privacy. TPA supports auditing for multiple users simultaneously. Batch auditing mechanism is used for multi user environment. Homomorphic linear authenticator and random masking techniques are used to protect the data from TPA. The following drawbacks are identified in the existing system.
Data dynamism is not tuned for batch auditing scheme
Commercial cloud operations are not supported by the system
Data dynamism is not adapted for privacy preserved auditing mechanism
Privacy is provided for single user verification process

TPA BASED AUDITING SCHEME FOR COMMERCIAL CLOUDS

The privacy preserving public auditing scheme is enhanced to perform data verification for multi user environment. Batch verification scheme is adapted to multi user data sharing environment. Data dynamism is integrated with public data auditability scheme. The system is improved to support public auditing based data sharing under commercial cloud environment.
The cloud data sharing scheme is designed to manage data sharing based on economic model. Batch auditing mechanism is adapted for the data verification process. Dynamic data updates are managed with auditing process. The system is divided into five major modules. They are data center, Third Party Auditor, client, data dynamism handler and batch auditing. The cloud data center manages the shared data values. Auditing operations are initiated by the Third Party Auditor. Client application is designed to manage data upload and download operations. Data update operations are managed under data dynamism module. Batch auditing is designed for multi user data verification process.

6.1. Data Center

The data center application is designed to allocate storage space for the data providers. Data center maintains data files for multiple providers. Different sized storage area is allocated for the data providers. Data files are delivered to the clients.
image

6.2. Third Party Auditor

The Third Party Auditor (TPA) maintains the signature for shared data files. TPA performs the public data verification for data providers. Data integrity verification is performed is using Secure Hashing Algorithm (SHA). Homomorphic linear authenticator and random masking techniques are used for privacy preservation process

6.3. Client

The client application is designed to access the hard data values. The cloud user initiates the download process. Data access information is updated to the data center. Data center transfers the data as blocks.

6.4. Data Dynamism Handler

Shared data values are managed with blocks. Block update and delete operations are handled with signature update process. Block insertion operations are also supported in data dynamism process. Block signatures are also updated in data dynamism process.

6.5. Batch Auditing

Data integrity verification is carried out under auditing process. Batch auditing is applied to perform simultaneous data verification process. Batch auditing is tuned for multi user environment. Data dynamism is integrated with batch auditing process.

CONCLUSION

Cloud data providers maintain the shared data in remote storages. Third party Auditing (TPA) mechanism is used to verify data integrity in cloud storages. Batch auditing mechanism is provided for multi user environment. Commercial cloud services are supported with privacy preserved data verification schemes. The system support multi user data verification process. Client resource consumption is reduced by the system. Data dynamism is supported for multi user environment. Simultaneous data verification is performed in batch verification mechanism.

References

  1. K.D. Bowers, A. Juels, and A. Oprea, “Proofs of Retrievability: Theory and Implementation,” Proc. ACM Workshop Cloud Computing Security (CCSW ’09), pp. 43-54, 2009.
  2. P. Mell and T. Grance, “Draft NIST Working Definition of Cloud Computing,” http://csrc.nist.gov/groups/SNS/cloudcomputing/index.html, June 2009.
  3. M. Armbrust, A. Fox, R. Griffith, A.D. Joseph, R.H. Katz, A. Konwinski, G. Lee, D.A. Patterson, A. Rabkin, I. Stoica, and M. Zaharia, “Above the Clouds: A Berkeley View of Cloud Computing,” Technical Report UCB-EECS-2009-28, Univ. of California, Berkeley, Feb. 2009.
  4. Cloud Security Alliance, “Top Threats to Cloud Computing,” http://www.cloudsecurityalliance.org, 2010.
  5. Cong Wang, Sherman S.M. Chow, Qian Wang, Kui Ren, and Wenjing Lou, “Privacy-Preserving Public Auditing for Secure Cloud Storage”, IEEE Transactions On Computers, Vol. 62, No. 2, February 2013.
  6. Y. Dodis, S.P. Vadhan, and D. Wichs, “Proofs of Retrievability via Hardness Amplification,” Proc. Theory of Cryptography Conf. Theory of Cryptography (TCC), pp. 109-127, 2009.
  7. K.D. Bowers, A. Juels, and A. Oprea, “HAIL: A High-Availability and Integrity Layer for Cloud Storage,” Proc. ACM Conf. Computer and Comm. Security (CCS ’09), pp. 187-198, 2009.
  8. Q. Wang, C. Wang, and J. Li, “Enabling Public Auditability and Data Dynamics for Storage Security in Cloud Computing,” IEEE Trans. Parallel and Distributed Systems, vol. 22, no. 5, pp. 847-859, May 2011.
  9. C. Wang, Q. Wang, K. Ren, and W. Lou, “Towards Secure and Dependable Storage Services in Cloud Computing,” IEEE Trans. Service Computing, vol. 5, no. 2, 220-232, Apr.-June 2012.
  10. C. Wang, K. Ren, W. Lou, and J. Li, “Towards Publicly Auditable Secure Cloud Data Storage Services,” IEEE Network Magazine, vol. 24, no. 4, pp. 19-24, July/Aug. 2010.