An Improved C-PCA Technique to Detect
Outliers Using Online Oversampling
Approach

L.Dhivya; C.Timotta

An Improved C-PCA Technique to Detect Outliers Using Online Oversampling Approach

L.Dhivya, C.Timotta
Dept of Computer Science & Engineering, PPG Institute of Technology, coimbatore, TamilNadu, India.

Related article at Pubmed, Scholar Google

Visit for more related articles at International Journal of Innovative Research in Science, Engineering and Technology

Abstract

Outlier detection is the process of identifying unusual behavior. It is widely used in data mining, for example, to identify customer behavioral change, fraud and manufacturing flaws. In recent years many researchers had proposed several concepts to obtain the optimal result in detecting the anomalies. But the process of PCA made it challenging due to its computations. In order to overcome the computational complexity, online oversampling PCA has been used. The algorithm enables quick Online updating of the principal directions for the effective computation and satisfying the online detecting demand and also oversampling will improve the impact of outliers which leads to accurate detection of outliers. Experimental results show that this method is effective in computation time and need less memory requirements also clustering technique is added to it for optimization.

References

M. Breunig, H.-P. Kriegel, R.T. Ng, and J. Sander, “LOF: Identifying Density-Based Local Outliers,” Proc. ACM SIGMOD Int’l Conf. Management of Data, 2000.
F. Angiulli, S. Basta, and C. Pizzuti, “Distance-Based Detection and Prediction of Outliers,” IEEE Trans. Knowledge and Data Eng., vol. 18, no. 2, pp. 145-160, 2006.
N.L.D. Khoa and S. Chawla, “Robust Outlier Detection Using Commute Time and Eigen space Embedding,” Proc. Pacific-Asia Conf. Knowledge Discovery and Data Mining, 2010.
V.Barnett and T. Lewis, “Outliers in Statistical Data”, John Wiley Sons, 2006.
D.M. Hawkins, “Identification of Outliers”. Chapman and Hall, 1980.
W. Jin, A.K.H. Tung, J. Han, and W. Wang, “Ranking Outliers Using Symmetric Neighborhood Relationship,” Proc. Pacific-Asia Conf. Knowledge Discovery and Data Mining, 2006.
H.-P. Kriegel, M. Schubert, and A. Zimek, “Angle-Based Outlier Detection in High- Dimensional Data,” Proc. 14th ACM SIGKDD Int’l Conf. Knowledge Discovery and data Mining, 2008.
C.C. Aggarwal and P.S. Yu, “Outlier Detection for High Dimensional Data,” Proc. ACM SIGMOD Int’l Conf. Management of Data, 2001.
T. Ahmed, “Online Anomaly Detection using KDE,” Proc. IEEE Conf. Global Telecomm., 2009.
X. Song, M. Wu, and C.J., and S. Ranka, “Conditional Anomaly Detection,”IEEE Trans. Knowledge and Data Eng., vol. 19, no. 5, pp. 631-645, May 2007.
L. Huang, X. Nguyen, M. Garofalakis, M. Jordan, A.D. Joseph, and N. Taft, “In-Network Pca and Anomaly Detection,” Proc. Advances in Neural Information Processing Systems 19, 2007.
V. Chandola, A. Banerjee, and V. Kumar, “Anomaly Detection: A Survey,” ACM Computing Surveys, vol. 41, no. 3, pp. 15:1-15:58, 2009.
W. Wang, X. Guan, and X. Zhang, “A Novel Intrusion Detection Method Based on Principal Component Analysis in Computer Security,” Proc. Int’l sym. Neural Networks, 2004.