ISSN ONLINE(2320-9801) PRINT (2320-9798)

All submissions of the EM system will be redirected to Online Manuscript Submission System. Authors are requested to submit articles directly to Online Manuscript Submission System of respective journal.

Research Article Open Access

An Optimized WebDocument Clustering Using Recurrent Set IGA & Confusion Matrix For Fact Retrieval

Abstract

Initially the first phase derives the Genetic Algorithm for global clustering process to resolve the optimization solution in both clustering and feature selection. The second phase follows a concept of confusion matrix for derivative works and improved GA is included for the final classification. The third phase presents the optimization technique to evaluate the cluster optimality for proficient document clustering based on the optimized conceptual feature words. Final phase introduce a join approach to cluster the web pages which primarily finds the recurrent sets and then clusters the documents. These recurrent sets are generated by using recurrent pattern expansion technique. Then by applying Fuzzy K-Means algorithm on Optimized Web document clustering using Recurrent Set founds clusters having documents which are extremely related and have related features. Experimental results show that our approach is more efficient then the above two join approach and can handle more efficiently in robust nature. Performance evaluation show benefits in terms of cluster optimality, true negative rate and information retrieval on real and UCI repository bag of words dataset.

C. Josephine Christy, Dr. B. Nagarajan