ISSN: 2229-371X

All submissions of the EM system will be redirected to Online Manuscript Submission System. Authors are requested to submit articles directly to Online Manuscript Submission System of respective journal.

Research Article Open Access

AN APPROACH TO BUILD A WEB CRAWLER USING CLUSTERING BASED K-MEANS ALGORITHM

Abstract

Central to any data-mining project is having sufficient amounts of data that can be processed to provide meaningful and statistically relevant information. But getting the unstructured data is only the initial stage and that data must be transformed into a structured format which is suitable for further processing. In this paper we have proposed architecture for the web-crawling and arrange their unstructured data using cluster based algorithm. . The clustering process is based on the k-means algorithm. This paper is completely based on the focused crawler mechanism that only scans the pages by using general crawling policies.

Nilesh Jain, Priyanka Mangal, Dr. Ashok Bhansali

To read the full article Download Full Article