Discovering Relations among Documents Using Novel Text Retrieval Technique | Abstract

ISSN ONLINE(2320-9801) PRINT (2320-9798)

Special Issue Article Open Access

Discovering Relations among Documents Using Novel Text Retrieval Technique

Abstract

Text categorization is one of the key techniques in text mining to categorize the documents in a supervised manner.In this paper we have done study on automaticcategorization of news items.The categorization algorithm transforms each document into a vector of weights corresponding to an automatically extracted set of keywords. This process is performed on a large set of news items, forming the multi-dimensional space populated by news items of known categories. An unknown news item is also transformed into a vector of keyword weights and then categorized using the k-means method in this space. Finally the documents are compared based on weighted keywords to find which documents are most similar.

Manjiri Gajanan Ghadi, Carmen Lysandra Pereira , Manimozhi R.

To read the full article Download Full Article