ISSN ONLINE(2320-9801) PRINT (2320-9798)

All submissions of the EM system will be redirected to Online Manuscript Submission System. Authors are requested to submit articles directly to Online Manuscript Submission System of respective journal.

Special Issue Article Open Access

Text Classification Using Symbolic Data Analysis

Abstract

In the real world, an operational text classification system is usually placed in the environment where the amount of human-annotated training documents is small in spite of thousands of classes. In this environment text classifier are probably the most appropriate methods for the practical systems rather than other complex learning models. Text classifiers are basically used for free flowing texts that are basically unstructured text documents and classification is done with a statistical feature weighting method which involves a pre-processing- a method wherein texts are reduced by eliminating digits, punctuations, hyphens, stop words and high/low frequency words and by applying stemming. This strategy of text classification cannot be applied to the domain of unstructured texts describing the advertisements, since these texts give the description in terms of attribute values. Since none of the text classifiers are useful in classifying such texts in an unstructured text document, the concept of symbolic data analysis is introduced. Symbolic Data Analysis (SDA) is a new domain in the area of knowledge discovery and data management, related to multivariate analysis, pattern recognition, databases and artificial intelligence. In this method of Symbolic Data Analysis for classification of unstructured text documents, uses a symbolic database and querying processes are proposed. From the proposed technique it seems that it is one of the efficient techniques to classify texts in unstructured text documents and hence is introduced for the better result when dealing with unstructured text documents

Sangeetha N

To read the full article Download Full Article | Visit Full Article