Kullback-Leibler Divergence Measurement for Clustering Based On Probability Distribution Similarity
Clustering on Distribution measurement is an essential task in mining methodology. The previous methods extend traditional partitioning based clustering methods like k-means and density based clustering methods like DBSCAN rely on geometric measurements between objects. The probability distributions have not been considered in measuring distance similarity between objects. In this paper, objects are systematically modeled in discrete domains and the Kullback-Leibler Divergence is used to measure similarity between the probabilities of discrete values and integrate it into partitioning and density based clustering methods to cluster objects. Finally the resultant execution time and Noise Point Detection is calculated and it is compared for Partitioning Based Clustering Algorithm and Density Based Clustering Algorithm. The Partitioning and Density Based clustering using KL divergence have reduced the execution time to 68 sec and 22 Noise Points are detected. The efficiency of Distribution based measurement clustering is better than the Distance based measurement clustering.
Priyadharshini.J, Akila Devi.S, Askerunisa.A