ISSN ONLINE(2320-9801) PRINT (2320-9798)

All submissions of the EM system will be redirected to Online Manuscript Submission System. Authors are requested to submit articles directly to Online Manuscript Submission System of respective journal.

Review and Analysis of Multimedia Data Mining Tasks and Models

Manjunath R1, S. Balaji2
  1. Dept. of Computer Science & Engineering, City Engineering College, Bangalore-560061, India
  2. Centre for Emerging Technologies,Jain Global Campus, Jain University,Ramanagara Dist-562112, India
Related article at Pubmed, Scholar Google

Visit for more related articles at International Journal of Innovative Research in Computer and Communication Engineering

Abstract

Over the past few decades, rapid changes in information technology have drastically changed the functions and activities of multimedia. Data mining has become more popular for extracting knowledge from multimedia data sets such as audio, video, speech, text, web, image and a combination of several types of these data sets. These are increasingly available and are semi-structured data or unstructured data. This poses a great challenge to manually extract the hidden, useful knowledge embedded within the multimedia collections without the use of new techniques and powerful tools. This challenge drives the need to develop data mining tools and techniques which can be used for the above mentioned types of data sets. This paper presents a review and analysis on the state-of-the-art in the field of multimedia data mining and knowledge discovery, advanced technologies, and data mining approaches that are useful for decision making applications and researchers.

Keywords

Data Mining, Multimedia Data, Classification, Association, Clustering

INTRODUCTION

Nowadays, multimedia data is universal and is needed in various applications;archival of multimedia datarequires extremely large storage. Multimedia data mining is an interdisciplinary research field in which generic data mining theory and techniques are applied to the multimedia data sets so as to facilitate multimedia-specific knowledge discovery tasks. Multimedia is a combination of more than one media such as text, image, video, audio, numeric, sound files, animation, graphical and categorical data [1].The multimedia is classified in to two categories:(i) static media such as text, graphics, and images and (ii) dynamic media such as animation, music, audio, speech, and video [12]. Fig. 1illustrates the various aspects of multimedia data mining.Multimedia data mining refers to the analysis of large amounts of multimedia information in order to find patterns or statistical relationships. Multimedia data becomes complex as the sequence progresses and the concept being mined may change as well [17]. Understanding and representing the changes in the mining process is necessary to mine multimedia data [2]. Data in multimedia databases are semi structured or unstructured [6]. The structured data is handled by traditional data mining techniques while advanced technologies are extended for semi structured heterogeneous data.

LITERATURE REVIEW

Ja-Hwung Su et al. [13] have proposed an innovative method to achieve high quality of content-based video retrieval by discovering the temporal patterns in the video contents. On the basis of the discovered temporal patterns, an efficient indexing technique and an effective sequence matching technique were integrated to reduce the computation cost and to raise the retrieval accuracy, respectively. Experimental results have shown that their approach was very promising in enhancing content-based video retrieval in terms of efficiency and effectiveness.Sanjeevkumar R, Su et al. [17] show thatthe data mining techniques are useful while converting the multimedia files in the libraries. The digital library retrieves, collects, stores and preserves the digital data. For this purpose, there is a need to convert different formats of information such as text, images, video, audio, etc.,.
The data mining techniques are popular during the conversion of the multimedia files in the libraries.Jaesik Choi et al. [3] have proposed a video matching called Spatio-Temporal Pyramid Matching (STPM). Considering features of objects in 2D space and time, STPM recursively divides a video clip into a 3D spatio-temporal pyramidal space and compares the features in different resolutions. In order to improve the retrieval performance, they consider both static and dynamic features of objects. They also provide a sufficient condition in which the matching can get the additional benefit from temporal information. The experimental results have shown that their STPM method performed better than the other video matching methods.Kale D.V., Su et al. [4], proposed a framework for surveillance videos of stationery places. They implemented an algorithm to group incoming video stream into meaningful pieces called segments. Further, they extracted a feature of segment (i.e. motion) which is used to characterize the segments. Motion of a segment is extracted using a two dimensional matrix which is constructed using accumulated pixel differences among all the frames in a segment. Video segments are then clustered using K-means algorithm and finally found abnormality in the segments of the video.
Chary et al. [7] proposed evaluation of image retrieval methods. The retrieval of images within a large image collection based on color projections and different mathematical approaches are introduced and applied for retrieval of images. Images are sub grouped using threshold values; they considered R, G, B color combinations for retrieval of images implemented. The results show that they obtained efficient results compared to existing methods. Vamsidhar Enireddy et al. [8] report that the digital medical images are stored in large databases for easy accessibility and Content Based Image Retrieval (CBIR) method is used to retrieve diagnostic cases similar to the query medical image. Haar wavelet is used for image compression without losses. Edge and texture features are extracted from the compressed medical images using Sobel edge detector and Gabor transforms, respectively. The classification accuracy of retrieval is evaluated using Naïve Bayes and Support Vector Machine. The digital medical images are stored in large databases for easy accessibility and CBIR method is used to retrieve diagnostic cases similar to the query medical image. CBIR uses algorithms to extract relevant features from the image, on presenting a query image. CBIR retrieves images from the database based on the features such as color, texture, edge and shape in the images which are automatically extracted by CBIR systems.

UNSTRUCTURED VERSUS STRUCTURED DATA

Various architectures are being examined to design and develop a multimedia data mining system. Data in multimedia databases are semi structured or unstructured. Unstructured data is simply a bit stream. Examples include pixel level representation for images, video, and audio, and character level representation for text. The architecture to convert unstructured data to structured data for mining is illustrated in Fig. 2: Extract data or metadata from the unstructured database. Store the extracted data in a structured database and apply data mining tools on the structured database [26]. A difference between multimedia mining and structured data mining is the sequence or time element. Multimedia often captures an entity changing over time. Video and audio are clearly ordered, and even text has little meaning without sequence. Time series mining analyses the change to one or more values over time. Multimedia is more complex - as the sequence progresses, the concept being represented may change as well. This is obvious with video, where a camera may slate or objects in the scene may move. Understanding and representing changes in the mining process is necessary to mine multimedia data [21].
Multimedia is harder to fit into typical data mining models. Image and video of different entities have some similarity - each represents a view of a building - but without clear structure such as "these are pictures of the front of buildings" it is difficult to relate multimedia mining to traditional data mining. Multimedia generally gives a lot of data on each entity, but not the same data for each entity.

MULTIMEDIA DATABASE MANGEMENT

Recently, multimedia has been the major focus for many researchers around the world and many technologies are proposed for representing, storing, indexing, and retrieving multimedia data. Most of the studies done are confined to the data filtering step of the KDD process. In [28], Czyzewski demonstrated how KDD methods can be used to analyze audio data and remove noise from old recordings. Thus, in multimedia documents, knowledge discovery deals with non-structured information. In general, the multimedia files from a database must be first pre-processed to improve their quality followed by feature extraction. With the help of generated features, information models can be devised using data mining techniques to discover significant patterns as shown in Fig. 2. Multimedia data mining refers to pattern discovery, rule extraction and knowledge acquisition from multimedia database, as discussed in [16].Chien et. al. in [27] use knowledge based AI techniques to assist image processing in a large image database generated from the Galileo mission. A multimedia data mining system prototype, MultiMediaMiner- includes the construction of a multimedia data cube which facilitates multiple dimensional analyses of multimedia data, primarily based on visual content, and the mining of multiple kinds of knowledge, including summarization, comparison, classification, association, and clustering [15].

Text Mining

Text mining is a burgeoning new field that attempts to glean meaningful information from natural language text. It may be loosely characterized as the process of analyzing text to extract information that is useful for particular purposes. Compared with the kind of data stored in databases, text is unstructured, amorphous, and difficult to deal with algorithmically. Nevertheless, in modern culture, text is the most common vehicle for the formal exchange of information. The field of text mining usually deals with text whose function is the communication of factual information or opinions, and the motivation for trying to extract information from such text automatically is compelling even if success is only partial [20].

Image Mining

Image mining systems that can automatically extract semantically meaningful information (knowledge) from image data are increasingly in demand. The fundamental challenge in image mining is to determine how low level, pixel representation contained in a raw image or image sequence can be processed to identify high-level spatial objects and relationship [24].

Video Mining

Video contains several kinds of multimedia data such as text, image, metadata, visual and audio. It is widely used in many major potential applications like security and surveillance, entertainment, medicine, education programs and sports. The objective of video data mining is to discover and describe interesting patterns from the huge amount of video data as it is one of the core problem areas of the data-mining research community. Compared to the mining of the other types of data, video data mining is still in its infancy [10].

Audio Mining

Audio mining is a technique by which the content of an audio signal can be automatically analyzed and searched. It is most commonly used in the field of automatic speech recognition, where the analysis tries to identify any speech within the audio.

ISSUES IN MULTIMEDIA DATAMINING

Before multimedia data mining develops into a conventional, mature and trusted discipline, many still-pending issues have to be addressed. These issues pertain to the multimedia data mining approaches applied and their limitations. Major Issues in multimedia data mining include content based retrieval and similarity search, generalization and multidimensional analysis, classification and prediction analysis, and mining associations in multimedia data [25]. Multimedia data mining needs content-based retrieval and similarity search integrated with mining methods. Content based retrieval in multimedia is a challenging problem since multimedia data needs detailed interpretation from pixel values [19].

PROCESS OF MULTIMEDIA DATA MINING APPLICATIONS

The model of applying multimedia mining in different multimedia types is presented in Fig. 4 [11]. Data collection is the starting point of a learning system, as the quality of raw data determines the overall achievable performance. Then, the goal of data pre-processing is to discover important features from raw data. Data pre-processing includes data cleaning, normalization, transformation, feature selection, etc. Learning can be straight-forward, if informative features can be identified at pre-processing stage.
Detailed procedure depends highly on the nature of raw data and problem’s domain. The product of data preprocessing is the training set. Given a training set, a learning model has to be chosen to learn from it and make multimedia mining model more iterative. Multimedia mining is more complex as compared to data mining due to: a) the huge volume of data, b) the variability and heterogeneity of the multimedia data (e.g. diversity of sensors, time or conditions of acquisition etc.) and c) the multimedia content’s meaning is subjective. Application and system of multimedia data mining based on the process discussed is surveyed in the following sub-section.

MULTIMEDIA DATA MINING TASKS

The main tasks involved in multimedia data mining are [16]:
Multimedia Data Cube: Multimedia data cube is an interesting model for multidimensional analysis of multimedia data; we should note that it is difficult to implement a data cube efficiently given the large number of dimensions. This curse of dimensionality is especially serious in the case of multimedia data cubes. We may like to model color, orientation, texture, keywords, and so on. Many of the multiple dimensions in multimedia data cubes are set oriented instead of single, valued. e.g. one image may correspond to a set of keywords. It may contain a set of objects, each associated with a set of colors. If we use each keyword as a dimension or each detailed color as a dimension in the design of the data cube, then we will create huge number of dimensions. On the other hand not doing so may lead to the modelling of an image at rather rough, limited and imprecise scale.
Feature extraction: Multimedia features are extracted from media sequences or collections converting them into numerical or symbolic form. Good features shall be able to capture the perceptual saliency, distinguish content semantics, as well as being computationally and representationally economical.
Data Pre-processing: Integrating data from different sources and making choices about representing or coding certain data fields is the task of this stage. It serves as input to the pattern discovery stage. Because certain fields may contain data at levels of details which are not considered suitable for the pattern discovery stage representation other choices are needed.
Discovering Patterns:The pattern discovery stage is the heart of the entire data mining process. The hidden patterns and trends in the data are actually uncovered in this stage. Several approaches of pattern discovery stage include association, classification, clustering, regression, time-series analysis and visualization.
Interpretations:To evaluate the quality of discovery and its value to determine whether previous stage should be revisited or not this stage of data mining process is used.
Reporting and using discovered knowledge: This final stage is reporting and putting to use the discovered knowledge to generate new actions or products and services or marketing strategies as the case may be.

MODELS FOR MULTIMEDIA DATA MINING

The models used to perform multimedia data are most important in mining. There are four multimedia mining models which are commonly used. These are classification, association rule, clustering and statistical modelling.
Classification Rule: Classification produces a function that maps a data item into one of several predefined classes, by inputting a training data set and building a model of the class attribute based on the rest of the attributes. Decision tree classification has an intuitive nature that matches the user’s conceptual model without loss of accuracy. An example of this work is Hidden Markov Model used for classifying the multimedia data [29].
Association Rule: An association rule is an expression of A->B, where Ais a set of items, and B is a single item. Association rule methods are an initial data exploration approach that is often applied to extremely large data set. A recent work in this area is due to Lei Wang et. al. [23], who introduced a clustering method based on unsupervised neural nets and self-organizing maps [24].
Clustering: Clustering is the task of assigning a set of objects into groups so that the objects in the same cluster are more similar to each other than to those in other clusters. Clustering is the main task of explorative data mining, and a common technique for data analysis used in many fields including information retrieval. Cluster analysis groups objects based on their similarity. The measure of similarity can be computed for various types of data [6]. Clustering algorithms can be categorized into partitioning methods, hierarchical methods, density-based methods, grid-based methods, and model-based methods, k-means algorithm, and graph based model. Clustering is division of data into groups of similar objects. Representing the data by fewer clusters necessarily loses certain fine details, but achieves simplification. It models data by its cluster. Recent works in this area is clustering method based on unsupervised neural nets and self-organizing maps [23].
Statistical Modelling: Statisticians were the first to use the term “data mining.” Originally, “data mining” or “data dredging” was a derogatory term referring to attempts to extract information that was not supported by the data.Now, statisticians view data mining as the construction of a statistical model, that is, an underlying distribution from which the visible data is drawn. Suppose our data is a set of numbers. This data is much simpler than data that would be datamined, but it will serve as an example. A statistician might decide that the data comes from a Gaussian distribution and use a formula to compute the most likely parameters of this Gaussian distribution. The mean and standard deviation of this Gaussian distribution completely characterize the distribution and would become the model of the data.Statistical mining models are used to determine the statistical validity of test parameters and can be utilized to test hypothesis, undertake correlation studies and transform and prepare data for further analysis. Pattern matching is used to find hidden characteristics within data and the methods used to find patterns with the data include association rules [9].

CONCLUSIONS

In this paper, we addressed data mining for multimedia data such as text, image, video and audio. In particular, we have reviewed and analysed the multimedia data mining process with different tasks. This paper also described the well-known models for multimedia mining.
 

Figures at a glance

Figure 1 Figure 2 Figure 3 Figure 4
Figure 1 Figure 2 Figure 3 Figure 4
 

References