A Decision Support System for Predicting
Student Performance

Lalit Dole; Jayant Rajurkar

A Decision Support System for Predicting Student Performance

Lalit Dole¹, Jayant Rajurkar²

Assistant Professor, Dept. of CSE, G.H.Raisoni College of Engineering, Nagpur (M.S), India
Dept. of CSE, G.H.Raisoni College of Engineering, Nagpur (M.S), India

Related article at Pubmed, Scholar Google

Visit for more related articles at International Journal of Innovative Research in Computer and Communication Engineering

Abstract

In recent years data mining has been successfully implemented in the business world. Evaluating students' academic success is becoming increasingly challenging, its use is intended for identification and extraction of new and potentially valuable knowledge from the data. Predicting educational outcome is a practical alternative heterogeneous environment. Performance prediction models can be built by applying data mining techniques to enrolment data. In this paper we present an Naive Bayes algorithm (NB) approach to predict graduating cumulative Grade Point Average based on applicant data collected from the surveys conducted during the summer semester at the University of Tuzla, the Faculty of Economics, academic year 2010-2011, among first year students and the data taken during the enrolment. The Naive Bayes algorithm is used to discover the most suited way to predict student's success.

Keywords

Data Mining, Classification, Prediction, Naive Bayes algorithm (NB), Student Evaluation

INTRODUCTION

Many leading higher education and Technical Education institutions aim is to contribute to the improvement of quality of higher education, the success of creation of human capital is the subject of a continuous analysis[1]. Therefore, the prediction of students' success is essential for higher education and Technical education institutions, because the quality of teaching process is the ability to meet students' needs. In this sense important data and information are gathered on a regular basis, and they are considered at the appropriate authorities, and standards in order to maintain the quality are set. All participants in the educational process could benefit by applying data mining on the data from the higher education system decibel in figure1. Computational data process from different Perspectives represents from data mining with the goal of extracting implicit and interesting samples , trends and information from the data, it can greatly help every participant in the educational process in order to improve the understanding of the teaching process, and it centres on discovering, detecting and explaining educational phenomenon’s [1].

Most Researchers suggests academic performance [3, 4] using student outcome as a good basis to assess applicants’ qualifications. A performance prediction model can be built by applying data mining to available admission and graduation grade point average data. Fortunately, AIT has a large database of information on past and current applicants. [2]. Decision support systems have been built to help advisors instruct students in choosing suitable courses and appropriate study plans [5, 6]. Previous work on student performance prediction used logistic regression to examine the impact of various factors on student performance [5]. Bekele and Menzel [7] used Bayesian networks to predict mathematics performance of high school students. Their model categorized students into three categories: below satisfactory, satisfactory, and above satisfactory. The work reported in the present paper differs from theirs in the highly international nature of the applicant pool and the more fine grained prediction [2].

In this paper we present an approach using Bayesian networks to predict graduating cumulative Grade Point Average based on applicant data collected from the surveys conducted during the summer semester at the University of Tuzla, the Faculty of Economics, academic year 2010-2011, among first year students and the data taken during the enrolment. Bayesian prediction model can provide valuable information to departmental faculty members in making decisions. They may be more comfortable with the predictive results if the system can show them the past student most similar to the applicant being considered. In this paper different techniques of data mining suitable for classification have been compared: Bayesian classifier, neural networks and decision trees. Neural networks have in many areas shown success in solving problems of prediction, approximation, function, classification and pattern recognition. Their accuracy was compared with decision trees and with the Bayesian classifier. The results indicate that the Naïve Bayes classifier outperforms in prediction decision tree and neural network methods. It also indicated that a good classifier model has to be both accurate and comprehensible for professors.

DATA DESCRIPTION

The data for the model were collected through a questionnaire survey conducted during the summer semester at the Faculty of Economics in Tuzla, academic year 2010-2011, among the first year students. After eliminating incomplete data, the sample comprised 257 students who were at the time of researches present at the practice classes. The model of students' success was created, where success as the output variable is measured with the success in the course ''Business Informatics’’ [1].

As input to the model 12 variables are used, whose names and coding is shown in Table1. Distribution of the final students' grades in the course ''Business Informatics'' is shown in Figure 2. It is evident that the prediction error rate will be much higher in the first case due to different distribution of grades through classes; hence the advantage is given to the second case of this study.

DATA MINING APPROACH

Data mining is a computational method of processing data which is successfully applied in many areas that aim to obtain useful knowledge from the data [9]. The goal of the analysis is the categorization of data by class, then that is the new information on classes to which data belongs. In order to do this, algorithms are divided into two basic groups:

 Unsupervised algorithms and

 Supervised algorithms.

The mining is ''unsupervised'' or ''undirected'', when the output conditions are not explicitly represented in the data set: the task of unsupervised algorithm is to discover automatically inherent patterns in the data without the prior information about which class the data could belong, and it does not involve any supervision [11].

Supervised algorithms are those which use data with in advance familiar class to which data belong for building models, and then on the basis of the constructed model predict the class to which unknown data will belong. Methods of data classification represent a process of learning a function that maps the data into one of several predefined classes. To every classification algorithm, that is based on inductive learning, input data set is given, that consists of vectors of attribute values and their corresponding class. The goal of a classification technique is to build a model which makes it possible to classify future data points based on a set of specific characteristics in an automated way[1]. Such systems take a collection of cases as input, each belonging to one of a small number of classes and described by its values for a fixed set of attributes. As output they take a classifier that can accurately predict the class to which a new case belongs. The most common methods of classifications are: decision trees, induction rules or classification rules, probabilistic or Bayesian networks, neural networks and hybrid procedures.

NAIVE BAYES ALGORITHM

A Bayesian network [8] is a graphical representation of a probability distribution. It is a directed acyclic graph in which nodes represent random variables and links represent probabilistic influences between the variables. Probabilistic dependence and independence are expressed by the presence or lack of paths between nodes in the graph[2]. The fact that probabilistic dependence is encoded in the network topology in this way permits probability distributions over large numbers of random variables to be compactly represented and permits calculations to be performed efficiently. Due to the inherent uncertainty of the performance prediction problem, we chose to use Bayesian networks for the modeling task. Using a probabilistic model has the advantage that it can later become a component of a higher level optimization model.

Naive Bayes algorithm (NB) is a simple method for classification based on the theory of probability, i.e. the Bayesian theorem [10]. It is called naïve because it simplifies problems relying on two important assumptions: it assumes that the prognostic attributes are conditionally independent with familiar classification, and it supposes that there are no hidden attributes. that could affect the process of prediction. This classifier represents the promising approach to the probabilistic discovery of knowledge, and it provides a very efficient algorithm for data classification.

EXPERIMENTAL RESULTS

We have performed the experiments on WEKA software package ,that was developed at the University of Waikato in New Zealand. This package has been implemented in the software language Java and today stands out as probably the most competent and comprehensive package with algorithms of machinery learning in academic and nonprofit world (Machine Learning Group at University of Waikato, 2011).

To get a better insight into the importance of the input variables, it is customary to analyze the impact of input variables during students' prediction success. The impact of certain input variable of the model on the output variable has been analyzed. Tests were conducted using four tests for the assessment of input variables: Chi-square test, One Rtest, Info Gain test and Gain Ratio test. The results of each test were monitored using the following metrics: Attribute (name of the attribute), Merit (measure of goodness), Merit dev (deviation, i.e. measure of goodness deviation), Rank (average position occupied by attribute), Rank and dev (deviation, deviation takes attribute's position). The results obtained with these values are shown in Table 4.

In this aggregate table "Merit" columns are not applicable, because the algorithms use mutually incompatible metrics. The aim of this analysis is to determine the importance of each attribute individually. Table 4. shows that attribute PO (GPA) impacts output the most, and that it showed the best performances in all of the four tests. Then these attributes follow: URK (entrance exam), MAT (study material), VRI (average weekly hours devoted to studying). The following attributes had the smallest output impact: BCD (number of household members), UAS (distance of residence from the faculty) and S (sex).

We have carried out some experiments in order to evaluate the performance and usefulness of NB classification algorithms for predicting students’ success. The results of the experiments are summarized in Table 5, 6, 7 and 8. The performances of the NB models are evaluated based on the three criteria: the prediction accuracy, learning time and error rate, which are illustrated in Figures 4, 5, and 6.

From the results, Naïve Bayes has better prediction. NB classifiers used for experiment, the accuracy rate of NB algorithm is the Highest. The Naïve Bayes and decision tree classifier learn more rapidly in the time to build a model for the given dataset.

The performance of the learning techniques is highly dependent on the nature of the training data. Confusion matrices are very useful for evaluating classifiers. The columns represent the predictions, and the rows represent the actual class. To evaluate the robustness of classifier, the usual methodology is to perform cross validation on the classifier. In general, cross validation has been proved to be statistically good enough in evaluating the performance of the classifier. Good results correspond to large numbers down the main diagonal and small, ideally zero, off-diagonal elements.

In educational problem, it is also very important for the classification model obtained to be user friendly, so that teachers can make decisions to improve student learning. Nonetheless, some models are more interpretable than others [13]. Decision trees are considered easily understood models because a reasoning process can be given for each conclusion. Knowledge models under this paradigm can be directly transformed into a set of IF-THEN rules that are one of the most popular forms of knowledge representation, due to their simplicity and comprehensibility which professor can easy understand and interpret (Figure 2)[1].

The model (Figure 2) is easy to understood. This model can give faculty interesting information about student and provides guidance to teacher to choose a suitable track, by analyzing experiences of students with similar academic achievements.

CONCLUSION AND FUTURE WORK

In this paper, we have present supervised data mining algorithms , Naive Bayes (NB) algorithm applied on the preoperative assessment data to predict success in a course (either passed or failed) and the performance of the learning methods were evaluated based on their predictive accuracy, ease of learning and user friendly characteristics.

The results indicate that the Naïve Bayes classifier outperforms in prediction decision tree, indicated that a good classifier model has to be both accurate and comprehensible for professors. This study was based on traditional classroom environments, since the data mining techniques were applied after the data was collected. It can be concluded that this methodology can be used to help students and teachers to improve student’s performance; reduce failing ratio by taking appropriate steps at right time to improve the quality of learning. It is important to answers how to obtain that predicting models are user friendly for professors or non-expert users and how to integrate data collection system of university and data mining tool.

Tables at a glance


Table 1	Table 2	Table 3	Table 4

Figures at a glance


Figure 1	Figure 2	Figure 5

Figure 6	Figure 7	Figure 8

References

EdinOsmanbegović,MirzaSuljić,”DATA MINING APPROACH FOR PREDICTING STUDENT PERFORMANCE”,Economic Review –Journal of Economics and Business, Vol. X, Issue 1, May 2012.

Nguyen Thi Ngoc Hien and Peter Haddawy,” A Decision Support System for Evaluating International Student Applications”, 37th ASEE/IEEE Frontiers in Education Conference,2007.

Hadkkinen I., “Do University entrance exams predict academic achievement?”,Working Paper Series, Department of Economics, Uppsala University, 2004.

Golding P., Donaldson O., “Predicting academic performance”, Proc. 36th ASEE/IEEE Frontiers in Education Conference, 2006, 21-26.

Chowdhury A. A., “Predicting success of a beginning computer course using logistic regression”, ACM conference on Computer Science, 1987, p449.

Dekhytar A., Goldsmith J., “The Bayesian advisor project”.

Bekele R., Menzel W., “A Bayesian approach to predict performance of a student (BAPPS): A Case with Ethiopian Students”, Proc. IASTED International Conference on Artificial Intelligence and Applications, 2005.

Jensen F., “Bayesian Networks and Decision Graphs”, Springer- Verlag, 2002.

Klosgen, W. &Zytkow, “Handbook of data mining and knowledge discovery, Oxford University Press”, New York ,2002.

Witten, I.H. & Frank E. “Data Mining – Practical Machine Learning Tools and Techniques, Second edition”, MorganKaufmann, San Francisco,2000.

Cios, K.J., Pedrycz W., Swiniarski, R.W. & Kurgan, L.A.,”Data Mining: A Knowledge Discovery Approach”, Springer, New York,2007.

Kumar S. A. &Vijayalakshmi M. N.,”Efficiency of Decision Trees in Predicting Student's Academic Performance”, First International Conference on Computer Science, Engineering and Applications, CS and IT 02, Dubai, pp. 335-343, 2011.

Romero, C. & Ventura, S. ,”Educational Data Mining: a Survey from 1995 to 2005, Expert Systems with Applications”, Elsevier, pp. 135-146, 2007.