Main Article Content
Data pre-processing: a case study in predicting student’s retention in MOOC
Abstract
Data pre-processing is a crucial phase prior to analytic task and yet rarely been discussed
especially for e-learning data which has multilevel data. Providing a reliable data
pre-processing is important to provide quality dataset. Therefore, this study investigates the
problems arise in data pre-processing and in this case, for identifying the significant factors to
implement prediction task. A MOOC dataset is selected for the data pre-processing task. The
process in generating the summary of dataset is explained and the ultimate aim is to produce a
dataset with features that are ready for data mining task. The study also proposed a process
model and suggestions, which can be applied to support more comprehensible tools for
educational domain who is the end user. Subsequently, the data pre-processing become more
efficient for predicting student’s retention in MOOC
Keywords: data pre-processing; e-learning; massive open online course; student’s retention;
prediction.