Main Article Content
Comparative analysis of text of speech using text data mining technique
Abstract
This paper is designed to investigate word frequency, term association (correlation) and to determine using stopwords to infer if the two speeches were given by the same person at separate instances. The text mining (TM) approach is applied to extract information from the inauguration speeches. In this discussion, two different inauguration speeches are considered. The objectives of this study is to determine term frequency, term association and to infer if these frequent words is associated to the economic growth, development of the nation or the words are auxilary words with zero information value. The analysis based on the term by document matrix indicates that word frequencies in both inauguration speeches are consistent. The pictorial analysis also revealed that the word frequencies are consistent. Although, the most frequent words appeared 175 and 255 times for the two instances of the inauguration speeches. The analysis though based on 50% correlation lower bound revealed that in the first inauguration speech, the following words; corruption, health, eduction, agriculture, security, defence, transportation, electricty, power, water, terrorism, oil, police, food” are highly correlated and associated than in the second inauguration speech. Term comparison indicated that the two speeches were given by the same person due to terms similarity and frequency.