Main Article Content
Assessing Selected CNN Models for Efficient Feature Extraction in SSD for Text Detection in Advertisement Images
Abstract
Digital advertisement promotes goods and services using digital media and technology. These digital advertisement images contain important information on the product and services being advertised and seek to persuade potential customers to take specific actions toward contacting the advertiser. Manual extraction of information from the advertisement images is tedious and prone to errors. The literature on text detection from images, billboards, and signposts using Single-shot detection (SSD) is vast. However, the literature has not explored its performance for text detection on advertisement images. Therefore, there is a need to evaluate the performance of these models on advertisement images. The performance of three selected Convolutional Neural Network (CNN) models (Resnet-50, Mobilenetv2, and Resnet-101) with SSD for text detection in advertisement images was evaluated. A total of 400 digital advertisement images were manually collected and annotated for use in this study. Results of comparing the performance of selected CNN models with the SSD architecture for text detection from advertisement images showed that Resnet-50 performed well with the detection of small texts with a mean Average Precision (mAP) of 0.736, AP(small) of 0.692 and AR(small) of 0.781.