Main Article Content
Statistical Analysis of Sexually Transmitted Infections among Patients Treated in a Nigerian Hospital
Abstract
Sexually transmitted infection (STI) has been one of the diseases prevalent among youths within society, which has affected the lives of individuals in the area of infertility and morbidity. Several factors have been attributed to sexually transmitted infections of which proper diagnosis has not been carried out as a result of misleading information given by infected personalities. Secondary data consisting of 400 patients treated for STI covering January 2020 to August 2021 was extracted from The Federal Polytechnic Ilaro Medical Centre record. STI status of patients was considered as a response variable while age, gender, vaginal diseases, vaginal itching, foul smelling, dysuria, penile itching, and vaginitis were the explanatory variables. Establishing a model that predicts the odds of each factor's contribution to the prevalence of STIs among the treated patients was achieved using the Logistic Regression Model (LRM), while the Random Forest Algorithm (RFA) was also considered as an alternative method. The area under the curve, accuracy, recall, precision, and F-score were used as evaluation metrics for the two techniques. The Chi-square technique was also used to test the independence of all the symptoms in association with the patient's status. The result showed that the female gender has a higher chance of being infected compared to its male counterpart with a 90.3% chance of being infected. The logistic regression model revealed that the odds of STI positive for patients suffering from penile itching is higher than every other factor considered in this research, while both logistic regression and random forest performances were evaluated with the area under the curve (0.947 and 0.907) accuracy (0.950 and 0.933) recall (0.939 and 0.849), precision (0.886 and 0.849) and F-score (0.912 and 0.875) respectively. Sex, vaginal itching, and foul-smelling were found to be significantly associated with patient status while age, dysuria, penile itching, and vaginitis were not significant at 5% and 10% levels. Based on the five (5) evaluation metrics, the logistic regression model outperformed the random forest model, hence, it is adjudged to be the best of the two models and can be relied upon for the analysis.