Main Article Content
Reproducibility of Gleason Grading of Prostatic Adenocarcinoma
Abstract
Background: Gleason grading system for carcinomas of the prostate is important in determining treatment and outcome for patients. However, there is need to audit its use among pathologists to ensure reproducibility, thus avoiding undesirable consequences of inappropriate treatment. Materials and Methods: Ten slides made from needle biopsies of varying primary patterns and scores were administered to 11 general pathologists. Their ratings were measured against consensus expert ratings of the lesions and degrees of inter- and intra-rater agreements were measured using kappa statistics. Results: The inter-rater agreement for primary pattern recognition showed a range of kappa from 0.07 to 0.47 with most raters (45.5%) showing fair agreement with consensus rating. Overall kappa for primary pattern was 0.25 (fair agreement). Pattern underrating occurred overall in 49.1% of ratings and overrating in 3.6% with Gleason pattern 4 being the most underrated. Kappa coefficient for intra-rater consistency ranged from 0.29 to 0.78 (fair to substantial) with intra-rater consistency being highest for Gleason pattern 3. The inter-rater agreement for Gleason scores showed a range of kappa from – 0.12 to 0.54 (poor to moderate) and majority of raters (54.5%) being in the slight agreement range of kappa. The overall kappa was 0.35 (fair reproducibility). Gleason score 7, was undergraded in 63.6% of ratings, score group 8 – 10 by 45.5% and group 5 – 6 was undergraded in 38.6% of ratings. Conclusion: The study shows fair inter- and intra-rater consistency in Gleason pattern recognition and scoring with underscoring being the major factor identified. This underscores the need for constant revision of the use of grading systems to ensure consistency among raters.