Main Article Content
An Innovative Automatic Indexing Method For Arabic Text
Abstract
The study of automatic indexing and text retrieval methods for language has a long history. Automatic indexing involves extracting words from a document to categorize it based on subject matter and to improve the information retrieval process. Despite extensive research in other languages, there remains limited investigation into automated Arabic text categorization. In this research, the researchers introduce an innovative method to enhance the accuracy of automatic indexing of Arabic texts by incorporating a thesaurus. Their approach extracts new relevant words by referencing thesaurus, which contains words, synonyms, and correlations identified through its construction using a natural language toolkit and a WordNet library. Synonyms with similar meanings that frequently appear together are grouped using a JavaScript Object Notation dictionary. The research results demonstrate a significant improvement in accuracy and efficiency compared to prior studies.