Main Article Content
Resource description framework triples entity formations using statistical language model
Abstract
A method in formatting unstructured sentences from the source corpus to a specific
knowledge representation such as RDF is needed. A method for RDF entity formations from a
paragraph of text using statistical language model based on N-gram is introduced. The
implementation of RDF entity formation is applied on natural language query for information
retrieval of the Islamic knowledge. 300 concepts from the English translation of Holy Quran
with 350 relationships are used as a knowledge base. We evaluate our approach on collection
of queries from the Islamic Research Foundation website with a total, 82 queries and compare
the performance against previous method used in FREyA. The result shown the proposed
method improved 17.07% on the accuracy of the natural language formulation analysis, which
tested on search strategy. It shows the increment on recall and precision with 7% and 3%.
Keywords: semantic web; N-gram; ontology; statistical model