Using Arabic Skeleton Morphology and Maximum Entropy for Arabic Document Classification
O. G. El-Barbary *
Faculty of Science, Tanta University, Egypt and Faculty of Science and Arts in Sajir, Shaqra University, KSA.
*Author to whom correspondence should be addressed.
Abstract
The morphology of Arabic plays an important role of computational natural language processing systems. The rich morphology, and the complexity of word formation all contribute to making morphological approaches to Arabic very challenging. In this paper, we present a new method for Arabic document classification using maximum entropy and morphological derivation of Arabic words. In this paper, maximum entropy and Arabic word derivative morphology for text classification by estimating the conditional distribution of the class variable given the document. Using these derivatives we can find a related words in the document which contains words and its derivatives. The proposed approach is designed for vowel and unvowel Arabic document.
Keywords: Document classification, Arabic information retrieval, Arabic morphology, maximum entropy.