Hidden Markov Model Approach for Offline Yorùbá Handwritten Word Recognition
Jumoke F. Ajao *
Department of Computer Science, Kwara State University, Malete, Nigeria.
Stephen O. Olabiyisi
Department of Computer Science & Engineering, Ladoke Akintola University of Technology, Ogbomosho, Nigeria.
Elijah O. Omidiora
Department of Computer Science & Engineering, Ladoke Akintola University of Technology, Ogbomosho, Nigeria.
Oladayo O. Okediran
Department of Computer Science & Engineering, Ladoke Akintola University of Technology, Ogbomosho, Nigeria.
*Author to whom correspondence should be addressed.
Abstract
This paper presents a recognition system for Yorùbá handwritten words using Hidden Markov Model(HMM).The work is divided into four stages, namely data acquisition, preprocessing, feature extraction and classification. Data were collected from adult indigenous writers and the scanned images were subjected to some level of preprocessing, such as: greyscale, binarization, noise removal and normalization accordingly. Features were extracted from each of the normalized words, where a set of new features for handwritten Yorùbá words is obtained, based on discrete cosine transform approach and zigzag scanning was applied to extract the character shape, underdot and the diacritic sign from spatial frequency of the word image. A ten(10) state left-to- right HMM was used to model the Yorùbá words. The initial probability of HMM was randomly generated based on the model created for Yorùbá alphabet. In the HMM modeling, one HMM per each class of the image feature was constructed. The Baum-Welch re-estimation algorithm was applied to train each of the HMM class based on the DCT feature vector for the handwritten word images. Viterbi algorithm was used to classify the handwritten word which, gave the corresponding state sequences that best describe the model. Our experiments reported the highest test accuracy of 92% and higher recognition rate of 95.6% which, indicated that the performance of the recognition system is very accurate.
Keywords: Handwriting recognition, yorùbá Corpus, yorùbá Language, database schema, database interface