Hardware and Software Co-Design of Arabic Alphabets Recognition Platform for Blind and Visually Impaired Persons
Brahim Sabir1, *, Yassine Khazri1, Mohamed Moussetad1, Bouzekri Touri2
Identifiers and Pagination:Year: 2017
First Page: 193
Last Page: 200
Publisher Id: TOEEJ-11-193
Article History:Received Date: 31/05/2017
Revision Received Date: 07/06/2017
Acceptance Date: 25/10/2017
Electronic publication date: 16/11/2017
Collection year: 2017
open-access license: This is an open access article distributed under the terms of the Creative Commons Attribution 4.0 International Public License (CC-BY 4.0), a copy of which is available at: https://creativecommons.org/licenses/by/4.0/legalcode. This license permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Optical character Recognition (OCR) is a technic that converts scanned or printed text images into editable text. Many OCR solutions have been proposed and used for Latin and Chinese alphabets.
However not much can be found about OCRs for the handwriting scripts Arabic Alphabets, and especially to be used for blind and visually impaired persons.
This paper has been an attempt towards the development of an OCR for Arabic Alphabets dedicated to blind and visually impaired persons.
The proposed Optical Arabic Alphabets Recognition algorithm includes binarization of the inputted image, segmentation, feature extraction and a classification based on neural networks to match read Arabic alphabets with trained pattern.
The proposed algorithm has been developed using Matlab, and the solution was designed to be implemented on hardware platform and can be customized for mobile phones.
The presented method has the benefit that the accuracy of recognition is comparable to other OCR algorithms.
For these reasons, Optical character recognition (OCR) of Arabic alphabets is more complicated than the recognition of the other languages (Latin, Chinese…)
The characteristics of Arabic do not allow direct implementation of many algorithms used for other Languages like English or Chinese. This is mainly because Arabic characters are always connected even when typewritten .
The automatic recognition has large commercial importance. It has applications in cheque reading, collecting data from scanned documents, e-books producing. And dealing with the methods used for OCR, Artificial neural networks (ANNs) offer parallel processing of data in
ccontrast to the conventional sequential processing computing systems.
The ANN has been used in a wide variety of applications such as character recognition. Most of these applications are software-based .
Software implementations of the optical character recognition (OCR) systems have been reported extensively
In the research literature. Off-line printed and hand-written text recognition has been done using ANNs and other
Technologies such hidden Markov models (HMMs) .
A hardware-based ANN can take up different architectural forms such as perceptron, feed-forward multilayer Perceptron, radial basis function, etc .
The scope of this paper is optical character recognition of Arabic alphabets, dedicated to blind and visually impaired persons.
In this paper, after reviewing the state of the art in Section 2, the recognition process is explained in Section 3.
The hardware design of the solution is then described in Section 4; and section 5 is dedicated to the experimental results of the proposed solution.
Finally, conclusions are drawn in the last section, and directions for future works are presented.
|Fig. (1). Arabic Alphabets.|
2. RELATED WORKS
Many approaches are used to recognize Arabic alphabets such as Grapheme-based approach , global features obtained with a spatial–temporal transform (wavelets) , directional/angular distributions [6, 7], word/phrase measurements [6, 8], statistical features, such as texture measurements [6, 7], grapheme distributions , gray-level statistics [9, 10], cross-correlation distributions [11, 12], Dynamic time Wrap , Artificial neural networks(ANN)  , or Kernel methods  such as Support Vector Machines (SVM).
And dealing with the rates of recognition, a rate of 97% was obtained by Dynamic Time Warp (DTW) algorithm ; a Multi-Layer Perceptron (MLP) based classifier yields an average recognition rate of 94.93% , Grapheme-based has been a rate of 96% ; recurrent neural network word recognition rate is 94.76% by Abandah et al. 
Hidden Markov Models (HMM) with a rate above 90% .
The ANN achieved 93% correct character recognition .
3. MATERIAL AND SOFTWARE USED
MATLAB 7.10.0 R2010a Version /64-bit software is used to implement the proposed Arabic OCR algorithm on an Dell Intel Core CPU M640@2.8GHZ, RAM 4Go, machine running a 64-bit operating, system -MS Windows 7.
The Cyclone II FPGA board is used to implement the VHDL code generated for matlab files.
The software ModelSim-Intel FPGA-2016 was used to simulate hardware design of our OCR.
4. PROPOSED METHOD
Fig. (2) shows The main steps To build an Optical character recognition (OCR) which will offer blind and visually impaired persons the capacity to scan printed text and then speak it back in synthetic speech.
|Fig. (2). Main steps of proposed method.|
In optical character recognition step, we followed: training procedure, image acquisition, preprocessing, segmentation, feature extraction and classification .
The speech synthesizer tool step is not on the scope of the present paper.
The following sections explain these steps and the used techniques in the experiments of this paper.
Preprocessing steps include: Binarization, filtering and smoothing.
The input image will be preprocessed in order to perform noise cleaning.
Convert to gray scale, binary image, remove salt and paper noise.
A Color, a graylevel, or a binary image of Arabic alphabet will be binarized based on:
For each point (x,y), X is random variable [0,254]:
If I (x,y) < X: level of the selected point = 1.
If I (x,y) > X: level of the selected point = 0.
4.2. Training Procedure
In order to build our artificial neural network, we used The Backpropagation Algorithm which has as input the segmented alphabets (matrix 35 * 35) and as outputs (vector column of 28 alphabets).
The training begins with random weights, and the goal is to adjust them so that the error will be minimal.
We note, A: Activation function, x input and w the weight:
Sigmoidal output (O) function:
The goal of the training process is to obtain a desired output when certain Inputs (Matrix of Arabic Alphabet) are given.
Since the error is the difference between the actual and the desired output, the error depends on the weights, and we need to adjust the weights in order to minimize the error E (difference between Output function and the distance):
The error of the network will be the sum of the errors of all the neurons in the output layer:
The error will generate confusion rates mentioned in tables of experimental results section.
The input to the training process is a set of Arabic alphabets’ images which ultimately produce the weight matrix of the recognizer’s neural network.
4.3. Image Acquisition
The alphabets are scanned or captured during the step of image acquisition. The saved image file format is GIF, bmp or jpeg.
The segmentation algorithm will extract each part as separate characters. Fig. 3 shows segmentation of the alphabet "Sin".
The non-cursive characteristic of the Arabic Alphabets facilitates the preprocessing step.
The segmentation step will be achieved in order to construct the input matrix (35 * 35).
|Fig. 3. Segmentation step of the alphabets sin (س).|
The classification assigns an unknown feature into a predefined class (character shape). A number of classification methods are used for Arabic text recognition for examples: Template Matching, Statistical Techniques, Syntactic Techniques, Neural Networks and Hidden Markov Model .
In this paper, the neural network classifier will be used to classify the inputted Arabic alphabets.
5. HARDWARE DESIGN
We opted for hardware implementation, in order to improve the speed of ANN, compared with software implementation.
An ANN’s structural details include number of inputs and outputs, and the number of layers, and activation functions (AFs).
We Implement the individual VHDL modules: adder, multiplier, shifter (divider), and AF (ramp, and piece-wise linear approximation of a sigmoid); for a single neuron.
|Adder Size (Bits)||Adder Count|
5.1. Activation Function Implementation
All the neurons in the entire ANN employ the same AF (activation function), but the output layer may use a different AF type.
There are several ways of approximating the sigmoid function , We opted for the combinational approximation.
The AF output ranges from -94 to 94.
In order to reduce the cost we replace the division operation with simple right shifts: 2-bit shifting for division-by-4, and 6-bit shifting for division by- 64.
5.2. Threshold Function Implementation
The threshold function in our neurons outputs ‘0’ if the input is less than or equal to zero; else the output is ‘1’.
The hardware implementation of this function simply comprises of a most-significant bit comparator.
5.3. Model Simulation and Analysis
The proposed VHDL models were simulated using ModelSim-Intel FPGA-2016.
We opted for the input samples which are made up of 35x35 grids which is the matrix generated from Arabic alphabets’ segmentation.
6. EXPERIMENTAL RESULTS
The Table 2 summarizes the recognition rates of proposed algorithm on software hardware platform.
Tests were done using images of size 2500*2500 pixels.
The recognition rate of software based solution is higher than the recognition rates of the one implemented in hardware platform; however the time consumption of hardware platform is reduced by 86%.
The higher error rate in hardware platform could be mainly attributed to the following factors:
- The Approximation of the sigmoid AF.
- The use of large grid, however it will lead to higher hardware implementation cost.
|Method||Recognition Rates-same Font Type(%)||Error Rates(%)||
|Proposed Algorithm on Hardware||85.7%||14.3%||0.3 seconds|
In this paper, we present an OCR for Arabic alphabets. The OCR system has been quite successful in the recognition of input images. The experimental result shows us that with the test image (matrix of 35*35) with same font type the accuracy rate is 96.4% on software implementation.
Searches and comparisons are time consuming, however the accuracy of proposed method is high compared to other OCRs
A comparison study of proposed algorithm on hardware platform is carried out with proposed algorithm (software only).
Future work for the alphabets with high rates of confusion a multiple classifier will be implemented, Additional training samples could improve the ANN learning ability, hence improving the prediction accuracy. This may not affect the ANN hardware configuration.
In this paper the hardware implementation of the ANN had lower accuracy than its software counterpart.
The hardware’s character recognition accuracy is still significantly high at 85.7%.
One reason for reduced accuracy is the approximation of non-linear sigmoid Activation Function.
Further improvements are possible without significantly altering the hardware, for example, by using a larger dataset for training, and possible implementation of proposed design using synthesis tools (such as Synopsys Design Compiler) in order to save on adder hardware (inside a neuron) .
The proposed solution can be customized to be implemented in mobile phones.
CONSENT FOR PUBLICATION
CONFLICT OF INTEREST
The author (editor) declares no conflict of interest, financial or otherwise.
I would like to thank Mrs. Maida Bermudez her valuable contribution.