Optical Character Recognition for Cursive Handwriting

which uses a sequence of segmentation and recognition algorithms, is proposed for offline cursive handwriting recognition problem. First, some global parameters, such as slant angle, baselines, and stroke width and height are estimated. Second, a segmentation method finds character segmentation paths by combining gray scale and binary information. Third, Hidden Markov Model (HMM) is employed for shape recognition to label and rank the character candidates. For this purpose, a string of codes is extracted from each segment to represent the character candidates. The estimation of feature space parameters is embedded in HMM training stage together with the estimation of the HMM model parameters. Finally, the lexicon information and HMM ranks are combined in a graph optimization problem for word-level recognition. This method corrects most of the errors produced by segmentation and HMM ranking stages by maximizing an information measure in an efficient graph search algorithm. The experiments in dicate higher recognition rates compared to the available methods reported in the literature. T HE most difficult problemin the field ofOpticalCharacter Recognition (OCR) is the recognition of unconstrained cursive handwriting. The present tools for modeling almost infinitely many variations of human handwriting are not yet sufficient. The similarities of distinct character shapes, the overlaps, and interconnection of the neighboring characters further complicate the problem. Additionally, when ob- served in isolation, characters are often ambiguous and require context information to reduce the classification error. Thus, current research aims at developing constrained systems for limited domain applications such as postal address reading , check sorting tax reading , and office automation for text entry. A well-defined lexicon plus a well-constrained syntax help provide a feasible solution to the problem . Handwritten Word Recognition techniques use either holistic or analytic strategies for training and recognition stages. Holistic strategies employ top-down approaches for recognizing the whole word, thus eliminating the segmen- tation problem . In this strategy, global features, extracted from the entire word image, are used in recognition of limited-size lexicon. As the size of the lexicon gets larger, the complexity of algorithms increase linearly due to the need for a larger search space and a more complex pattern representation. Additionally, the recogni- tion rates decrease rapidly due to the decrease in between- class-variances in the feature space.

Free download research paper