<< Chapter < Page Chapter >> Page >

laplace

Because this filter serves to enhance the features of an image, this also increases noise.

Sobel

Sobel filtering involves using two 3 by 3 matrices to convolve with the image in order to find the gradient of the image. Although this may be an inaccurate approximation, it proves effective for our needs.

Sobel

After testing with multiple files, we had come to the Gaussian filter was best suited for the current data sets. There is a convenient OpenCV function that does this, Cv2.gaussianBlur( .. ). With appropriate parameters: the width and height of the matrix that will be used to filter the image and the standard deviation of the Gaussian in the x and y directions (the greater standard deviation the less variance among the pixels after filter, i.e. greater blur)

Blurring code

Blurred Above is our blurred image.

Feature detection

The next step is to determine where the edges in the image exist. After filtering, there needs to be some way to take each character and define its features in some measurable value. Under OpenCV, we used the adaptiveThreshold function to take our image and decide whether a certain intensity value in the image should be a 0 or 1. Effectively, our processed image is currently a matrix of binary values. We have the provided helper function.

Adaptive Threshold code

In deciding whether a pixel meets the threshold there are two methods: we can use an adaptive mean filter or a Gaussian. We found a Gaussian was better.The threshold type should be cv2.THRESH_BINARY_INV which turns inverts the values-pixels deemed white become black and vice versa. This is because Open Cv’s edge finding functions find whitecharacters in black backgrounds.

Threshold Above shows the adaptive threshold of the image.

From here, we can use OpenCV's findContours method to find the edges. This will return coordinates, width, and height of a rectangle around a character. Given specific properties, some rectangles may be removed, resized, or combined to accomodate special cases.

Finding the contours

Afterwards, we call the following two helper functions, findContoursAreas() and removeOverlaps(). findContoursAreas removes countours in the list contourlist that do not meet a specified minimum height. We use this toignore small contours, like the tittles around ‘i’s and ‘j’s.

removeOverlaps goes through the rectangles list and returns a list of contours(rectangles) that do not overlap, returning the largest rectangle if overlap does occur.This is necessary because the list findContourAreas returns all countours in a image, even those that do define an actual character.For example contours that make up part of a letter like the “o” in p would beincluded in addition to the contour that we want that encloses the “p.” We then sort the list to our liking so we can read the remaining letter outlines left to right, top tobottom. We do this with trainingHelper.xsort( countour_list ). It uses a simple algorithm to sort the rectangles into sorted rows and then into columns.

Edge detection Above shows the code discovering the region of interests.

Taking each rectangle, we will create a vector to hold the features each image has. First, we use a resize method to convert each rectangle into a n by n matrix of values. The values is determined by dividing the image to n by n cells and returning the average intensity of each cell. From here, we turn the n x n matrix into a row vector. We can then add other features to this vector in order to determine the character. For example, take the average of the top half and the bottom half of the image.

ROI Code block

Classification

To classify an image, there needs to be a referenced database to compare the characters to. Our provided code solves this through the training method, which uses machine learning. Using multiple sets of pictures, we take an image, run it through all of the steps above and tell the program what the character should be. When finished, we have training data, which holds a histogram of vectors mapped to specific characters. Training data

Back to the current image, we take its feature vector and use a k-nearest neighbor algorithm to determine which vector is most similar to the one being compared. From there, we see what the vector is mapped to and return what should be the actual letter or number. OCR

And there you have it! Thank you for reading through the introduction of OCR. Feel free to look through the files, as there will be more resources to explain this topic.

Get Jobilize Job Search Mobile App in your pocket Now!

Get it on Google Play Download on the App Store Now




Source:  OpenStax, Introduction to optical character recognition. OpenStax CNX. Dec 17, 2014 Download for free at http://legacy.cnx.org/content/col11728/1.1
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'Introduction to optical character recognition' conversation and receive update notifications?

Ask