<< Chapter < Page | Chapter >> Page > |
Algorithmically our red cup replacement algorithm breaks down into three main sections: cup identification, finding a suitable replacement image, and the merger of the found image into the original. Each part presents its own technical challenges and solutions.
Our test identification algorithm is based on simple template matching. Basically, the template image of a desired object is convolved with the original image and the correlation between the two is found at every point. The correlation is then normalized with respect to the intensity of the original image, giving a correlation value in the range between -1 and 1. This process is encapsulated in the matlab function normxcorr2, whith takes two grayscale image matrices and returns one correlation matrix whose width and height are the sum of the widths and heights of the original matrices.
The program sets a threshold value (around .7 by experimentation) to determine if our template has matched a cup in the original image. Each color channel runs and is compared with the threshold separately. The program then ands the resulting filtered correlation matrices together so a match is only found if it matches in terms of red, green, and blue. This prevents a red (100% red, 0% green, 0% blue) from matching with white (100% red, 100% green, 100% blue). At this stage, all points that exceed the threshold are considered matches. Inorder to find the actual location of the cup the algorithm finds the maximum correlation overall, records a cup at that location, and then masks out the area of the found cup. This neutralizes the other over threshold points around corresponding to the same cup, preventing overlapping cup hits. The algorithm then finds the next greatest maximum value and repeats until all points over threshold have been accounted.
Unfortunately, this approach only works for one size of cup in the source image (the size of the template). To detect all cup sizes the scale of the template relative to the source image must change and the correlation must be run for each respective size. Our algorithm scales down the original image using imresize and leaves the template small (to save on runtime by reducing the correlation size instead of increasing it). After each small change in size the correlation function runs and saves matched regions to an accumulation array. The function also keeps track of the masks of previous match regions so smaller cups aren’t found erroneously inside of larger cups. The match regions are recorded at the scale of the original image, so the algorithm keeps track of the scale factor at each step and sizes the recorded region accordingly.
The search algorithm builds on the idea of template matching and expands it to a wider scope. Ideally the program would exactly match the regions around each cup and ignore the cup itself. Since our correlation function can not exclude the middle area, we had to use a different approach. The replacement algorithm generates blocks around the found cup with a width proportional to the size of the cup to be replaced. Each individual block is then correlated through the image bank (similarly to as explained above). The main difference is that the search algorithm must consider all blocks simultaneously-- a match is only a match if it works all the way around the suspect region. To achieve this, the correlation matrices for each block are shifted and merged by the displacement of the block from the origin of the replacement image. This generates a correlation matrix that takes all blocks into account. The algorithm then finds the region with the highest correlation from all the images, and passes that region to the merge algorithm.
We built our test image bank from a relatively small number of images and just used Matlab’s imread function to load each one serially. The program runs the above block based correlation on each image, keeping track of the highest correlation value and its assosciated region.
Because of the block nature of the search algorithm, one simple improvement we made was to give the blocks different weights based on their importance to the continuity of the image. The human eye sees lines and edges more than muted textures, so we gave more weight (by multipying their correlation matrices by a factor before the final correlation sum) to blocks that contained more edges. This modification helped ensure that arms stayed continuous and helped with the hand problem (the frequent presence of hands over the cup).
After finding a region to suitably replace the excised region from the original image, the new image is blended with the original. We used a conditional blend to completely replace the red cup, and then gradually blended the surrounding buffer regions together with the original image. Our blend algorithm used a linear intensity blend (scaled sum of the two images), but could be quickly improved with bicubic blur (taking blur information from above and below as well) and a more consistent merger (angled corners).
Notification Switch
Would you like to follow the 'Red cup replacement' conversation and receive update notifications?