, motion analysis, and many other tasks of computer vision.
In 1981, Bruse D. Lucas and Takeo Kanade proposed a
new technique that used image intensity
gradient information to search for the best match between a
template T and another image I. The proposed algorithm has been
widely used in the field of computer vision for the last 20 years,
and has had many modifications and extensions. One of such
modifications is an
algorithm proposed by Simon Baker, Frank
Dellaert, and Iain Matthews. Their algorithm is much more
computationally effective than the original Lucas-Kanade
algorithm.
In the
Lucas-Kanade 20 Years On: A Unifying Framework
series of articles, Baker and Matthews try to classify image
alignment algorithms and divide them into two categories: forwards
and inverse, depending on the role of the template T. Also, they
classify algorithms as additive or compositional, depending on
whether the warp is updated by adding parameters increment or by
composing transformation matrices. Following this classification,
the Lucas-Kanade algorithm should be called a forwards additive
algorithm, and the Baker-Dellaert-Matthews algorithm should be
called an inverse compositional algorithm.
In this article, we will implement these two algorithms in C
language using the Intel
OpenCV open-source computer vision library as
the base for our implementation. We also will try to compare these
two algorithms, and will see which one is faster.
Why is this article needed? First of all, after writing this
article, I have a better understanding of image alignment
algorithms myself (at least I hope so). There are a lot of articles
that provide tons of information on image alignment. Most of them
are oriented on advanced readers, not on beginners. Among those
articles, I've found several ones that, from my point of view,
describe difficult things more simply. But what I really missed is
code examples to experiment with. So, the other two purposes of
this article are a relatively simple explaination of how image
alignment works and providing sample source code.
Who will be interested in reading this? I think, if you remember
some mathematics from your university program, are familiar with C
or C++, interested in learning how to track an object on video, and
have a little patience, you can read this.
Compiling Example Code
It is very easy to run the sample – just download
demo_binary.zip and run the executable file.
But to experiment with the code, you will need to compile it
yourself. First of all, you need Visual Studio .NET 2005 to be
installed. Visual C++ Express Edition also fits our needs.
The next step is to download and install the OpenCV library from
here. Download
OpenCV_1.0.exe and run it on your computer.
And at last, you need to configure your Visual Studio as
follows:
- In the Visual Studio main window, choose the Tools
-> Options... menu item, then click the
Projects and Solutions -> VC++ Directories tree
item.
- In the 'Show directories for..' combo box, select 'Library
Files'. Add the following line to the list below: Collapse
| Copy Code
C:\Program Files\OpenCV\lib
- In the 'Show directories for..' combo box, select 'Include
Files'. Add the following lines to the list below: Collapse
| Copy Code
C:\Program Files\OpenCV\cv\include C:\Program
Files\OpenCV\cxcore\include C:\Program
Files\OpenCV\otherlibs\highgui
That's all! Now you can open
align_demo.sln, compile, and
run it.
Some words about the structure of the sample project.
The
auxfunc.h and
auxfunc.cpp contain auxiliary
image warping functions. The
forwadditive.h and
forwadditive.cpp files contain an implementation of the
Lucas-Kanade algorithm. The
invcomp.h and
invcomp.cpp contain an implementation of the inverse
compositional algorithm. And, the
main.cpp contains the
main function.
What does it do?
Saying in a few words, this program tries to align the template
(the small butterfly image) with the big image (butterfly on the
flower). It is not limited to using a butterfly as the template.
For example, you can use a face image as a template and determine
where the face moves on the next frame of the video sequence.
During the process of alignment, we perform warping operations on
the template. We assume that there are three allowed operations:
in-plane rotation of the template, and its 2D translation in X and
Y directions. These operations are called the allowed set of
warps.
We define the allowed set of warps by defining the warp matrix,
. The matrix
W depends on the vector of parameters,
p=(w
z, t
x, t
y).
Here t
x and t
y are translation components and
w
z is the rotation component.
It is even possible to use much more complex sets of warps, for
example 3D translation and rotation, scaling, and so on. But, here
we use a relatively simple set of warps for simplicity.
Now, about the logic of the application.
- After you compile and run the demo, you will see a console
window, where we will display the diagnostic information. Enter the
components of the vector p=(wz,
tx, ty).
- After that, you can see two windows containing the template T
and the incoming image I. You also can see a white rectangle on the
image I. It is an initial approximation of the butterfly's
position. We just assume that the butterfly is somewhere near that
area.
- Press any key to start the Lucas-Kanade image alignment
algorithm. This algorithm tries to search where the butterfly is
located, iteratively minimizing the difference between the template
and the image I. In the console window, you will see how it works.
You can see the current iteration number, parameter increments, and
the mean error value.
- After the process of error minimization converges, the summary
is written to the console. The summary includes the algorithm name,
the calculation time, the resulting mean error , and the
parameters approximation. You can also see the white rectangle that
displays how the template is aligned on the image I.
- Now, press any key to run the same process, but using an
inverse compositional image alignment algorithm. You can see the
current iteration number and the mean error value.
- The summary is displayed for the inverse compositional
algorithm. Press any key to exit the program.
Let's see the contents of
main.cpp:
Collapse
| Copy Code
//
Our plan: // // 1. Ask user to enter image warp
parameter vector p=(wz, tx, ty) // 2. Define our template rectangle to
be bounding rectangle of the butterfly
// 3. Warp image I
with warp matrix W(p) // 4. Show template T and image I,
wait for a key press // 5. Estimate warp parameters using
Lucas-Kanade method, wait for a key press
// 6. Estimate warp
parameters using Baker-Dellaert-Matthews, wait for a key
press // int main(int argc, char* argv[]) { // Ask user to enter image warp
paremeters vector. // p = (wz, tx, ty)
float WZ=0, TX=0, TY=0;
pr