Mathematical Image Analysis Group, Saarland University

Pose Tracking

Motivation

3-D pose tracking is the task to follow the 3-D position and orientation of a known object through a video sequence. There are a multitude of applications for which pose tracking is necessary, such as self localisation and object grasping in robotics, motion capture, or human computer interaction.

Input image Object model Initial pose

The images above show the input data that must be given for the tracking approach developed in our group: Necessary are input images from one or several calibrated cameras (left), a 3-D model of the object to be tracked (middle), and an initial, possibly inaccurate, pose estimation for the first image in 3-D space (right). Here, we illustrate the 3-D pose estimated by projecting the object back to the image.

While there are tracking approaches that make use of markers, the method we use is markerless. Further note that other prior knowledge, e.g. about the objects colour, is not necessary, and that the object model can easily be replaced.

Our Contributions

The 3-D pose tracking work in our group is performed in collaboration with Bodo Rosenhahn and Thomas Brox.

Region-based Pose Tracking

As a first step, we extended the approach in [1], which performed 3-D pose tracking with the help of segmentation, to a segmentation-free pose tracking approach. The new, region-based tracking approach [2]is not only faster than the approach in [1], it can also achieve improved results. While this initial contribution only worked for rigid objects, we later extended it to kinematic chains [3].

Tracking of Multiple Objects

Since the appearance of the tracked object is not given and might change during the sequence, it must be estimated while tracking. This estimation is disturbed if there are partial occlusions. Although the initial algorithm [1] can handle partial occlusions up to some extent, once the occluded part of the object to be tracked is too big, pose tracking is bound to fail.
One way to ease this problem is to track not only a single object, but also the occluding objects, which was done in our next contribution [4].

Including Scene Knowledge

For some scenes, prior knowledge about the object to be tracked or the scene is known, and can be used to improve tracking results. If there is a known ground plane, for example, it is obvious that the tracked object cannot pass it. In [5], it was demonstrated that improved results are achieved when using this knowledge. Similarly, other constraints can be incorporated. Examples include a cyclist whose feet must stay on the pedals, and a snowboarder whose feet have to stay on the snowboard, as demonstrated in [6].

Dealing with Self-Occlusions

When tracking kinematic chains, occlusions can be due to the fact that some parts of the chain occlude other parts. With the pure silhouette-based approach explain so far, such occlusions cannot be handled appropriately. Thus, we extended this method such that it uses multiple internal regions to cope with self-occlusions in [7].
A recent technical report combines the tracking of multiple objects from [4] with the tracking approach using internal regions, see [9].

Improved Handling of the Background Region

The approach descriped in [7] divides the object into multiple internal regions. This way, different object regions can be distinguished. In [8], we extended this approach such that the background is also divided into several regions. This allows a better approximation of the background region, which can significantly improve the tracking results.

Publications

B. Rosenhahn, T. Brox, J. Weickert:
Three-dimensional shape knowledge for joint image segmentation and pose tracking.
International Journal of Computer Vision, Vol. 73, No. 3, 243-262, July 2007.
Revised version of Technical Report No. 163, Centre for Imaging Technology and Robotics, University of Auckland, New Zealand, 2005.
C. Schmaltz, B. Rosenhahn, T. Brox, D. Cremers, J. Weickert, L. Wietzke, G. Sommer:
Region-based pose tracking.
In J. Martí, J. M. Benedí, A. M. Mendonça, J. Serrat (Eds.): Pattern Recognition and Image Analysis. Part II. Lecture Notes in Computer Science, Vol. 4478, 56-63, Springer, Berlin, 2007.

C. Schmaltz, B. Rosenhahn, T. Brox, D. Cremers, J. Weickert, L. Wietzke, and G. Sommer:
Region-based pose tracking.
Technical Report No. 196, Department of Mathematics, Saarland University, Saarbrücken, Germany, 2007.

C. Schmaltz, B. Rosenhahn, T. Brox, J. Weickert, D. Cremers, L. Wietzke, G. Sommer:
Occlusion modeling by tracking multiple objects.
In F. A. Hamprecht, C. Schnörr, B. Jähne (Eds.): Pattern Recognition. Lecture Notes in Computer Science, Vol. 4713, 173-183, Springer, Berlin, 2007.

B. Rosenhahn, C. Schmaltz, T. Brox, J. Weickert, H.-P. Seidel:
Staying well grounded in markerless motion capture.
In G. Rigoll (Ed.): Pattern Recognition. Lecture Notes in Computer Science, Vol. 5096, 385-395, Springer, Berlin, 2008.

B. Rosenhahn, C. Schmaltz, T. Brox, J. Weickert, D. Cremers, H.-P. Seidel:
Markerless motion capture of man-machine interaction.
Proc. 2008 IEEE Conference of Computer Vision and Pattern Recognition (Anchorage, AK, June 2008). IEEE Computer Society Press, 2008.

C. Schmaltz, B. Rosenhahn, T. Brox, J. Weickert, L. Wietzke, G. Sommer:
Dealing with self-occlusion in region based motion capture by means of internal regions.
In F. J. Perales, R.B. Fischer (Eds.): Articulated Motion and Deformable Objects. Lecture Notes in Computer Science, Vol. 5098, 102-111, Springer, Heidelberg, 2008.

C. Schmaltz, B. Rosenhahn, T. Brox, and J. Weickert:
Localised mixture models in region-based tracking.
In J. Denzler, G. Notni, H. Süße (Eds.): Pattern Recognition. Lecture Notes in Computer Science, Vol. 5748, 21-30, Springer, Berlin, 2009

C. Schmaltz, B. Rosenhahn, T. Brox, J. Weickert:
Region based pose tracking with occlusions using 3D models.
Machine Vision and Applications, Vol. 23, No. 3, 557-577, May 2012.
Revised version of Technical Report No. 277, Department of Mathematics, Saarland University, Saarbrücken, Germany, October 2010.

Demo

For a demonstration of tracking results, click one of the following images: