Search This Blog

Thursday, October 25, 2012

Stereo Vision: An Introduction


I started working on stereoscopic vision last year. It happens to be an exciting topic and this topic is still in its infancy. So there is a lot of scope for good quality research in this area. I'm gonna give you an introduction as to what stereo vision means.
Stereo vision is the technique of getting the 3D information of a scene from two different 2D views of the scene. Actually even our eyes work on the very same principle. They take two 2D images of the scene (we got two eyes), and based on these two images they calculate the 3D model of the scene.
If you are not familiar with the concept then you might be wondering why do we need two images. See the problem is, when we take an image of a scene we lose the depth information of the scene. Here's why:
Suppose we take the image of the point P using this camera.

We get the image of the point at p. Now suppose there is another point Q.

We can see that it's image will get projected at the same point as that of P. There is no method to differentiate between the two points using a single image. As a matter of fact all the points lying on the line OPQ will get projected on the same point. So you understand that we cannot tell anything about the distance of a point from the camera in the image.
Now suppose we have another camera which is at a slightly shifted position.


You can see that the points P and Q are projected at different points in this image. The images are formed at p' and q' in this camera. Actually there can be no two points which will get projected at the same point in both the images.
Now suppose we want to find the 3D coordinates of the point P. We take two images of the point P using two cameras as shown below. We know the focal length f of the cameras, the base length (distance between the two cameras) b, XR and XT. So we can easily calculate the distance Z by applying similarity of triangles POROT and Ppp'.

No comments:

Post a Comment