Linear depth value interpolation

From CGAFaq

Jump to: navigation, search

Why can the depth-value be interpolated linearly in the screen space?

Depth functions

Let the center of projection be located at the origin, with the axes X, Y and Z pointing right, up and inwards, respectively (I'll assume a left-handed coordinate system). Let the near plane be at distance n and far plane at distance f from the origin.

Consider tracing a camera ray from the center of projection through the view plane and suppose it intersects multiple objects. For opaque objects the one visible is that with the nearest intersection point. Thus, for hidden surface removal, it sounds natural to use the distance from the center of projection to the intersection point (this is the 2-norm):

The problem with this metric comes from the fact that its computation is inefficient. It's value can't be interpolated easily on screen space and its computation needs all coordinates of the 3d point in the view-space. The square root is also costly in time.

For a better alternative, note that for hidden surface removal the precise values of f(x) are not important. What is important is the order the function f gives to points. We will now make an initial definition of a depth function. A depth function is a function taking as an input a vector and outputting a real number. It gives the same ordering for vectors as f does (if f(x) < f(y), then g(x) < g(y)). The input vector space need not include all vectors.

In particular, any p-norm is a depth function. But it is easy to see that all p-norms for have the same issues as the 2-norm. The 1-norm (manhattan norm) avoids the powers and the root and thus sounds like the best selection from the family of p-norms.

But we can do better. This is because a norm is actually too restricted a depth function. Let's be more specific for our requirements:

  • produces the right ordering of points on a line segment inside the view volume
  • has constant partial differentials with respect to x and y in screen space, so that it can be linearly interpolated
  • can be computed from view coordinates along with the projective transform

Deriving a convenient depth function

Let a camera ray R be parametrized by:

where

x is the x screen coordinate
y is the y screen coordinate
n is the near plane distance

Let the plane X of a polygon be given by:

where

N is the normal of the plane
P is a point on the plane

By intersecting the ray R with the plane X we get:

In perspective projection, z will be the variable that is divided by. Hinted by that, lets look at just the z coordinates of the intersection points.

This shows that Sz(x, y) is inversely proportional to x and y. So by inverting both sides, we get something on the right side that is linear with respect to x and y.

Now the right side has constant partial differentials w.r.t to x and y. Could it be that the left side is already an adequate depth function? It is not: if Sz increased, (1/Sz) would decrease. However, by negating both sides we get a left hand side that behaves correctly.

The left hand side seems to be what we are looking for. Indeed, we could use this as a depth function, but further analysis reveals a better metric. If we multiply the metric by a positive A and add a B to it, we still have a depth function. Thus there is family of depth function m that have all of our desired properties:

where

We will use these two additional degrees of freedom to choose a metric for which:

, when
, when

This produces a pair of equations:

which yield:

Our final metric z is:

which has all of our desired properties as well as an additional property that:

  • all points inside the view volume are mapped to a depth value in the range
Personal tools