3.3 投影与观看¶

Projection and Viewing

在前面的部分，我们讨论了建模变换，它将对象坐标转换为世界坐标。然而，对于3D计算机图形，你需要了解其他几种坐标系及其之间的转换。我们将在本节中讨论它们。

我们首先概述各种坐标系。其中一些是复习内容，而另一些是新内容。

In the previous section, we looked at the modeling transformation, which transforms from object coordinates to world coordinates. However, for 3D computer graphics, you need to know about several other coordinate systems and the transforms between them. We discuss them in this section.

We start with an overview of the various coordinate systems. Some of this is review, and some of it is new.

3.3.1 多种坐标系¶

Many Coordinate Systems

中文英文

你实际用于绘制对象的坐标称为对象坐标。对象坐标系被选择为正在绘制的对象方便使用的坐标系。然后，可以应用建模变换来设置对象在整体场景中的大小、方向和位置（或者，在分层建模的情况下，设置在更大、更复杂对象的对象坐标系中）。建模变换是应用于对象顶点的第一个变换。

你用于构建完整场景的坐标被称为世界坐标。这些是整个场景的坐标，你正在创建的虚拟3D世界的坐标。建模变换从对象坐标到世界坐标的映射。

在现实世界中，你所看到的取决于你站立的位置和你观察的方向。也就是说，在你知道“观察者”的位置以及观察者在哪里看向之前，你无法制作出场景的图片，而且如果你考虑到，观察者的头是如何倾斜的。为了OpenGL的目的，我们想象观察者连接到他们自己的个体坐标系，这被称为眼睛坐标。在这个坐标系中，观察者位于原点（0,0,0），朝着负z轴的方向看；正y轴的方向指向上方；x轴指向右边。这是一个以观察者为中心的坐标系。换句话说，眼睛坐标（几乎）是你实际要在屏幕上绘制时要使用的坐标。从世界坐标到眼睛坐标的转换称为观察变换。

如果这很令人困惑，可以这样想：我们可以在世界上使用任何我们想要的坐标系。眼睛坐标是制作一个由观察者观察的世界图片的自然坐标系。如果我们在构建世界时使用了不同的坐标系（世界坐标），那么我们必须将这些坐标转换为眼睛坐标以了解观察者实际看到了什么。这个转换就是观察变换。

顺便提一下，OpenGL不跟踪单独的建模和观察变换。它们被合并成一个单一的变换，称为模型视图变换。事实上，尽管世界坐标可能看起来是最重要和最自然的坐标系，但OpenGL没有任何对它们的表示，也不在内部使用它们。对于OpenGL来说，只有对象坐标和眼睛坐标有意义。OpenGL通过应用模型视图变换直接从对象坐标到眼睛坐标。

我们还没有完成。观察者不能看到整个3D世界，只能看到适合视口的部分，视口是屏幕或其他显示设备上将绘制图像的矩形区域。我们说，场景被视口的边缘“裁剪”。此外，在OpenGL中，观察者只能看到眼睛坐标系中有限范围的z值。具有较大或较小z值的点将被裁剪掉，不会被渲染到图像中。（当然，这不是真实世界中的观察方式，但这是由OpenGL中深度测试的使用所要求的。请参阅子节3.1.4。）实际渲染到图像中的空间体积称为视体积。视体积内的物体会出现在图像中；不在视体积内的物体将被裁剪掉，看不到。为了绘制的目的，OpenGL应用一个坐标变换，将视体积映射到一个立方体上。该立方体位于原点处，x方向、y方向和z方向分别从-1到1延伸。在这个立方体上的坐标系被称为裁剪坐标。从眼睛坐标到裁剪坐标的变换称为投影变换。在这一点上，我们还没有将3D场景投影到2D表面，但现在我们可以通过丢弃z坐标来做到这一点。（然而，仍然需要z坐标来提供深度测试所需的深度信息。）

请注意，如果你根本不应用任何变换，也就是说，如果模型视图和投影变换都是单位矩阵，那么裁剪坐标就是将要使用的坐标。它是一个左手坐标系，其中z轴的正方向指向屏幕内部。

我们还没有结束。最终，当实际绘制时，存在设备坐标，即在物理显示设备（如计算机屏幕）上进行实际绘制的2D坐标系。通常，在设备坐标中，像素是度量单位。绘制区域是一个像素的矩形。这个矩形称为视口。视口变换将裁剪坐标中的x和y进行缩放，以适应视口。

让我们再次浏览变换序列。想象一个基本元素，比如场

景的一部分，可能出现在我们想要制作的场景图片中。该基本元素经历以下一系列操作：

定义基本元素的点使用对象坐标，使用诸如glVertex3f之类的方法。
点首先经过模型视图变换，这是将基本元素放置到世界中并将基本元素映射到眼睛坐标的建模变换的组合。
然后应用投影变换，将对观察者可见的视体积映射到裁剪坐标立方体上。如果变换后的基本元素位于该立方体之外，它将不会成为图像的一部分，并且处理停止。如果基本元素的一部分位于内部，另一部分位于外部，则位于外部的部分将被裁剪并丢弃，只有剩余的部分会进一步处理。
最后，应用视口变换以产生实际用于在显示设备上绘制基本元素的设备坐标。之后，只需决定如何给基本元素的各个像素着色。

我们需要更详细地考虑这些变换，并了解如何在OpenGL 1.1中使用它们。

The coordinates that you actually use for drawing an object are called object coordinates. The object coordinate system is chosen to be convenient for the object that is being drawn. A modeling transformation can then be applied to set the size, orientation, and position of the object in the overall scene (or, in the case of hierarchical modeling, in the object coordinate system of a larger, more complex object). The modeling transformation is the first that is applied to the vertices of an object.

The coordinates in which you build the complete scene are called world coordinates. These are the coordinates for the overall scene, the imaginary 3D world that you are creating. The modeling transformation maps from object coordinates to world coordinates.

In the real world, what you see depends on where you are standing and the direction in which you are looking. That is, you can't make a picture of the scene until you know the position of the "viewer" and where the viewer is looking—and, if you think about it, how the viewer's head is tilted. For the purposes of OpenGL, we imagine that the viewer is attached to their own individual coordinate system, which is known as eye coordinates. In this coordinate system, the viewer is at the origin, (0,0,0), looking in the direction of the negative z-axis; the positive direction of the y-axis is pointing straight up; and the x-axis is pointing to the right. This is a viewer-centric coordinate system. In other words, eye coordinates are (almost) the coordinates that you actually want to use for drawing on the screen. The transform from world coordinates to eye coordinates is called the viewing transformation.

If this is confusing, think of it this way: We are free to use any coordinate system that we want on the world. Eye coordinates are the natural coordinate system for making a picture of the world as seen by a viewer. If we used a different coordinate system (world coordinates) when building the world, then we have to transform those coordinates to eye coordinates to find out what the viewer actually sees. That transformation is the viewing transform.

Note, by the way, that OpenGL doesn't keep track of separate modeling and viewing transforms. They are combined into a single transform, which is known as the modelview transformation. In fact, even though world coordinates might seem to be the most important and natural coordinate system, OpenGL doesn't have any representation for them and doesn't use them internally. For OpenGL, only object and eye coordinates have meaning. OpenGL goes directly from object coordinates to eye coordinates by applying the modelview transformation.

We are not done. The viewer can't see the entire 3D world, only the part that fits into the viewport, which is the rectangular region of the screen or other display device where the image will be drawn. We say that the scene is "clipped" by the edges of the viewport. Furthermore, in OpenGL, the viewer can see only a limited range of z-values in the eye coordinate system. Points with larger or smaller z-values are clipped away and are not rendered into the image. (This is not, of course, the way that viewing works in the real world, but it's required by the use of the depth test in OpenGL. See Subsection 3.1.4.) The volume of space that is actually rendered into the image is called the view volume. Things inside the view volume make it into the image; things that are not in the view volume are clipped and cannot be seen. For purposes of drawing, OpenGL applies a coordinate transform that maps the view volume onto a cube. The cube is centered at the origin and extends from -1 to 1 in the x-direction, in the y-direction, and in the z-direction. The coordinate system on this cube is referred to as clip coordinates. The transformation from eye coordinates to clip coordinates is called the projection transformation. At this point, we haven't quite projected the 3D scene onto a 2D surface, but we can now do so simply by discarding the z-coordinate. (The z-coordinate, however, is still needed to provide the depth information that is needed for the depth test.)

Note that clip coordinates are the coordinates will be used if you apply no transformation at all, that is if both the modelview and the projection transforms are the identity. It is a left-handed coordinate system, with the positive direction of the z-axis pointing into the screen.

We still aren't done. In the end, when things are actually drawn, there are device coordinates, the 2D coordinate system in which the actual drawing takes place on a physical display device such as the computer screen. Ordinarily, in device coordinates, the pixel is the unit of measure. The drawing region is a rectangle of pixels. This is the rectangle that is called the viewport. The viewport transformation takes x and y from the clip coordinates and scales them to fit the viewport.

Let's go through the sequence of transformations one more time. Think of a primitive, such as a line or triangle, that is part of the scene and that might appear in the image that we want to make of the scene. The primitive goes through the following sequence of operations:

The points that define the primitive are specified in object coordinates, using methods such as glVertex3f.
The points are first subjected to the modelview transformation, which is a combination of the modeling transform that places the primitive into the world and the viewing transform that maps the primitive into eye coordinates.
The projection transformation is then applied to map the view volume that is visible to the viewer onto the clip coordinate cube. If the transformed primitive lies outside that cube, it will not be part of the image, and the processing stops. If part of the primitive lies inside and part outside, the part that lies outside is clipped away and discarded, and only the part that remains is processed further.
Finally, the viewport transform is applied to produce the device coordinates that will actually be used to draw the primitive on the display device. After that, it's just a matter of deciding how to color the individual pixels that are part of the primitive.

We need to consider these transforms in more detail and see how to use them in OpenGL 1.1.

3.3.2 视口变换¶

The Viewport Transformation

中文英文

最简单的变换之一是视口变换。它将裁剪坐标的x和y转换为在显示设备上使用的坐标。要指定视口变换，只需要指定场景将被渲染到的设备上的矩形。这可以通过使用glViewport函数来实现。

OpenGL必须由其运行环境提供一个绘制表面，比如Java中的JOGL或C中的GLUT库。该绘制表面是由像素组成的矩形网格，具有水平大小和垂直大小。OpenGL在绘制表面上使用一个坐标系，将(0,0)放在左下角，y从底部向顶部增加，x从左向右增加。当绘制表面首次交给OpenGL时，视口被设置为整个绘制表面。但是，通过调用

glViewport( x, y, width, height );

其中(x, y)是视口的左下角，以绘制表面坐标系表示，width和height是视口的大小，可以将裁剪坐标从-1到1映射到指定的视口。特别要注意的是，这意味着绘制仅限于视口。视口扩展到绘制表面之外并不是错误，尽管故意设置这种情况可能不太常见。

当绘制表面的大小发生变化时，比如用户调整包含绘制表面的窗口的大小时，OpenGL不会自动更改视口以匹配新的大小。但是，OpenGL运行环境可能会为您完成这项工作。（有关JOGL和GLUT如何处理此问题的信息，请参见第3.6节。）

glViewport经常用于在同一绘制表面上绘制多个不同的场景，或者同一场景的多个视图。例如，假设我们想要在同一绘制表面的两侧绘制两个场景，并且绘制表面的大小是600x400像素。如何做到这一点的概要非常简单：

glViewport(0,0,300,400);  // 绘制到绘制表面的左半部分。
    .
    .   // 绘制第一个场景。
    .

glViewport(300,0,300,400);  // 绘制到绘制表面的右半部分。
    .
    .   // 绘制第二个场景。
    .

第一个glViewport命令建立了一个300x400像素的视口，其左下角位于(0,0)。也就是说，视口的左下角位于绘制表面的左下角。这个视口填充了绘制表面的左半部分。类似地，第二个视口，其左下角位于(300,0)，填充了绘制表面的右半部分。

The simplest of the transforms is the viewport transform. It transforms x and y clip coordinates to the coordinates that are used on the display device. To specify the viewport transform, it is only necessary to specify the rectangle on the device where the scene will be rendered. This is done using the glViewport function.

OpenGL must be provided with a drawing surface by the environment in which it is running, such as JOGL for Java or the GLUT library for C. That drawing surface is a rectangular grid of pixels, with a horizontal size and a vertical size. OpenGL uses a coordinate system on the drawing surface that puts (0,0) at the lower left, with y increasing from bottom to top and x increasing from left to right. When the drawing surface is first given to OpenGL, the viewport is set to be the entire drawing surface. However, it is possible for OpenGL to draw to a different rectangle by calling

glViewport( x, y, width, height );

where (x,y) is the lower left corner of the viewport, in the drawing surface coordinate system, and width and height are the size of the viewport. Clip coordinates from -1 to 1 will then be mapped to the specified viewport. Note that this means in particular that drawing is limited to the viewport. It is not an error for the viewport to extend outside of the drawing surface, though it would be unusual to set up that situation deliberately.

When the size of the drawing surface changes, such as when the user resizes a window that contains the drawing surface, OpenGL does not automatically change the viewport to match the new size. However, the environment in which OpenGL is running might do that for you. (See Section 3.6 for information about how this is handled by JOGL and GLUT.)

glViewport is often used to draw several different scenes, or several views of the same scene, on the same drawing surface. Suppose, for example, that we want to draw two scenes, side-by-side, and that the drawing surface is 600-by-400 pixels. An outline for how to do that is very simple:

glViewport(0,0,300,400);  // Draw to left half of the drawing surface.
    .
    .   // Draw the first scene.
    .

glViewport(300,0,300,400);  // Draw to right half of the drawing surface.
    .
    .   // Draw the second scene.
    .

The first glViewport command establishes a 300-by-400 pixel viewport with its lower left corner at (0,0). That is, the lower left corner of the viewport is at the lower left corner of the drawing surface. This viewport fills the left half of the drawing surface. Similarly, the second viewport, with its lower left corner at (300,0), fills the right half of the drawing surface.

3.3.3 投影变换¶

The Projection Transformation

中文英文

接下来我们来讨论投影变换。和任何变换一样，投影在OpenGL中以矩阵的形式表示。OpenGL会将投影矩阵与表示模型视图变换的矩阵分开进行跟踪。相同的变换函数，比如glRotatef，可以应用于这两个矩阵，所以OpenGL需要一种方式来知道这些函数应用于哪个矩阵。这由一个名为矩阵模式的OpenGL状态属性决定。矩阵模式的值是一个常量，比如GL_PROJECTION或GL_MODELVIEW。当调用诸如glRotatef之类的函数时，它会修改一个矩阵；哪个矩阵被修改取决于当前矩阵模式的值。通过调用函数glMatrixMode来设置值。初始值是GL_MODELVIEW。这意味着如果你想要操作投影矩阵，你必须首先调用

glMatrixMode(GL_PROJECTION);

如果你想要回到工作在模型视图矩阵上，你必须调用

glMatrixMode(GL_MODELVIEW);

在我的程序中，我通常将矩阵模式设置为GL_PROJECTION，设置投影变换，然后立即将矩阵模式设置回GL_MODELVIEW。这意味着在程序的其他地方，我可以确信矩阵模式是GL_MODELVIEW。

为了帮助你理解投影，记住3D图像只能显示无限3D世界的一部分。视体积是图像中可见的世界的部分。视体积由观察变换和投影变换的组合确定。观察变换确定观察者的位置和朝向，但它不指定观察者能看到世界的多少。投影变换完成了这一点：它指定了视图中可见区域的形状和范围。将观察者想象成一个相机，带着一个大的无形盒子附着在相机前面，围绕着相机有视线的部分。盒子内部就是视体积。当相机在世界中移动时，盒子也跟着移动，视体积也会改变。但是盒子的形状和大小不会改变。盒子的形状和大小对应于投影变换。相机的位置和朝向对应于观察变换。

这只是另一种说法，数学上，OpenGL投影变换将眼睛坐标变换为裁剪坐标，将视体积映射到2×2×2的裁剪立方体上，该立方体包含图像中可见的所有内容。指定投影只是指定视体积的大小和形状，相对于观察者。

投影有两种一般类型，透视投影和正交投影。透视投影更加物理上实际。也就是说，它显示了如果OpenGL屏幕上的显示矩形是一个窗口，它是一个真实的3D世界（可以延伸到屏幕前面和后面），那么你会看到什么。它显示了你用普通相机拍摄3D世界时会看到的视图。在透视视图中，对象的视觉大小取决于它距离观察者的距离。只有位于观察者前面的东西才能被看到。事实上，暂时忽略z方向上的裁剪，可见的世界部分是一个无限金字塔，观察者位于金字塔的顶点，金字塔的侧面穿过视口矩形的侧面。

然而，由于OpenGL使用深度测试来解决隐藏表面问题，它实际上无法显示这个金字塔中的所有东西。由于深度缓冲区只能存储有限范围的深度值，它不能表示理论上可见的无限金字塔的整个深度值范围。只有位于观察者一定距离范围内的对象才能成为图像的一部分。这个距离范围由两个值，近裁剪面和远裁剪面确定。对于透视变换，这两个值必须是正数，而且远裁剪面必须大于近裁剪面。距离观察者更近或更远的任何东西都会被丢弃，并且不会出现在渲染的图像中。因此，在图像中表示的空间体积是一个“截断金字塔”。这个金字塔是透视投影的视体积：

视体积由六个平面限定——四个侧面加上截断金字塔的顶部和底部。这些平面被称为裁剪平面，因为任何位于每个平面错误一侧的东西都会被裁剪掉。投影变换将眼睛坐标中截断金字塔的六个侧面映射到裁剪坐标中裁剪立方体的六个侧面。

在OpenGL中，设置投影变换等同于定义视体积。对于透视变换，你必

须设置一个截断金字塔形状的视体积。这种形状的一个相当生僻的术语是视锥体。透视变换可以通过glFrustum命令设置：

glFrustum( xmin, xmax, ymin, ymax, near, far );

最后两个参数指定观察者的近裁剪距离和远裁剪距离，如前所述。假设观察者位于原点(0,0,0)，朝向负z轴方向。（这是眼坐标系。）因此，近裁剪平面位于z = −near，远裁剪平面位于z = −far。（注意负号！）前四个参数指定了金字塔的四个侧面：xmin、xmax、ymin和ymax指定了在近裁剪平面视体积的水平和垂直限制。例如，金字塔小端的左上角的坐标是（xmin，ymax，-near）。在远裁剪平面的x和y限制比glFrustum命令中指定的限制大得多，通常是远大于近。

请注意，glFrustum中的x和y限制通常关于零对称。也就是说，xmin通常等于xmax的负值，ymin通常等于ymax的负值。但这不是必需的。可以有不对称的视体积，其中z轴不直接指向视野的中心。

由于矩阵模式必须设置为GL_PROJECTION才能工作在投影变换上，glFrustum通常在以下形式的代码段中使用

glMatrixMode(GL_PROJECTION);
glLoadIdentity();
glFrustum( xmin, xmax, ymin, ymax, near, far );
glMatrixMode(GL_MODELVIEW);

调用glLoadIdentity确保了起始点是单位变换。这很重要，因为glFrustum修改了现有的投影矩阵而不是替换它，并且尽管在理论上可能，你根本不想尝试将多个投影变换组合成一个。

与透视投影相比，正交投影更容易理解：在正交投影中，将眼坐标系的z坐标舍弃，将3D世界投影到2D图像上。这种投影类型是不现实的，因为它不是观察者会看到的。例如，物体的视觉大小不取决于它距离观察者的距离。在图像中，观察者后面以及前面的物体都可以被看到。然而，正交投影仍然很有用，特别是在交互式建模程序中，其中看到真实大小和角度对透视视图的畸变非常有用。

事实上，在正交投影的情况下，说有一个观察者并不是很清楚。尽管如此，在OpenGL中，正交投影被认为有一个观察者。观察者位于眼坐标原点，面向负z轴方向。从理论上讲，一个矩形走廊无限延伸到观察者前面和后面都能被观察到。然而，就像透视投影一样，在OpenGL图像中实际上只能显示这个无限走廊的一个有限段。这个有限的视体积是一个平行六面体——一个长方体——它被近裁剪平面和远裁剪平面削去了无限走廊的部分。远距离必须大于近距离，但对于正交投影，近距离的值允许为负数，将“近”裁剪平面置于观察者的后面，如下图的下部分所示：

注意，近裁剪平面的负值将近裁剪平面放在了正z轴上，即观察者的后面。

可以使用glOrtho方法在OpenGL中设置正交投影，该方法的形式如下：

glOrtho( xmin, xmax, ymin, ymax, near, far );

前四个参数指定了视体积左、右、下和上的x和y坐标。请注意，最后两个参数是near和far，而不是zmin和zmax。实际上，视体积的最小z值为−far，最大z值为−near。然而，通常情况下near = −far，如果是这样，则最小和最大z值最终仍然是near和far！

与glFrustum一样，当矩阵模式为GL_PROJECTION时应调用glOrtho。例如，假设我们希望视体积是以原点为中心的箱子，其x、y和z值的范围为-10到10。可以通过以下方式实现：

glMatrixMode(GL_PROJECTION);
glLoadIdentity();
glOrtho( -10, 10, -10, 10, -10, 10 );
glMatrixMode(GL_MODELVIEW);

现在，事实证明，在这种简单情况下，glOrtho的效果与glScalef(0.1, 0.1, -0.1)的效果完全相同，因为投影只是将箱子按比例缩小了10倍。但通常更好地将投影看作与缩放不同类型的事物。（z缩放因子上的负号是因为投影颠倒了z轴的方向，将传统的右手坐标系转换为OpenGL的左手默认坐标系。）

glFrustum方法并不特别易于使用。有一个称为GLU的库，其中包含一些用于OpenGL的实用函数。GLU库包括gluPerspective方法，作为设置透视投影的更简单的方法。命令

gluPerspective( fieldOfViewAngle, aspect, near, far );

可以用来代替glFrustum。fieldOfViewAngle是垂直角度，以度为单位，指的是视体积金字塔的上侧和下侧之间的夹角。典型值在30到60度之间。aspect参数是视图的纵横比，即金字塔横截面的宽度除以高度。aspect的值通常应设置为视口的纵横比。gluPerspective中的near和far参数的含义与glFrustum中的相同。

We turn next to the projection transformation. Like any transform, the projection is represented in OpenGL as a matrix. OpenGL keeps track of the projection matrix separately from the matrix that represents the modelview transformation. The same transform functions, such as glRotatef, can be applied to both matrices, so OpenGL needs some way to know which matrix those functions apply to. This is determined by an OpenGL state property called the matrix mode. The value of the matrix mode is a constant such as GL_PROJECTION or GL_MODELVIEW. When a function such as glRotatef is called, it modifies a matrix; which matrix is modified depends on the current value of the matrix mode. The value is set by calling the function glMatrixMode. The initial value is GL_MODELVIEW. This means that if you want to work on the projection matrix, you must first call

glMatrixMode(GL_PROJECTION);

If you want to go back to working on the modelview matrix, you must call

glMatrixMode(GL_MODELVIEW);

In my programs, I generally set the matrix mode to GL_PROJECTION, set up the projection transformation, and then immediately set the matrix mode back to GL_MODELVIEW. This means that anywhere else in the program, I can be sure that the matrix mode is GL_MODELVIEW.

To help you to understand projection, remember that a 3D image can show only a part of the infinite 3D world. The view volume is the part of the world that is visible in the image. The view volume is determined by a combination of the viewing transformation and the projection transformation. The viewing transform determines where the viewer is located and what direction the viewer is facing, but it doesn't say how much of the world the viewer can see. The projection transform does that: It specifies the shape and extent of the region that is in view. Think of the viewer as a camera, with a big invisible box attached to the front of the camera that encloses the part of the world that that camera has in view. The inside of the box is the view volume. As the camera moves around in the world, the box moves with it, and the view volume changes. But the shape and size of the box don't change. The shape and size of the box correspond to the projection transform. The position and orientation of the camera correspond to the viewing transform.

This is all just another way of saying that, mathematically, the OpenGL projection transformation transforms eye coordinates to clip coordinates, mapping the view volume onto the 2-by-2-by-2 clipping cube that contains everything that will be visible in the image. To specify a projection just means specifying the size and shape of the view volume, relative to the viewer.

There are two general types of projection, perspective projection and orthographic projection. Perspective projection is more physically realistic. That is, it shows what you would see if the OpenGL display rectangle on your computer screen were a window into an actual 3D world (one that could extend in front of the screen as well as behind it). It shows a view that you could get by taking a picture of a 3D world with an ordinary camera. In a perspective view, the apparent size of an object depends on how far it is away from the viewer. Only things that are in front of the viewer can be seen. In fact, ignoring clipping in the z-direction for the moment, the part of the world that is in view is an infinite pyramid, with the viewer at the apex of the pyramid, and with the sides of the pyramid passing through the sides of the viewport rectangle.

However, OpenGL can't actually show everything in this pyramid, because of its use of the depth test to solve the hidden surface problem. Since the depth buffer can only store a finite range of depth values, it can't represent the entire range of depth values for the infinite pyramid that is theoretically in view. Only objects in a certain range of distances from the viewer can be part of the image. That range of distances is specified by two values, near and far. For a perspective transformation, both of these values must be positive numbers, and far must be greater than near. Anything that is closer to the viewer than the near distance or farther away than the far distance is discarded and does not appear in the rendered image. The volume of space that is represented in the image is thus a "truncated pyramid." This pyramid is the view volume for a perspective projection:

The view volume is bounded by six planes—the four sides plus the top and bottom of the truncated pyramid. These planes are called clipping planes because anything that lies on the wrong side of each plane is clipped away. The projection transformation maps the six sides of the truncated pyramid in eye coordinates to the six sides of the clipping cube in clip coordinates.

In OpenGL, setting up the projection transformation is equivalent to defining the view volume. For a perspective transformation, you have to set up a view volume that is a truncated pyramid. A rather obscure term for this shape is a frustum. A perspective transformation can be set up with the glFrustum command:

glFrustum( xmin, xmax, ymin, ymax, near, far );

The last two parameters specify the near and far distances from the viewer, as already discussed. The viewer is assumed to be at the origin, (0,0,0), facing in the direction of the negative z-axis. (This is the eye coordinate system.) So, the near clipping plane is at z = −near, and the far clipping plane is at z = −far. (Notice the minus signs!) The first four parameters specify the sides of the pyramid: xmin, xmax, ymin, and ymax specify the horizontal and vertical limits of the view volume at the near clipping plane. For example, the coordinates of the upper-left corner of the small end of the pyramid are (xmin, ymax, -near). The x and y limits at the far clipping plane are larger, usually much larger, than the limits specified in the glFrustum command.

Note that x and y limits in glFrustum are usually symmetrical about zero. That is, xmin is usually equal to the negative of xmax and ymin is usually equal to the negative of ymax. However, this is not required. It is possible to have asymmetrical view volumes where the z-axis does not point directly down the center of the view.

Since the matrix mode must be set to GL_PROJECTION to work on the projection transformation, glFrustum is often used in a code segment of the form

glMatrixMode(GL_PROJECTION);
glLoadIdentity();
glFrustum( xmin, xmax, ymin, ymax, near, far );
glMatrixMode(GL_MODELVIEW);

The call to glLoadIdentity ensures that the starting point is the identity transform. This is important since glFrustum modifies the existing projection matrix rather than replacing it, and although it is theoretically possible, you don't even want to try to think about what would happen if you combine several projection transformations into one.

Compared to perspective projections, orthographic projections are easier to understand: In an orthographic projection, the 3D world is projected onto a 2D image by discarding the z-coordinate of the eye-coordinate system. This type of projection is unrealistic in that it is not what a viewer would see. For example, the apparent size of an object does not depend on its distance from the viewer. Objects in back of the viewer as well as in front of the viewer can be visible in the image. Orthographic projections are still useful, however, especially in interactive modeling programs where it is useful to see true sizes and angles, undistorted by perspective.

In fact, it's not really clear what it means to say that there is a viewer in the case of orthographic projection. Nevertheless, for orthographic projection in OpenGL, there is considered to be a viewer. The viewer is located at the eye-coordinate origin, facing in the direction of the negative z-axis. Theoretically, a rectangular corridor extending infinitely in both directions, in front of the viewer and in back, would be in view. However, as with perspective projection, only a finite segment of this infinite corridor can actually be shown in an OpenGL image. This finite view volume is a parallelepiped—a rectangular solid—that is cut out of the infinite corridor by a near clipping plane and a far clipping plane. The value of far must be greater than near, but for an orthographic projection, the value of near is allowed to be negative, putting the "near" clipping plane behind the viewer, as shown in the lower section of this illustration:

Note that a negative value for near puts the near clipping plane on the positive z-axis, which is behind the viewer.

An orthographic projection can be set up in OpenGL using the glOrtho method, which is has the following form:

glOrtho( xmin, xmax, ymin, ymax, near, far );

The first four parameters specify the x- and y-coordinates of the left, right, bottom, and top of the view volume. Note that the last two parameters are near and far, not zmin and zmax. In fact, the minimum z-value for the view volume is −far and the maximum z-value is −near. However, it is often the case that near = −far, and if that is true then the minimum and maximum z-values turn out to be near and far after all!

As with glFrustum, glOrtho should be called when the matrix mode is GL_PROJECTION. As an example, suppose that we want the view volume to be the box centered at the origin containing x, y, and z values in the range from -10 to 10. This can be accomplished with

glMatrixMode(GL_PROJECTION);
glLoadIdentity();
glOrtho( -10, 10, -10, 10, -10, 10 );
glMatrixMode(GL_MODELVIEW);

Now, as it turns out, the effect of glOrtho in this simple case is exactly the same as the effect of glScalef(0.1, 0.1, -0.1), since the projection just scales the box down by a factor of 10. But it's usually better to think of projection as a different sort of thing from scaling. (The minus sign on the z scaling factor is there because projection reverses the direction of the z-axis, transforming the conventionally right-handed eye coordinate system into OpenGL's left-handed default coordinate system.)

The glFrustum method is not particularly easy to use. There is a library known as GLU that contains some utility functions for use with OpenGL. The GLU library includes the method gluPerspective as an easier way to set up a perspective projection. The command

gluPerspective( fieldOfViewAngle, aspect, near, far );

can be used instead of glFrustum. The fieldOfViewAngle is the vertical angle, measured in degrees, between the upper side of the view volume pyramid and the lower side. Typical values are in the range 30 to 60 degrees. The aspect parameter is the aspect ratio of the view, that is, the width of a cross-section of the pyramid divided by its height. The value of aspect should generally be set to the aspect ratio of the viewport. The near and far parameters in gluPerspective have the same meaning as for glFrustum.

3.3.4 模型视图转换¶

The Modelview Transformation

中文英文

“建模”和“视图”在概念上似乎是非常不同的事物，但在OpenGL中将它们合并成一个单一的变换。这是因为从原则上讲无法区分它们之间的区别；区别仅在概念上存在。也就是说，给定的变换可以被视为是建模变换或视图变换，这取决于您对其的理解方式。（在概念上的一个显着差异是，相同的视图变换通常适用于3D场景中的每个对象，而每个对象可以有自己的建模变换。但这不是原则上的差异。）我们在2D图形中已经看到了类似的情况(Subsection 2.3.1)，但让我们思考一下它在3D中是如何工作的。

例如，假设在原点有一个房子模型，朝向正z轴方向。假设观察者在正z轴上，朝向原点回望。观察者直接朝向房子的前面。现在，您可以对房子应用一个建模变换，将其围绕y轴旋转90度。在此变换之后，房子朝向正x轴的方向，观察者直接朝向房子的左侧。另一方面，您可以将观察者绕y轴负方向旋转90度。这会将观察者置于负x轴上，从而使其能够看到房子的左侧。无论哪种变换，最终的结果是观察者对房子的视图完全相同。在OpenGL中，这两种变换都可以使用命令实现：

glRotatef(90,0,1,0);

也就是说，此命令表示既是将对象旋转90度的建模变换，也是将观察者绕y轴旋转-90度的视图变换。请注意，对观察者的影响与对对象的影响是相反的。建模和视图变换总是以这种方式相关联的。例如，如果您正在观察一个对象，您可以将自己向左移动5英尺（视图变换），或者将对象向右移动5英尺（建模变换）。无论哪种情况，您最终都会得到相同的对象视图。在OpenGL中，这两种变换都表示为

glTranslatef(5,0,0);

这甚至适用于缩放：如果观察者缩小，那么对于观察者来说，世界看起来与扩展的情况完全相同，反之亦然。

尽管建模和视图变换在原理上是相同的，但它们在概念上仍然有很大的不同，并且通常在代码中的不同点应用它们。一般情况下，在绘制场景时，您将执行以下操作：(1)加载单位矩阵，作为明确定义的起始点；(2)应用视图变换；(3)绘制场景中的对象，每个对象都有自己的建模变换。请记住，OpenGL会跟踪多个变换，并且所有这些都必须在模型视图变换当前时完成；如果您不确定这一点，那么在步骤(1)之前，您应该调用glMatrixMode(GL_MODELVIEW)。在步骤(3)中，您可能会使用glPushMatrix()和glPopMatrix()将每个建模变换限制为特定的对象。

在加载单位矩阵后，观察者位于默认位置，即原点，沿负z轴向下看，视图中的正y轴向上指向。例如，假设我们希望将观察者从其默认位置在原点沿正z轴方向移动到点(0,0,20)。这个操作与移动世界及其包含的对象，使其沿着z轴的负方向移动20个单位具有完全相同的效果。无论执行哪种操作，观察者最终与对象的位置完全相同。这两种操作都由相同的OpenGL命令实现，即glTranslatef(0,0,-20)。举个例子，假设我们使用两个命令

glRotatef(90,0,1,0);
glTranslatef(10,0,0);

来建立视图变换。作为建模变换，这些命令首先将一个对象沿着正x方向平移10个单位，然后围绕y轴旋转90度。这将使原本位于(0,0,0)的对象移动到(0,0,-10)，将对象直接放置在观察者前方10个单位处。请记住，建模变换是按照与代码中出现顺序相反的顺序应用于对象的。（您应该思考一下这两个解释如何影响一个从(0,0,0)开始的房屋的视图。该变换会影响观察者看到房屋的哪一侧，以及观察者位于房屋多远的位置。）

顺便提一下，应用视图变换的顺序与它们在代码中出现的顺序相同。

以下是一个演示，说明了建模和视图之间的等价性。下图中的半透明灰色框表示用于创建显示在左上方的图像的视图体积。在这种情况下，投影是透视投影，视图体积是一个棱台。阅读演示中的帮助文本以获取更多信息。

通过组合旋转、缩放和平移来设置视图可能会很困难，因此OpenGL提供了一种更容易设置典型视图的方法。这个命令不是OpenGL本身的一部分，而是GLU库的一部分。

GLU库提供了以下方便的方法来设置视图变换：

gluLookAt( eyeX,eyeY,eyeZ, refX,refY,refZ, upX,upY,upZ );

该方法将观察者放置在点(eyeX,eyeY,eyeZ)，朝向点(refX,refY,refZ)。观察者的方向是使向量(upX,upY,upZ)指向观察者视图中的上方。例如，要将观察者定位在负x轴上，距离原点10个单位，向后看原点，使正方向的y轴向上，通常使用以下命令：

gluLookAt( -10,0,0,  0,0,0,  0,1,0 );

有了所有这些，我们可以为使用OpenGL 1.1绘制3D场景图像的典型显示程序提供一个大致的概述：

// 可能在此处设置清除颜色，如果在其他地方未设置

glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT);

// 可能在此处设置投影，如果在其他地方未设置

glMatrixMode(GL_MODELVIEW);

glLoadIdentity();

gluLookAt(eyeX, eyeY, eyeZ, refX, refY, refZ, upX, upY, upZ);  // 视图变换

glPushMatrix();
.
.   // 应用建模变换并绘制一个对象
.
glPopMatrix();

glPushMatrix();
.
.   // 应用另一个建模变换并绘制另一个对象
.
glPopMatrix();

.
.
.

"Modeling" and "viewing" might seem like very different things, conceptually, but OpenGL combines them into a single transformation. This is because there is no way to distinguish between them in principle; the difference is purely conceptual. That is, a given transformation can be considered to be either a modeling transformation or a viewing transformation, depending on how you think about it. (One significant difference, conceptually, is that the same viewing transformation usually applies to every object in the 3D scene, while each object can have its own modeling transformation. But this is not a difference in principle.) We have already seen something similar in 2D graphics (Subsection 2.3.1), but let's think about how it works in 3D.

For example, suppose that there is a model of a house at the origin, facing towards the direction of the positive z-axis. Suppose the viewer is on the positive z-axis, looking back towards the origin. The viewer is looking directly at the front of the house. Now, you might apply a modeling transformation to the house, to rotate it by 90 degrees about the y-axis. After this transformation, the house is facing in the positive direction of the x-axis, and the viewer is looking directly at the left side of the house. On the other hand, you might rotate the viewer by minus 90 degrees about the y-axis. This would put the viewer on the negative x-axis, which would give it a view of the left side of the house. The net result after either transformation is that the viewer ends up with exactly the same view of the house. Either transformation can be implemented in OpenGL with the command

glRotatef(90,0,1,0);

That is, this command represents either a modeling transformation that rotates an object by 90 degrees or a viewing transformation that rotates the viewer by -90 degrees about the y-axis. Note that the effect on the viewer is the inverse of the effect on the object. Modeling and viewing transforms are always related in this way. For example, if you are looking at an object, you can move yourself 5 feet to the left (viewing transform), or you can move the object 5 feet to the right (modeling transform). In either case, you end up with the same view of the object. Both transformations would be represented in OpenGL as

glTranslatef(5,0,0);

This even works for scaling: If the viewer shrinks, it will look to the viewer exactly the same as if the world is expanding, and vice-versa.

Although modeling and viewing transformations are the same in principle, they remain very different conceptually, and they are typically applied at different points in the code. In general when drawing a scene, you will do the following: (1) Load the identity matrix, for a well-defined starting point; (2) apply the viewing transformation; and (3) draw the objects in the scene, each with its own modeling transformation. Remember that OpenGL keeps track of several transformations, and that this must all be done while the modelview transform is current; if you are not sure of that then before step (1), you should call glMatrixMode(GL_MODELVIEW). During step (3), you will probably use glPushMatrix() and glPopMatrix() to limit each modeling transform to a particular object.

After loading the identity matrix, the viewer is in the default position, at the origin, looking down the negative z-axis, with the positive y-axis pointing upwards in the view. Suppose, for example, that we would like to move the viewer from its default location at the origin back along the positive z-axis to the point (0,0,20). This operation has exactly the same effect as moving the world, and the objects that it contains, 20 units in the negative direction along the z-axis. Whichever operation is performed, the viewer ends up in exactly the same position relative to the objects. Both operations are implemented by the same OpenGL command, glTranslatef(0,0,-20). For another example, suppose that we use two commands

glRotatef(90,0,1,0);
glTranslatef(10,0,0);

to establish the viewing transformation. As a modeling transform, these commands would first translate an object 10 units in the positive x-direction, then rotate the object 90 degrees about the y-axis. This would move an object originally at (0,0,0) to (0,0,-10), placing the object 10 units directly in front of the viewer. (Remember that modeling transformations are applied to objects in the order opposite to their order in the code.) What do these commands do as a viewing transformation? The effect on the view is the inverse of the effect on objects. The inverse of "translate 90 then rotate 10" is "rotate -10 then translate -90." That is, to do the inverse, you have to undo the rotation before you undo the translation. The effect as a viewing transformation is first to rotate the view by -90 degrees about the y-axis (which would leave the viewer at the origin, but now looking along the positive x-axis), then to translate the viewer by -10 along the x-axis (backing up the viewer to the point (-10,0,0)). An object at the point (0,0,0) would thus be 10 units directly in front of the viewer. (You should think about how the two interpretations affect the view of a house that starts out at (0,0,0). The transformation affects which side of the house the viewer is looking at, as well as how far away from the house the viewer is located).

Note, by the way, that the order in which viewing transformations are applied is the same as the order in which they occur in the code.

Here is a demo that illustrates the equivalence between modeling and viewing. The translucent gray box in the lower images represents the view volume that is used to create the image that is shown in the upper left. In this case, the projection is a perspective projection, and the view volume is a frustum. Read the help text in the demo for more information.

It can be difficult to set up a view by combining rotations, scalings, and translations, so OpenGL provides an easier way to set up a typical view. The command is not part of OpenGL itself but is part of the GLU library.

The GLU library provides the following convenient method for setting up a viewing transformation:

gluLookAt( eyeX,eyeY,eyeZ, refX,refY,refZ, upX,upY,upZ );

This method places the viewer at the point (eyeX,eyeY,eyeZ), looking towards the point (refX,refY,refZ). The viewer is oriented so that the vector (upX,upY,upZ) points upwards in the viewer's view. For example, to position the viewer on the negative x-axis, 10 units from the origin, looking back at the origin, with the positive direction of the y-axis pointing up as usual, use

gluLookAt( -10,0,0,  0,0,0,  0,1,0 );

With all this, we can give an outline for a typical display routine for drawing an image of a 3D scene with OpenGL 1.1:

// possibly set clear color here, if not set elsewhere

glClear( GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT );

// possibly set up the projection here, if not done elsewhere

glMatrixMode( GL_MODELVIEW );

glLoadIdentity();

gluLookAt( eyeX,eyeY,eyeZ, refX,refY,refZ, upX,upY,upZ );  // Viewing transform

glPushMatrix();
.
.   // apply modeling transform and draw an object
.
glPopMatrix();

glPushMatrix();
.
.   // apply another modeling transform and draw another object
.
glPopMatrix();

.
.
.

3.3.5 相机抽象¶

A Camera Abstraction

中文英文

将投影和视图讨论时，通常会使用“相机”类比。实际相机用于拍摄3D世界的照片。对于3D图形，想象使用虚拟相机执行相同操作是很有用的。设置视图变换就像定位和指向相机一样。投影变换确定相机的属性：它的视场是多少，使用什么样的镜头？（当然，至少在一个方面，OpenGL中的这个比喻有所不同，因为实际相机在其z方向上不会进行裁剪。）

我编写了一个相机工具来实现这个想法。相机旨在接管设置投影和视图的工作。与手动设置相比，您可以设置相机的属性。API适用于C和Java。两个版本略有不同，因为Java版本是面向对象的。我将首先讨论C实现。（有关使用C和Java中OpenGL编程的基本信息，请参阅第3.6节。有关在程序中使用相机示例，请参阅下一节的多面体查看器示例。还要注意，还有一个用于我的OpenGL JavaScript模拟器的相机版本；它是模拟器库glsim/glsim.js的一部分，其API几乎与Java API相同。）

在C中，相机由样例.c文件glut/camera.c和相应的头文件glut/camera.h定义。API的完整文档可以在头文件中找到。要使用相机，您应该在程序开头包含#include "camera.h"，并且在编译程序时，应该在要编译的文件列表中包含camera.c。相机依赖于GLU库和C的标准数学库，因此在编译时必须确保这些库可用。要使用相机，您应该调用

    cameraApply();

在绘制场景之前设置投影和视图变换。调用此函数会替换设置投影和视图变换的常规代码。它会将矩阵模式设置为GL_MODELVIEW。

API中的其余函数用于配置相机。这通常作为初始化的一部分完成，但可以随时更改配置。但是，请记住，只有在调用cameraApply()之后，设置才会生效。可用的函数包括：

    cameraLookAt(eyeX, eyeY, eyeZ, refX, refY, refZ, upX, upY, upZ);
        // 确定视图变换，就像gluLookAt一样
        // 默认为 cameraLookAt(0,0,30, 0,0,0, 0,1,0);

    cameraSetLimits(xmin, xmax, ymin, ymax, zmin, zmax);
        // 设置视图体积的限制，其中 zmin 和 zmax 是相对于视图参考点给出的，
        // 而 xy 限制是在视图参考点的距离上测量的，而不是在近距离处。
        // 默认为 cameraSetLimits(-5,5, -5,5, -10,10);

    cameraSetScale(limit);
        // 一个方便的方法，与调用
        // cameraSetLimits(-limit,limit, -limit,limit, -2*limit, 2*limit);
        // 相同

    cameraSetOrthographic(ortho);
        // 在正交和透视投影之间切换。
        // 参数应为 0 表示透视，1 表示正交。默认是透视。

    cameraSetPreserveAspect(preserve);
        // 确定是否应该保持视口的宽高比。
        // 参数应为 0 表示忽略，1 表示保持视口的宽高比。默认是保持宽高比。

在许多情况下，默认设置就足够了。特别注意cameraLookAt和cameraSetLimits如何一起设置视图和投影。cameraLookAt的参数表示世界坐标中的三个点。视图参考点(refX, refY, refZ)应该位于您想要渲染的场景中间的某个位置。cameraSetLimits的参数定义了一个围绕视图参考点的盒子，该盒子应该包含您想要出现在图像中的所有内容。

对于Java中的JOGL，相机API被实现为一个名为Camera的类，定义在文件jogl/Camera.java中。相机适用于作为OpenGL绘图表面使用的GLPanel或GLCanvas。要使用相机，请将Camera对象创建为实例变量：

    camera = new Camera();

在绘制场景之前，调用

    camera.apply(gl2);

其中gl2是类型为GL2的OpenGL绘图上下文。（注意参数gl2的存在，在C中不是必需的；它是必需的，因为在JOGL中，OpenGL绘图上下文被实现为一个对象。）与C版本类似，这会设置视图和投影变换，并可以替换您为此目的使用的任何其他代码。用于配置相机的函数在Java中与C中相同，除了它们成为相机对象中的方法，而真/假参数

是布尔型而不是整数型：

    camera.lookAt(eyeX, eyeY, eyeZ, refX, refY, refZ, upX, upY, upZ);
    camera.setLimits(xmin, xmax, ymin, ymax, zmin, zmax);
    camera.setScale(limit);
    camera.setOrthographic(ortho);    // ortho的类型为布尔型
    camera.setPreserveAspect(preserve); // preserve的类型为布尔型

相机附带了一个模拟的“轨迹球”。轨迹球允许用户通过在显示屏上单击并拖动鼠标来旋转视图。要在C中使用GLUT使用它，只需通过调用安装鼠标函数和鼠标移动函数：

    glutMouseFunc(trackballMouseFunction);
    glutMotionFunc(trackballMotionFunction);

函数trackballMouseFunction和trackballMotionFunction作为相机API的一部分定义，并在camera.h中声明和记录。轨迹球通过修改与相机相关联的视图变换来工作，并且只有在调用cameraApply()在显示函数的开头设置视图和投影变换时才有效。要在Java中为Camera对象安装轨迹球，调用

    camera.installTrackball(drawSurface);

其中drawSurface是使用相机的组件。

Projection and viewing are often discussed using the analogy of a camera. A real camera is used to take a picture of a 3D world. For 3D graphics, it useful to imagine using a virtual camera to do the same thing. Setting up the viewing transformation is like positioning and pointing the camera. The projection transformation determines the properties of the camera: What is its field of view, what sort of lens does it use? (Of course, the analogy breaks for OpenGL in at least one respect, since a real camera doesn't do clipping in its z-direction.)

I have written a camera utility to implement this idea. The camera is meant to take over the job of setting the projection and view. Instead of doing that by hand, you set properties of the camera. The API is available for both C and Java. The two versions are somewhat different because the Java version is object-oriented. I will discuss the C implementation first. (See Section 3.6 for basic information about programming OpenGL in C and Java. For an example of using a camera in a program, see the polyhedron viewer example in the next section. Note also that there is a version of the camera for use with my JavaScript simulator for OpenGL; it is part of the simulator library glsim/glsim.js and has an API almost identical to the Java API.)

In C, the camera is defined by the sample .c file, glut/camera.c and a corresponding header file, glut/camera.h. Full documentation for the API can be found in the header file. To use the camera, you should #include "camera.h" at the start of your program, and when you compile the program, you should include camera.c in the list of files that you want to compile. The camera depends on the GLU library and on C's standard math library, so you have to make sure that those libraries are available when it is compiled. To use the camera, you should call

cameraApply();

to set up the projection and viewing transform before drawing the scene. Calling this function replaces the usual code for setting up the projection and viewing transformations. It leaves the matrix mode set to GL_MODELVIEW.

The remaining functions in the API are used to configure the camera. This would usually be done as part of initialization, but it is possible to change the configuration at any time. However, remember that the settings are not used until you call cameraApply(). Available functions include:

cameraLookAt( eyeX,eyeY,eyeZ, refX,refY,refZ, upX,upY,upZ );
    // Determines the viewing transform, just like gluLookAt
    // Default is cameraLookAt( 0,0,30, 0,0,0, 0,1,0 );

cameraSetLimits( xmin, xmax, ymin, ymax, zmin, zmax );
    // Sets the limits on the view volume, where zmin and zmax are
    // given with respect to the view reference point, NOT the eye,
    // and the xy limits are measured at the distance of the
    // view reference point, NOT the near distance.
    // Default is cameraSetLimits( -5,5, -5,5, -10,10 );

cameraSetScale( limit );
    // a convenience method, which is the same as calling
    // cameraSetLimits( -limit,limit, -limit,limit, -2*limit, 2*limit );

cameraSetOrthographic( ortho );
    // Switch between orthographic and perspective projection.
    // The parameter should be 0 for perspective, 1 for
    // orthographic.  The default is perspective.

cameraSetPreserveAspect( preserve );
    // Determine whether the aspect ratio of the viewport should
    // be respected.  The parameter should be 0 to ignore and
    // 1 to respect the viewport aspect ratio.  The default
    // is to preserve the aspect ratio.

In many cases, the default settings are sufficient. Note in particular how cameraLookAt and cameraSetLimits work together to set up the view and projection. The parameters to cameraLookAt represent three points in world coordinates. The view reference point, (refX,refY,refZ), should be somewhere in the middle of the scene that you want to render. The parameters to cameraSetLimits define a box about that view reference point that should contain everything that you want to appear in the image.

For use with JOGL in Java, the camera API is implemented as a class named Camera, defined in the file jogl/Camera.java. The camera is meant for use with a GLPanel or GLCanvas that is being used as an OpenGL drawing surface. To use a camera, create an object of type Camera as an instance variable:

camera = new Camera(); Before drawing the scene, call

camera.apply(gl2);

where gl2 is the OpenGL drawing context of type GL2. (Note the presence of the parameter gl2, which was not necessary in C; it is required because the OpenGL drawing context in JOGL is implemented as an object.) As in the C version, this sets the viewing and projection transformations and can replace any other code that you would use for that purpose. The functions for configuring the camera are the same in Java as in C, except that they become methods in the camera object, and true/false parameters are boolean instead of int:

camera.lookAt( eyeX,eyeY,eyeZ, refX,refY,refZ, upX,upY,upZ );
camera.setLimits( xmin,xmax, ymin,ymax, zmin,zmax );
camera.setScale( limit );
camera.setOrthographic( ortho );    // ortho is of type boolean
camera.setPreserveAspect( preserve ); // preserve is of type boolean

The camera comes with a simulated "trackball." The trackball allows the user to rotate the view by clicking and dragging the mouse on the display. To use it with GLUT in C, you just need to install a mouse function and a mouse motion function by calling

glutMouseFunc( trackballMouseFunction );
glutMotionFunc( trackballMotionFunction );

The functions trackballMouseFunction and trackballMotionFunction are defined as part of the camera API and are declared and documented in camera.h. The trackball works by modifying the viewing transformation associated with the camera, and it only works if cameraApply() is called at the beginning of the display function to set the viewing and projection transformations. To install a trackball for use with a Camera object in JOGL, call

camera.installTrackball(drawSurface);

where drawSurface is the component on which the camera is used.