第 3 节：硬件与软件¶

Hardware and Software

我们将以OpenGL作为3D图形编程的主要基础。最初的开放GL(OpenGL)版本于1992年由一家名为Silicon Graphics的公司发布，该公司以其图形工作站而闻名——这是设计用于密集图形应用的强大且昂贵的计算机。（今天，您的智能手机具有更多的图形计算能力。）OpenGL受到大多数现代计算设备的图形硬件支持，包括台式计算机、笔记本电脑和许多移动设备。作为网页GL(WebGL)的形式，它是Web上大多数3D图形的使用方式。本节将为您介绍一些关于OpenGL历史和支持它的图形硬件的背景知识。

在最初的台式计算机中，屏幕内容是由中央处理器(CPU)直接管理的。例如，要在屏幕上绘制一条线段，CPU将运行一个循环来设置沿线的每个像素的颜色。不用说，图形可能会占用CPU大量的时间。与我们今天期望的相比，图形性能非常慢。那么，有什么变化呢？当然，计算机总体上更快了，但最大的变化是在现代计算机中，图形处理是由一种称为图形处理器(GPU)的专用组件完成的。 GPU包括用于执行图形计算的处理器；实际上，它可以包含大量的这种处理器，这些处理器并行工作以大大加速图形操作。它还包括专用内存，用于存储诸如图像和坐标列表之类的东西。 GPU处理器对存储在GPU内存中的数据具有非常快的访问速度——比它们访问存储在计算机主内存中的数据要快得多。

要绘制一条线或执行其他图形操作，CPU只需将命令以及任何必要的数据发送到GPU，GPU负责实际执行这些命令。 CPU将大部分图形工作交给了GPU，后者被优化为非常快地执行这项工作。 GPU理解的命令集组成了GPU的接口(API)。 OpenGL是图形API的一个例子，大多数GPU支持OpenGL，意味着它们可以理解OpenGL命令，或者至少OpenGL命令可以被有效地转换为GPU可以理解的命令。

OpenGL并不是唯一的图形API。实际上，它正在被更现代的替代方案所取代，包括伏尔甘(Vulkan)，这是由负责OpenGL的同一组织开发的开放API。还有一些由苹果和微软使用的专有API：metal(Metal)和direct3D(Direct3D)。至于Web，一个名为网页GPU(WebGPU)的新API已经开发了一段时间，并且已经在一些Web浏览器中实现了。这些较新的API是复杂且底层的。它们更多地设计用于速度和效率，而不是易用性。本教材不涵盖Metal、Direct3D和Vulkan，但在第9章中介绍了WebGPU。在大部分的内容中，我们将使用OpenGL，因为它提供了一个更容易入门的3D图形介绍，以及WebGL，因为它仍然是Web浏览器中3D图形的主要API。

We will be using OpenGL as the primary basis for 3D graphics programming. The original version of OpenGL was released in 1992 by a company named Silicon Graphics, which was known for its graphics workstations—powerful, expensive computers designed for intensive graphical applications. (Today, you have more graphics computing power on your smart phone.) OpenGL is supported by the graphics hardware in most modern computing devices, including desktop computers, laptops, and many mobile devices. In the form of WebGL, it is the used for most 3D graphics on the Web. This section will give you a bit of background about the history of OpenGL and about the graphics hardware that supports it.

In the first desktop computers, the contents of the screen were managed directly by the CPU. For example, to draw a line segment on the screen, the CPU would run a loop to set the color of each pixel that lies along the line. Needless to say, graphics could take up a lot of the CPU's time. And graphics performance was very slow, compared to what we expect today. So what has changed? Computers are much faster in general, of course, but the big change is that in modern computers, graphics processing is done by a specialized component called a GPU, or Graphics Processing Unit. A GPU includes processors for doing graphics computations; in fact, it can include a large number of such processors that work in parallel to greatly speed up graphical operations. It also includes its own dedicated memory for storing things like images and lists of coordinates. GPU processors have very fast access to data that is stored in GPU memory—much faster than their access to data stored in the computer's main memory.

To draw a line or perform some other graphical operation, the CPU simply has to send commands, along with any necessary data, to the GPU, which is responsible for actually carrying out those commands. The CPU offloads most of the graphical work to the GPU, which is optimized to carry out that work very quickly. The set of commands that the GPU understands make up the API of the GPU. OpenGL is an example of a graphics API, and most GPUs support OpenGL in the sense that they can understand OpenGL commands, or at least that OpenGL commands can efficiently be translated into commands that the GPU can understand.

OpenGL is not the only graphics API. In fact, it is in the process of being replaced by more modern alternatives, including Vulkan an open API from the same group that is responsible for OpenGL. There are also proprietary APIs used by Apple and Microsoft: Metal and Direct3D. As for the Web, a new API called WebGPU has been under development for some time and is already implemented in some Web browsers. These newer APIs are complex and low-level. They are designed more for speed and efficiency rather than ease-of-use. Metal, Direct3D, and Vulkan are not covered in this textbook, but WebGPU is introduced in Chapter 9. For most of the book, we will use OpenGL, because it provides an easier introduction to 3D graphics, and WebGL, because it is still the major API for 3D graphics in Web browsers.

中文英文

我曾经说过OpenGL是一个API，但实际上它是一系列API，经过多次扩展和修订。在2023年，当前（也许是最终）版本是4.6，它首次发布于2017年。这与1992年的1.0版本有很大的不同。此外，还有一个专门的版本叫做OpenGL ES，用于诸如手机和平板电脑之类的“嵌入式系统”。还有WebGL，用于Web浏览器，基本上是OpenGL ES的一个移植版本。了解OpenGL的变化是如何发生以及原因将会很有用。

首先，您应该知道OpenGL被设计为一个“客户端/服务器”系统。服务器负责控制计算机的显示并执行图形计算，执行客户端发出的命令。通常，服务器是一个GPU，包括其图形处理器和内存。服务器执行OpenGL命令。客户端是同一台计算机中的CPU，以及它正在运行的应用程序。OpenGL命令来自于在CPU上运行的程序。但是，实际上可以通过网络远程运行OpenGL程序。也就是说，您可以在远程计算机（OpenGL客户端）上执行应用程序，而图形计算和显示是在您实际使用的计算机上完成的（OpenGL服务器）。

关键思想是客户端和服务器是分开的组件，并且在这些组件之间有一个通信通道。OpenGL命令及其所需的数据通过该通道从客户端（CPU）传输到服务器（GPU）。通道的容量可能是图形性能的限制因素。想象一下将图像绘制到屏幕上。如果GPU可以在微秒内绘制图像，但是将图像数据从CPU发送到GPU需要毫秒级的时间，那么GPU的快速速度就无关紧要了——绘制图像所需的大部分时间是通信时间。

因此，OpenGL发展的驱动因素之一是希望限制CPU和GPU之间需要的通信量。一种方法是将信息存储在GPU的内存中。如果某些数据将被多次使用，则可以一次将其传输到GPU并存储在那里的内存中，从而立即使GPU可以访问。另一种方法是尝试减少必须传输到GPU以绘制给定图像的OpenGL命令的数量。

OpenGL绘制诸如三角形之类的基元。指定一个基元意味着为每个顶点指定坐标和属性。在最初的OpenGL 1.0中，使用单独的命令来指定每个顶点的坐标，并且每当属性的值发生变化时都需要一个命令。要绘制一个三角形将需要三个或更多个命令。由成千上万个三角形组成的复杂对象的绘制将需要许多成千上万个命令。即使在OpenGL 1.1中，也可以使用单个命令而不是数千个来绘制这样的对象。对象的所有数据将加载到数组中，然后可以一次性将其发送到GPU。不幸的是，如果要多次绘制对象，则每次绘制对象时都必须重新传输数据。这在OpenGL 1.5中通过顶点缓冲对象(VBO(Vertex Buffer Objects))得到了修复。 VBO 是GPU中的一块内存块，可以存储一组顶点的坐标或属性值。这样，就可以在不必每次使用时都从CPU重新传输数据的情况下重用数据。

同样，在OpenGL 1.1中引入了纹理对象(texture objects)，以便在GPU上存储多个图像以供纹理使用。这意味着将多次重复使用的纹理图像加载到GPU中一次，以便GPU可以轻松地在图像之间切换而无需重新加载它们。

I have said that OpenGL is an API, but in fact it is a series of APIs that have been subject to repeated extension and revision. In 2023, the current (and perhaps final) version is 4.6, which was first released in 2017. It is very different from the 1.0 version from 1992. Furthermore, there is a specialized version called OpenGL ES for "embedded systems" such as mobile phones and tablets. And there is also WebGL, for use in Web browsers, which is basically a port of OpenGL ES. It will be useful to know something about how and why OpenGL has changed.

First of all, you should know that OpenGL was designed as a "client/server" system. The server, which is responsible for controlling the computer's display and performing graphics computations, carries out commands issued by the client. Typically, the server is a GPU, including its graphics processors and memory. The server executes OpenGL commands. The client is the CPU in the same computer, along with the application program that it is running. OpenGL commands come from the program that is running on the CPU. However, it is actually possible to run OpenGL programs remotely over a network. That is, you can execute an application program on a remote computer (the OpenGL client), while the graphics computations and display are done on the computer that you are actually using (the OpenGL server).

The key idea is that the client and the server are separate components, and there is a communication channel between those components. OpenGL commands and the data that they need are communicated from the client (the CPU) to the server (the GPU) over that channel. The capacity of the channel can be a limiting factor in graphics performance. Think of drawing an image onto the screen. If the GPU can draw the image in microseconds, but it takes milliseconds to send the data for the image from the CPU to the GPU, then the great speed of the GPU is irrelevant—most of the time that it takes to draw the image is communication time.

For this reason, one of the driving factors in the evolution of OpenGL has been the desire to limit the amount of communication that is needed between the CPU and the GPU. One approach is to store information in the GPU's memory. If some data is going to be used several times, it can be transmitted to the GPU once and stored in memory there, where it will be immediately accessible to the GPU. Another approach is to try to decrease the number of OpenGL commands that must be transmitted to the GPU to draw a given image.

OpenGL draws primitives such as triangles. Specifying a primitive means specifying coordinates and attributes for each of its vertices. In the original OpenGL 1.0, a separate command was used to specify the coordinates of each vertex, and a command was needed each time the value of an attribute changed. To draw a single triangle would require three or more commands. Drawing a complex object made up of thousands of triangles would take many thousands of commands. Even in OpenGL 1.1, it became possible to draw such an object with a single command instead of thousands. All the data for the object would be loaded into arrays, which could then be sent in a single step to the GPU. Unfortunately, if the object was going to be drawn more than once, then the data would have to be retransmitted each time the object was drawn. This was fixed in OpenGL 1.5 with Vertex Buffer Objects. A VBO is a block of memory in the GPU that can store the coordinates or attribute values for a set of vertices. This makes it possible to reuse the data without having to retransmit it from the CPU to the GPU every time it is used.

Similarly, OpenGL 1.1 introduced texture objects to make it possible to store several images on the GPU for use as textures. This means that texture images that are going to be reused several times can be loaded once into the GPU, so that the GPU can easily switch between images without having to reload them.

中文英文

随着新的功能被添加到OpenGL中，API的规模也在增长。但增长速度仍然被用于进行图形处理的新的、更复杂的技术所超越。其中一些新技术被添加到了OpenGL中，但问题在于，无论你添加了多少功能，总会有对新功能的需求，以及对所有新功能使事情变得过于复杂的抱怨！OpenGL是一个庞大的机器，不断地增加新组件，但仍然不能让每个人都满意。真正的解决方案是使机器可编程化(programmable)。随着OpenGL 2.0的出现，编写程序以作为GPU图形计算的一部分执行成为可能。这些程序在GPU上以GPU速度运行。想要使用新图形技术的程序员可以编写一个程序来实现该功能，然后将其交给GPU执行。OpenGL API不必更改。API唯一需要支持的是将程序发送到GPU以进行执行的能力。

这些程序被称为着色器(shaders)（尽管这个术语实际上并不描述它们大多数做什么）。首先被引入的着色器是顶点着色器(vertex shaders)和片段着色器(fragment shaders)。当绘制一个基元时，必须在每个基元的顶点上进行一些工作，例如对顶点坐标应用几何变换或使用属性和全局光照环境来计算该顶点的颜色。顶点着色器是一个可以接管执行此类“每个顶点”计算的程序。同样，对于基元内的每个像素，必须执行一些工作。片段着色器可以接管执行这种“每个像素”的计算。（片段着色器也称为像素着色器(pixel shaders)。）

可编程图形硬件的概念非常成功——成功到在OpenGL 3.0中，常规的每个顶点和每个片段的处理被弃用（意味着不鼓励使用）。并且在OpenGL 3.1中，它已从OpenGL标准中删除，尽管仍作为可选扩展存在。实际上，在桌面版本的OpenGL中仍支持所有原始功能，并且可能会在未来继续提供。然而，在嵌入式系统方面，使用OpenGL ES 2.0及更高版本时，着色器的使用是强制性的，并且已完全删除了OpenGL 1.1 API的大部分内容。用于Web浏览器的OpenGL版本WebGL是基于OpenGL ES的，它也需要使用着色器来完成任何事情。尽管如此，我们将从版本1.1开始学习OpenGL。该版本的大多数概念和许多细节仍然相关，并且为初学者提供了更容易的入门点。

OpenGL着色器是用OpenGL着色语言(GLSL)（(OpenGL Shading Language)）编写的。与 OpenGL 本身一样，GLSL也经历了几个版本。我们将在课程后期花一些时间学习GLSL ES，这是与 WebGL 和 OpenGL ES 一起使用的版本。GLSL使用类似于C编程语言的语法。

As new capabilities were added to OpenGL, the API grew in size. But the growth was still outpaced by the invention of new, more sophisticated techniques for doing graphics. Some of these new techniques were added to OpenGL, but the problem is that no matter how many features you add, there will always be demands for new features—as well as complaints that all the new features are making things too complicated! OpenGL was a giant machine, with new pieces always being tacked onto it, but still not pleasing everyone. The real solution was to make the machine programmable. With OpenGL 2.0, it became possible to write programs to be executed as part of the graphical computation in the GPU. The programs are run on the GPU at GPU speed. A programmer who wants to use a new graphics technique can write a program to implement the feature and just hand it to the GPU. The OpenGL API doesn't have to be changed. The only thing that the API has to support is the ability to send programs to the GPU for execution.

The programs are called shaders (although the term doesn't really describe what most of them actually do). The first shaders to be introduced were vertex shaders and fragment shaders. When a primitive is drawn, some work has to be done at each vertex of the primitive, such as applying a geometric transform to the vertex coordinates or using the attributes and global lighting environment to compute the color of that vertex. A vertex shader is a program that can take over the job of doing such "per-vertex" computations. Similarly, some work has to be done for each pixel inside the primitive. A fragment shader can take over the job of performing such "per-pixel" computations. (Fragment shaders are also called pixel shaders.)

The idea of programmable graphics hardware was very successful—so successful that in OpenGL 3.0, the usual per-vertex and per-fragment processing was deprecated (meaning that its use was discouraged). And in OpenGL 3.1, it was removed from the OpenGL standard, although it is still present as an optional extension. In practice, all the original features of OpenGL are still supported in desktop versions of OpenGL and will probably continue to be available in the future. On the embedded system side, however, with OpenGL ES 2.0 and later, the use of shaders is mandatory, and a large part of the OpenGL 1.1 API has been completely removed. WebGL, the version of OpenGL for use in web browsers, is based on OpenGL ES, and it also requires shaders to get anything at all done. Nevertheless, we will begin our study of OpenGL with version 1.1. Most of the concepts and many of the details from that version are still relevant, and it offers an easier entry point for someone new to 3D graphics programming.

OpenGL shaders are written in GLSL (OpenGL Shading Language). Like OpenGL itself, GLSL has gone through several versions. We will spend some time later in the course studying GLSL ES, the version used with WebGL and OpenGL ES. GLSL uses a syntax similar to the C programming language.

中文英文

作为对GPU硬件的最后一点说明，我应该指出，对于不同顶点进行的计算基本上是相互独立的，因此大概可以并行进行计算。对于不同片段的计算也是如此。事实上，GPU可以拥有数百个甚至数千个可以并行操作的处理器。诚然，单个处理器的性能远不及CPU强大，但典型的每个顶点和每个片段的计算并不是非常复杂的。在图形计算中可能存在的大量处理器和大量并行性，使得即使在相当廉价的GPU上也能实现令人印象深刻的图形性能。

As a final remark on GPU hardware, I should note that the computations that are done for different vertices are pretty much independent, and so can potentially be done in parallel. The same is true of the computations for different fragments. In fact, GPUs can have hundreds or thousands of processors that can operate in parallel. Admittedly, the individual processors are much less powerful than a CPU, but then typical per-vertex and per-fragment computations are not very complicated. The large number of processors, and the large amount of parallelism that is possible in graphics computations, makes for impressive graphics performance even on fairly inexpensive GPUs.