第1节: 像素、坐标和颜色¶
Pixels, Coordinates, and Colors
为了创建一个二维图像,图像中的每个点都被分配了一种颜色。二维空间中的一个点可以由一对数字坐标来确定。颜色也可以用数字来指定。然而,将数字分配给点或颜色有一定的任意性。因此,我们需要花一些时间来研究坐标系(coordinate systems),将数字与点相关联,并且颜色模型(color model),将数字与颜色相关联。
To create a two-dimensional image, each point in the image is assigned a color. A point in 2D can be identified by a pair of numerical coordinates. Colors can also be specified numerically. However, the assignment of numbers to points or colors is somewhat arbitrary. So we need to spend some time studying coordinate systems, which associate numbers to points, and color models, which associate numbers to colors.
2.1.1 像素坐标¶
Pixel Coordinates
数字图像由像素的行和列组成。在这样的图像中,一个像素可以通过指定包含它的列和行来确定。就坐标而言,一个像素可以由给定的列号和行号组成的整数对来标识。例如,坐标为(3,5)的像素位于第3列和第5行。通常情况下,列从左到右编号,从零开始。大多数图形系统,包括本章将要讨论的系统,将行从上到下编号,从零开始。但是,一些系统,包括OpenGL,将行从底部到顶部编号。
请特别注意,由一对坐标(x,y)确定的像素取决于坐标系统的选择。在了解所使用的坐标系统之前,您总是需要知道您所讨论的是哪个点。
行号和列号标识一个像素,而不是一个点。一个像素包含许多点;从数学上讲,它包含无限多个点。 计算机图形的目标实际上并不是为像素着色,而是创建和操作图像。在某种理想意义上,图像应该通过为每个点指定一个颜色来定义,而不仅仅是为每个像素指定一个颜色。像素是一种近似。如果我们想象有一个真正的、理想的图像要显示,那么通过给像素着色显示的任何图像都是一种近似。这有很多含义。
例如,假设我们想要画一条线段。数学上的线条是没有厚度的,因此是看不见的。所以我们真正想要画的是一条有一定宽度的线段。假设线条应该是一像素宽。问题是,除非线是水平或垂直的,否则我们无法通过给像素着色来真正绘制线。对角几何线条只会部分地覆盖一些像素。不可能将像素的一部分涂成黑色,另一部分涂成白色。当您尝试仅使用黑色和白色像素绘制线条时,结果是出现了阶梯效应。这种效应是所谓“混叠(aliasing)”的一个例子。混叠也可以在屏幕上绘制的字符轮廓和两个不同颜色区域之间的对角线或曲线边界中看到。(“混叠”一词可能来自于理想图像自然是用实数坐标描述的。当您尝试使用像素表示图像时,许多实数坐标将映射到相同的整数像素坐标;它们可以被视为同一个像素的不同名称或“别名”。)
抗混叠(Antialiasing) 是一种旨在减轻混叠效应的技术术语。其思想是,当一个像素只被一个形状的一部分覆盖时,像素的颜色应该是形状颜色和背景颜色的混合。当在白色背景上绘制一条黑色线时,被部分覆盖的像素的颜色将是灰色,其灰度取决于线段覆盖像素的比例。(实际上,为每个像素精确计算这个区域是太困难的,因此采用了一些近似方法。)例如,下图显示了一个几何线段,左侧是该线段,右侧是由像素着色得到的两个近似图像。为了让您能够看到单个像素,这些线段被放大了许多。中间的线段没有使用抗混叠技术绘制,而右侧的线段使用了抗混叠技术:
请注意,抗混叠并不能提供完美的图像,但它可以减少混叠产生的“阶梯(jaggies)”效应(至少在正常缩放时)。
将实数坐标映射到像素时还涉及其他问题。例如,在像素中哪一点应与(3,5)等整数值的坐标对应?像素的中心?像素的一个角落?通常情况下,我们认为这些数字是指像素的左上角。另一种思考方式是说整数坐标是指像素之间的线,而不是指像素本身。但是这仍然不能确定绘制几何形状时确切影响了哪些像素。例如,下图显示了使用HTML画布图形绘制的两条线,放大了许多。这些线被指定为以一像素线宽绘制的黑色:
顶部线段从点(100,100)到点(120,100)绘制。在画布图形中,整数坐标对应于像素之间的线,但是当绘制一像素宽的线时,该线延伸了一个像素的一半。因此,对于顶部线段,绘制的线位于一个像素的一半以及另一个像素的一半。使用抗混叠的图形系统将两行像素都渲染成了灰色。底部线段从点(100.5,100.5)到(120.5,100.5)绘制。在这种情况下,线段正好位于一个像素的一行中,这个像素被涂成了黑色。底部线段末端的灰色像素与该线段仅延伸到像素一半有关。其他图形系统可能以不同的方式渲染相同的线段。
下面的交互式演示允许您对像素和抗混叠进行实验。(请注意,在本书的任何交互式演示中,您都可以点击左上角的问号图标获取有关如何使用它的更多信息。)
所有这些都被现今像素不再是过去的像素所复杂化。今天的像素更小了!显示设备的分辨率可以用每英寸的像素数来衡量,这个数量被称为PPI(每英寸像素)或有时称为DPI(每英寸点)。早期的屏幕的分辨率大约在72 PPI左右。在这种分辨率下,个别像素是清晰可见的。有一段时间,大多数显示器的像素密度约为100像素/英寸,但是今天的高分辨率显示器可以有200、300甚至400像素/英寸。在最高分辨率下,单个像素已经无法分辨。
像素有着各种各样的尺寸,这是一个问题,因为我们使用基于像素的坐标系统。一个图像是根据每英寸100个像素的假设创建的,将在400 PPI的显示器上看起来很小。一条一像素宽的线在100 PPI的显示器上看起来不错,但在400 PPI的显示器上,一像素宽的线可能太细了。
实际上,在许多图形系统中,“像素(pixel)”并不真正指的是物理像素的尺寸。相反,它只是另一个度量单位,由系统设置为适当的尺寸。(在桌面系统上,一个像素通常大约是一英寸的百分之一。在智能手机上,观看距离更近,这个值可能更接近于1/160英寸。此外,当用户对网页进行放大时,像素作为一个度量单位的含义可能会发生变化。)
像素引起了一些尚未完全解决的问题。幸运的是,对于我们在本书中大多数使用的矢量图形来说,它们不再是一个问题。对于矢量图形来说,像素仅在栅格化期间成为一个问题,即将矢量图像转换为用于显示的像素。矢量图像本身可以使用任何方便的坐标系统创建。它代表了一个理想化的、与分辨率无关的图像。栅格化图像是该理想图像的近似,但如何进行近似可以交给显示硬件处理。
A digital image is made up of rows and columns of pixels. A pixel in such an image can be specified by saying which column and which row contains it. In terms of coordinates, a pixel can be identified by a pair of integers giving the column number and the row number. For example, the pixel with coordinates (3,5) would lie in column number 3 and row number 5. Conventionally, columns are numbered from left to right, starting with zero. Most graphics systems, including the ones we will study in this chapter, number rows from top to bottom, starting from zero. Some, including OpenGL, number the rows from bottom to top instead.
Note in particular that the pixel that is identified by a pair of coordinates (x,y) depends on the choice of coordinate system. You always need to know what coordinate system is in use before you know what point you are talking about.
Row and column numbers identify a pixel, not a point. A pixel contains many points; mathematically, it contains an infinite number of points. The goal of computer graphics is not really to color pixels—it is to create and manipulate images. In some ideal sense, an image should be defined by specifying a color for each point, not just for each pixel. Pixels are an approximation. If we imagine that there is a true, ideal image that we want to display, then any image that we display by coloring pixels is an approximation. This has many implications.
Suppose, for example, that we want to draw a line segment. A mathematical line has no thickness and would be invisible. So we really want to draw a thick line segment, with some specified width. Let's say that the line should be one pixel wide. The problem is that, unless the line is horizontal or vertical, we can't actually draw the line by coloring pixels. A diagonal geometric line will cover some pixels only partially. It is not possible to make part of a pixel black and part of it white. When you try to draw a line with black and white pixels only, the result is a jagged staircase effect. This effect is an example of something called "aliasing." Aliasing can also be seen in the outlines of characters drawn on the screen and in diagonal or curved boundaries between any two regions of different color. (The term aliasing likely comes from the fact that ideal images are naturally described in real-number coordinates. When you try to represent the image using pixels, many real-number coordinates will map to the same integer pixel coordinates; they can all be considered as different names or "aliases" for the same pixel.)
Antialiasing is a term for techniques that are designed to mitigate the effects of aliasing. The idea is that when a pixel is only partially covered by a shape, the color of the pixel should be a mixture of the color of the shape and the color of the background. When drawing a black line on a white background, the color of a partially covered pixel would be gray, with the shade of gray depending on the fraction of the pixel that is covered by the line. (In practice, calculating this area exactly for each pixel would be too difficult, so some approximate method is used.) Here, for example, is a geometric line, shown on the left, along with two approximations of that line made by coloring pixels. The lines are greatly magnified so that you can see the individual pixels. The line on the right is drawn using antialiasing, while the one in the middle is not:
Note that antialiasing does not give a perfect image, but it can reduce the "jaggies" that are caused by aliasing (at least when it is viewed on a normal scale).
There are other issues involved in mapping real-number coordinates to pixels. For example, which point in a pixel should correspond to integer-valued coordinates such as (3,5)? The center of the pixel? One of the corners of the pixel? In general, we think of the numbers as referring to the top-left corner of the pixel. Another way of thinking about this is to say that integer coordinates refer to the lines between pixels, rather than to the pixels themselves. But that still doesn't determine exactly which pixels are affected when a geometric shape is drawn. For example, here are two lines drawn using HTML canvas graphics, shown greatly magnified. The lines were specified to be colored black with a one-pixel line width:
The top line was drawn from the point (100,100) to the point (120,100). In canvas graphics, integer coordinates correspond to the lines between pixels, but when a one-pixel line is drawn, it extends one-half pixel on either side of the infinitely thin geometric line. So for the top line, the line as it is drawn lies half in one row of pixels and half in another row. The graphics system, which uses antialiasing, rendered the line by coloring both rows of pixels gray. The bottom line was drawn from the point (100.5,100.5) to (120.5,100.5). In this case, the line lies exactly along one line of pixels, which gets colored black. The gray pixels at the ends of the bottom line have to do with the fact that the line only extends halfway into the pixels at its endpoints. Other graphics systems might render the same lines differently.
The following interactive demo lets you experiment with pixels and antialiasing. (Note that in any of the interactive demos that accompany this book, you can click the question mark icon in the upper left for more information about how to use it.)
All this is complicated further by the fact that pixels aren't what they used to be. Pixels today are smaller! The resolution of a display device can be measured in terms of the number of pixels per inch on the display, a quantity referred to as PPI (pixels per inch) or sometimes DPI (dots per inch). Early screens tended to have resolutions of somewhere close to 72 PPI. At that resolution, and at a typical viewing distance, individual pixels are clearly visible. For a while, it seemed like most displays had about 100 pixels per inch, but high resolution displays today can have 200, 300 or even 400 pixels per inch. At the highest resolutions, individual pixels can no longer be distinguished.
The fact that pixels come in such a range of sizes is a problem if we use coordinate systems based on pixels. An image created assuming that there are 100 pixels per inch will look tiny on a 400 PPI display. A one-pixel-wide line looks good at 100 PPI, but at 400 PPI, a one-pixel-wide line is probably too thin.
In fact, in many graphics systems, "pixel" doesn't really refer to the size of a physical pixel. Instead, it is just another unit of measure, which is set by the system to be something appropriate. (On a desktop system, a pixel is usually about one one-hundredth of an inch. On a smart phone, which is usually viewed from a closer distance, the value might be closer to 1/160 inch. Furthermore, the meaning of a pixel as a unit of measure can change when, for example, the user applies a magnification to a web page.)
Pixels cause problems that have not been completely solved. Fortunately, they are less of a problem for vector graphics, which is mostly what we will use in this book. For vector graphics, pixels only become an issue during rasterization, the step in which a vector image is converted into pixels for display. The vector image itself can be created using any convenient coordinate system. It represents an idealized, resolution-independent image. A rasterized image is an approximation of that ideal image, but how to do the approximation can be left to the display hardware.
2.1.2 实数坐标系¶
Real-number Coordinate Systems
在进行二维图形绘制时,您会得到一个矩形,在其中您想要绘制一些图形原语(graphics primitives)。使用某个坐标系统在矩形上指定原语。应该能够选择一个适合应用程序的坐标系统。例如,如果矩形表示一个15英尺乘12英尺的房间的平面图,则您可能希望使用一个单位为一英尺的坐标系统,坐标范围从水平方向的0到15,垂直方向的0到12。在这种情况下,单位是英尺而不是像素,而且一个英尺可以对应于图像中的许多像素。像素的坐标通常是实数而不是整数。实际上,最好忘记像素,只考虑图像中的点。一个点将由一对实数给出的坐标表示。
为了在矩形上指定坐标系统,您只需要指定矩形左边缘和右边缘的水平坐标,以及顶部和底部的垂直坐标。让我们将这些值称为left、right、top和bottom。通常情况下,它们被认为是xmin、xmax、ymin和ymax,但是没有理由认为例如top小于bottom。我们可能希望一个坐标系统中垂直坐标从下到上递增,而不是从上到下。在这种情况下,顶部将对应于最大的y值,而不是最小值。
为了让程序员能够指定他们想要使用的坐标系统,最好有一个子程序,例如
setCoordinateSystem(left,right,bottom,top)
然后,图形系统将负责自动将指定坐标系统的坐标转换为像素坐标。可能没有这样的子程序,所以了解如何手动进行转换是有用的。让我们考虑一般情况。给定第一个坐标系统中一个点的坐标,我们想要在第二个坐标系统中找到相同点的坐标。(请记住,坐标系统只是一种给点分配数字的方法。重要的是点!)假设第一个坐标系统的水平和垂直限制为oldLeft、oldRight、oldTop和oldBottom,第二个坐标系统的限制为newLeft、newRight、newTop和newBottom。假设一个点在第一个坐标系统中的坐标为(oldX,oldY)。我们想要找到在第二个坐标系统中该点的坐标(newX,newY)
newX 和 newY 的公式如下:
newX = newLeft + ((oldX - oldLeft) / (oldRight - oldLeft)) * (newRight - newLeft))
newY = newTop + ((oldY - oldTop) / (oldBottom - oldTop)) * (newBottom - newTop)
这里的逻辑是,oldX 位于从oldLeft到oldRight的距离的某个比例处。该比例由以下公式给出:
((oldX - oldLeft) / (oldRight - oldLeft))
对于newX的公式只是说newX应该位于从newLeft到newRight的距离的相同比例处。您也可以通过测试来检查这些公式,看看当oldX等于oldLeft或oldRight,以及当oldY等于oldBottom或oldTop时,它们是否起作用。
例如,假设我们想要将某个具有左、右、顶部和底部限制的实数坐标系转换为像素坐标,该像素坐标在左边为0、右边为800、顶部为0、底部为600。在这种情况下,newLeft和newTop为零,公式简化为:
newX = ((oldX - left) / (right - left)) * 800
newY = ((oldY - top) / (bottom - top)) * 600
当然,这将以实数形式给出newX和newY,如果我们需要像素的整数坐标,则必须将它们四舍五入或截断。反向转换——从像素坐标到实数坐标——也是有用的。例如,如果图像显示在计算机屏幕上,并且您希望对图像上的鼠标点击做出反应,您可能会以整数像素坐标形式获得鼠标坐标,但您可能希望将这些像素坐标转换为您选择的坐标系。
实际上,通常情况下,您不必自己执行转换,因为大多数图形API提供了某种更高级的方式来指定转换。我们将在第2.3节中更多地讨论这个问题。
When doing 2D graphics, you are given a rectangle in which you want to draw some graphics primitives. Primitives are specified using some coordinate system on the rectangle. It should be possible to select a coordinate system that is appropriate for the application. For example, if the rectangle represents a floor plan for a 15 foot by 12 foot room, then you might want to use a coordinate system in which the unit of measure is one foot and the coordinates range from 0 to 15 in the horizontal direction and 0 to 12 in the vertical direction. The unit of measure in this case is feet rather than pixels, and one foot can correspond to many pixels in the image. The coordinates for a pixel will, in general, be real numbers rather than integers. In fact, it's better to forget about pixels and just think about points in the image. A point will have a pair of coordinates given by real numbers.
To specify the coordinate system on a rectangle, you just have to specify the horizontal coordinates for the left and right edges of the rectangle and the vertical coordinates for the top and bottom. Let's call these values left, right, top, and bottom. Often, they are thought of as xmin, xmax, ymin, and ymax, but there is no reason to assume that, for example, top is less than bottom. We might want a coordinate system in which the vertical coordinate increases from bottom to top instead of from top to bottom. In that case, top will correspond to the maximum y-value instead of the minimum value.
To allow programmers to specify the coordinate system that they would like to use, it would be good to have a subroutine such as
setCoordinateSystem(left,right,bottom,top)
The graphics system would then be responsible for automatically transforming the coordinates from the specified coordinate system into pixel coordinates. Such a subroutine might not be available, so it's useful to see how the transformation is done by hand. Let's consider the general case. Given coordinates for a point in one coordinate system, we want to find the coordinates for the same point in a second coordinate system. (Remember that a coordinate system is just a way of assigning numbers to points. It's the points that are real!) Suppose that the horizontal and vertical limits are oldLeft, oldRight, oldTop, and oldBottom for the first coordinate system, and are newLeft, newRight, newTop, and newBottom for the second. Suppose that a point has coordinates (oldX,oldY) in the first coordinate system. We want to find the coordinates (newX,newY) of the point in the second coordinate system
Formulas for newX and newY are then given by
newX = newLeft +
((oldX - oldLeft) / (oldRight - oldLeft)) * (newRight - newLeft))
newY = newTop +
((oldY - oldTop) / (oldBottom - oldTop)) * (newBottom - newTop)
The logic here is that oldX is located at a certain fraction of the distance from oldLeft to oldRight. That fraction is given by
((oldX - oldLeft) / (oldRight - oldLeft))
The formula for newX just says that newX should lie at the same fraction of the distance from newLeft to newRight. You can also check the formulas by testing that they work when oldX is equal to oldLeft or to oldRight, and when oldY is equal to oldBottom or to oldTop.
As an example, suppose that we want to transform some real-number coordinate system with limits left, right, top, and bottom into pixel coordinates that range from 0 at left to 800 at the right and from 0 at the top 600 at the bottom. In that case, newLeft and newTop are zero, and the formulas become simply
newX = ((oldX - left) / (right - left)) * 800
newY = ((oldY - top) / (bottom - top)) * 600
Of course, this gives newX and newY as real numbers, and they will have to be rounded or truncated to integer values if we need integer coordinates for pixels. The reverse transformation—going from pixel coordinates to real number coordinates—is also useful. For example, if the image is displayed on a computer screen, and you want to react to mouse clicks on the image, you will probably get the mouse coordinates in terms of integer pixel coordinates, but you will want to transform those pixel coordinates into your own chosen coordinate system.
In practice, though, you won't usually have to do the transformations yourself, since most graphics APIs provide some higher level way to specify transforms. We will talk more about this in Section 2.3.
2.1.3 纵横比¶
Aspect Ratio
矩形的宽高比(aspect ratio)是其宽度与高度的比值。例如,宽高比为2:1意味着矩形的宽度是其高度的两倍,而宽高比为4:3意味着宽度是高度的4/3倍。尽管宽高比通常以宽度:高度(width:height)的形式写成,但我将使用该术语来指代分数宽度/高度(width/height)。一个正方形的宽高比等于1。一个高度为600且宽高比为5/4的矩形的宽度等于 600*(5/4),即750。
坐标系统也有一个宽高比。如果坐标系统的水平和垂直限制如上所述为left、right、bottom和top,则宽高比是绝对值
(right - left) / (top - bottom)
如果在具有相同宽高比的矩形上使用坐标系统,则在该矩形中查看时,水平方向上的一个单位将具有与垂直方向上的单位相同的视觉长度。如果宽高比不匹配,则会存在一些畸变。例如,由方程x2 + y2 = 9定义的形状应该是一个圆,但只有在(x,y)坐标系的宽高比与绘图区域的宽高比相匹配时才成立。
这并不总是一件坏事,使用不同的长度单位在垂直和水平方向上。然而,假设您希望使用具有限制left、right、bottom和top的坐标,并且确实希望保持宽高比。在这种情况下,根据显示矩形的形状,您可能需要调整left和right或bottom和top的值,以使宽高比匹配:
我们将在本章后面更深入地研究几何变换,到那时,我们将看到一些用于设置坐标系统的程序代码。
The aspect ratio of a rectangle is the ratio of its width to its height. For example an aspect ratio of 2:1 means that a rectangle is twice as wide as it is tall, and an aspect ratio of 4:3 means that the width is 4/3 times the height. Although aspect ratios are often written in the form width:height, I will use the term to refer to the fraction width/height. A square has aspect ratio equal to 1. A rectangle with aspect ratio 5/4 and height 600 has a width equal to 600*(5/4), or 750.
A coordinate system also has an aspect ratio. If the horizontal and vertical limits for the coordinate system are left, right, bottom, and top, as above, then the aspect ratio is the absolute value of
(right - left) / (top - bottom)
If the coordinate system is used on a rectangle with the same aspect ratio, then when viewed in that rectangle, one unit in the horizontal direction will have the same apparent length as a unit in the vertical direction. If the aspect ratios don't match, then there will be some distortion. For example, the shape defined by the equation x2 +y2 = 9 should be a circle, but that will only be true if the aspect ratio of the (x,y) coordinate system matches the aspect ratio of the drawing area.
It is not always a bad thing to use different units of length in the vertical and horizontal directions. However, suppose that you want to use coordinates with limits left, right, bottom, and top, and that you do want to preserve the aspect ratio. In that case, depending on the shape of the display rectangle, you might have to adjust the values either of left and right or of bottom and top to make the aspect ratios match:
We will look more deeply into geometric transforms later in the chapter, and at that time, we'll see some program code for setting up coordinate systems.
2.1.4 颜色模型¶
Color Models
我们正在谈论计算机图形学最基础的基础知识之一。其中之一是坐标系。另一个是颜色。事实上,颜色是一个令人惊讶的复杂话题。我们将看一些与计算机图形应用程序最相关的部分。
计算机屏幕上的颜色是通过红、绿和蓝光的组合产生的。通过改变每种类型光的强度来产生不同的颜色。颜色可以通过三个数字来指定,分别表示颜色中红、绿和蓝的强度。强度可以用范围从零(最小强度)到一(最大强度)的数字来指定。这种指定颜色的方法称为RGB颜色模型(RGB color model),其中RGB代表红/绿/蓝(Red/Green/Blue)。例如,在RGB颜色模型中,数值三元组(1,0.5,0.5)表示将红色设置为全强度,而绿色和蓝色设置为半强度的颜色。颜色的红、绿和蓝值在RGB颜色模型中称为该颜色的颜色分量(color components)。
光由具有各种波长的波构成。纯色是指所有光具有相同波长的光,但一般来说,一个颜色可以包含许多波长 - 从数学上讲,是无限多个波长。那么,我们如何通过仅组合红、绿和蓝光来表示所有颜色呢?实际上,我们不能完全做到这一点。
你可能听说过,三种基本或“主要”颜色的组合足以表示所有颜色,因为人眼有三种颜色传感器,可以检测红、绿和蓝光。然而,这只是一个近似值。眼睛确实包含三种颜色传感器。这些传感器称为“锥形细胞(cone cells)”。然而,锥形细胞不仅对红、绿和蓝光做出反应。每种类型的锥形细胞对广泛范围内的光波长以不同程度作出反应。一组给定的波长混合物将使每种类型的细胞以一定程度被刺激,而刺激的强度决定了我们看到的颜色。将每种类型的锥细胞刺激到同样程度的不同波长混合物将被感知为相同的颜色。因此,事实上,可以通过三个数字指定感知到的颜色,这三个数字分别表示三种类型的锥细胞的刺激强度。但是,无论如何选择这三种颜色,都不可能通过组合来产生所有可能的刺激模式。这只是关于我们眼睛实际工作方式的事实;这可能会有所不同。三种基本颜色可以产生相当大比例的可感知颜色集合,但是在计算机屏幕上可能看不到的颜色有很多。(这整个讨论仅适用于实际拥有三种类型锥细胞的人。色盲,即某人缺少一种或多种类型的锥细胞,是令人惊讶地普遍的。)
诸如计算机屏幕之类的设备可以产生的颜色范围称为该设备的色域。不同的计算机屏幕可以具有不同的色域,并且相同的 RGB 值在不同的屏幕上可能会产生略有不同的颜色。彩色打印机的色域明显不同,而且可能比屏幕的色域要小,这就解释了为什么打印出来的图像可能看起来与屏幕上的图像并不完全相同。(顺便说一句,打印机制造颜色的方式与屏幕不同。屏幕是通过组合光来生成颜色,而打印机则是通过组合墨水或染料。由于这种差异,为打印机设计的颜色通常使用不同的基本颜色集合。一种常见的打印机颜色模型是 CMYK,使用青色、品红色、黄色和黑色。)
无论如何,计算机图形最常见的颜色模型是 RGB。RGB 颜色通常使用每个颜色分量 8 位表示,总共需要 24 位来表示一个颜色。这种表示有时被称为 "24 位颜色(24-bit color)"。8 位数字可以表示 2^8,或 256,个不同的值,我们可以将其视为从 0 到 255 的正整数。然后,颜色被指定为在该范围内的整数三元组 (r,g,b)。
这种表示方法很有效,因为 256 种红色、绿色和蓝色的色调几乎是人眼可以区分的。在图像通过颜色分量进行计算的应用程序中,通常使用每个颜色分量额外的位数,以避免由于计算中的舍入误差而产生的视觉效果。这种应用程序可能会为每个颜色分量使用 16 位整数甚至 32 位浮点值。另一方面,有时会使用更少的位数。例如,一种常见的颜色方案使用 5 位用于红色和蓝色分量,以及 6 位用于绿色分量,总共为颜色使用 16 位。(绿色获得额外的位,因为眼睛对绿光的敏感性比对红色或蓝色的敏感性更高。)这种 "16 位颜色" 相对于 24 位颜色可以节省内存,并且在内存更昂贵时更为常见。
除了 RGB 外,还有许多其他颜色模型。RGB 有时被批评为不直观。例如,对大多数人来说,黄色是由红色和绿色的组合而成并不明显。密切相关的颜色模型 HSV 和 HSL 描述与 RGB 相同的颜色集,但试图以更直观的方式进行描述。(HSV 有时被称为 HSB,其中的 "B" 代表 "亮度"。HSV 和 HSB 是完全相同的模型。)
这些模型中的 "H" 代表 "色相(hue)",是基本的光谱颜色。随着 H 的增加,颜色从红色(red)变为黄色(yellow)、绿色(green)、青色(cyan)、蓝色(blue)、品红(magenta),然后回到红色(red)。通常将 H 的值取为从 0 到 360,因为颜色可以被看作是围绕一个圆圈排列,红色在 0 和 360 度处。
HSV 和 HSL 中的 "S" 代表 "饱和度(saturation)",取值范围为 0 到 1。饱和度为 0 会产生灰色的色调(色调取决于 V 或 L 的值)。饱和度为 1 给出 "纯色",减小饱和度就像在颜色中添加更多灰色一样。"V" 代表 "值(value)","L" 代表 "亮度(lightness)"。它们确定颜色的明亮或暗。主要的区别在于,在 HSV 模型中,纯光谱颜色出现在 V=1 时,而在 HSL 中,它们出现在 L=0.5 时。
让我们来看看 HSV 颜色模型中的一些颜色。下面的示例显示了具有全范围 H 值的颜色,其中 S 和 V 分别等于 1 和 0.5。请注意,对于 S=V=1,你会得到明亮、纯净的颜色。S=0.5 会给出较苍白、饱和度较低的颜色。V=0.5 会产生较暗的颜色。
可能通过观察一些实际颜色及其表示方式来更容易理解颜色模型。以下是一个交互式演示,让您可以使用RGB和HSV颜色模型来实现这一点:
通常,颜色模型会添加第四个分量。第四个分量称为 阿尔法(alpha),使用它的颜色模型通常被称为 RGBA 和 HSLA 等名称。Alpha 并不是一种颜色,它通常用来表示透明度(transparency)。具有最大 alpha 值的颜色是完全不透明的;也就是说,它完全不透明。具有 alpha 等于零的颜色是完全透明的,因此是不可见的。中间值给出半透明或部分透明的颜色。透明度决定了在另一种颜色(前景色)之上绘制另一种颜色(背景色)时会发生什么情况。如果前景色完全不透明,则简单地替换背景色。如果前景色部分透明,则与背景色混合。假设 alpha 分量的范围是从 0 到 1,则可以计算得到的颜色为
new_color = (alpha)*(foreground_color) + (1 - alpha)*(background_color)
这个计算是分别对红色、蓝色和绿色的颜色分量进行的。这被称为 阿尔法混合(alpha blending)。效果就像透过有色玻璃观察背景一样;玻璃的颜色会给背景色添加一种色调。这种混合并不是 alpha 分量的唯一可能用法,但它是最常见的。
使用每个分量 8 位的 RGBA 颜色模型总共使用 32 位来表示一个颜色。这是一个方便的数字,因为整数值通常使用 32 位值表示。32 位整数值可以被解释为 32 位 RGBA 颜色。如何在 32 位整数内排列颜色分量在某种程度上是任意的。最常见的布局是将 alpha 分量存储在高 8 位中,然后是红色、绿色和蓝色。(这可能应该称为 ARGB 颜色。)但是,也有其他布局在使用中。
We are talking about the most basic foundations of computer graphics. One of those is coordinate systems. The other is color. Color is actually a surprisingly complex topic. We will look at some parts of the topic that are most relevant to computer graphics applications.
The colors on a computer screen are produced as combinations of red, green, and blue light. Different colors are produced by varying the intensity of each type of light. A color can be specified by three numbers giving the intensity of red, green, and blue in the color. Intensity can be specified as a number in the range zero, for minimum intensity, to one, for maximum intensity. This method of specifying color is called the RGB color model, where RGB stands for Red/Green/Blue. For example, in the RGB color model, the number triple (1, 0.5, 0.5) represents the color obtained by setting red to full intensity, while green and blue are set to half intensity. The red, green, and blue values for a color are called the color components of that color in the RGB color model.
Light is made up of waves with a variety of wavelengths. A pure color is one for which all the light has the same wavelength, but in general, a color can contain many wavelengths—mathematically, an infinite number. How then can we represent all colors by combining just red, green, and blue light? In fact, we can't quite do that.
You might have heard that combinations of the three basic, or "primary," colors are sufficient to represent all colors, because the human eye has three kinds of color sensors that detect red, green, and blue light. However, that is only an approximation. The eye does contain three kinds of color sensors. The sensors are called "cone cells." However, cone cells do not respond exclusively to red, green, and blue light. Each kind of cone cell responds, to a varying degree, to wavelengths of light in a wide range. A given mix of wavelengths will stimulate each type of cell to a certain degree, and the intensity of stimulation determines the color that we see. A different mixture of wavelengths that stimulates each type of cone cell to the same extent will be perceived as the same color. So a perceived color can, in fact, be specified by three numbers giving the intensity of stimulation of the three types of cone cell. However, it is not possible to produce all possible patterns of stimulation by combining just three basic colors, no matter how those colors are chosen. This is just a fact about the way our eyes actually work; it might have been different. Three basic colors can produce a reasonably large fraction of the set of perceivable colors, but there are colors that you can see in the world that you will never see on your computer screen. (This whole discussion only applies to people who actually have three kinds of cone cell. Color blindness, where someone is missing one or more kinds of cone cell, is surprisingly common.)
The range of colors that can be produced by a device such as a computer screen is called the color gamut of that device. Different computer screens can have different color gamuts, and the same RGB values can produce somewhat different colors on different screens. The color gamut of a color printer is noticeably different—and probably smaller—than the color gamut of a screen, which explains why a printed image probably doesn't look exactly the same as it did on the screen. (Printers, by the way, make colors differently from the way a screen does it. Whereas a screen combines light to make a color, a printer combines inks or dyes. Because of this difference, colors meant for printers are often expressed using a different set of basic colors. A common color model for printer colors is CMYK, using the colors cyan, magenta, yellow, and black.)
In any case, the most common color model for computer graphics is RGB. RGB colors are most often represented using 8 bits per color component, a total of 24 bits to represent a color. This representation is sometimes called "24-bit color." An 8-bit number can represent 28, or 256, different values, which we can take to be the positive integers from 0 to 255. A color is then specified as a triple of integers (r,g,b) in that range.
This representation works well because 256 shades of red, green, and blue are about as many as the eye can distinguish. In applications where images are processed by computing with color components, it is common to use additional bits per color component to avoid visual effects that might occur due to rounding errors in the computations. Such applications might use a 16-bit integer or even a 32-bit floating point value for each color component. On the other hand, sometimes fewer bits are used. For example, one common color scheme uses 5 bits for the red and blue components and 6 bits for the green component, for a total of 16 bits for a color. (Green gets an extra bit because the eye is more sensitive to green light than to red or blue.) This "16-bit color" saves memory compared to 24-bit color and was more common when memory was more expensive.
There are many other color models besides RGB. RGB is sometimes criticized as being unintuitive. For example, it's not obvious to most people that yellow is made of a combination of red and green. The closely related color models HSV and HSL describe the same set of colors as RGB, but attempt to do it in a more intuitive way. (HSV is sometimes called HSB, with the "B" standing for "brightness." HSV and HSB are exactly the same model.)
The "H" in these models stands for "hue," a basic spectral color. As H increases, the color changes from red to yellow to green to cyan to blue to magenta, and then back to red. The value of H is often taken to range from 0 to 360, since the colors can be thought of as arranged around a circle with red at both 0 and 360 degrees.
The "S" in HSV and HSL stands for "saturation," and is taken to range from 0 to 1. A saturation of 0 gives a shade of gray (the shade depending on the value of V or L). A saturation of 1 gives a "pure color," and decreasing the saturation is like adding more gray to the color. "V" stands for "value," and "L" stands for "lightness." They determine how bright or dark the color is. The main difference is that in the HSV model, the pure spectral colors occur for V=1, while in HSL, they occur for L=0.5.
Let's look at some colors in the HSV color model. The illustration below shows colors with a full range of H-values, for S and V equal to 1 and to 0.5. Note that for S=V=1, you get bright, pure colors. S=0.5 gives paler, less saturated colors. V=0.5 gives darker colors.
It's probably easier to understand color models by looking at some actual colors and how they are represented. Here is an interactive demo that let's you do that for the RGB and HSV color models:
Often, a fourth component is added to color models. The fourth component is called alpha, and color models that use it are referred to by names such as RGBA and HSLA. Alpha is not a color as such. It is usually used to represent transparency. A color with maximal alpha value is fully opaque; that is, it is not at all transparent. A color with alpha equal to zero is completely transparent and therefore invisible. Intermediate values give translucent, or partly transparent, colors. Transparency determines what happens when you draw with one color (the foreground color) on top of another color (the background color). If the foreground color is fully opaque, it simply replaces the background color. If the foreground color is partly transparent, then it is blended with the background color. Assuming that the alpha component ranges from 0 to 1, the color that you get can be computed as
new_color = (alpha)*(foreground_color) + (1 - alpha)*(background_color)
This computation is done separately for the red, blue, and green color components. This is called alpha blending. The effect is like viewing the background through colored glass; the color of the glass adds a tint to the background color. This type of blending is not the only possible use of the alpha component, but it is the most common.
An RGBA color model with 8 bits per component uses a total of 32 bits to represent a color. This is a convenient number because integer values are often represented using 32-bit values. A 32-bit integer value can be interpreted as a 32-bit RGBA color. How the color components are arranged within a 32-bit integer is somewhat arbitrary. The most common layout is to store the alpha component in the eight high-order bits, followed by red, green, and blue. (This should probably be called ARGB color.) However, other layouts are also in use.