跳转至

9.1 WebGPU 基础

WebGPU Basics

WebGPU 是网络计算机图形学的新API。WebGL基于OpenGL构建,而WebGPU则是完全从头设计的。它与更现代的计算机图形API如Vulkan、Metal和Direct3D类似。WebGPU是一个非常底层的API,这使得程序员要做更多的工作,但也提供了更多的能力和效率。另一方面,您可能发现WebGPU比WebGL是一个更清洁、更合逻辑的API,WebGL充满了旧OpenGL特性的奇怪残留物。

我们从WebGPU的概览开始本章。目前,我们将坚持使用基本的2D图形,没有变换或照明。尽管我会提到一些WebGL,但我会尽量使讨论对那些尚未学习WebGL或OpenGL的人也是可访问的;然而,如果您不熟悉这些旧API,您可能需要参考本书的早期部分以获取背景信息。

我们的WebGPU示例将用JavaScript编写。在附录A的第3节中可以找到JavaScript的简短介绍。WebGPU广泛使用类型化数组如Float32Array,以及用于创建对象(使用{...})和数组(使用[...])的符号。它还使用异步函数和承诺,这些高级JavaScript特性在该附录的第4节中讨论。

WebGPU应用程序的环境有两个部分,我将称之为JavaScript端和GPU端。JavaScript端在CPU(计算机的中央处理单元)上执行,而WebGPU计算和渲染操作在GPU(图形处理单元)上执行。CPU和GPU各自拥有自己的专用内存,但它们也有一些共享内存,可用于共享数据和发送消息。应用程序的JavaScript端和GPU端之间的通信相对较慢且效率低下。WebGPU的许多设计可能看起来繁琐且有点奇怪,但这些都可以通过尽可能高效地管理通信来解释。现在,实际上,WebGPU可以在许多不同的系统上以多种方式实现。它甚至可以完全用软件模拟,不涉及任何物理GPU。但设计必须对所有情况都高效,当你试图理解设计时,你应该记住的情况是有一个单独的CPU和GPU,它们可以访问一些共享内存。

在本节中,我们将主要看一个示例程序:basic_webgpu_1.html,它简单地绘制了一个彩色三角形。这个例子的源代码有大量的注释,并且鼓励你去阅读它。你可以运行它来测试你的浏览器是否支持WebGPU。这里有一个演示版本(源代码不包括所有注释):

WebGPU is a new API for computer graphics on the Web. Where WebGL was based on OpenGL, WebGPU has been completely designed from scratch. It is similar to more modern computer graphics APIs such as Vulkan, Metal, and Direct3D. WebGPU is a very low-level API, which makes the programmer do more work but also offers more power and efficiency. On the other hand, you might find that WebGPU is a cleaner, more logical API than WebGL, which is filled with strange remnants of old OpenGL features.

We begin the chapter with an overview of WebGPU. For now, we will stick to basic 2D graphics, with no transformations or lighting. Although I will make some references to WebGL, I will try to make the discussion accessible even for someone who has not already studied WebGL or OpenGL; however, if you are not familiar with those older APIs, you might need to refer to earlier sections of this book for background information.

Our WebGPU examples will be programmed in JavaScript. A short introduction to JavaScript can be found in Section 3 of Appendix A. WebGPU makes extensive use of typed arrays such as Float32Array and of the notations for creating objects (using {...}) and arrays (using [...]). And it uses async functions and promises, advanced JavaScript features that are discussed in Section 4 of that appendix.

The environment for a WebGPU application has two parts that I will call the JavaScript side and the GPU side. The JavaScript side is executed on the CPU (the Central Processing Unit of the computer), while WebGPU computational and rendering operations are executed on the GPU (Graphical Processing Unit). The CPU and GPU each have their own dedicated memory, but they also have some shared memory that can be used for sharing data and sending messages. Communication between the JavaScript side and the GPU side of the application is relatively slow and inefficient. A lot of the design of WebGPU, which can seem cumbersome and a little strange, can be explained by the need to manage that communication as efficiently as possible. Now, WebGPU can in fact be implemented in many ways on many different systems. It can even be emulated entirely in software with no physical GPU involved. But the design has to be efficient for all cases, and the case that you should keep in mind when trying to understand the design is one with separate CPU and GPU that have access to some shared memory.

In this section we will mostly be looking at one sample program: basic_webgpu_1.html, which simply draws a colored triangle. The source code for this example is extensively commented, and you are encouraged to read it. You can run it to test whether your browser supports WebGPU. Here is a demo version (with source code that does not include all the comments):

9.1.1 适配器、设备和画布

Adapter, Device, and Canvas

任何WebGPU应用程序都必须从获取一个WebGPU“设备”开始,它代表了程序员几乎所有WebGPU功能的接口。要在网页上产生可见的图形图像,WebGPU会渲染到页面上的一个HTML画布元素。为此,应用程序将需要该画布的WebGPU上下文。(WebGPU除了渲染到画布之外还可以做其他事情,但我们现在只关注这一点)。获取设备和上下文的代码在任何应用程序中都可以相同:

async function initWebGPU() {

    if (!navigator.gpu) {
        throw Error("WebGPU not supported in this browser.");
    }
    let adapter = await navigator.gpu.requestAdapter();
    if (!adapter) {
        throw Error("WebGPU is supported, but couldn't get WebGPU adapter.");
    }

    device = await adapter.requestDevice();

    let canvas = document.getElementById("webgpuCanvas");
    context = canvas.getContext("webgpu");
    context.configure({
        device: device,
        format: navigator.gpu.getPreferredCanvasFormat(),
        alphaMode: "premultiplied" // (另一种选择是 "opaque")
    });
    .
    .
    .
}

在这里,device和context是全局变量,navigator是代表网络浏览器的预定义变量,其他变量,adapter和canvas,可能在初始化函数之外不需要。(如果需要对画布的引用,它可作为context.canvas获得)。函数navigator.gpu.requestAdapter()和adapter.requestDevice()返回承诺。该函数被声明为async,因为它使用await等待这些承诺的结果。(Async函数的使用方式与其他函数相同,只是有时您需要考虑程序的其他部分在await等待结果时理论上可以运行)。

您可能想要在初始化中更改的唯一事项是上下文的alphaMode。值"premultiplied"允许画布上像素的alpha值确定该像素在网页上绘制时的透明度程度。另一种值"opaque"意味着忽略像素的alpha值,像素是不透明的。

这段初始化代码进行了一些错误检查,如果遇到问题可能会抛出错误。可以想象,程序会在其他地方捕获该错误并向用户报告。然而,作为WebGPU开发者,您应该知道WebGPU对程序进行了广泛的有效性检查,并将所有错误和警告报告给网络浏览器控制台。因此,在测试您的工作时保持控制台打开是一个好主意。

Any WebGPU application must begin by obtaining a WebGPU "device," which represents the programmer's interface to almost all WebGPU features. To produce visible graphics images on a web page, WebGPU renders to an HTML canvas element on the page. For that, the application will need a WebGPU context for the canvas. (WebGPU can do other things besides render to a canvas, but we will stick to that for now). The code for obtaining the device and context can be the same in any application:

async function initWebGPU() {

if (!navigator.gpu) {
    throw Error("WebGPU not supported in this browser.");
}
let adapter = await navigator.gpu.requestAdapter();
if (!adapter) {
    throw Error("WebGPU is supported, but couldn't get WebGPU adapter.");
}

device = await adapter.requestDevice();

let canvas = document.getElementById("webgpuCanvas");
context = canvas.getContext("webgpu");
context.configure({
    device: device,
    format: navigator.gpu.getPreferredCanvasFormat(),
    alphaMode: "premultiplied" // (the alternative is "opaque")
});
    .
    .
    .

Here, device and context are global variables, navigator is a predefined variable representing the web browser, and the other variables, adapter and canvas, are probably not needed outside the initialization function. (If a reference to the canvas is needed, it is available as context.canvas.) The functions navigator.gpu.requestAdapter() and adapter.requestDevice() return promises. The function is declared as async because it uses await to wait for the results from those promises. (Async functions are used in the same way as other functions, except that sometimes you have to take into account that other parts of the program can in theory run while await is waiting for a result.)

The only thing you might want to change in this initialization is the alphaMode for the context. The value "premultiplied" allows the alpha value of a pixel in the canvas to determine the degree of transparency of that pixel when the canvas is drawn on the web page. The alternative value, "opaque", means that the alpha value of a pixel is ignored, and the pixel is opaque.

This initialization code does some error checking and can throw an error if a problem is encountered. Presumably, the program would catch that error elsewhere and report it to the user. However, as a WebGPU developer, you should be aware that WebGPU does extensive validity checks on programs and reports all errors and warnings to the web browser console. So, it is a good idea to keep the console open when testing your work.

9.1.2 着色器模块

Shader Module

WebGPUWebGLOpenGL 类似,绘制由顶点定义的图元(点、线和三角形)。渲染过程包括对图元的每个顶点进行一些计算,以及对图元中的每个像素(或“片段”)进行一些计算。WebGPU 程序员必须定义函数来指定这些计算。这些函数就是着色器。要渲染图像,WebGPU 程序必须提供顶点着色器主函数和片段着色器主函数。在文档中,这些函数被称为顶点着色器入口点和片段着色器入口点。WebGPU 的着色器函数和支持代码是用 WGSL 编写的,即 WebGPU 着色器语言。着色器源代码作为一个普通的 JavaScript 字符串给出。设备对象中的 device.createShaderModule() 方法用于编译源代码,检查语法错误,并将其打包成一个着色器模块,然后在渲染管线中使用:

shader = device.createShaderModule({
    code: shaderSource
});

这里的参数是一个对象,在这个例子中只有一个属性,名为 code;shaderSource 是包含着色器源代码的字符串;返回值 shader 表示编译后的源代码,稍后在配置渲染管线时会用到。源代码中的语法错误不会抛出异常。然而,编译错误和警告会报告在网络控制台中。在开发期间,您应该始终检查控制台中的 WebGPU 消息。

我们将在 第9.3节 中详细查看 WGSLWGSL 在很多方面与 GLSL 相似,GLSLWebGL 的着色语言,但它的变量和函数声明非常不同。我在这里只进行简短的讨论,以帮助您理解 WebGPU 应用程序的 JavaScript 部分和 WGSL 部分之间的关系。以下是我们 第一个 WebGPU 示例 中的简短着色器源代码。它在 JavaScript 端定义为一个模板字符串,可以跨越多行:

const shaderSource = `

@group(0) @binding(0) var<uniform> color : vec3f;

@vertex
fn vertexMain( @location(0) coords : vec2f ) -> @builtin(position) vec4f {
    return vec4f( coords, 0, 1 );
}

@fragment
fn fragmentMain() -> @location(0) vec4f {
    return vec4f( color, 1 ); 
}
`;

WGSL 中的函数定义语法为

fn function_name ( parameter_list ) -> return_type { . . . }

这个例子中使用的类型 —— vec2f、vec3f 和 vec4f —— 表示两个、三个和四个32位浮点数的向量。变量声明可以有几种形式。这段代码中的一个示例有以下形式:

var<uniform> variable_name : type ;

这在“uniform address space”中声明了一个全局变量,下面会讨论。在 uniform address space 中的变量从 JavaScript 端获取其值。

以 "@" 开头的词是注释或修饰符。例如,@vertex 表示后面的函数可以用作顶点着色器入口点,@fragment 表示后面的函数可以用作片段着色器入口点。@builtin(position) 注释说明 vertexMain() 的返回值给出了顶点在标准 WebGPU 坐标系统中的坐标。示例中的 @location(0)、@group(0) 和 @binding(0) 用于指定着色器中数据和 JavaScript 端数据之间的连接,稍后将讨论。

这里使用的顶点和片段着色器函数非常简单。顶点着色器简单地从其参数中获取 (x,y) 坐标,该参数来自 JavaScript 端,并添加 z- 和 w- 坐标以获得顶点的最终齐次坐标。返回值的表达式 vec4f(coords,0,1) 从其参数列表中的四个浮点值构造一个 vec4f(四个浮点数的向量)。片段着色器为其处理的像素输出 RGBA 颜色,简单地使用来自 JavaScript 端的 uniform color 变量的三个 RGB 分量,并添加 1 作为颜色的 alpha 分量。

Like WebGL and OpenGL, WebGPU draws primitives (points, lines, and triangles) that are defined by vertices. The rendering process involves some computation for each vertex of a primitive, and some computation for each pixel (or "fragment") that is part of the primitive. A WebGPU programmer must define functions to specify those computations. Those functions are shaders. To render an image, a WebGPU program must provide a vertex shader main function and a fragment shader main function. In the documentation, those functions are referred to as the vertex shader entry point and the fragment shader entry point. Shader functions and supporting code for WebGPU are written in WGSL, the WebGPU Shader Language. Shader source code is given as an ordinary JavaScript string. The device.createShaderModule() method, in the WebGPU device object, is used to compile the source code, check it for syntax errors, and package it into a shader module that can then be used in a rendering pipeline:

shader = device.createShaderModule({
    code: shaderSource
});

The parameter here is an object that in this example has just one property, named code; shaderSource is the string that contains the shader source code; and the return value, shader, represents the compiled source code, which will be used later, when configuring the render pipeline. Syntax errors in the source code will not throw an exception. However, compilation errors and warnings will be reported in the web console. You should always check the console for WebGPU messages during development.


We will look at WGSL in some detail in Section 9.3. WGSL is similar in many ways to GLSL, the shading language for WebGL, but its variable and function declarations are very different. I will give just a short discussion here, to help you understand the relationship between the JavaScript part and the WGSL part of a WebGPU application. Here is the short shader source code from our first WebGPU example. It is defined (on the JavaScript side) as a template string, which can extend over multiple lines:

const shaderSource = `

@group(0) @binding(0) var<uniform> color : vec3f;

@vertex
fn vertexMain( @location(0) coords : vec2f ) -> @builtin(position) vec4f {
    return vec4f( coords, 0, 1 );
}

@fragment
fn fragmentMain() -> @location(0) vec4f {
    return vec4f( color, 1 ); 
}
`;

The syntax for a function definition in WGSL is

fn function_name ( parameter_list ) -> return_type { . . . }

The types used in this example—vec2f, vec3f, and vec4f—represent vectors of two, three, and four 32-bit floating point numbers. Variable declarations can have several forms. The one example in this code has the form

var<uniform> variable_name : type ;

This declares a global variable in the "uniform address space," which will be discussed below. A variable in the uniform address space gets its value from the JavaScript side.

The words beginning with "@" are annotations or modifiers. For example, @vertex means that the following function can be used as a vertex shader entry point, and @fragment means that the following function can be used as a fragment shader entry point. The @builtin(position) annotation says that the return value from vertexMain() gives the coordinates of the vertex in the standard WebGPU coordinate system. And @location(0), @group(0), and @binding(0) in this example are used to specify connections between data in the shader and data on the JavaScript side, as will be discussed below.

The vertex and fragment shader functions that are used here are very simple. The vertex shader simply takes the (x,y) coordinates from its parameter, which comes from the JavaScript side, and adds z- and w-coordinates to get the final homogeneous coordinates for the vertex. The expression vec4f(coords,0,1) for the return value constructs a vec4f (a vector of four floats) from the four floating-point values in its parameter list. The fragment shader, which outputs an RGBA color for the pixel that it is processing, simple uses the three RGB components from the uniform color variable, which comes from the JavaScript side, and adds a 1 for the alpha component of the color.

9.1.3 渲染管线

Render Pipeline

WebGPU 中,图像是作为一系列处理阶段的输出而产生的,这些阶段构成了一个“渲染管线”。顶点着色器和片段着色器是管线中的可编程阶段,但还有其他固定功能的阶段内置于 WebGPU 中。管线的输入来自 GPU 中的数据结构。如果数据源自应用程序的 JavaScript 端,则必须先将其复制到 GPU,然后才能在管线中使用。以下是通用渲染管线结构的示意图:

WebGPU 渲染管线中的数据流

该图示显示了两种类型的管线输入:顶点缓冲区绑定组。回想一下,当绘制一个图元时,顶点着色器会针对图元中的每个顶点被调用一次。顶点着色器的每次调用都可以为顶点着色器入口点函数中的参数获取不同的值。这些值来自顶点缓冲区。缓冲区必须为每个顶点的参数加载值。管线的一个固定功能阶段,显示为顶点缓冲区和顶点着色器之间的点,会针对每个顶点调用一次顶点着色器,从缓冲区中提取该顶点的适当参数值集。(顶点缓冲区还保存实例化绘制的数据,将在下一节中介绍)。

顶点着色器输出一些值,这些值必须包括顶点的坐标,但也可以包括其他值,如颜色、纹理坐标和顶点的法向量。管线中位于顶点着色器和片段着色器之间的中间阶段以各种方式处理这些值。例如,顶点的坐标用于确定哪些像素位于图元中。通过插值顶点坐标来计算像素的坐标。颜色和纹理坐标等值通常也会被插值,以获得每个像素的不同值。所有这些值都可以作为输入提供给片段着色器,片段着色器将针对图元中的每个像素调用一次,并为其参数提供适当的值。

顶点缓冲区之所以特殊,是因为它们用于提供顶点着色器参数的方式。其他类型的输入存储在称为绑定组的数据结构中。绑定组中的值作为全局变量提供给顶点和片段着色器。

片段着色器可以输出多个值。这些值的目标位于管线外部,被称为管线的“颜色附件”。在最常见的情况下,只有一个输出,表示要分配给像素的颜色,相关的附件是正在渲染的图像(或者更确切地说,是保存该图像颜色数据的内存块)。多个输出可以用于高级应用,如延迟着色(见7.5.4小节)。

WebGPU 程序负责创建管线并提供它们的许多配置细节。(幸运的是,许多细节可以通过剪切和粘贴的久经考验的方法来处理。)让我们看看我们第一个示例程序中的相对简单的例子。目标是在以下代码摘录的最后创建一个渲染管线。在此之前,程序创建了一些对象来指定管线配置:

let vertexBufferLayout = [ // 顶点缓冲区规范的数组。
    { 
        attributes: [ { shaderLocation:0, offset:0, format: "float32x2" } ],
        arrayStride: 8, 
        stepMode: "vertex"
    }
];

let uniformBindGroupLayout = device.createBindGroupLayout({
    entries: [ // 资源规范的数组。
    {
        binding: 0,
        visibility: GPUShaderStage.FRAGMENT,
        buffer: {
            type: "uniform"
        }
    }
]
});

let pipelineDescriptor = {
    vertex: { // 顶点着色器的配置。
        module: shader, 
        entryPoint: "vertexMain", 
        buffers: vertexBufferLayout 
    },
    fragment: { // 片段着色器的配置。
        module: shader, 
        entryPoint: "fragmentMain", 
        targets: [{
            format: navigator.gpu.getPreferredCanvasFormat()
        }]
    },
    primitive: {
        topology: "triangle-list"
    },
    layout: device.createPipelineLayout({
    bindGroupLayouts: [uniformBindGroupLayout]
    })
};

pipeline = device.createRenderPipeline(pipelineDescriptor);

(你可以在程序的源代码中阅读带有更多注释的相同代码。)

这里有很多内容!管线描述符的 vertex 和 fragment 属性描述了管线中使用的着色器。module 属性是包含着色器函数的编译着色器模块。entryPoint 属性提供了在着色器源代码中使用的着色器入口点函数的名称。buffers 和 targets 属性与顶点着色器函数的输入和片段着色器函数的输出有关。

顶点缓冲区和绑定组的“布局”指定了管线所需的输入。它们只指定了输入的结构。它们基本上创建了连接点,实际的输入源可以稍后插入。这允许一个管线通过提供不同的输入来绘制不同的内容。

注意在整个规范中使用数组。例如,管线可以配置为使用多个顶点缓冲区作为输入。顶点缓冲区布局是一个数组,其中数组的每个元素指定一个输入缓冲区。数组元素的索引很重要,因为它标识了相应缓冲区的连接点。索引将在稍后使用,用于连接实际的缓冲区。

同样,管线可以从多个绑定组接收输入。在这种情况下,绑定组的索引来自 pipelineDescriptor 中的 bindGroupLayouts 属性,当将实际的绑定组连接到管线时将需要该索引。索引也用于着色器程序中。例如,如果你回顾上面的着色器源代码,你会看到 uniform 变量声明带有 @group(0)。这意味着该变量的值将在 bindGroupLayouts 数组的索引 0 处的绑定组中找到。

此外,每个绑定组可以包含一系列资源,这些资源由该绑定组的绑定组布局的 entries 属性指定。一个条目可以为着色器中的全局变量提供值。在这种情况下,令人困惑的是,不是 entries 数组中条目的索引重要;相反,条目有一个 binding 属性来标识它。在示例程序中,uniform 变量声明上的双注释 @group(0) @binding(0) 表示该变量的值特别来自索引 0 处的绑定组中 binding 数字为 0 的条目。

管线还有输出,这些输出来自片段着色器入口点函数,管线需要这些输出目标的连接点。pipelineDescriptor 中的 targets 属性是一个数组,每个连接点都有一个条目。当着色器源代码使用 fn fragmentMain() -> @location(0) vec4f 定义片段着色器时,输出上的 @location(0) 注释表示该输出将发送到颜色附件编号 0,对应于 targets 数组中索引 0 处的元素。该元素中 format 属性的值指定输出将以适合画布颜色的格式。(系统将自动将着色器输出转换为画布格式,着色器输出使用每个颜色分量的 32 位浮点数,而画布格式使用每个分量的 8 位无符号整数。)

这解释了 pipelineDescriptor 的 primitive 属性:它指定了管线可以绘制的几何图元类型。拓扑指定了图元类型,在本例中是“triangle-list”。也就是说,当执行管线时,每组三个顶点将定义一个三角形。WebGPU 只有五种图元类型:“point-list”,“line-list”,“line-strip”,“triangle-list”和“triangle-strip”,对应于 WebGLOpenGL 中的 POINTS,LINES,LINE_STRIP,TRIANGLES 和 TRIANGLE_STRIP。下图显示了相同的六个顶点在每种拓扑中如何被解释(除了三角形的轮廓和线段的端点不会是实际输出的一部分):

五种 WebGPU 图元拓扑的图片。

(见3.1.1小节,了解有关如何渲染图元的更多讨论。)

每次绘制图像时,您不必创建一个新的管线。一个管线可以使用任意次数。它可以通过连接不同的输入源来绘制不同的内容。绘制单个图像可能需要几个管线,每个管线都可能执行多次。通常,程序在初始化期间创建管线并将它们存储在全局变量中。

In WebGPU, an image is produced as the output of a series of processing stages that make up a "render pipeline." The vertex shader and fragment shader are programmable stages in the pipeline, but there are other fixed function stages that are built into WebGPU. Input to the pipeline comes from data structures in the GPU. If the data originates on the JavaScript side of the application, it must be copied to the GPU before it can be used in the pipeline. Here is an illustration of the general structure of a render pipeline:

Data flow in a WebGPU render pipeline

This diagram shows two types of input to the pipeline, vertex buffers and bind groups. Recall that when a primitive is drawn, the vertex shader is called once for each vertex in the primitive. Each invocation of the vertex shader can get different values for the parameters in the vertex shader entry point function. Those values come from vertex buffers. The buffers must be loaded with values for the parameters for every vertex. A fixed function stage of the pipeline, shown as the dots between the vertex buffers and the vertex shader, calls the vertex shader once for each vertex, pulling the appropriate set of parameter values for that vertex from the buffers. (Vertex buffers also hold data for instanced drawing, which will be covered in the next section).

The vertex shader outputs some values, which must include the coordinates of the vertex but can also include other values such as color, texture coordinates, and normal vector for the vertex. Intermediate stages of the pipeline between the vertex shader and the fragment shader process the values in various ways. For example, the coordinates of the vertices are used to determine which pixels lie in the primitive. Coordinates for the pixels are computed by interpolating the vertex coordinates. Values like color and texture coordinates are also generally interpolated to get different values for each pixel. All these values are available as inputs to the fragment shader, which will be called once for each pixel in the primitive with appropriate values for its parameters.

Vertex buffers are special because of the way that they are used to supply vertex shader parameters. Other kinds of input are stored in the data structures called bind groups. Values from bind groups are made available to vertex and fragment shaders as global variables in the shader programs.

The fragment shader can output several values. The destinations for those values lie outside the pipeline and are referred to as the "color attachments" for the pipeline. In the most common case, there is just one output that represents the color to be assigned to the pixel, and the associated color attachment is the image that is being rendered (or, rather, the block of memory that holds the color data for that image). Multiple outputs can be used for advanced applications such as deferred shading (see Subsection 7.5.4).

A WebGPU program is responsible for creating pipelines and providing many details of their configuration. (Fortunately, a lot of the detail can be handled by the tried-and-true method of cut-and-paste.) Let's look at the relatively simple example from our first sample program. The goal is to create a render pipeline as the final step in the following code excerpt. Before that, the program creates some objects to specify the pipeline configuration:

let vertexBufferLayout = [ // An array of vertex buffer specifications.
{ 
    attributes: [ { shaderLocation:0, offset:0, format: "float32x2" } ],
    arrayStride: 8, 
    stepMode: "vertex"
}
];

let uniformBindGroupLayout = device.createBindGroupLayout({
    entries: [ // An array of resource specifications.
    {
        binding: 0,
        visibility: GPUShaderStage.FRAGMENT,
        buffer: {
            type: "uniform"
        }
    }
    ]
});

let pipelineDescriptor = {
    vertex: { // Configuration for the vertex shader.
    module: shader, 
    entryPoint: "vertexMain", 
    buffers: vertexBufferLayout 
    },
    fragment: { // Configuration for the fragment shader.
    module: shader, 
    entryPoint: "fragmentMain", 
    targets: [{
        format: navigator.gpu.getPreferredCanvasFormat()
    }]
    },
    primitive: {
    topology: "triangle-list"
    },
    layout: device.createPipelineLayout({
    bindGroupLayouts: [uniformBindGroupLayout]
    })
};

pipeline = device.createRenderPipeline(pipelineDescriptor);

(You can read the same code with more comments in the source code for the program.)

There is a lot going on here! The vertex and fragment properties of the pipeline descriptor describe the shaders that are used in the pipeline. The module property is the compiled shader module that contains the shader function. The entryPoint property gives the name used for the shader entry point function in the shader source code. The buffers and targets properties are concerned with inputs for the vertex shader function and outputs from the fragment shader function.

The vertex buffer and bind group "layouts" specify what inputs will be required for the pipeline. They specify only the structure of the inputs. They basically create attachment points where actual input sources can be plugged in later. This allows one pipeline to draw different things by providing it with different inputs.

Note the use of arrays throughout the specification. For example, a pipeline can be configured to use multiple vertex buffers for input. The vertex buffer layout is an array, in which each element of the array specifies one input buffer. The index of an element in the array is important, since it identifies the attachment point for the corresponding buffer. The index will be used later, when attaching an actual buffer.

Similarly, a pipeline can take inputs from multiple bind groups. In this case, the index for a bind group comes from the bindGroupLayouts property in the pipelineDescriptor, and that index will be required when attaching an actual bind group to the pipeline. The index is also used in the shader program. For example, if you look back at the shader source code above, you'll see that the uniform variable declaration is annotated with @group(0). This means that the value for that variable will be found in the bind group at index 0 in the bindGroupLayouts array.

Furthermore, each bind group can hold a list of resources, which are specified by the entries property of the bind group layout for that bind group. An entry can provide the value for a global variable in the shader. In this case, confusingly, it is not the index of the entry in the entries array that is important; instead, the entry has a binding property to identify it. In the sample program, the double annotation @group(0) @binding(0) on the uniform variable declaration says that the value for the variable comes specifically from the entry with binding number 0 in the bind group at index 0.

The pipeline also has outputs, which come from the fragment shader entry point function, and the pipeline needs attachment points for the destinations of those outputs. The targets property in the pipelineDescriptor is an array with one entry for each attachment point. When the shader source code defines the fragment shader with fn fragmentMain() -> @location(0) vec4f, the annotation @location(0) on the output says that that output will be sent to color attachment number 0, corresponding to the element at index 0 in the targets array. The value for the format property in that element specifies that the output will be in the appropriate format for colors in a canvas. (The system will automatically translate the shader output, which uses a 32-bit float for each color component, into the canvas format, which uses an 8-bit unsigned integer for each component.)

That leaves the primitive property of the pipelineDescriptor to be explained: It specifies the kind of geometric primitive that the pipeline can draw. The topology specifies the primitive type, which in this example is "triangle-list." That is, when the pipeline is executed, each group of three vertices will define a triangle. WebGPU has only five primitive types: "point-list", "line-list", "line-strip", "triangle-list", and "triangle-strip", corresponding to POINTS, LINES, LINE_STRIP, TRIANGLES, and TRIANGLE_STRIP in WebGL or OpenGL. This illustration shows how the same six vertices would be interpreted in each topology (except that outlines of triangles and endpoints of line segments would not be part of the actual output):

Pictures of the five WebGPU primitive topologies.

(See Subsection 3.1.1 for more discussion of how primitives are rendered.)

You don't have to create a pipeline every time you draw an image. A pipeline can be used any number of times. It can be used to draw different things by attaching different input sources. Drawing a single image might require several pipelines, each of which might be executed several times. It is common for programs to create pipelines during initialization and store them in global variables.

9.1.4 缓冲区

Buffers

管线的输入来自顶点缓冲区以及绑定组中的通用目的缓冲区和其他资源。(其他可能的资源与纹理有关,我们将在第9.5节中才会遇到。)您需要知道如何创建缓冲区,用数据填充它,以及如何将其附加到管线。

使用 device.createBuffer() 函数创建缓冲区。它接受一个参数,该参数指定缓冲区的大小(以字节为单位)以及缓冲区的用途。例如,示例程序使用以下代码创建顶点缓冲区:

vertexBuffer = device.createBuffer({
    size: vertexCoords.byteLength,
    usage: GPUBufferUsage.VERTEX | GPUBufferUsage.COPY_DST
});

顶点缓冲区的目的是保存程序 GPU 端顶点着色器的输入。数据将来自类型化数组,如 Float32Array,或来自相关的 JavaScript 数据类型,如 ArrayBuffer。在本例中,vertexCoords 是一个 Float32Array,它保存了三角形顶点的 xy 坐标,而 vertexCoords.byteLength 提供了该数组中的字节数。(或者,大小可以指定为 4*vertexCoords.length 或常量 24。)

此示例中的 usage 属性表明缓冲区是顶点缓冲区,并且可以用作复制数据的目的地。usage 的值可以给定为使用常量,如 GPUBufferUsage.VERTEX,或通过这些常量的位或运算结果。

程序还使用缓冲区来保存着色器中 uniform 颜色变量的值。颜色值由三个四字节的浮点数组成,缓冲区可以这样创建:

uniformBuffer = device.createBuffer({
    size: 3*4,
    usage: GPUBufferUsage.UNIFORM | GPUBufferUsage.COPY_DST
});

只有顶点缓冲区直接附加到管线。其他缓冲区必须是一个绑定组的一部分,该绑定组附加到管线。示例程序创建了一个绑定组来保存 uniformBuffer

uniformBindGroup = device.createBindGroup({
    layout: uniformBindGroupLayout,
    entries: [ 
        {
            binding: 0, // 对应于布局中的 binding 0。
            resource: { buffer: uniformBuffer, offset: 0, size: 3*4 }
        }
    ]
});

回想一下,uniformBindGroupLayout 是为了指定绑定组的结构而创建的。绑定组布局有指定资源的条目;相应的绑定组有提供实际资源的条目。在这种情况下,资源是一个缓冲区。资源的 offsetsize 属性使得只使用缓冲区的一个片段成为可能;offset 是片段的起始字节号,size 是片段中的字节数。

要使缓冲区有用,必须用数据加载它。缓冲区存在于程序的 GPU 端。对于源自 JavaScript 端的数据,device.queue.writeBuffer() 函数是将数据复制到 GPU 缓冲区的最简单方法。例如,函数调用

device.queue.writeBuffer(vertexBuffer, 0, vertexCoords);

vertexCoords 数组的全部内容复制到 vertexBuffer 中,从缓冲区的字节号 0 开始。可以复制类型化数组的子数组到缓冲区的任何位置。一般形式是

device.queue.writeBuffer(buffer, startByte, array, startIndex, count)

其中 count 指定要复制到缓冲区的数组元素数量。(当数据源是类型化数组时;对于其他数据源,源中的起始位置和要复制的数据大小以字节为单位测量。)

在示例程序中,缓冲区和绑定组只创建一次,在初始化期间。vertexBufferuniformBuffer 是全局变量 —— vertexBuffer 因为在每次使用管线绘制三角形时都必须附加到管线,uniformBuffer 以便可以更改存储在其中的数据。每次要更改三角形的颜色时,都会向 uniformBuffer 写入一个新值。同样,uniformBindGroup 是一个全局变量,因为每次绘制三角形时都必须附加到管线。


思考 writeBuffer() 函数为何是设备对象中的 device.queue 的一个方法是一件有趣的事情。所说的队列是 GPU 上要执行的操作队列。当 writeBuffer() 返回时,并不一定意味着数据已经被写入缓冲区。但是,执行复制的操作已经被添加到队列中了。你所能保证的是,在队列中后续操作需要使用该缓冲区之前,数据将被复制到缓冲区。这可以包括使用该缓冲区的绘制操作。同样有可能队列中已经包含了依赖于缓冲区中先前值的操作,所以新数据不能被复制到缓冲区,直到那些操作完成。

当调用 device.queue.writeBuffer() 时,它立即将数据复制到一个中间的“暂存”缓冲区,这个缓冲区存在于 JavaScriptGPU 两侧共享的内存中。这意味着你可以立即重用数组;你不必等待数据被复制到最终目的地。而不是调用 writeBuffer(),可以自己完成这项工作——创建一个暂存缓冲区,将数据复制到暂存缓冲区,排队一个命令将数据从暂存缓冲区复制到目标缓冲区——但是 writeBuffer() 使这个过程变得更加容易。

Inputs to a pipeline come from vertex buffers and from general purpose buffers and other resources in bind groups. (The other possible resources relate to textures, which we will not encounter until Section 9.5). You need to know how to create a buffer, fill it with data, and attach it to a pipeline.

The function device.createBuffer() is used for creating buffers. It takes a parameter that specifies the size of the buffer in bytes and how the buffer will be used. For example, the sample program creates a vertex buffer with

vertexBuffer = device.createBuffer({
    size: vertexCoords.byteLength,
    usage: GPUBufferUsage.VERTEX | GPUBufferUsage.COPY_DST
});

The purpose of a vertex buffer is to hold inputs for a vertex shader on the GPU side of the program. The data will come from a typed array, such as a Float32Array, or from a related JavaScript data type such as ArrayBuffer. In this case, vertexCoords is a Float32Array that holds the xy-coordinates of the vertices of a triangle, and vertexCoords.byteLength gives the number of bytes in that array. (Alternatively, the size could be specified as 4*vertexCoords.length or as the constant 24.)

The usage property in this example says that the buffer is a vertex buffer and that it can be used as a destination for copying data. The value for the usage can be given as a usage constant such as GPUBufferUsage.VERTEX or by the bitwise OR of several such constants.

The program also uses a buffer to hold the value for the uniform color variable in the shader. The color value consists of three four-byte floats, and the buffer can be created with

uniformBuffer = device.createBuffer({
size: 3*4,
usage: GPUBufferUsage.UNIFORM | GPUBufferUsage.COPY_DST
});

Only vertex buffers are attached directly to pipelines. Other buffers must be part of a bind group that is attached to the pipeline. The sample program creates a bind group to hold uniformBuffer:

uniformBindGroup = device.createBindGroup({
layout: uniformBindGroupLayout,
entries: [ 
    {
        binding: 0, // Corresponds to the binding 0 in the layout.
        resource: { buffer: uniformBuffer, offset: 0, size: 3*4 }
    }
]
});

Recall that uniformBindGroupLayout was created to specify the structure of the bind group. The bind group layout has entries that specify resources; a corresponding bind group has entries the provide the actual resources. The resource in this case is a buffer. The offset and size properties of the resource make it possible to use just a segment of a buffer; offset is the starting byte number of the segment, and size is the number of bytes in the segment.

To be useful, a buffer must loaded with data. The buffer exists on the GPU side of the program. For data that originates on the JavaScript side, the function device.queue.writeBuffer() is the easiest way to copy the data into a GPU buffer. For example the function call

device.queue.writeBuffer(vertexBuffer, 0, vertexCoords); copies the entire contents of the vertexCoords array into vertexBuffer, starting at byte number 0 in the buffer. It is possible to a copy a subarray of a typed array to any position in the buffer. The general form is

device.queue.writeBuffer(buffer,startByte,array,startIndex,count)

where count gives the number of elements of array to be copied into buffer. (This is when the data source is a typed array; for other data sources, the starting position in the source and the size of the data to be copied are measured in bytes.)

In the sample program, the buffers and bind group are created just once, during initialization. And vertexBuffer and uniformBuffer are global variables—vertexBuffer because it must be attached to the pipeline each time the pipeline is used to draw a triangle, and uniformBuffer so that the data stored in it can be changed. A new value is written to uniformBuffer every time the color of the triangle is to be changed. Similarly, uniformBindGroup is a global variable because it must be attached to the pipeline each time a triangle is drawn.


It is interesting to think about why the writeBuffer() function is a method in the object device.queue. The queue in question is a queue of operations to be performed on the GPU. When writeBuffer() returns, it is not necessarily true that the data has been written to the buffer. However, the operation that does the copying has been added to the queue. What you are guaranteed is that the data will be copied to the buffer before it is needed by operations that come later in the queue. That can include drawing operations that use the buffer. It is also possible that the queue already contains operations that depend on the previous value in the buffer, so the new data can't be copied into the buffer until those operations have completed.

When device.queue.writeBuffer() is called, it immediately copies the data into an intermediate "staging" buffer that exists in memory that is shared by the JavaScript and GPU sides. This means that you are free to reuse the array immediately; you don't have to wait for the data to be copied to its final destination. Instead of calling writeBuffer(), it's possible to do the work yourself—create a staging buffer, copy the data into the staging buffer, enqueue a command to copy the data from the staging buffer to the destination buffer—but writeBuffer() makes the process much easier.

9.1.5 绘图

Drawing

随着管线设置完毕和输入缓冲区准备就绪,是时候真正绘制三角形了!绘制命令在 JavaScript 端指定,但在 GPU 端执行。在 JavaScript 端使用“命令编码器”来创建一个命令列表,该列表可以以可以添加到 GPU 处理命令队列的形式存在。命令编码器由 WebGPU 设备创建:

let commandEncoder = device.createCommandEncoder();

为了绘制,我们需要编码一个“渲染通道”,为此,我们需要一个渲染通道描述符:

let renderPassDescriptor = {
    colorAttachments: [{
        clearValue: { r: 0.5, g: 0.5, b: 0.5, a: 1 },  // 灰色背景
        loadOp: "clear", // 另一种选择是 "load"。
        storeOp: "store",  // 另一种选择是 "discard"。
        view: context.getCurrentTexture().createView()  // 绘制到画布上。
    }]
};

renderPassDescriptor 的 colorAttachments 属性对应于管线的输出目标。colorAttachments 数组的每个元素指定了输出目标数组中对应元素的目的地。在这种情况下,我们想要在网页上的画布上绘制。loadOp 属性的值为 "clear",如果画布在绘制前要被填充为清除颜色;如果希望在画布的先前内容上绘制,则为 "load"。clearValue 给出了清除颜色的 RGBA 分量,为 0.0 到 1.0 范围内的浮点值。storeOp 几乎总是 "store"。view 属性指定了图像将被绘制的位置。在这种情况下,最终目的地是画布,但实际目的地是一个纹理,当网页内容刷新时,它将被复制到画布上。每次画布重新绘制时都必须调用 context.getCurrentTexture() 函数,因此我们不能简单地制作一个渲染通道描述符,并将其不变地用于每次渲染。

绘制命令本身由渲染通道编码器编码,渲染通道编码器从命令编码器获得。在我们的示例中,通道编码器组装了绘制所需的资源(管线、顶点缓冲区和绑定组),并发出实际进行绘制的命令。调用 passEncoder.end() 终止渲染通道:

let passEncoder = commandEncoder.beginRenderPass(renderPassDescriptor);
passEncoder.setPipeline(pipeline);            // 指定管线。
passEncoder.setVertexBuffer(0,vertexBuffer);  // 附加顶点缓冲区。
passEncoder.setBindGroup(0,uniformBindGroup); // 附加绑定组。
passEncoder.draw(3);                          // 生成顶点。
passEncoder.end();

在这种情况下,绘制命令,passEncoder.draw(3),在执行时将简单地生成三个顶点。由于管线使用的是“triangle-list”拓扑,这些顶点形成了一个三角形。作为管线一部分指定的顶点着色器函数将被调用三次,输入来自顶点缓冲区。顶点着色器的三次调用的输出指定了一个三角形的三个顶点的位置。然后,对于三角形中的每个像素,都会调用片段着色器函数。片段着色器从作为绑定组一部分的 uniform 缓冲区获取像素的颜色。程序早期完成的所有设置最终将被用来产生图像!这是一个简单的例子。更一般地说,一个渲染通道可以涉及其他选项、多个绘制命令和其他命令。

您应该注意,所有这些实际上并没有进行任何绘制!它只是编码了进行绘制所需的命令,并将其添加到命令编码器中。最后一步是从命令编码器获取编码命令列表,并将其提交给 GPU 执行:

let commandBuffer = commandEncoder.finish();
device.queue.submit( [ commandBuffer ] );

device.queue.submit() 的参数是命令缓冲区的数组,尽管在这种情况下只有一个。(命令编码器不能重复使用;如果您想提交多个命令缓冲区,您将需要为每一个创建一个新的命令编码器。)

请注意,命令是提交给设备队列的。submit() 函数在将命令排队后立即返回。它们将在应用程序的 GPU 端的单独进程中执行。

With the pipeline set up and the input buffers ready, it's time to actually draw the triangle! The drawing commands are specified on the JavaScript side but executed on the GPU side. A "command encoder" is used on the JavaScript side to create a list of commands in a form that can be added to the queue of commands for processing on the GPU. The command encoder is created by the WebGPU device:

let commandEncoder = device.createCommandEncoder();

For drawing, we need to encode a "render pass," and for that, we need a render pass descriptor:

let renderPassDescriptor = {
colorAttachments: [{
    clearValue: { r: 0.5, g: 0.5, b: 0.5, a: 1 },  // gray background
    loadOp: "clear", // Alternative is "load".
    storeOp: "store",  // Alternative is "discard".
    view: context.getCurrentTexture().createView()  // Draw to the canvas.
}]
};

The colorAttachments property of the renderPassDescriptor corresponds to the output targets of the pipeline. Each element of the colorAttachments array specifies the destination for the corresponding element in the array of output targets. In this case, we want to draw to the canvas on the web page. The value for the loadOp property is "clear" if the canvas is to be filled with the clear color before drawing; it is "load" if you want to draw over the previous contents of the canvas. The clearValue gives the RGBA components of the clear color as floating point values in the range 0.0 to 1.0. The storeOp will almost always be "store". The view property specifies where the image will be drawn. In this case, the ultimate destination is the canvas, but the actual destination is a texture that will be copied to the canvas when the content of the web page is refreshed. The function context.getCurrentTexture() has to be called each time the canvas is redrawn, so we can't simply make a render pass descriptor and use it unchanged for every render.

The drawing commands themselves are encoded by a render pass encoder, which is obtained from the command encoder. The pass encoder in our example assembles the resources required for the drawing (pipeline, vertex buffer, and bind group), and it issues the command that actually does the drawing. A call to passEncoder.end() terminates the render pass:

let passEncoder = commandEncoder.beginRenderPass(renderPassDescriptor);
passEncoder.setPipeline(pipeline);            // Specify pipeline.
passEncoder.setVertexBuffer(0,vertexBuffer);  // Attach vertex buffer.
passEncoder.setBindGroup(0,uniformBindGroup); // Attach bind group.
passEncoder.draw(3);                          // Generate vertices.
passEncoder.end();

The draw command in this case, passEncoder.draw(3), will simply generate three vertices when it is executed. Since the pipeline uses the "triangle-list" topology, those vertices form a triangle. The vertex shader function, which was specified as part of the pipeline, will be called three times, with inputs that are pulled from the vertex buffer. The outputs from the three invocations of the vertex shader specify the positions of the three vertices of a triangle. The fragment shader function is then called for each pixel in the triangle. The fragment shader gets the color for the pixel from the uniform buffer that is part of the bind group. All the set up that was done earlier in the program will finally be used to produce an image! This is a simple example. More generally, a render pass can involve other options, multiple draw commands, and other commands.

You should note that all of this has not actually done any drawing! It has just encoded the commands that are needed to do the drawing, and has added them to the command encoder. The final step is to get the list of encoded commands from the command encoder and submit them to the GPU for execution:

let commandBuffer = commandEncoder.finish();
device.queue.submit( [ commandBuffer ] );

The parameter to device.queue.submit() is an array of command buffers, although in this case there is only one. (The command encoder cannot be reused; if you want to submit multiple command buffers, you will need to create a new command encoder for each one.)

Note that commands are submitted to the device queue. The submit() function returns immediately after enqueueing the commands. They will be executed in a separate process on the GPU side of the application.

9.1.6 多个顶点输入

Multiple Vertex Inputs

在我们结束本节之前,我们来看我们基本示例的两个变体:basic_webgpu_2.htmlbasic_webgpu_3.html。这些程序不是绘制一个纯色三角形,而是绘制一个每个顶点都有不同颜色的三角形。内部像素的颜色是从顶点颜色插值得到的。这是标准的“RGB三角形”示例。

123

由于每个顶点都有不同的颜色,颜色是一个顶点属性,必须作为参数传递给顶点着色器入口点。在新的示例中,该函数有两个参数,2D顶点坐标和顶点RGB颜色。这些两个值的插值版本由片段着色器使用,因此顶点着色器还需要两个输出。由于一个函数只能有一个返回值,这两个输出必须合并到一个单独的数据结构中。在WGSL中,就像在GLSL中一样,这个数据结构是一个结构体(见6.3.2小节)。这是在两个新示例中使用的着色器源代码:

struct VertexOutput {  // 顶点着色器返回值的类型
    @builtin(position) position: vec4f,
    @location(0) color: vec3f  
}

@vertex
fn vertexMain(
        @location(0) coords: vec2f, 
        @location(1) color: vec3f  
    ) -> VertexOutput {  
    var output: VertexOutput;  
    output.position = vec4f(coords, 0, 1);
    output.color = color; 
    return output;
}

@fragment
fn fragmentMain(@location(0) fragColor: vec3f) -> @location(0) vec4f {
    return vec4f(fragColor, 1);
}

传递给片段着色器函数的 fragColor 参数是顶点着色器颜色输出的插值版本,即使名称不一样。实际上,名称根本无关紧要;顶点着色器输出 color 和片段着色器参数 fragColor 之间的关联是通过 @location(0) 修饰符指定的。注意,@location(0) 在这里的含义与顶点着色器参数 coords 上的 @location(0) 注释非常不同。(回想一下,顶点着色器参数上的 @location 注释对应于 JavaScript 端顶点缓冲区布局中的 shaderLocation,它指定了该参数的值来自哪里。)

我再次指出,即使在这个示例中,顶点着色器的位置输出没有明确在片段着色器函数中使用,但它是隐含使用的。顶点着色器函数始终需要有一个 @builtin(position) 输出。


应用程序的 JavaScript 端现在必须为顶点着色器函数提供两个输入。在第一种变体中,两个输入在两个单独的顶点缓冲区中提供,新的顶点缓冲区布局反映了这一点,有两个数组元素对应于两个顶点缓冲区:

let vertexBufferLayout = [
    { // 第一个顶点缓冲区,用于 coords(每个顶点两个32位浮点数)。
        attributes: [ { shaderLocation:0, offset:0, format: "float32x2" } ],
        arrayStride: 8,  // 缓冲区中的值之间有8个字节
        stepMode: "vertex" 
    },
    { // 第二个顶点缓冲区,用于颜色(每个顶点三个32位浮点数)。
        attributes: [ { shaderLocation:1, offset:0, format: "float32x3" } ],
        arrayStride: 12,  // 缓冲区中的值之间有12个字节
        stepMode: "vertex" 
    }
];

第二种变体做了一些更有趣的事情:它使用一个包含两个参数值的顶点缓冲区。颜色的值与坐标的值交错在一起。这是 JavaScript 端数据的样子:

const vertexData = new Float32Array([
    /* 坐标 */     /* 颜色 */
    -0.8, -0.6,      1, 0, 0,      // 第一个顶点的数据
    0.8, -0.6,       0, 1, 0,      // 第二个顶点的数据
    0.0, 0.7,        0, 0, 1       // 第三个顶点的数据
]);

这个数组将被复制到单个顶点缓冲区中。顶点缓冲区布局反映了缓冲区中数据的布局:

let vertexBufferLayout = [
    {   // 一个顶点缓冲区,包含两个属性的值。
        attributes: [
            { shaderLocation:0, offset:0, format: "float32x2" },
            { shaderLocation:1, offset:8, format: "float32x3" }
        ],
        arrayStride: 20,
        stepMode: "vertex" 
    }
];

请注意,每个缓冲区的数据占用20个字节(五个4字节的浮点数)。这成为布局中的 arrayStride,它给出了一个顶点的值到下一个顶点的值之间的距离,以字节为单位。属性的 offset 属性告诉您在给定顶点的数据块中在哪里找到该属性的值:coords 的 offset 是 0,因为它位于数据的开始位置;color 的 offset 是 8,因为它位于数据开始位置的 8个字节处。

我们的第一个示例和两个新变体之间还有其他差异。我鼓励您查看两个新程序的源代码并阅读注释。只有每个程序的新特性有注释。

Before ending this section, we look at two variations on our basic example: basic_webgpu_2.html and basic_webgpu_3.html. Instead of drawing a solid colored triangle, these programs draw a triangle in which each vertex has a different color. The colors for the interior pixels are interpolated from the vertex colors. This is the standard "RGB triangle" example.

123

Since each vertex has a different color, the color is a vertex attribute that has to be passed as a parameter to the vertex shader entry point. In the new examples, that function has two parameters, the 2D vertex coordinates and the vertex RGB color. Interpolated versions of these two values are used by the fragment shader, so the vertex shader also needs two outputs. Since a function can have only one return value, the two outputs have to be combined into a single data structure. In WGSL, as in GLSL, that data structure is a struct (see Subsection 6.3.2). Here is the shader source code that is used in both of the new examples:

struct VertexOutput {  // type for return value of vertex shader
@builtin(position) position: vec4f,
@location(0) color : vec3f  
}

@vertex
fn vertexMain(
        @location(0) coords : vec2f, 
        @location(1) color : vec3f  
    ) -> VertexOutput {  
var output: VertexOutput;  
output.position = vec4f( coords, 0, 1 );
output.color = color; 
return output;
}

@fragment
fn fragmentMain(@location(0) fragColor : vec3f) -> @location(0) vec4f {
return vec4f(fragColor,1);
}

The fragColor parameter to the fragment shader function is the interpolated version of the color output from the vertex shader, even though the name is not the same. In fact, the names don't matter at all; the association between the two values is specified by the @location(0) modifier on both the vertex shader output, color, and the fragment shader parameter, fragColor. Note that the meaning of @location(0) here is very different from the @location(0) annotation on the vertex shader parameter, coords. (Recall that a @location annotation on a vertex shader parameter corresponds to a shaderLocation in the vertex buffer layout on the JavaScript side, and it specifies where the values for that parameter come from.)

I will note again that even though the position output from the vertex shader is not used explicitly in the fragment shader function in this example, it is used implicitly. A vertex shader function is always required to have a @builtin(position) output.


The JavaScript side of the application must now provide two inputs for the vertex shader function. In the first variation, the two inputs are provided in two separate vertex buffers, and the new vertex buffer layout reflects this, with two array elements corresponding to the two vertex buffers:

let vertexBufferLayout = [
{ // First vertex buffer, for coords (two 32-bit floats per vertex).
    attributes: [ { shaderLocation:0, offset:0, format: "float32x2" } ],
    arrayStride: 8,  // 8 bytes between values in the buffer
    stepMode: "vertex" 
},
{ // Second vertex buffer, for colors (three 32-bit floats per vertex).
    attributes: [ { shaderLocation:1, offset:0, format: "float32x3" } ],
    arrayStride: 12,  // 12 bytes between values in the buffer
    stepMode: "vertex" 
}
];

The second variation does something more interesting: It uses just one vertex buffer that contains the values for both parameters. The values for the colors are interleaved with the values for the coordinates. Here is what the data looks like on the JavaScript side:

const vertexData = new Float32Array([
/* coords */     /* color */
    -0.8, -0.6,      1, 0, 0,      // data for first vertex
    0.8, -0.6,       0, 1, 0,      // data for second vertex
    0.0, 0.7,        0, 0, 1       // data for third vertex
]);

This array will be copied into the single vertex buffer. The vertex buffer layout reflects the layout of the data in the buffer:

let vertexBufferLayout = [
{   // One vertex buffer, containing values for two attributes.
    attributes: [
        { shaderLocation:0, offset:0, format: "float32x2" },
        { shaderLocation:1, offset:8, format: "float32x3" }
        ],
    arrayStride: 20,
    stepMode: "vertex" 
}
];

Note that the data for each buffer takes up 20 bytes (five 4-byte floats). This becomes the arrayStride in the layout, which gives the distance, in bytes, from the values for one vertex to the values for the next vertex. The offset property for an attribute tells where to find the value for that attribute within the block of data for a given vertex: The offset for coords is 0 because it is found at the start of the data; the offset for color is 8 because it is found 8 bytes from the start of the data.

There are other differences between our first example and the two new variations. I encourage you to look at the source code for the two new programs and read the comments. Only the new features of each program are commented.

9.1.7 自动绑定组布局

Auto Bind Group Layout

最后一点说明。绑定组布局包含了组中每个绑定的信息:绑定引用的资源类型以及它所使用的着色器阶段。通常,这些信息可以从着色器程序中推断出来。当创建管线时,会组装完整的着色器程序,并且管线可以自动构造它所使用的绑定组布局。您可以通过将管线描述符的 layout 属性设置为 "auto",让管线为您创建绑定组布局:

pipelineDescriptor = {
    .
    .
    .
    layout: "auto"
};
pipeline = device.createRenderPipeline(pipelineDescriptor);

然后,您可以使用函数 pipeline.getBindGroupLayout(N),其中 N 是绑定组编号,从管线中获取布局。创建实际的绑定组时需要这个布局:

bndGroup = device.createBindGroup({
    layout: pipeline.getBindGroupLayout(0),
    entries: [
        .
        .
        .

从现在起,我将在大多数示例中使用自动绑定组布局,但偶尔我会自己指定布局,以展示不同类型资源的布局是什么样的。

One final note. A bind group layout contains information about each binding in the group: what kind of resource the binding refers to and which shader stage it is used in. In general, that information can be deduced from the shader program. The full shader program is assembled when the pipeline is created, and the pipeline can automatically construct the bind group layouts that it uses. You can ask the pipeline to create the bind group layouts by setting the layout property of the pipeline descriptor to "auto":

pipelineDescriptor = {
    .
    .
    .
layout: "auto"
};
pipeline = device.createRenderPipeline( pipelineDescriptor );

You can then use the function pipeline.getBindGroupLayout(N), where N is the bind group number, to get the layout from the pipeline. The layout is needed to create the actual bind group:

bndGroup = device.createBindGroup({
layout: pipeline.getBindGroupLayout(0),,
entries: [ 
    .
    .
    .

I will use auto bind group layout in most of my examples from now on, but I will occasionally specify the layout myself, to show what it looks like for various kinds of resources.