7.8 内容流和资源¶
Content Streams and Resources
7.8.1 概述¶
General
内容流是描述页面外观和其他图形元素的主要方式。内容流依赖于相关资源字典中包含的信息;这两个对象结合起来形成一个自包含的实体。本小节描述了这些对象。
Content streams are the primary means for describing the appearance of pages and other graphical elements. A content stream depends on information contained in an associated resource dictionary; in combination, these two objects form a self-contained entity. This sub-clause describes these objects.
7.8.2 内容流¶
Content Streams
PDF中的内容流是一个PDF流对象,其数据由一系列指令组成,这些指令描述了要在页面上绘制的图形元素。这些指令应以PDF对象的形式表示,使用与PDF文档其余部分相同的对象语法。然而,与作为整体的文档是一个静态的、随机访问的数据结构不同,内容流中的对象应被解释并按顺序执行。
文档的每一页都由一个或多个内容流表示。内容流也用于将一系列指令打包为自包含的图形元素,例如表单(见8.10,“表单X对象”)、模式(见8.7,“模式”)、某些字体(见9.6.5,“类型3字体”)和注释外观(见12.5.5,“外观流”)。
内容流在用任何指定的过滤器解码后,应根据7.2中描述的PDF语法规则进行解释,“词法约定”。它由PDF对象组成,表示操作数和操作符。操作数是直接对象,可以是除流之外的任何基本PDF数据类型。某些特定操作符允许字典作为操作数。间接对象和对象引用则根本不允许。
操作符是一个PDF关键字,指定要执行的某些操作,例如在页面上绘制图形形状。操作符关键字应通过缺少初始斜杠字符(2Fh)(/)与名称对象区分开来。操作符仅在内容流内部才有意义。
NOTE 1
这种后缀表示法,其中操作符由其操作数前置,表面上与PostScript语言相同。然而,PDF没有像PostScript那样的操作数栈的概念。
在PDF中,操作符所需的所有操作数应立即出现在该操作符之前。操作符不返回结果,当操作符完成执行时,操作数也不应有剩余。
NOTE 2
大多数操作符都与在页面上绘制图形元素或指定影响后续绘制操作的参数有关。各个操作符在专门讨论其功能的子条款中进行了描述:
第8条款,“图形”描述了绘制一般图形的操作符,例如填充区域、描边和采样图像,以及指定设备无关图形参数的操作符,如颜色。
第9条款,“文本”描述了使用字体中定义的字符字形绘制文本的操作符。
第10条款,“渲染”描述了指定设备相关渲染参数的操作符。
第14条款,“文档交换”描述了标记内容操作符,这些操作符将更高级别的逻辑信息与内容流中的对象关联起来。这些操作符不影响内容的渲染外观;它们指定了对使用PDF进行文档交换的应用程序有用的信息。
通常,当一个符合规范的阅读器在内容流中遇到一个它不认识的操作符时,会发生错误。一对兼容性操作符BX和EX(PDF 1.1)将修改这种行为(见表32)。这些操作符应成对出现,并且可以嵌套。它们标记了一个兼容性部分,在这部分内容流中,不认识的操作符将被忽略,而不会出错。这种机制使符合规范的编写器能够在不牺牲与旧应用程序的兼容性的情况下,使用在PDF的后续版本中定义的操作符。它应该仅在忽略这些较新的操作符是适当的情况下使用。BX和EX操作符本身不是任何图形对象(见8.2,“图形对象”)或图形状态(8.4,“图形状态”)的一部分。
Operands | Operator | Description |
---|---|---|
-- | BX | (PDF 1.1) 开始一个兼容性部分。在遇到平衡的EX操作符之前,不认识的操作符(连同它们的操作数)将被忽略,而不会出错。 |
-- | EX | (PDF 1.1) 结束由平衡的BX操作符开始的兼容性部分。忽略从先前的匹配BX开始的任何不认识的操作数和操作符。 |
A content stream is a PDF stream object whose data consists of a sequence of instructions describing th graphical elements to be painted on a page. The instructions shall be represented in the form of PDF objects using the same object syntax as in the rest of the PDF document. However, whereas the document as a whol is a static, random-access data structure, the objects in the content stream shall be interpreted and acted upo sequentially.
Each page of a document shall be represented by one or more content streams. Content streams shall also be used to package sequences of instructions as self-contained graphical elements, such as forms (see 8.10, "Form XObjects"), patterns (8.7, "Patterns"), certain fonts (9.6.5, "Type 3 Fonts"), and annotation appearances (12.5.5, "Appearance Streams").
A content stream, after decoding with any specified filters, shall be interpreted according to the PDF syntax rules described in 7.2, "Lexical Conventions." It consists of PDF objects denoting operands and operators. The operands needed by an operator shall precede it in the stream. See EXAMPLE 4 in 7.4, "Filters," for an example of a content stream.
An operand is a direct object belonging to any of the basic PDF data types except a stream. Dictionaries shall be permitted as operands only by certain specific operators. Indirect objects and object references shall not be permitted at all.
An operator is a PDF keyword specifying some action that shall be performed, such as painting a graphical shape on the page. An operator keyword shall be distinguished from a name object by the absence of an initial SOLIDUS character (2Fh) (/ ). Operators shall be meaningful only inside a content stream.
NOTE 1
This postfix notation, in which an operator is preceded by its operands, is superficially the same as in the PostScript language. However, PDF has no concept of an operand stack as PostScript has.
In PDF, all of the operands needed by an operator shall immediately precede that operator. Operators do not return results, and operands shall not be left over when an operator finishes execution.
NOTE 2
Most operators have to do with painting graphical elements on the page or with specifying parameters that affect subsequent painting operations. The individual operators are described in the clauses devoted to their functions:
Clause 8, "Graphics" describes operators that paint general graphics, such as filled areas, strokes, and sampled images, and that specify device-independent graphical parameters, such as colour.
Clause 9, "Text" describes operators that paint text using character glyphs defined in fonts.
Clause 10, "Rendering" describes operators that specify device-dependent rendering parameters.
Clause 14, "Document Interchange" describes the marked-content operators that associate higher-level logical information with objects in the content stream. These operators do not affect the rendered appearance of the content; they specify information useful to applications that use PDF for document interchange.
Ordinarily, when a conforming reader encounters an operator in a content stream that it does not recognize, an error shall occur. A pair of compatibility operators, BX and EX (PDF 1.1), shall modify this behaviour (see Table 32). These operators shall occur in pairs and may be nested. They bracket a compatibility section, a portion of a content stream within which unrecognized operators shall be ignored without error. This mechanism enables a conforming writer to use operators defined in later versions of PDF without sacrificing compatibility with older applications. It should be used only in cases where ignoring such newer operators is the appropriate thing to do. The BX and EX operators are not themselves part of any graphics object (see 8.2, "Graphics Objects") or of the graphics state (8.4, "Graphics State").
Operands | Operator | Description |
---|---|---|
-- | BX | (PDF 1.1) Begin a compatibility section. Unrecognized operators (along with their operands) shall be ignored without error until the balancing EX operator is encountered. |
-- | EX | (PDF 1.1) End a compatibility section begun by a balancing BX operator. Ignore any unrecognized operands and operators from previous matching BX onward. |
7.8.3 资源字典¶
Resource Dictionaries
如上所述,内容流中的操作符操作数只能是直接对象;不允许间接对象和对象引用。在某些情况下,操作符需引用在内容流外定义的 PDF 对象,如字体字典或包含图像数据的流。这应通过定义这些对象为命名资源并在内容流中按名称引用来实现。
命名资源仅在内容流的上下文中有意义。资源名称的范围应局限于特定内容流,并且与外部已知的对象标识符(如字体)无关。从一个内容流外的对象引用另一个内容流外的对象应通过间接对象引用而不是命名资源来进行。
内容流的命名资源应由资源字典(resource dictionary)定义,该字典枚举内容流中操作符所需的命名资源及其引用名称。
示例 1
如果内容流中的文本操作符需要某个字体,则内容流的资源字典可以将名称 F42 与相应的字体字典关联。文本操作符可以使用该名称来引用字体。
资源字典应通过以下方式之一与内容流关联:
- 对于作为页面的 Contents 条目的值(或作为该条目值的数组元素)的内容流,资源字典应由页面字典的 Resources 指定或由页面对象的某个祖先节点继承,如7.7.3.4 “页面属性的继承”所述。
- 对于其他内容流,符合标准的编写器应在流的字典中包含一个 Resources 条目,指定包含该内容流使用的所有资源的资源字典。这适用于定义表单 XObjects、图案(patterns)、Type 3 字体和注释(annotation)的内容流。
- 遵循早期版本 PDF 编写的文件可能在所有页面使用的表单 XObjects 和 Type 3 字体中省略了 Resources 条目。所有从这些表单和字体引用的资源应从使用它们的页面的资源字典继承。这种构造已过时,不应被符合标准的编写器使用。
在给定内容流的上下文中,术语当前资源字典指以上述方式之一与内容流关联的资源字典。
资源字典中的每个键应为资源类型的名称,如表33所示。相应的值如下:
- 对于资源类型 ProcSet,其值应为过程集名称的数组。
- 对于所有其他资源类型,其值应为子字典。子字典中的每个键应为特定资源的名称,相应的值应为与该名称关联的 PDF 对象。
操作数 | 操作符 | 描述 |
---|---|---|
ExtGState | dictionary | (可选)将资源名称映射到图形状态参数字典的字典(见 8.4.5,"图形状态参数字典")。 |
ColorSpace | dictionary | (可选)将每个资源名称映射到设备相关的颜色空间名称或描述颜色空间的数组的字典(见 8.6,"颜色空间")。 |
Pattern | dictionary | (可选)将资源名称映射到图案对象的字典(见 8.7,"图案")。 |
Shading | dictionary | (可选 PDF 1.3)将资源名称映射到着色字典的字典(见 8.7.4.3,"着色字典")。 |
XObject | dictionary | (可选)将资源名称映射到外部对象的字典(见 8.8,"外部对象")。 |
Font | dictionary | (可选)将资源名称映射到字体字典的字典(见第 9 条,"文本")。 |
ProcSet | array | (可选)预定义过程集名称的数组(见 14.2,"过程集")。 |
Properties | dictionary | (可选;PDF 1.2)将资源名称映射到标记内容的属性列表字典的字典(见 14.6.2,"属性列表")。 |
EXAMPLE 2
下面是一个包含过程集、字体和外部对象的资源字典示例。过程集通过一个数组指定,如14.2“过程集”中所述。字体通过一个子字典指定,将名称 F5、F6、F7 和 F8 分别与对象 6、8、10 和 12 关联起来。同样,XObject 子字典将名称 Im1 和 Im2 分别与对象 13 和 15 关联起来。
<</ProcSet [ /PDF /ImageB ]
/Font << /F5 6 0 R
/F6 8 0 R
/F7 10 0 R
/F8 12 0 R
>>
/XObject << /Im1 13 0 R
/Im2 15 0 R
>>
>>
chat-gpt 举例:
<<
/ProcSet [/PDF /Text /ImageB /ImageC /ImageI]
/Font <<
/F5 6 0 R
/F6 8 0 R
/F7 10 0 R
/F8 12 0 R
>>
/XObject <<
/Im1 13 0 R
/Im2 15 0 R
>>
>>
在这个示例中:
- /ProcSet 条目包含一个过程集数组,指定了文档使用的过程集。
- /Font 条目包含一个子字典,将字体名称(F5、F6、F7、F8)映射到相应的对象引用(6 0 R、8 0 R、10 0 R、12 0 R)。
- /XObject 条目包含一个子字典,将外部对象名称(Im1、Im2)映射到相应的对象引用(13 0 R、15 0 R)。
这样定义的资源字典允许内容流中的操作符引用这些资源名称来访问相应的对象。
As stated above, the operands supplied to operators in a content stream shall only be direct objects; indirect objects and object references shall not be permitted. In some cases, an operator shall refer to a PDF object that is defined outside the content stream, such as a font dictionary or a stream containing image data. This shall be accomplished by defining such objects as named resources and referring to them by name from within the content stream.
Named resources shall be meaningful only in the context of a content stream. The scope of a resource name shall be local to a particular content stream and shall be unrelated to externally known identifiers for objects such as fonts. References from one object outside of content streams to another outside of content streams shall be made by means of indirect object references rather than named resources.
A content stream’s named resources shall be defined by a resource dictionary, which shall enumerate the named resources needed by the operators in the content stream and the names by which they can be referred to.
EXAMPLE 1
If a text operator appearing within the content stream needs a certain font, the content stream’s resource dictionary can associate the name F42 with the corresponding font dictionary. The text operator can use this name to refer to the font.
A resource dictionary shall be associated with a content stream in one of the following ways:
- For a content stream that is the value of a page’s Contents entry (or is an element of an array that is the value of that entry), the resource dictionary shall be designated by the page dictionary’s Resources or is inherited, as described under 7.7.3.4, "Inheritance of Page Attributes," from some ancestor node of the page object.
- For other content streams, a conforming writer shall include a Resources entry in the stream's dictionary specifying the resource dictionary which contains all the resources used by that content stream. This shall apply to content streams that define form XObjects, patterns, Type 3 fonts, and annotation.
- PDF files written obeying earlier versions of PDF may have omitted the Resources entry in all form XObjects and Type 3 fonts used on a page. All resources that are referenced from those forms and fonts shall be inherited from the resource dictionary of the page on which they are used. This construct is obsolete and should not be used by conforming writers.
In the context of a given content stream, the term current resource dictionary refers to the resource dictionary associated with the stream in one of the ways described above.
Each key in a resource dictionary shall be the name of a resource type, as shown in Table 33. The corresponding values shall be as follows:
- For resource type ProcSet, the value shall be an array of procedure set names
- For all other resource types, the value shall be a subdictionary. Each key in the subdictionary shall be the name of a specific resource, and the corresponding value shall be a PDF object associated with the name.
Operands | Operator | Description |
---|---|---|
ExtGState | dictionary | (Optional) A dictionary that maps resource names to graphics state parameter dictionaries (see 8.4.5, "Graphics State Parameter Dictionaries"). |
ColorSpace | dictionary | (Optional) A dictionary that maps each resource name to either the name of a device-dependent colour space or an array describing a colour space (see 8.6, "Colour Spaces"). |
Pattern | dictionary | (Optional) A dictionary that maps resource names to pattern objects (see 8.7, "Patterns"). |
Shading | dictionary | (Optional PDF 1.3) A dictionary that maps resource names to shading dictionaries (see 8.7.4.3, "Shading Dictionaries"). |
XObject | dictionary | (Optional) A dictionary that maps resource names to external objects (see 8.8, "External Objects"). |
Font | dictionary | (Optional) A dictionary that maps resource names to font dictionaries (see clause 9, "Text"). |
ProcSet | array | (Optional) An array of predefined procedure set names (see 14.2, "Procedure Sets"). |
Properties | dictionary | (Optional; PDF 1.2) A dictionary that maps resource names to property list dictionaries for marked content (see 14.6.2, "Property Lists"). |
EXAMPLE 2
The following shows a resource dictionary containing procedure sets, fonts, and external objects. The procedure sets are specified by an array, as described in 14.2, "Procedure Sets". The fonts are specified with a subdictionary associating the names F5, F6, F7, and F8 with objects 6, 8, 10, and 12, respectively. Likewise, the XObject subdictionary associates the names Im1 and Im2 with objects 13 and 15, respectively.
<</ProcSet [ /PDF /ImageB ]
/Font << /F5 6 0 R
/F6 8 0 R
/F7 10 0 R
/F8 12 0 R
>>
/XObject << /Im1 13 0 R
/Im2 15 0 R
>>
>>