PDF 范型对象¶

Implementation of generic PDF objects (dictionary, number, string, ...).

class pypdf.generic.BooleanObject(value: Any)[源代码]¶

基类：PdfObject

clone(pdf_dest: PdfWriterProtocol, force_duplicate: bool = False, ignore_fields: Sequence[str | int] | None = ()) → BooleanObject[源代码]¶: Clone object into pdf_dest.

hash_bin() → int[源代码]¶

Used to detect modified object.

返回:: Hash considering type and value.

write_to_stream(stream: IO[Any], encryption_key: None | str | bytes = None) → None[源代码]¶

static read_from_stream(stream: IO[Any]) → BooleanObject[源代码]¶

class pypdf.generic.FloatObject(value: Any = '0.0', context: Any | None = None)[源代码]¶

基类：float, PdfObject

clone(pdf_dest: Any, force_duplicate: bool = False, ignore_fields: Sequence[str | int] | None = ()) → FloatObject[源代码]¶: Clone object into pdf_dest.

hash_bin() → int[源代码]¶

Used to detect modified object.

返回:: Hash considering type and value.

myrepr() → str[源代码]¶

as_numeric() → float[源代码]¶

write_to_stream(stream: IO[Any], encryption_key: None | str | bytes = None) → None[源代码]¶

class pypdf.generic.NumberObject(value: Any)[源代码]¶

基类：int, PdfObject

NumberPattern = re.compile(b'[^+-.0-9]')¶

clone(pdf_dest: Any, force_duplicate: bool = False, ignore_fields: Sequence[str | int] | None = ()) → NumberObject[源代码]¶: Clone object into pdf_dest.

hash_bin() → int[源代码]¶

Used to detect modified object.

返回:: Hash considering type and value.

as_numeric() → int[源代码]¶

write_to_stream(stream: IO[Any], encryption_key: None | str | bytes = None) → None[源代码]¶

static read_from_stream(stream: IO[Any]) → NumberObject | FloatObject[源代码]¶

class pypdf.generic.NameObject[源代码]¶

基类：str, PdfObject

delimiter_pattern = re.compile(b'\\s+|[\\(\\)<>\\[\\]{}/%]')¶

surfix = b'/'¶

renumber_table: ClassVar[Dict[str, bytes]] = {'\x00': b'#00', '\x01': b'#01', '\x02': b'#02', '\x03': b'#03', '\x04': b'#04', '\x05': b'#05', '\x06': b'#06', '\x07': b'#07', '\x08': b'#08', '\t': b'#09', '\n': b'#0A', '\x0b': b'#0B', '\x0c': b'#0C', '\r': b'#0D', '\x0e': b'#0E', '\x0f': b'#0F', '\x10': b'#10', '\x11': b'#11', '\x12': b'#12', '\x13': b'#13', '\x14': b'#14', '\x15': b'#15', '\x16': b'#16', '\x17': b'#17', '\x18': b'#18', '\x19': b'#19', '\x1a': b'#1A', '\x1b': b'#1B', '\x1c': b'#1C', '\x1d': b'#1D', '\x1e': b'#1E', '\x1f': b'#1F', ' ': b'#20', '#': b'#23', '%': b'#25', '(': b'#28', ')': b'#29', '/': b'#2F'}¶

clone(pdf_dest: Any, force_duplicate: bool = False, ignore_fields: Sequence[str | int] | None = ()) → NameObject[源代码]¶: Clone object into pdf_dest.

hash_bin() → int[源代码]¶

Used to detect modified object.

返回:: Hash considering type and value.

write_to_stream(stream: IO[Any], encryption_key: None | str | bytes = None) → None[源代码]¶

renumber() → bytes[源代码]¶

static unnumber(sin: bytes) → bytes[源代码]¶

CHARSETS = ('utf-8', 'gbk', 'latin1')¶

static read_from_stream(stream: IO[Any], pdf: Any) → NameObject[源代码]¶

class pypdf.generic.IndirectObject(idnum: int, generation: int, pdf: Any)[源代码]¶

基类：PdfObject

hash_bin() → int[源代码]¶

Used to detect modified object.

返回:: Hash considering type and value.

replicate(pdf_dest: PdfWriterProtocol) → PdfObject[源代码]¶

clone(pdf_dest: PdfWriterProtocol, force_duplicate: bool = False, ignore_fields: Sequence[str | int] | None = ()) → IndirectObject[源代码]¶: Clone object into pdf_dest.

property indirect_reference: IndirectObject¶

get_object() → PdfObject | None[源代码]¶

write_to_stream(stream: IO[Any], encryption_key: None | str | bytes = None) → None[源代码]¶

static read_from_stream(stream: IO[Any], pdf: Any) → IndirectObject[源代码]¶

class pypdf.generic.NullObject(*args, **kwargs)[源代码]¶

基类：PdfObject

clone(pdf_dest: PdfWriterProtocol, force_duplicate: bool = False, ignore_fields: Sequence[str | int] | None = ()) → NullObject[源代码]¶: Clone object into pdf_dest.

hash_bin() → int[源代码]¶

Used to detect modified object.

返回:: Hash considering type and value.

write_to_stream(stream: IO[Any], encryption_key: None | str | bytes = None) → None[源代码]¶

static read_from_stream(stream: IO[Any]) → NullObject[源代码]¶

class pypdf.generic.PdfObject(*args, **kwargs)[源代码]¶

基类：PdfObjectProtocol

hash_func(*, usedforsecurity=True)¶: Returns a sha1 hash object; optionally initialized with a string

indirect_reference: IndirectObject | None¶

hash_bin() → int[源代码]¶

Used to detect modified object.

返回:: Hash considering type and value.

hash_value_data() → bytes[源代码]¶

hash_value() → bytes[源代码]¶

replicate(pdf_dest: PdfWriterProtocol) → PdfObject[源代码]¶

Clone object into pdf_dest (PdfWriterProtocol which is an interface for PdfWriter) without ensuring links. This is used in clone_document_from_root with incremental = True.

参数:: pdf_dest -- Target to clone to.
返回:: The cloned PdfObject

clone(pdf_dest: PdfWriterProtocol, force_duplicate: bool = False, ignore_fields: Sequence[str | int] | None = ()) → PdfObject[源代码]¶

Clone object into pdf_dest (PdfWriterProtocol which is an interface for PdfWriter).

By default, this method will call _reference_clone (see _reference).

参数:

pdf_dest -- Target to clone to.
force_duplicate -- By default, if the object has already been cloned and referenced, the copy will be returned; when True, a new copy will be created. (Default value = False)
ignore_fields -- List/tuple of field names (for dictionaries) that will be ignored during cloning (applies to children duplication as well). If fields are to be considered for a limited number of levels, you have to add it as integer, for example [1,"/B","/TOTO"] means that "/B" will be ignored at the first level only but "/TOTO" on all levels.

返回:

The cloned PdfObject

get_object() → PdfObject | None[源代码]¶: Resolve indirect references.

write_to_stream(stream: IO[Any], encryption_key: None | str | bytes = None) → None[源代码]¶

class pypdf.generic.TextStringObject(value: Any)[源代码]¶

基类：str, PdfObject

A string object that has been decoded into a real unicode string.

If read from a PDF document, this string appeared to match the PDFDocEncoding, or contained a UTF-16BE BOM mark to cause UTF-16 decoding to occur.

autodetect_pdfdocencoding: bool¶

autodetect_utf16: bool¶

utf16_bom: bytes¶

clone(pdf_dest: Any, force_duplicate: bool = False, ignore_fields: Sequence[str | int] | None = ()) → TextStringObject[源代码]¶: Clone object into pdf_dest.

hash_bin() → int[源代码]¶

Used to detect modified object.

返回:: Hash considering type and value.

property original_bytes: bytes¶: It is occasionally possible that a text string object gets created where a byte string object was expected due to the autodetection mechanism -- if that occurs, this "original_bytes" property can be used to back-calculate what the original encoded bytes were.

get_original_bytes() → bytes[源代码]¶

get_encoded_bytes() → bytes[源代码]¶

write_to_stream(stream: IO[Any], encryption_key: None | str | bytes = None) → None[源代码]¶

class pypdf.generic.ByteStringObject(*args, **kwargs)[源代码]¶

基类：bytes, PdfObject

Represents a string object where the text encoding could not be determined.

This occurs quite often, as the PDF spec doesn't provide an alternate way to represent strings -- for example, the encryption data stored in files (like /O) is clearly not text, but is still stored in a "String" object.

clone(pdf_dest: Any, force_duplicate: bool = False, ignore_fields: Sequence[str | int] | None = ()) → ByteStringObject[源代码]¶: Clone object into pdf_dest.

hash_bin() → int[源代码]¶

Used to detect modified object.

返回:: Hash considering type and value.

property original_bytes: bytes¶: For compatibility with TextStringObject.original_bytes.

write_to_stream(stream: IO[Any], encryption_key: None | str | bytes = None) → None[源代码]¶

class pypdf.generic.AnnotationBuilder[源代码]¶

基类：object

The AnnotationBuilder is deprecated.

Instead, use the annotation classes in pypdf.annotations.

See adding PDF annotations for its usage combined with PdfWriter.

static text(rect: RectangleObject | Tuple[float, float, float, float], text: str, open: bool = False, flags: int = 0) → None[源代码]¶

static free_text(text: str, rect: RectangleObject | Tuple[float, float, float, float], font: str = 'Helvetica', bold: bool = False, italic: bool = False, font_size: str = '14pt', font_color: str = '000000', border_color: str | None = '000000', background_color: str | None = 'ffffff') → None[源代码]¶

static popup(*, rect: RectangleObject | Tuple[float, float, float, float], flags: int = 0, parent: DictionaryObject | None = None, open: bool = False) → None[源代码]¶

static line(p1: Tuple[float, float], p2: Tuple[float, float], rect: RectangleObject | Tuple[float, float, float, float], text: str = '', title_bar: str | None = None) → None[源代码]¶

static polyline(vertices: List[Tuple[float, float]]) → None[源代码]¶

static rectangle(rect: RectangleObject | Tuple[float, float, float, float], interiour_color: str | None = None) → None[源代码]¶

static highlight(*, rect: RectangleObject | Tuple[float, float, float, float], quad_points: ArrayObject, highlight_color: str = 'ff0000', printing: bool = False) → None[源代码]¶

static ellipse(rect: RectangleObject | Tuple[float, float, float, float], interiour_color: str | None = None) → None[源代码]¶

static polygon(vertices: List[Tuple[float, float]]) → None[源代码]¶

DEFAULT_FIT = <pypdf.generic._fit.Fit object>¶

static link(rect: ~pypdf.generic._rectangle.RectangleObject | ~typing.Tuple[float, float, float, float], border: ~pypdf.generic._data_structures.ArrayObject | None = None, url: str | None = None, target_page_index: int | None = None, fit: ~pypdf.generic._fit.Fit = <pypdf.generic._fit.Fit object>) → None[源代码]¶

class pypdf.generic.ArrayObject(iterable=(), /)[源代码]¶

基类：List[Any], PdfObject

replicate(pdf_dest: PdfWriterProtocol) → ArrayObject[源代码]¶

clone(pdf_dest: PdfWriterProtocol, force_duplicate: bool = False, ignore_fields: Sequence[str | int] | None = ()) → ArrayObject[源代码]¶: Clone object into pdf_dest.

hash_bin() → int[源代码]¶

Used to detect modified object.

返回:: Hash considering type and value.

items() → Iterable[Any][源代码]¶: Emulate DictionaryObject.items for a list (index, object).

write_to_stream(stream: IO[Any], encryption_key: None | str | bytes = None) → None[源代码]¶

static read_from_stream(stream: IO[Any], pdf: PdfReaderProtocol | None, forced_encoding: None | str | List[str] | Dict[int, str] = None) → ArrayObject[源代码]¶

class pypdf.generic.DictionaryObject[源代码]¶

基类：Dict[Any, Any], PdfObject

replicate(pdf_dest: PdfWriterProtocol) → DictionaryObject[源代码]¶

clone(pdf_dest: PdfWriterProtocol, force_duplicate: bool = False, ignore_fields: Sequence[str | int] | None = ()) → DictionaryObject[源代码]¶: Clone object into pdf_dest.

hash_bin() → int[源代码]¶

Used to detect modified object.

返回:: Hash considering type and value.

raw_get(key: Any) → Any[源代码]¶

get_inherited(key: str, default: Any = None) → Any[源代码]¶

Returns the value of a key or from the parent if not found. If not found returns default.

参数:

key -- string identifying the field to return
default -- default value to return

返回:

Current key or inherited one, otherwise default value.

setdefault(key: Any, value: Any | None = None) → Any[源代码]¶

property xmp_metadata: XmpInformationProtocol | None¶

Retrieve XMP (Extensible Metadata Platform) data relevant to the this object, if available.

See Table 347 — Additional entries in a metadata stream dictionary.

返回:: Returns a XmpInformation instance that can be used to access XMP metadata from the document. Can also return None if no metadata was found on the document root.

write_to_stream(stream: IO[Any], encryption_key: None | str | bytes = None) → None[源代码]¶

static read_from_stream(stream: IO[Any], pdf: PdfReaderProtocol | None, forced_encoding: None | str | List[str] | Dict[int, str] = None) → DictionaryObject[源代码]¶

class pypdf.generic.TreeObject(dct: DictionaryObject | None = None)[源代码]¶

基类：DictionaryObject

has_children() → bool[源代码]¶

children() → Iterable[Any][源代码]¶

add_child(child: Any, pdf: PdfWriterProtocol) → None[源代码]¶

inc_parent_counter_default(parent: None | IndirectObject | TreeObject, n: int) → None[源代码]¶

inc_parent_counter_outline(parent: None | IndirectObject | TreeObject, n: int) → None[源代码]¶

insert_child(child: Any, before: Any, pdf: PdfWriterProtocol, inc_parent_counter: Callable[[...], Any] | None = None) → IndirectObject[源代码]¶

remove_child(child: Any) → None[源代码]¶

remove_from_tree() → None[源代码]¶: Remove the object from the tree it is in.

empty_tree() → None[源代码]¶

class pypdf.generic.StreamObject[源代码]¶

基类：DictionaryObject

replicate(pdf_dest: PdfWriterProtocol) → StreamObject[源代码]¶

hash_bin() → int[源代码]¶

Used to detect modified object.

返回:: Hash considering type and value.

get_data() → bytes[源代码]¶

set_data(data: bytes) → None[源代码]¶

hash_value_data() → bytes[源代码]¶

write_to_stream(stream: IO[Any], encryption_key: None | str | bytes = None) → None[源代码]¶

static initializeFromDictionary(data: Dict[str, Any]) → None[源代码]¶

static initialize_from_dictionary(data: Dict[str, Any]) → EncodedStreamObject | DecodedStreamObject[源代码]¶

flate_encode(level: int = -1) → EncodedStreamObject[源代码]¶

decode_as_image() → Any[源代码]¶

Try to decode the stream object as an image

返回:: a PIL image if proper decoding has been found
抛出:: Exception -- (any)during decoding to to invalid object or errors during decoding will be reported It is recommended to catch exceptions to prevent stops in your program.

class pypdf.generic.DecodedStreamObject[源代码]¶: 基类：StreamObject

class pypdf.generic.EncodedStreamObject[源代码]¶

基类：StreamObject

get_data() → bytes[源代码]¶

set_data(data: bytes) → None[源代码]¶

class pypdf.generic.ContentStream(stream: Any, pdf: Any, forced_encoding: None | str | List[str] | Dict[int, str] = None)[源代码]¶

基类：DecodedStreamObject

In order to be fast, this data structure can contain either:

raw data in ._data
parsed stream operations in ._operations.

At any time, ContentStream object can either have both of those fields defined, or one field defined and the other set to None.

These fields are "rebuilt" lazily, when accessed:

when .get_data() is called, if ._data is None, it is rebuilt from ._operations.
when .operations is called, if ._operations is None, it is rebuilt from ._data.

Conversely, these fields can be invalidated:

when .set_data() is called, ._operations is set to None.
when .operations is set, ._data is set to None.

replicate(pdf_dest: PdfWriterProtocol) → ContentStream[源代码]¶

clone(pdf_dest: Any, force_duplicate: bool = False, ignore_fields: Sequence[str | int] | None = ()) → ContentStream[源代码]¶

Clone object into pdf_dest.

参数:

pdf_dest
force_duplicate
ignore_fields

返回:

The cloned ContentStream

get_data() → bytes[源代码]¶

set_data(data: bytes) → None[源代码]¶

property operations: List[Tuple[Any, bytes]]¶

isolate_graphics_state() → None[源代码]¶

write_to_stream(stream: IO[Any], encryption_key: None | str | bytes = None) → None[源代码]¶

class pypdf.generic.ViewerPreferences(value: Any = None)[源代码]¶

基类：DictionaryObject

property PRINT_SCALING: NameObject¶

class pypdf.generic.OutlineItem(title: str, page: NumberObject | IndirectObject | NullObject | DictionaryObject, fit: Fit)[源代码]¶

基类：Destination

write_to_stream(stream: IO[Any], encryption_key: None | str | bytes = None) → None[源代码]¶

class pypdf.generic.OutlineFontFlag(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[源代码]¶

基类：IntFlag

A class used as an enumerable flag for formatting an outline font.

italic = 1¶

bold = 2¶

pypdf.generic.read_object(stream: IO[Any], pdf: PdfReaderProtocol | None, forced_encoding: None | str | List[str] | Dict[int, str] = None) → PdfObject | int | str | ContentStream[源代码]¶

Create a ByteStringObject or a TextStringObject from a string to represent the string.

参数:

string -- The data being used
forced_encoding -- Typically None, or an encoding string

返回:

A ByteStringObject

抛出:

TypeError -- If string is not of type str or bytes.

pypdf.generic.encode_pdfdocencoding(unicode_string: str) → bytes[源代码]¶

pypdf.generic.decode_pdfdocencoding(byte_array: bytes) → str[源代码]¶

pypdf.generic.hex_to_rgb(value: str) → Tuple[float, float, float][源代码]¶

pypdf.generic.is_null_or_none(x: Any) → TypeGuard[None | NullObject | IndirectObject][源代码]¶

返回:: True if x is None or NullObject.

pypdf.generic.read_hex_string_from_stream(stream: IO[Any], forced_encoding: None | str | List[str] | Dict[int, str] = None) → TextStringObject | ByteStringObject[源代码]¶

pypdf.generic.read_string_from_stream(stream: IO[Any], forced_encoding: None | str | List[str] | Dict[int, str] = None) → TextStringObject | ByteStringObject[源代码]¶

class pypdf._protocols.PdfObjectProtocol(*args, **kwargs)[源代码]¶

基类：Protocol

indirect_reference: Any¶

clone(pdf_dest: Any, force_duplicate: bool = False, ignore_fields: Tuple[str, ...] | List[str] | None = ()) → Any[源代码]¶

get_object() → PdfObjectProtocol | None[源代码]¶

hash_value() → bytes[源代码]¶

write_to_stream(stream: IO[Any], encryption_key: None | str | bytes = None) → None[源代码]¶

class pypdf._protocols.XmpInformationProtocol(*args, **kwargs)[源代码]¶: 基类：PdfObjectProtocol

class pypdf._protocols.PdfCommonDocProtocol(*args, **kwargs)[源代码]¶

基类：Protocol

property pdf_header: str¶

property pages: List[Any]¶

property root_object: PdfObjectProtocol¶

get_object(indirect_reference: Any) → PdfObjectProtocol | None[源代码]¶

property strict: bool¶

class pypdf._protocols.PdfReaderProtocol(*args, **kwargs)[源代码]¶

基类：PdfCommonDocProtocol, Protocol

abstract property xref: Dict[int, Dict[int, Any]]¶

abstract property trailer: Dict[str, Any]¶

class pypdf._protocols.PdfWriterProtocol(*args, **kwargs)[源代码]¶

基类：PdfCommonDocProtocol, Protocol

incremental: bool¶

abstract write(stream: Path | str | IO[Any]) → Tuple[bool, IO[Any]][源代码]¶