文档对象¶
Document objects
主要文档和相关对象。
The main Document and related objects.
Document
构造函数¶
Document
constructor
Document
对象¶
Document
objects
- class docx.document.Document[源代码]¶
WordprocessingML (WML) document.
Not intended to be constructed directly. Use
docx.Document()
to open or create a document.- add_heading(text: str = '', level: int = 1)[源代码]¶
Return a heading paragraph newly added to the end of the document.
The heading paragraph will contain text and have its paragraph style determined by level. If level is 0, the style is set to Title. If level is 1 (or omitted), Heading 1 is used. Otherwise the style is set to Heading {level}. Raises
ValueError
if level is outside the range 0-9.
- add_paragraph(text: str = '', style: str | ParagraphStyle | None = None) Paragraph [源代码]¶
Return paragraph newly added to the end of the document.
The paragraph is populated with text and having paragraph style style.
text can contain tab (
\t
) characters, which are converted to the appropriate XML form for a tab. text can also include newline (\n
) or carriage return (\r
) characters, each of which is converted to a line break.
- add_picture(image_path_or_stream: str | IO[bytes], width: int | Length | None = None, height: int | Length | None = None)[源代码]¶
Return new picture shape added in its own paragraph at end of the document.
The picture contains the image at image_path_or_stream, scaled based on width and height. If neither width nor height is specified, the picture appears at its native size. If only one is specified, it is used to compute a scaling factor that is then applied to the unspecified dimension, preserving the aspect ratio of the image. The native size of the picture is calculated using the dots-per-inch (dpi) value specified in the image file, defaulting to 72 dpi if no value is specified, as is often the case.
- add_section(start_type: WD_SECTION_START = WD_SECTION_START.NEW_PAGE)[源代码]¶
Return a
Section
object newly added at the end of the document.The optional start_type argument must be a member of the WD_SECTION_START enumeration, and defaults to
WD_SECTION.NEW_PAGE
if not provided.
- add_table(rows: int, cols: int, style: str | _TableStyle | None = None)[源代码]¶
Add a table having row and column counts of rows and cols respectively.
style may be a table style object or a table style name. If style is
None
, the table inherits the default table style of the document.
- property core_properties¶
A
CoreProperties
object providing Dublin Core properties of document.
- property inline_shapes¶
The
InlineShapes
collection for this document.An inline shape is a graphical object, such as a picture, contained in a run of text and behaving like a character glyph, being flowed like other text in a paragraph.
- iter_inner_content() Iterator[Paragraph | Table] [源代码]¶
Generate each Paragraph or Table in this document in document order.
- property paragraphs: List[Paragraph]¶
The
Paragraph
instances in the document, in document order.Note that paragraphs within revision marks such as
<w:ins>
or<w:del>
do not appear in this list.
- property part: DocumentPart¶
The
DocumentPart
object of this document.
- save(path_or_stream: str | IO[bytes])[源代码]¶
Save this document to path_or_stream.
path_or_stream can be either a path to a filesystem location (a string) or a file-like object.
- property tables: List[Table]¶
All
Table
instances in the document, in document order.Note that only tables appearing at the top level of the document appear in this list; a table nested inside a table cell does not appear. A table within revision marks such as
<w:ins>
or<w:del>
will also not appear in the list.
CoreProperties
对象¶
CoreProperties
objects-
每个 Document
对象都通过其 core_properties
属性提供对其 CoreProperties
对象的访问。CoreProperties
对象提供对文档所谓的 核心属性 的读/写访问。核心属性包括作者、类别、评论、content_status、创建、标识符、关键字、语言、上次修改者、上次打印、修改、修订、主题、标题和版本。
每个属性都是三种类型之一:str
、datetime.datetime
或 int
。字符串属性的长度限制为 255 个字符,如果未设置,则返回空字符串 ('')。日期属性被分配并返回为不带时区的 datetime.datetime
对象,即 UTC。任何时区转换都是客户的责任。如果未设置,日期属性将返回 None
。
python-docx
除了向没有核心属性部分的演示文稿添加核心属性部分(非常罕见)外,不会自动设置任何文档核心属性。如果 python-docx
添加核心属性部分,则它包含 title、last_modified_by、revision 和 modified 属性的默认值。如果需要该行为,客户端代码应该更新 revision 和 last_modified_by 等属性。
Each Document
object provides access to its CoreProperties
object via its core_properties
attribute. A CoreProperties
object provides read/write access to the so-called core properties for the document. The core properties are author, category, comments, content_status, created, identifier, keywords, language, last_modified_by, last_printed, modified, revision, subject, title, and version.
Each property is one of three types, str
, datetime.datetime
, or int
. String properties are limited in length to 255 characters and return an empty string ('') if not set. Date properties are assigned and returned as datetime.datetime
objects without timezone, i.e. in UTC. Any timezone conversions are the responsibility of the client. Date properties return None
if not set.
python-docx
does not automatically set any of the document core properties other than to add a core properties part to a presentation that doesn't have one (very uncommon). If python-docx
adds a core properties part, it contains default values for the title, last_modified_by, revision, and modified properties. Client code should update properties like revision and last_modified_by if that behavior is desired.
- class docx.opc.coreprops.CoreProperties[源代码]¶
- author¶
string -- An entity primarily responsible for making the content of the resource.
- category¶
string -- A categorization of the content of this package. Example values might include: Resume, Letter, Financial Forecast, Proposal, or Technical Presentation.
- comments¶
string -- An account of the content of the resource.
- content_status¶
string -- completion status of the document, e.g. 'draft'
- created¶
datetime -- time of intial creation of the document
- identifier¶
string -- An unambiguous reference to the resource within a given context, e.g. ISBN.
- keywords¶
string -- descriptive words or short phrases likely to be used as search terms for this document
- language¶
string -- language the document is written in
- last_modified_by¶
string -- name or other identifier (such as email address) of person who last modified the document
- last_printed¶
datetime -- time the document was last printed
- modified¶
datetime -- time the document was last modified
- revision¶
int -- number of this revision, incremented by Word each time the document is saved. Note however
python-docx
does not automatically increment the revision number when it saves a document.
- subject¶
string -- The topic of the content of the resource.
- title¶
string -- The name given to the resource.
- version¶
string -- free-form version string