迁移指南:从 1.x 到 2.x¶
PyPDF2<2.0.0(文档)与 PyPDF2>=2.0.0(文档)差异较大。
幸运的是,大多数更改只是简单的命名调整。本指南将帮助您从 PyPDF2 1.x(甚至原始的 PyPdf)迁移到 PyPDF2>=2.0.0。
您可以使用更新版本运行代码,并通过运行以下命令显示弃用警告:
python -W all your_code.py
导入和模块¶
PyPDF2.utils已被移除。PyPDF2.pdf已被移除。您可以直接从PyPDF2或PyPDF2.generic中导入所需内容。
命名调整¶
类名¶
基础类已重命名,因为它们不仅可以操作文件,还可以操作 ByteIO 流。同时,strict 参数的默认值从 strict=True 改为 strict=False。
PdfFileReader➔PdfReaderPdfFileWriter➔PdfWriterPdfFileMerger➔PdfMerger
PdfFileReader 和 PdfFileMerger 不再支持 overwriteWarnings 参数。新行为默认 overwriteWarnings=False。
函数、方法和属性名称¶
在 PyPDF2.xmp.XmpInformation:
rdfRoot➔rdf_rootxmp_createDate➔xmp_create_datexmp_creatorTool➔xmp_creator_toolxmp_metadataDate➔xmp_metadata_datexmp_modifyDate➔xmp_modify_datexmpMetadata➔xmp_metadataxmpmm_documentId➔xmpmm_document_idxmpmm_instanceId➔xmpmm_instance_id
在 PyPDF2.generic:
readObject➔read_objectconvertToInt➔convert_to_intDocumentInformation.getText➔DocumentInformation._get_text: 此方法通常不应使用;如果您需要,请告诉我。readHexStringFromStream➔read_hex_string_from_streaminitializeFromDictionary➔initialize_from_dictionarycreateStringObject➔create_string_objectTreeObject.hasChildren➔TreeObject.has_childrenTreeObject.emptyTree➔TreeObject.empty_tree
在许多地方:
getObject➔get_objectwriteToStream➔write_to_streamreadFromStream➔read_from_stream
PdfReader class:¶
reader.getPage(pageNumber)➔reader.pages[page_number]reader.getNumPages()/reader.numPages➔len(reader.pages)getDocumentInfo➔metadataflattenedPagesattribute ➔flattened_pagesresolvedObjectsattribute ➔resolved_objectsxrefIndexattribute ➔xref_indexgetNamedDestinations/namedDestinationsattribute ➔named_destinationsgetPageLayout/pageLayout➔page_layoutattributegetPageMode/pageMode➔page_modeattributegetIsEncrypted/isEncrypted➔is_encryptedattributegetOutlines➔get_outlinesreadObjectHeader➔read_object_headercacheGetIndirectObject➔cache_get_indirect_objectcacheIndirectObject➔cache_indirect_objectgetDestinationPageNumber➔get_destination_page_numberreadNextEndLine➔read_next_end_line_zeroXref➔_zero_xref_authenticateUserPassword➔_authenticate_user_password_pageId2Numattribute ➔_page_id2num_buildDestination➔_build_destination_buildOutline➔_build_outline_getPageNumberByIndirect(indirectRef)➔_get_page_number_by_indirect(indirect_ref)_getObjectFromStream➔_get_object_from_stream_decryptObject➔_decrypt_object_flatten(..., indirectRef)➔_flatten(..., indirect_ref)_buildField➔_build_field_checkKids➔_check_kids_writeField➔_write_field_write_field(..., fieldAttributes)➔_write_field(..., field_attributes)_read_xref_subsections(..., getEntry, ...)➔_read_xref_subsections(..., get_entry, ...)
PdfWriter class:¶
writer.getPage(pageNumber)➔writer.pages[page_number]writer.getNumPages()➔len(writer.pages)addMetadata➔add_metadataaddPage➔add_pageaddBlankPage➔add_blank_pageaddAttachment(fname, fdata)➔add_attachment(filename, data)insertPage➔insert_pageinsertBlankPage➔insert_blank_pageappendPagesFromReader➔append_pages_from_readerupdatePageFormFieldValues➔update_page_form_field_valuescloneReaderDocumentRoot➔clone_reader_document_rootcloneDocumentFromReader➔clone_document_from_readergetReference➔get_referencegetOutlineRoot➔get_outline_rootgetNamedDestRoot➔get_named_dest_rootaddBookmarkDestination➔add_bookmark_destinationaddBookmarkDict➔add_bookmark_dictaddBookmark➔add_bookmarkaddNamedDestinationObject➔add_named_destination_objectaddNamedDestination➔add_named_destinationremoveLinks➔remove_linksremoveImages(ignoreByteStringObject)➔remove_images(ignore_byte_string_object)removeText(ignoreByteStringObject)➔remove_text(ignore_byte_string_object)addURI➔add_uriaddLink➔add_linkgetPage(pageNumber)➔get_page(page_number)getPageLayout / setPageLayout / pageLayout➔page_layout attributegetPageMode / setPageMode / pageMode➔page_mode attribute_addObject➔_add_object_addPage➔_add_page_sweepIndirectReferences➔_sweep_indirect_references
PdfMerger class¶
__init__parameter:strict=True➔strict=False(thePdfFileMergerstill has the old default)addMetadata➔add_metadataaddNamedDestination➔add_named_destinationsetPageLayout➔set_page_layoutsetPageMode➔set_page_mode
Page class:¶
artBox/bleedBox/cropBox/mediaBox/trimBox➔artbox/bleedbox/cropbox/mediabox/trimboxgetWidth,getHeight➔width/heightgetLowerLeft_x/getUpperLeft_x➔leftgetUpperRight_x/getLowerRight_x➔rightgetLowerLeft_y/getLowerRight_y➔bottomgetUpperRight_y/getUpperLeft_y➔topgetLowerLeft/setLowerLeft➔lower_leftpropertyupperRight➔upper_right
mergePage➔merge_pagerotateClockwise/rotateCounterClockwise➔rotate_clockwise_mergeResources➔_merge_resources_contentStreamRename➔_content_stream_rename_pushPopGS➔_push_pop_gs_addTransformationMatrix➔_add_transformation_matrix_mergePage➔_merge_page
XmpInformation class:¶
getElement(..., aboutUri, ...)➔get_element(..., about_uri, ...)getNodesInNamespace(..., aboutUri, ...)➔get_nodes_in_namespace(..., aboutUri, ...)_getText➔_get_text
utils.py:¶
matrixMultiply➔ `matrix_multiplyRC4_encryptis moved to the security module
参数名称¶
PdfWriter.get_page:pageNumber➔page_numberPyPDF2.filters(all classes):decodeParms➔decode_parmsPyPDF2.filters(all classes):decodeStreamData➔decode_stream_datapagenum➔page_numberPdfMerger.merge:position➔page_numberPdfWriter.add_outline_item_destination:dest➔page_destinationPdfWriter.add_named_destination_object:dest➔page_destinationPdfWriter.encrypt:user_pwd➔user_passwordPdfWriter.encrypt:owner_pwd➔owner_password
弃用¶
一些类/函数已被弃用且没有替换:
PyPDF2.utils.ConvertFunctionsToVirtualListPyPDF2.utils.formatWarningPyPDF2.isInt(obj): 使用instance(obj, int)替代PyPDF2.u_(s): 直接使用sPyPDF2.chr_(c): 使用chr(c)替代PyPDF2.barray(b): 使用bytearray(b)替代PyPDF2.isBytes(b): 使用instance(b, type(bytes()))替代PyPDF2.xrange_fn: 使用range替代PyPDF2.string_type: 使用str替代PyPDF2.isString(s): 使用instance(s, str)替代PyPDF2._basestring: 使用strinsteadb_(...)已被删除。您通常应该能够直接使用字节对象,否则您可以复制此内容