Feature | PyMuPDF | pikepdf | PyPDF2 | pdfrw |
---|---|---|---|---|
Supports Multiple Document Formats | PDF XPS EPUB MOBI FB2 CBZ SVG Image | |||
Implementation | C and Python | C++ and Python | Python | Python |
Render Document Pages | All document types | No rendering | No rendering | No rendering |
Extract Text | All document types | PDF only | ||
Extract Vector Graphics | All document types | |||
Draw Vector Graphics (PDF) | ||||
Based on Existing, Mature Library | MuPDF | QPDF | ||
Automatic Repair of Damaged PDFs | ||||
Encrypted PDFs | Limited | |||
Linerarized PDFs | ||||
Incremental Updates | ||||
Integrates with Jupyter and IPython Notebooks | ||||
Joining / Merging PDF with other Document Types | All document types | PDF only | PDF only | PDF only |
OCR API for Seamless Integration with Tesseract | All document types | |||
Integrated Checkpoint / Restart Feature (PDF) | ||||
PDF Optional Content | ||||
PDF Embedded Files | Limited | |||
PDF Redactions | ||||
PDF Annotations | Full | Limited | ||
PDF Form Fields | Create, read, update | Limited, no creation | ||
PDF Page Labels | ||||
Support Font Sub-Setting |