The World's Fastest
PDF Processing Library
High performance PDF library for Python, built for fast data extraction, conversion and file processing. Lightweight, efficient, and available on PyPI for easy installation.

Key Capabilities
From low-level PDF manipulation to LLM-ready extraction.
Choose what your workflow needs.
Extract all Document Text
Extract Key-Value Pairs from a Page
Extract Text from within a Rectangle
Extract Text in Natural Reading Order
Extract Table Content from Documents
Mark Extracted Text
Extract Text with Color
Extract Images: Non-PDF Documents
Extract Images: PDF Documents
Extract vector graphics
Getting All Annotations from a Document
Extract Drawings
Enterprise Scaling
Made Simple
PyMuPDF is built on open collaboration and always will be. Our code is freely available on GitHub under the AGPL license, welcoming contributions from developers worldwide. For projects requiring different terms, we also offer commercial licensing through Artifex.

Need More?
Building RAG pipelines or LLM applications?
Upgrade to PyMuPDF4LLM for AI-optimized text extraction with automatic structure preservation. 10x faster with no GPU required.
SEE PYMUPDF4LLM
Try PyMuPDF Pro
Everything in PyMuPDF and PyMuPDF4LLM, plus Office (Word, Excel, PowerPoint) and HWP support.
TRY PYMUPDF PRO
Your Next Document Pipeline
Starts Here
Install PyMuPDF, extract your first document, and see why thousands of developers trust us for production document processing.
