PyMuPDF4LLM
now ships with Layout. GNN-based document intelligence, CPU-only, no GPU required

TRY THE DEMOTRY DEMO

CONTACT US GET STARTED

The World's Fastest
PDF Processing Library

High performance PDF library for Python, built for fast data extraction, conversion and file processing. Lightweight, efficient, and available on PyPI for easy installation.

CONTACT US INSTALL PYMUPDF

PyMuPDF Feature

Key Capabilities

From low-level PDF manipulation to LLM-ready extraction.
Choose what your workflow needs.

Extract all Document Text

Extract Key-Value Pairs from a Page

Extract Text from within a Rectangle

Extract Text in Natural Reading Order

Extract Table Content from Documents

Mark Extracted Text

Extract Text with Color

Extract Images: Non-PDF Documents

Extract Images: PDF Documents

Extract vector graphics

Getting All Annotations from a Document

Extract Drawings

Enterprise Scaling
Made Simple

PyMuPDF is built on open collaboration and always will be. Our code is freely available on GitHub under the AGPL license, welcoming contributions from developers worldwide. For projects requiring different terms, we also offer commercial licensing through Artifex.

PyMuPDF Licensing

Need More?

Building RAG pipelines or LLM applications?

Upgrade to PyMuPDF4LLM for AI-optimized text extraction with automatic structure preservation. 10x faster with no GPU required.

SEE PYMUPDF4LLM

PyMuPDF 4LLM

Try PyMuPDF Pro

Everything in PyMuPDF and PyMuPDF4LLM, plus Office (Word, Excel, PowerPoint) and HWP support.

TRY PYMUPDF PRO

PyMuPDF Pro

Your Next Document Pipeline
Starts Here

Install PyMuPDF, extract your first document, and see why thousands of developers trust us for production document processing.

CONTACT US GET STARTED

© 2026 Artifex Software Inc. All rights reserved.

PyMuPDF PyMuPDF4LLM PyMuPDF Pro

Licensing Blog Forum Documentation Privacy Policy

© 2026 Artifex Software Inc. All rights reserved.