The ultimate library for managing PDF documents
MuPDF is the fast & powerful solution for managing PDF and other document formats.
A few of our trusted partners
The complete PDF solution for every need
The MuPDF source code
The core library - built with C (and a lot of love 💖).
Quick Start Guide
Jump into our documentation to find out how to download and get started.
The fastest PDF document parsing and data extraction software available in Python
Table Extraction
PyMuPDF offers a straightforward and efficient method for extracting tables using Python.
Efficient
Highly efficient in parsing PDFs and extracting text, images, and metadata for data analysis.
Performance boosted
With C code performance at the core of PyMuPDF get serious with your applications.
How to install
PyMuPDF should be installed using pip with:
RAG Integration
PyMuPDF integrates seamlessly with LangChain, Llamaparse and more! Prepare your data for Retrieval-Augmented Generation solutions and give your LLM the data that your users can trust.
PyMuPDF and elevait
elevait deals with the extraction of structured information from unstructured sources for the construction industry.
Read Case StudyEnhance PyMuPDF Capability with Office Document Support
PyMuPDF Pro supports DOC/DOCX, PPT/PPTX, XLS/XLSX and HWP/HWPX.
Quick Start Guide
Jump into our documentation to find out how to download and get started.