The Fast, Lightweight PDF Library for Python
High performance PDF library for Python, built for fast data extraction, conversion and file processing. Lightweight, efficient, and available on PyPI for easy installation.
pip install pymupdf
Why Developers Choose PyMuPDF
Table Extraction
PyMuPDF offers a straightforward and efficient method for extracting tables.
Efficient
Highly efficient in parsing PDFs and extracting text, images, and metadata for data analysis.
Performance Boosted
With C code performance at the core of PyMuPDF get serious with your applications.
PyMuPDF Pro for Office Document
PyMuPDF Pro supports a wide range of Office file formats, including DOC/DOCX, PPT/PPTX, XLS/XLSX, as well as HWP and HWPX, the widely used formats for Korean word processing.
Try It NowPyMuPDF4LLM for RAG Integrations
PyMuPDF integrates seamlessly with LangChain, Llamaparse and more! Prepare your data for RAG solutions and give your LLM the data that your users can trust.
Try It NowGet Started with RAG
Explore how PyMuPDF powers Retrieval-Augmented Generation (RAG) workflows at pymupdf.git/RAG. Perfect for building intelligent apps that need fast and reliable PDF parsing.
Learn MoreInstall from PyPI
The easiest way to use PyMuPDF is via PyPI. Just install it with pip and you’re ready to go:
pip install pymupdf
PyMuPDF Resources
Quick Start Guide
Jump into our docs and get started.
Find us on Github
Check out our ongoing development.
Join the PyMuPDF Community
Get help, share ideas, and connect with developers building with PyMuPDF.