Name	Name	Last commit message	Last commit date
parent directory ..
edgeparse	edgeparse
tests	tests
README.md	README.md
pyproject.toml	pyproject.toml

Name

Last commit message

Last commit date

edgeparse

High-performance PDF-to-structured-data extraction for Python — powered by a Rust engine via PyO3.

Install

pip install edgeparse

Pre-built wheels are available for macOS, Linux (x86_64, arm64), and Windows (x64). No system dependencies or compilation required.

Quick start

import edgeparse

# Convert a PDF to Markdown
result = edgeparse.convert("document.pdf")
print(result.markdown)

# Convert with options
result = edgeparse.convert(
    "document.pdf",
    format="markdown",      # "markdown" | "json" | "html"
    extract_images=False,
    page_range=None,        # None = all pages, or [0, 5] for pages 1–6
)

CLI

edgeparse document.pdf                     # → Markdown on stdout
edgeparse document.pdf --format json       # → JSON
edgeparse /path/to/dir/ --output-dir out/  # batch convert

Performance

edgeparse consistently leads open benchmarks for PDF-to-Markdown extraction quality across 200-document test suites.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

edgeparse

Install

Quick start

CLI

Performance

Links

FilesExpand file tree

python

Directory actions

More options

Directory actions

More options

Latest commit

History

python

Folders and files

parent directory

README.md

edgeparse

Install

Quick start

CLI

Performance

Links