pdf-edit-mcp
Bringing format-preserving PDF editing to AI agents — 38 tools, three guided workflows, one long-running Python bridge.
Install
claude mcp add pdf-edit-mcp -- npx -y @aryanbv/pdf-edit-mcpnpx -y @aryanbv/pdf-edit-mcpThe Challenge
Once pdf-edit-engine existed as a Python library, the next question was how to make it usable by AI agents. A naïve MCP server would spawn a fresh Python process per tool call — tens of milliseconds of interpreter startup on every request, which compounds over batch operations. And without structured workflows, agents would have no idea when to inspect a PDF, when to call analyze_subset before editing, or how to combine 38 tools into a coherent edit. Simply exposing the library's functions one-for-one would be a usability failure even if it technically worked.
The Approach
pdf-edit-mcp is a TypeScript MCP server that spawns bridge.py once at startup and keeps it alive for the entire session, communicating over stdio via JSON-RPC 2.0. Zod schemas validate every input at the TypeScript boundary before it ever hits Python, so malformed agent requests never reach the engine. The 38 tools are organised into seven categories (reading, text edits, block ops, section ops, annotations, document manipulation, metadata/security) and backed by three built-in MCP prompts that teach agents the canonical workflow: quick-pdf-edit for typos and dates, section-swap for structural rewrites (including the subtle requirement that batch_replace_block must include all sibling sections for uniform spacing), and comprehensive-pdf-edit for multi-step edits. Every tool result surfaces pdf-edit-engine's FidelityReport so agents can verify quality before calling it done.
The Impact
Published to npm as @aryanbv/pdf-edit-mcp, installable in one command across Claude Desktop, Claude Code, Cursor, Windsurf, and VS Code. The long-running Python bridge eliminates interpreter startup from the hot path, so a 500-edit batch call runs in essentially the same time as calling the engine directly. The three built-in prompts are workflow scaffolding agents can reference by name — teaching them the inspect → analyze → execute → verify loop rather than leaving them to discover it.
Tech Stack
Model Context Protocol SDK
Official @modelcontextprotocol/sdk for tool and prompt registration over stdio transport — the canonical implementation every major MCP client expects.
Python subprocess bridge
Spawns bridge.py once at startup and keeps it alive for the whole session — eliminates per-request Python startup and makes batch operations fast.
Zod
Runtime input validation at the TypeScript boundary. Agents generate unpredictable arguments; Zod catches them before they reach Python, and the schemas double as the MCP tool parameter specs.
JSON-RPC 2.0 over stdio
Standard IPC format between the TypeScript server and the Python bridge. stdout is the IPC channel; all logging routes to stderr so it never contaminates the protocol stream.