Blog

How to Extract Document Data with Claude Using Lido MCP

April 20, 2026

To extract document data with Claude, install the Lido MCP server with one command: claude mcp add lido -- npx -y @lido-app/mcp-server. Once connected, you can ask Claude to pull fields from PDFs, images, and scanned documents by describing what you need in plain English. Lido's template-free extraction engine reads the document, returns structured data, and can export it however you want. The setup takes about five minutes and works with Claude Code, Claude Desktop, Cursor, and any other MCP-compatible client.

What the Lido MCP server does

MCP (Model Context Protocol) is an open standard that lets AI assistants connect to external tools. The Lido MCP server connects Claude to Lido's document extraction API, so instead of bouncing between your AI assistant and a separate document processing tool, you just tell Claude what to extract.

The server gives Claude four tools:

ToolWhat it does
authenticateOpens a local browser window where you paste your Lido API key. One-time setup that persists across sessions.
extract_file_dataProcesses a document (PDF, image, scan) and returns structured data in the columns you specify.
extraction_tipsGives Claude techniques for refining extractions when results need adjustment.
extractor_usageShows your remaining monthly page quota.

In practice, this means Claude can read an invoice, pull out line items and totals and vendor details, then write everything to a CSV or push it into a spreadsheet. All in one conversation. You never leave your editor.

How to install the Lido MCP server

You need Node.js 18 or later. The server runs locally via npx, so there's nothing to clone or build.

Option 1: Claude Code (terminal)

If you use Claude Code, run one command:

claude mcp add lido -- npx -y @lido-app/mcp-server

That registers the Lido server as an MCP connection. Claude Code launches it automatically when you start a session.

Option 2: Claude Desktop

Open your config file at ~/Library/Application Support/Claude/claude_desktop_config.json on Mac or %APPDATA%\Claude\claude_desktop_config.json on Windows. Add Lido to the mcpServers section:

{
  "mcpServers": {
    "lido": {
      "command": "npx",
      "args": ["-y", "@lido-app/mcp-server"]
    }
  }
}

Restart Claude Desktop and you'll see the Lido tools in the tools menu.

Option 3: Any MCP-compatible client

The server works with anything that supports MCP. Launch it with:

npx -y @lido-app/mcp-server

Cursor, Windsurf, Cline, and other MCP-compatible editors can connect using their own config methods. See our full roundup of MCP servers for document processing for other options.

How to authenticate with Lido

The first time you use the Lido tools, Claude calls authenticate, which opens a browser window at http://127.0.0.1. Paste your Lido API key there. If you don't have one yet, sign up at lido.app and grab it from account settings.

Credentials get saved to .lido-mcp/credentials.json in your project directory and are automatically gitignored. One-time thing. No environment variables, no manual config editing.

How to extract data from documents with Claude

Once you're set up, extraction is conversational. Drop a PDF or image into your project directory and tell Claude what you need.

Example: invoice extraction

You type: "Extract the vendor name, invoice number, invoice date, line items, and total from ./invoice.pdf"

Claude calls extract_file_data, sends the file to Lido, and returns a structured table with those fields. It works regardless of layout. A three-column invoice from a German distributor and a simple one-pager from a local vendor get processed the same way. This is the same template-free extraction that Lido uses in its web app.

Example: batch processing

You type: "There are 50 invoices in the ./invoices/ folder. Extract vendor name, invoice number, date, and total from each one and write the results to results.csv"

Claude processes each file, collects the results, and writes the CSV. For 50 standard invoices this takes a few minutes. That's hours of manual data entry gone.

Example: non-invoice documents

This isn't limited to invoices. Receipts, purchase orders, bank statements, tax forms like K-1s, insurance documents, shipping docs, contracts. Describe the fields, get structured data back.

"Extract the patient name, provider, service date, billed amount, and allowed amount from this EOB" works the same way.

How to refine document extractions

Most extractions work on the first try. When they don't, the extraction_tips tool gives Claude refinement techniques so you can fix things without leaving the conversation.

A few common ones: you can add field-level instructions like "The invoice number on this vendor's documents is in the top right corner labeled 'Ref No.'" to help with ambiguous layouts. You can set column formatting ("dates as YYYY-MM-DD," "strip currency symbols") to match what your downstream system expects. And you can add computed columns ("multiply quantity by unit price") to validate totals or create derived fields.

You look at the results, tell Claude what's off, it adjusts. No template editor, no regex, no retraining.

Tracking your API usage

Ask Claude "How many pages do I have left this month?" and it checks your remaining quota via the extractor_usage tool. Useful when you're running a big batch job and want to know if you'll hit your limit.

Document extraction workflows with MCP

The interesting part of MCP isn't the extraction itself. It's that Claude can chain extraction with other tasks in a single conversation. These multi-step workflows are what make AI agents for document processing practical. For a step-by-step implementation, see our guide to building a document processing agent with Claude Code.

Invoice data to spreadsheet

"Extract vendor, invoice number, date, and total from every PDF in ./ap-inbox/ and append to my Google Sheet." Claude extracts, formats, and writes the data. You review the spreadsheet and approve for payment. For more on this workflow, see our guide to automating invoice extraction from email.

Purchase order vs. invoice comparison

"Compare the line items on this purchase order against this invoice and flag discrepancies." Claude extracts from both documents and cross-references, catching quantity mismatches, price differences, and missing items. This is one of the use cases where document automation saves the most time.

Developer data pipeline

"Read the bank statement PDF, extract all transactions, and write a JSON file with date, description, amount, and running balance." You get clean, structured JSON ready to ingest into your app or database. For more API-focused options, see our roundup of document extraction APIs for developers.

Compliance document check

"Extract the certificate holder, policy number, coverage amounts, and expiration date from these 30 COI PDFs. Flag any that expire before June 2026." Claude extracts the data and surfaces the compliance issues in one pass.

When to use the MCP server vs. the Lido web app

Lido's web app at lido.app is the better fit for teams who want a visual interface, email ingestion, and built-in approval workflows. The MCP server is for developers and power users who live in the terminal or work in AI-native editors and want extraction woven into what they're already doing.

The main reasons to use MCP over the web app: you stay in Claude without context switching, you can chain extraction with file operations and code generation and database writes in one conversation, and you can build repeatable pipelines that run from the terminal.

Both use the same extraction engine. Same accuracy, same document support, same template-free approach. The difference is where you interact with it.

Getting started with Lido MCP

The setup takes about five minutes:

  1. Install Node.js 18+ if you don't have it.
  2. Run claude mcp add lido -- npx -y @lido-app/mcp-server (for Claude Code) or add the config JSON (for Claude Desktop).
  3. Start a Claude session and ask it to extract data from a document.
  4. Authenticate when prompted by pasting your Lido API key.
  5. Describe the fields you want. Get structured data back.

The Lido MCP server is open source under the MIT license. Source code on GitHub. Extraction uses your Lido account quota. If you don't have an account, sign up free at lido.app.

Frequently asked questions

What is the Lido MCP server?

It's an MCP integration that connects Claude (and other MCP-compatible AI assistants) to Lido's document extraction API. You describe the fields you need from a PDF, image, or scanned document, and Claude extracts them using Lido's engine. No templates, no model training, no separate upload interface.

Do I need to be a developer to use the Lido MCP server?

You need to be comfortable running a command in your terminal, but you don't need to write code. Installation is one command. Authentication happens in your browser. After that, extraction is conversational: tell Claude what you want from a document, and it returns the results.

What document types can I extract data from with Lido MCP?

Anything with structured or semi-structured data. Invoices, receipts, purchase orders, bank statements, tax forms (W-2s, K-1s), insurance documents, bills of lading, contracts, certificates of insurance, medical claims. The extraction is template-free, so it handles any layout without per-document-type setup.

How accurate is extraction through the MCP server compared to the Lido web app?

Identical. The MCP server calls the same extraction API. Same engine, same accuracy. The only difference is the interface: MCP runs inside Claude, while the web app has a visual UI with email ingestion and team collaboration features.

Does the Lido MCP server work with AI tools other than Claude?

Yes. It implements the open MCP standard, so any MCP-compatible client can use it. Claude Code, Claude Desktop, Cursor, Windsurf, Cline, and others. The install command varies by client, but the server is the same.

Is the Lido MCP server free?

The MCP server is free and open source (MIT license). Document extraction uses your Lido account's page quota. There's a free tier to get started. You can check your remaining pages anytime by asking Claude "How many pages do I have left?"

Ready to grow your business with document automation, not headcount?

Join hundreds of teams growing faster by automating the busywork with Lido.