Blog

How to extract data from remittance advices (any format)

March 22, 2026

If you work in accounts receivable or cash application, you already know this: you can receive ten remittance advices in a single day and no two of them look alike. One customer sends a six-page PDF with two columns per page and thirty invoice lines. Another sends a single-page table with four columns. A third sends an Excel file where the column labeled "Reference Number" is what everyone else calls "Invoice Number." A fourth buries three debit memos on the last page after the actual payment detail.

Each format is its own puzzle, and the puzzle changes every time a customer updates their ERP or switches their accounts payable software. For a company with hundreds of customers, manually keying this data into your own system for cash application is a real time sink, and the error rate climbs fast when someone is retyping check amounts from a dense two-column FedEx statement at 4pm on a Friday.

To extract data from remittance advices in any format, use an AI-powered extraction tool rather than template-based OCR. Set up a single output template with your required columns (invoice number, payment amount, discount, check number, customer name) and the AI maps each customer's field names to your columns automatically, regardless of layout. The extracted data downloads as CSV or Excel formatted for direct import into your ERP's cash application module.

This post covers how AI-powered extraction handles remittance advices in any format, what fields you should be pulling, and how to set up a workflow that outputs data in exactly the format your ERP needs for import.

Why template-based OCR doesn't work for remittance advices

Traditional OCR tools work by matching field positions. You define a template: "Invoice Number is at coordinates X,Y on the page." That works fine if you're only processing one document type from one source, like your own purchase orders, which always look the same.

Remittance advices are the opposite situation. Every customer's AP system generates a different layout. If you have 300 customers, you'd theoretically need 300 templates. In practice, most companies don't build them. They just keep manually keying the data.

Some AR teams try to reduce the problem by only building templates for their top 20 customers by volume. That helps at the margin, but you're still manually processing everything outside that group, and even your template customers will occasionally send something that breaks the template: a new page added, a column reordered, a field renamed.

To understand more about why coordinate-based OCR has these limitations, see our post on what OCR data extraction actually does.

What fields you need from a remittance advice

Before setting up any extraction workflow, get clear on what your ERP's cash application import format requires. The fields are usually a subset of:

Invoice number -- the reference that ties the payment back to an open invoice in your system. This is the most important field, and it's the one with the most naming variation across customers. You'll see "Invoice #", "Invoice Number", "Reference Number", "PO Number", "Document Number", and more, all meaning the same thing.

Invoice date -- when the invoice was originally issued. Not always present, but useful for reconciliation when you have multiple invoices with similar amounts.

Due date -- less common on remittance advices, but some customers include it.

Payment amount -- the amount being paid against that specific invoice line. Sometimes labeled "Net Amount", "Amount Paid", or "Payment".

Discount amount -- if the customer is taking an early payment discount, this appears as a separate column. Some customers deduct it silently from the payment amount instead; you need to know which is happening.

Check number -- the ACH trace number or check number for the payment. Often appears in a header rather than on each line.

Check amount -- total payment amount. Also usually in the header.

Customer or vendor name -- who sent the payment. You need this for matching against your customer master.

That's the standard set. Your ERP import template likely has specific column names and formatting requirements for each of these.

How AI extraction handles format variation

With an AI-powered tool like Lido, you set up one extraction template that defines your desired output columns. Those column names can match exactly what your ERP expects for import, for example: Invoice_Number, Invoice_Date, Payment_Amount, Discount_Amount, Check_Number, Check_Amount, Customer_Name.

When you run a document through the tool, the AI reads the actual field names on that specific remittance advice and maps them to your output columns intelligently. A customer who uses "Reference Number" on their RA -- the tool maps that to your Invoice_Number column. Another customer who uses "Net Amount" -- that maps to Payment_Amount. You don't create a new template for each customer. The single template works across all of them.

This is meaningfully different from how document parsing tools have traditionally worked, where the parser needs explicit instructions about field positions and names for each document variation.

Multi-row extraction: getting every invoice line, not just the first

A remittance advice isn't usually paying one invoice. A single check from a large customer might cover 40 or 50 invoices. The RA lists each one as a separate row in a table.

When you configure Lido for remittance advice extraction, you specify that the data repeats in rows, and the tool extracts every row from the document into separate rows in your output file. A six-page FedEx statement with two columns of invoice lines per page produces one output row per invoice line, properly structured.

This sounds obvious, but it's a meaningful distinction from extraction tools that are designed for single-value fields (like pulling a total from an invoice). Multi-row extraction requires the tool to correctly identify where the table starts, where it ends, and how to handle page breaks mid-table.

Extracting a multi-page, two-column remittance advice (FedEx example)

One manufacturing company using Lido for cash application had a specific challenge with FedEx remittances. FedEx sends a statement format that runs multiple pages, with two columns of invoice detail per page. You effectively have two parallel tables side by side, each with its own rows.

This is a document that would break most template-based tools. Two-column table layouts require the parser to understand that the left column and right column are both part of the same data set, not separate sections of the document.

The AR specialist at this company set up their Lido template, uploaded the FedEx PDF, and had the extracted data in Excel on the first attempt, overnight, without any support. The output had one row per invoice line, with all the fields mapped correctly to their ERP's import format. They hadn't done any special configuration for the FedEx format specifically. The same template they use for single-page remittances handled it.

Handling fields that aren't on the document

Not every remittance advice includes every field you want. A customer might not include invoice dates. Another might not break out the discount amount separately. Some won't have your internal vendor number at all. They'll just have their own company name.

When a field in your output template isn't present on the source document, Lido outputs null or blank for that field rather than attempting to fill it in with something incorrect. That matters for ERP imports, where a misformatted field can reject an entire batch.

For fields that are absent but derivable, like your internal vendor ID, there's a separate mechanism: context documents.

Using context documents to add data that isn't on the RA

Your remittance advice extraction output needs a vendor ID or customer ID for ERP import, but that number doesn't appear anywhere on the document the customer sends you. Their RA just has their company name.

Lido lets you upload a context document, up to 50 pages, alongside your extraction template. For cash application, that's typically your vendor master or customer master list: a file that maps customer names to your internal IDs.

When the tool extracts a customer name from the RA, it cross-references that name against your context document and pulls the corresponding vendor ID into your output. If the RA says "Federal Express Corporation" but your customer master has it as "FedEx," the AI does fuzzy matching against the context document to make the connection, not a rigid exact-match lookup.

This removes one of the most tedious steps in manual cash application: looking up the customer ID for every remittance you process.

How to exclude debit memos and cover pages from extraction

Many remittance advices have pages you want to ignore. Common examples: a cover page with no data, a page of payment terms and conditions, or a section of debit memos that need to be handled separately from standard payment lines.

In Lido, you can set page exclusion rules. For example: skip any page where the header text contains "Terms and Conditions." Or skip the last page of documents from a specific customer that always contains debit memo detail you're handling through a different workflow.

Skipped pages don't count against your credit usage. If a document is ten pages and you're excluding two, you're charged for eight.

Formatting output to match your ERP exactly

ERP import files are picky. SAP Business One needs dates in YYYY-MM-DD format. NetSuite requires Internal IDs rather than invoice numbers for payment application. QuickBooks Desktop has its own column mapping per transaction type. Your Oracle instance might need amounts without currency symbols and with exactly two decimal places. Vendor numbers in your system might be zero-padded to eight digits, so vendor 1234 needs to appear as 00001234 in the import file.

Lido accepts extra instructions alongside your template. You can specify: "Output all dates in MM/DD/YYYY format." "Output all amounts as numbers with two decimal places, no currency symbol." "Add leading zeros to vendor numbers so they are eight characters long." These instructions apply across all documents processed with that template, so you set them once and they're consistent across every RA you run.

The output downloads as CSV or Excel. If your ERP takes a tab-delimited text file, CSV works. If you're doing any intermediate manipulation in Excel before import, the Excel download keeps number formats intact, which matters when amounts that look like "$1,234.50" in a PDF need to come out as 1234.50 in a cell, not as text.

Email automation for high-volume AR teams

If you're processing remittances daily, logging into a tool and uploading files manually adds friction. Lido supports automated email processing: you get a dedicated inbox address, and any remittance advice forwarded to that address gets processed automatically against your template.

In practice, this means your team can forward RAs directly from their email client as they arrive, without switching tools. Or you can set up a rule in your email system to auto-forward from a specific customer to the Lido inbox. Either way, the extracted data is waiting in your Lido account when you're ready to pull it into your ERP.

Manufacturing companies with high transaction volumes and multiple remittance formats coming in daily have used this approach to handle the bulk of their cash application data capture automatically, with manual review reserved for exceptions like documents where key fields came back null.

How to set up remittance advice extraction in Lido

The setup process for a remittance advice template in Lido is straightforward. You define your output columns (match them to your ERP import format), add any formatting instructions, upload your context document if you need vendor ID cross-referencing, and set page exclusion rules if applicable.

Then you run a few test documents through it. Pick a sample from your most complex customers: two-column formats, documents with debit memos, documents with unusual field names. Check the output against what you'd expect. Adjust your instructions if anything is mapping incorrectly.

Once the template is working correctly on your test sample, it handles new formats from new customers without additional configuration. When a customer you've never processed before sends a remittance advice, you run it through the same template. The AI reads their field names and maps them to your columns. If something unusual comes up, you can add a one-line instruction to your template to handle it going forward.

AR teams processing 150 or more remittances monthly report reducing manual entry time by 70-80% after switching to AI extraction. For teams that currently spend hours per week on manual remittance data entry, the single-template approach pays off immediately compared to either manual entry or maintaining a library of customer-specific templates. The more format variation you deal with across your customer base, the larger the gap.

Frequently asked questions

How do I extract data from a remittance advice PDF?

Upload the PDF to an AI-powered extraction tool like Lido, which reads the document contextually and pulls structured data into columns you define. Unlike template-based OCR, AI extraction handles any PDF layout without per-vendor configuration. The output downloads as CSV or Excel with one row per invoice line, formatted for direct ERP import.

Can AI extract data from remittance advices in different formats?

Yes. AI-powered extraction reads documents contextually, understanding that Reference Number on one document means the same thing as Invoice Number on another. A single extraction template handles remittance advices from any customer regardless of layout, column names, or formatting differences.

What is multi-row extraction for remittance advices?

Multi-row extraction means the tool pulls every invoice line from a remittance advice into separate rows in your output file, not just the first line or the header totals. A single remittance covering 40 invoices produces 40 rows of extracted data, each with the correct invoice number, amount, and discount.

How do I handle remittance advices with extra pages like debit memos?

Use page exclusion rules to skip pages that contain irrelevant content like debit memos, terms and conditions, or cover pages. You can exclude by page number or by rule, such as skip any page that does not contain a dollar amount. Excluded pages are not charged as credits.

How do I get the vendor ID if it is not on the remittance advice?

Upload your vendor master or customer master list as a context document. The extraction tool cross-references the customer name from the remittance advice against your master list and returns the matching vendor ID in your output, even if the name on the document differs slightly from your records.

What is the best output format for remittance advice data?

The best output format is whatever your ERP cash application import requires. Configure your extraction template column headers to match the ERP import template exactly, and use formatting instructions to enforce date formats, number formats, and leading zeros on ID fields so the CSV can be imported directly without manual reformatting.

How do I extract remittance advice data into Excel?

AI extraction tools output remittance advice data directly as Excel or CSV files. Each invoice line becomes a separate row with columns matching your defined fields. The Excel export preserves number formatting, which matters when amounts need specific decimal precision or when vendor IDs have leading zeros that CSV files can lose when opened in Excel.

What is a remittance advice?

A remittance advice is a document sent by a customer alongside a payment that details which invoices the payment covers. It typically lists invoice numbers, amounts, any discounts taken, and the total check or ACH payment amount. AR teams use remittance advices to apply incoming payments to the correct open invoices in their ERP system.

Ready to grow your business with document automation, not headcount?

Join hundreds of teams growing faster by automating the busywork with Lido.