Blog

Why PDF-to-Excel Converters Fail on Trade Documents

February 23, 2026

Every customs brokerage and freight forwarder has tried it. You get an 80-page combined packing list and commercial invoice PDF from a European supplier, you run it through a PDF-to-Excel converter, and the output is unusable. Either everything lands in one cell—Box A1 has the entire document crammed into it—or every line gets its own box, completely out of order. The fields you actually need—country of origin, net weight, batch numbers, part numbers—are scattered, mismatched, or missing entirely. You end up spending just as long fixing the converter’s output as you would have spent keying the data by hand.

This isn’t a bug in the converter. It’s a fundamental mismatch between what PDF-to-Excel tools are designed to do and what trade document processing actually requires. The problem isn’t converting PDF to text. The problem is getting structured, matched data out of complex multi-document PDFs—and that’s a completely different task.

Lido is an AI-powered document extraction platform built for the complexity that PDF-to-Excel converters can’t handle. Upload a combined packing list and invoice PDF from any supplier, and Lido identifies the different document types, extracts fields like batch numbers, net weights, and country of origin, matches corresponding line items across documents, and normalizes inconsistent formatting—all without templates or per-supplier configuration. Customs brokers processing thousands of entries per month use Lido to turn hours of manual data entry into minutes of automated extraction.

What generic PDF-to-Excel converters actually do (and don’t do)

A standard PDF-to-Excel converter does one thing: it reads the text and tables in a PDF and tries to reproduce the layout in a spreadsheet. Some do this well for simple, single-format documents—a one-page invoice with a clean table, a bank statement with consistent columns. The converter detects the table grid, maps text into cells, and gives you a reasonable facsimile of the original document in Excel.

But that’s where the capability ends. PDF-to-Excel converters don’t understand what the data means. They don’t know that “Germany” on the commercial invoice and “DE” on the packing list refer to the same country of origin. They can’t match a batch number on a packing list to the corresponding line item on an invoice. They won’t flag that three line items are missing net weight or that country of origin has been omitted from an entire section. They don’t distinguish between a packing list and a commercial invoice when both appear in the same PDF file.

For a simple domestic invoice, this limitation doesn’t matter much. For international trade documents, it’s a dealbreaker.

Five specific ways trade documents break PDF-to-Excel tools

  1. 1. Combined documents in a single PDF. In international trade, packing lists and commercial invoices routinely arrive as a single combined PDF. Not as separate files—as one continuous document that can run from 80 to 2,000 pages per shipment packet. A PDF-to-Excel converter treats this entire file as one document. It has no concept that pages 1–40 are a packing list and pages 41–80 are a commercial invoice with a different layout and different fields. The output is a jumbled spreadsheet that mixes packing list data with invoice data, with no way to tell which is which without manually reviewing every row.
  2. 2. European and international invoice formats. European invoices are structured differently from North American ones, and the differences aren’t cosmetic. Field labels are in different languages. Number formatting uses commas for decimals and periods for thousands separators—the opposite of US conventions. Date formats vary. Tax calculations follow different rules. When these invoices go through a generic PDF-to-Excel converter, the result is predictable: “either get everything in one field, or every line has its own box and it’s not in order.” The converter can’t parse what it doesn’t understand.
  3. 3. Country code inconsistency across document types. A commercial invoice lists country of origin as the full name—“Germany,” “France,” “Japan.” The packing list for the same shipment uses ISO 2-letter codes—“DE,” “FR,” “JP.” When you’re preparing a customs entry, you need these to match. A PDF-to-Excel converter extracts both formats exactly as they appear and leaves you to manually reconcile “Germany” with “DE” across hundreds or thousands of line items. It doesn’t normalize, it doesn’t standardize, and it doesn’t flag the inconsistency.
  4. 4. Missing fields that no one catches. Trade documents frequently have gaps. A line item might be missing net weight. An entire section might omit country of origin. A batch number might appear on the invoice but not on the packing list. When you’re manually processing entries, an experienced broker catches these gaps because they know what should be there. A PDF-to-Excel converter doesn’t know what’s supposed to exist—it only extracts what’s present. Missing data passes through silently, and you don’t discover the gap until you’re mid-way through filing the customs entry and a required field is blank.
  5. 5. Layout variety across customers and even divisions. Every customer sends documents in a different format. But it’s worse than that—different divisions within the same customer often use different templates. The automotive parts division sends invoices in one layout. The electronics division uses another. The industrial supplies division uses a third. With roughly 50 fields needed per entry, spread across both packing lists and invoices, a PDF-to-Excel converter would need to be reconfigured for every single layout variation. In practice, no one does this. They just accept the bad output and fix it manually.

The real problem isn’t text extraction—it’s structured data matching

When customs brokers describe what they actually need, it becomes clear that text extraction is the easiest part of the job. The hard part is everything that comes after.

  1. Matching data across document types. A shipment’s packing list and commercial invoice contain overlapping but different information about the same goods. The packing list has physical details—weights, dimensions, box counts, batch numbers. The invoice has financial details—unit prices, totals, payment terms. To prepare a customs entry, you need to combine data from both documents, matching each packing list line item to its corresponding invoice line item using shared identifiers like batch numbers, part numbers, or reference numbers. This is a data matching problem, not a text extraction problem. PDF-to-Excel converters don’t even attempt it.
  2. Normalizing inconsistent data. The same information appears in different formats across different documents and different suppliers. Country names versus country codes. Different date formats. Different number formatting conventions. Abbreviated versus full product descriptions. Getting usable data out of trade documents means normalizing all of this into a consistent format—something a converter that simply reproduces the PDF layout cannot do.
  3. Validating completeness. A customs entry requires approximately 50 fields per line item. If any required field is missing from the source documents, someone needs to know before the entry is filed—not after. This means the extraction system needs to understand what fields are required, check whether each one is present, and flag gaps proactively. A PDF-to-Excel converter has no concept of required fields. It extracts what’s there and stays silent about what isn’t.
  4. Handling volume. A single shipment packet can run to 2,000 pages. A busy brokerage processes dozens of these per week. At this scale, the difference between a tool that extracts text and a tool that produces structured, matched, validated data is the difference between a minor time savings and a transformational one. As one customs broker put it: “If I have to spend an hour as opposed to six, we’re way ahead of the game.”

What to use instead of PDF-to-Excel converters for trade documents

The tool category that solves this problem is AI-powered document extraction—not PDF-to-Excel conversion. The difference is fundamental. PDF-to-Excel converters reproduce layouts. AI-powered extraction understands documents.

Lido is an AI-powered document processing platform built for exactly this kind of complexity. Upload a combined packing list and invoice PDF—whether it’s 80 pages or 2,000—and Lido identifies the different document types within the file, extracts the relevant fields from each, matches corresponding line items across packing lists and invoices using batch numbers and reference numbers, normalizes country codes and formatting inconsistencies, and flags missing required fields before you start the customs entry. No templates. No configuration per supplier. No manual cleanup.

This is what separates purpose-built document extraction from generic conversion tools. The extraction system understands what a packing list is, what a commercial invoice is, and what data customs entry requires. It doesn’t just convert text—it produces the structured, matched dataset you actually need.

The work itself isn’t difficult or complicated. It’s just tedious. That’s the assessment from brokers who’ve been doing this manually for years. Matching batch numbers, normalizing country codes, flagging missing weights—none of it requires deep expertise. It just takes hours. And those hours multiply with every shipment. Automated invoice processing and document extraction take away the tedious part, letting brokers focus on the work that actually requires their expertise—tariff classification, compliance review, and customer communication.

For customs brokers and freight forwarders already struggling with trade document complexity, the path forward isn’t a better PDF-to-Excel converter. It’s a fundamentally different approach to document processing. Learn how customs brokers are using OCR to process import invoices and packing lists—and see why the results look nothing like what a PDF converter produces.

Frequently asked questions

Why do PDF-to-Excel converters fail on international trade documents?

PDF-to-Excel converters fail on trade documents because they only reproduce the visual layout of a PDF in spreadsheet form. They can’t handle the complexities specific to trade documents: combined packing list and invoice PDFs that run hundreds or thousands of pages, inconsistent country code formats across document types, missing required fields that need to be flagged, and different layouts from every supplier and division. Trade document processing requires data matching, normalization, and validation—none of which a layout converter provides.

Can a PDF converter handle combined packing list and invoice PDFs?

No. Standard PDF-to-Excel converters treat a combined PDF as a single document and produce a jumbled spreadsheet that mixes packing list data with invoice data. They have no concept that different pages contain different document types with different layouts and fields. AI-powered document extraction tools like Lido can identify the different document types within a single PDF, extract the relevant fields from each, and match corresponding line items across packing lists and invoices automatically.

How do you match packing list items to invoice line items automatically?

AI-powered extraction matches packing list items to invoice line items using shared identifiers like batch numbers, part numbers, and reference numbers. The system extracts these identifiers from both document types, then links corresponding records automatically. This is something PDF-to-Excel converters cannot do because they extract text without understanding the relationships between data points across different sections or pages of a document.

What’s the difference between PDF-to-Excel conversion and AI document extraction?

PDF-to-Excel conversion reproduces the visual layout of a PDF in spreadsheet form—it maps text and tables into cells. AI document extraction understands what the data means: it identifies document types, extracts specific fields, normalizes inconsistent formats (like country names versus country codes), matches related data across document sections, and flags missing required information. For simple single-page documents, conversion may be sufficient. For complex trade documents with multiple document types, hundreds of pages, and 50+ required fields per entry, only AI extraction produces usable output.

How long does it take to process a large trade document packet with AI extraction?

Processing time depends on the document size, but AI extraction typically reduces trade document processing from hours to minutes. Customs brokers report going from six hours of manual data entry per shipment to about one hour of review and validation—a reduction that compounds across dozens of shipments per week. Documents ranging from 80 to 2,000 pages per packet can be processed without splitting the file or configuring templates for each supplier’s format.

Ready to grow your business with document automation, not headcount?

Join hundreds of teams growing faster by automating the busywork with Lido.