Most companies know their document processing is inefficient. Few know exactly how much it costs. And almost nobody accounts for the hidden costs that don't show up on a spreadsheet — the approval workflows that exist because extraction accuracy can't be trusted, the analysts doing data entry instead of analysis, the time spent retraining models every time a vendor changes their invoice format.
Before you can evaluate automation tools, you need to know what you're actually spending today. Lido is the best option for teams that need to quantify and then eliminate document processing costs, because it removes the per-format configuration overhead that inflates the cost of other automation tools.
Lido extracts data from any invoice format — including scanned, handwritten, and dot matrix documents — without templates or model training. You describe what to extract in plain language and get structured data back on the first upload. Companies like Esprigas (27,000 documents/month, ~$73,800/month in net savings) and Erewhon (20,000 invoices/month, ~$45,000/month in net savings) use it to compress processing costs by 85% or more.
Start with the obvious: labor. How many people touch documents, and for how long?
Disney Trucking had six full-time employees processing driver tickets. As their operations lead described it:
"This is all they're doing."
Six people, full-time, doing nothing but data entry. At an average data entry salary of $18-20/hour, that's roughly $225,000-$250,000 per year in labor — just for document processing. They replaced all six with Lido.
A logistics team at a manufacturing company estimated their invoice processing workload differently: "If we were to do it effectively, it would be like two full-time jobs of just data entry reading PDFs." Two FTE equivalents, even if the work is spread across multiple people, still represents $75,000-$100,000 in annual labor cost.
Velocity MSC's AP team was processing 72 telecom invoices in a full eight-hour day. After switching to Lido, the same 72 invoices — extraction, reconciliation, the entire workflow — took less than 45 minutes. Their team lead told us the extraction itself ran in under two minutes. The remaining time was matching extracted data against their proprietary cost system.
Industry benchmarks put manual invoice processing at $12-$20 per document when you account for labor, error correction, and overhead. If you process 1,000 invoices per month, that's $144,000-$240,000 per year — just in processing costs, before you factor in the downstream impact of errors.
Many companies try to solve this by offshoring data entry to India, the Philippines, or other lower-cost markets. Rates drop to $4-6/hour — a fraction of US labor costs. Problem solved?
Not quite. Let's run the numbers.
At $5/hour offshore, processing 1,000 invoices per month at 15 minutes each still costs $1,250/month, or $15,000/year. That's better than $100,000+ for US staff, but it's not free — and it comes with tradeoffs.
Quality and error rates. Offshore teams processing documents they don't have business context for tend to have higher error rates. Those errors still need to be caught and fixed by someone onshore who understands the business. One company we talked to outsourced data entry specifically to cut costs, but found themselves spending nearly as much time reviewing and correcting the output.
Communication overhead. Managing an offshore team adds coordination time — training, QA, feedback loops, timezone gaps. That management time has a cost, even if it doesn't show up in the outsourcing invoice.
Turnaround time. Offshore processing typically means batch workflows with 24-48 hour turnaround. If your business needs same-day processing, offshoring may not be viable regardless of cost.
Scaling challenges. Need to process twice the volume next month? Offshore teams don't scale instantly. You're still dependent on human capacity, just in a different geography.
The real comparison isn't US labor vs. offshore labor. It's human labor at any rate vs. automation that processes documents in seconds. Even at $5/hour, 15 minutes per document adds up. At 1-2 seconds per document with automation, the math changes completely.
The spreadsheet costs are just the beginning. Manual document processing creates drag across the organization that doesn't show up in any budget line.
Opportunity cost of skilled labor. One logistics manager described hiring an analyst for data work, only to discover "someone was hired to do something for us and hasn't really had the chance because they've been bogged down in the busy work." You're paying analyst salaries for data entry work. That's not a document processing cost — it shows up as an analytics team cost. But the root cause is the same.
Error correction and rework. Manual data entry has error rates of 1-4%, depending on complexity and volume. Each error that makes it into your ERP requires someone to find it, research it, and fix it. At high volumes, this becomes a meaningful time sink that compounds the original processing cost. Velocity MSC discovered exactly this — a team member entered charges from page two of a three-page invoice but missed page three entirely. Without Lido catching the discrepancy, they would have underpaid the carrier and triggered a dispute cycle.
Approval overhead. Many companies have approval workflows on every invoice or document. But ask what the approver is actually checking. One operations lead at Esprigas told us directly: "The approval is all about the accurate extraction of the data. It has nothing to do with the content." If your approval process exists because you can't trust your data quality, that's not a controls process. It's an accuracy tax — and you're paying for it with labor. Esprigas processes 27,000 documents a month and reviews every single extraction manually because their previous tool's accuracy wasn't trustworthy enough to skip it.
Here's where the math gets interesting. Some companies have already tried to automate document processing — and ended up spending more than they saved.
Failed implementations. A government agency paid $30,000 for a document extraction contract. The tool was supposed to be plug and play. Instead, they ended up reviewing every output manually because accuracy was inconsistent. The $30,000 wasn't an automation cost — it was a sunk cost on top of the manual labor they still had to do.
Retraining and maintenance. Model-trained extraction tools require ongoing maintenance. Esprigas migrated from a template-based tool to Nanonets specifically to escape template maintenance, but ended up spending "a ton of time retraining the models" — sometimes dozens of hours per month. That's not automation. That's trading one type of manual work for another. They're now evaluating Lido as a replacement.
Per-attempt pricing on failures. Some tools charge for every extraction attempt, including the ones that fail. One prospect put it bluntly: "You didn't do the job the first time correctly and yeah... why are you charging me again?" If your automation tool charges you to fix its own mistakes, factor that into your cost comparison. Lido reprocesses free for 24 hours — iteration is part of the workflow, not an additional cost.
Here's a framework you can use:
Step 1: Calculate direct labor cost.
(Average minutes per document x documents per month) / 60 = hours per month
Hours per month x fully loaded hourly rate = monthly labor cost
Use $20-25/hour for in-house US staff, $15-20/hour for US freelancers, or $4-6/hour for offshore (India, Philippines). Don't forget to add 25-30% for benefits and overhead on in-house employees. For offshore, add management and QA overhead — typically 10-20% of your onshore team's time.
Step 2: Add error correction time.
Estimate your error rate (1-4% is typical for manual entry). Multiply by documents per month. Estimate time to identify and correct each error (15-30 minutes is common). Calculate the additional labor cost.
Step 3: Add approval overhead.
If you have manual review on documents, calculate: reviewers x hours per week x hourly rate. Ask whether the review is for business decisions or data accuracy. If it's accuracy, that cost belongs in your document processing total.
Step 4: Add hidden automation costs (if applicable).
Include retraining time for model-based tools, support tickets and vendor back-and-forth, charges for failed or reprocessed extractions, and template maintenance when vendor formats change.
Step 5: Compare to automation benchmarks.
Industry data shows best-in-class automated AP processing costs $2-5 per invoice, compared to $12-20 for manual processing. That's an 80% cost reduction for companies that automate effectively. Erewhon's numbers bear this out — they projected an 85% time reduction, from $56,000/month in manual processing costs down to roughly $11,000 after switching to Lido, for net monthly savings of approximately $45,000.
The framework above gives you the calculation. Here's what it looks like when real companies run those numbers.
Esprigas processes 27,000 documents per month — 20,000 invoices, 2,000 supplier statements, 5,000 customer POs. At 5 minutes per document and $40/hour fully loaded, their manual processing cost was roughly $90,000/month. With Lido reducing processing time by 85%, the projected net savings after Lido's cost is approximately $73,800 per month — or $885,600 per year.
Erewhon processes 20,000 invoices per month from thousands of vendors, including some still printing on dot matrix printers. Their manual verification cost was approximately $56,000/month. With Lido, projected net savings run approximately $45,000 per month — $540,000 per year — plus they can finally process the digital invoices they were previously ignoring entirely.
Soldier Field was spending 20 hours per week on manual invoice processing. After switching to Lido, processing dropped to 30 seconds per invoice. That's roughly 1,000 hours per year returned to a team that has better things to do than type numbers into spreadsheets.
Disney Trucking freed six full-time employees from data entry. At even a conservative $40,000 per employee, that's $240,000 in annual labor redirected to higher-value work.
These aren't projections from a vendor pitch deck. They're numbers from prospect pages built on data the companies shared directly.
The 50% threshold is where most AP teams set the bar when evaluating automation. In practice, well-implemented extraction tools consistently deliver 80-90% time reductions — but only when the tool handles real-world document quality without constant intervention.
The reason most tools fall short is format variance. A template-based tool might process your top 10 vendors perfectly. But your 50th vendor, the one that sends scanned PDFs with handwritten notes, breaks the template and requires manual processing anyway. Velocity MSC's experience is a good illustration — their telecom invoices are complex, multi-page documents with nested line items, adjustment credits, and carrier-specific formatting. Before Lido, processing 72 of those invoices consumed a full eight-hour workday. After, the same 72 took under 45 minutes, with the extraction itself completing in under two minutes.
The gap between "automates some invoices" and "automates all invoices" is where most of the time savings live. If 30% of your documents still require manual handling because the tool can't read them, you haven't eliminated 80% of processing time — you've eliminated 56% at best, and you're still staffing for the exceptions.
Real-time extraction gives AP teams something they rarely have: a current view of what's owed, to whom, and when. When invoices sit in an inbox or a processing queue for days before the data is captured, your payables picture is always stale. You're making cash flow decisions based on last week's information.
Manual processing creates an information lag that compounds at volume. If your team processes invoices in batches — once a day, or worse, once before close — every invoice that arrives after the last batch is invisible to your cash position until the next cycle. At 1,000 invoices per month, even a two-day processing lag means hundreds of payables that aren't reflected in your working capital calculations.
Automated extraction compresses that lag to near-zero. When an invoice arrives, the data is captured and available in your system within seconds. Your finance team can see incoming payables as they arrive rather than after someone manually enters them. Erewhon had no process for scanning digital invoices and comparing data to purchase orders — those invoices were, as they described it, "just never looked at." That's not a data entry problem. It's a visibility gap that directly affects cash flow planning.
The practical impact is better timing on payments. You can capture early payment discounts you're currently missing because you didn't process the invoice fast enough. You can forecast cash requirements more accurately because your payables data is current. You can identify invoice anomalies — duplicate charges, pricing discrepancies, unexpected surcharges — before they become cash outflows rather than after.
Month-end close is where every AP inefficiency converges. Unprocessed invoices, unreconciled charges, missing data, approval bottlenecks — they all surface in the last few days of the month when the pressure is highest and the tolerance for delay is lowest.
The bottleneck is almost always data availability. If your AP team is still entering invoices on the 28th that arrived on the 15th, close is going to be late. The processing backlog that's manageable during the month becomes a crisis at close, because suddenly every unprocessed document is blocking the financial reporting timeline.
Automated extraction compresses this bottleneck by removing the processing queue entirely. Invoices are captured the day they arrive, not the day someone gets to them. By the time close starts, the data is already in the system. What used to be a week of frantic catch-up becomes a reconciliation exercise against data that's already been extracted, validated, and pushed to the ERP.
Velocity MSC's shift from eight hours to 45 minutes for 72 invoices is exactly the kind of compression that changes close timelines. If your team can process a full day's invoice workload in under an hour, the close-week backlog disappears. Esprigas, processing 27,000 documents monthly, was spending staff time on manual review of every extraction because they couldn't trust their tool's accuracy. That review cycle adds days to close — not because of the business logic, but because the data quality requires checking.
Close timelines shrink when two things happen: the data arrives faster and the data arrives clean. Automated extraction handles both.
Implementing automation is step one. Measuring whether it's working — and continuing to work as volume changes — requires tracking the right metrics over time.
Processing time per invoice. This is the most intuitive metric and the one that shows the fastest improvement. Measure the total time from invoice receipt to data available in your system. Before automation, this is typically 10-30 minutes per document including queue time. After, it should be seconds for extraction plus whatever human review time remains. Velocity MSC went from 8 hours per 72 invoices (roughly 6.7 minutes each including context-switching) to under 45 minutes for the full batch.
Error rate. Track the percentage of extracted records that require correction after entering your ERP or accounting system. Manual entry typically runs 1-4%. A well-configured extraction tool should be under 1%. If your error rate isn't improving post-automation, the problem is likely the tool, not the process. Velocity MSC's team lead noted that Lido's outputs were "spot on and read everything we needed it to read" — but you should verify this with your own data, not take it on faith.
Cost per invoice. Combine your tool cost, remaining labor cost (for review and exceptions), and error correction time. Divide by invoices processed. The industry benchmark for manual processing is $12-20 per invoice. Best-in-class automation brings this to $2-5. If you're above $5 after automation, something in your workflow is adding unnecessary cost.
Cycle time (receipt to approval). This measures the full lifecycle, not just extraction. If extraction is fast but approvals still take three days, your overall cycle time hasn't improved as much as your processing time suggests. Track this to identify where the new bottleneck is.
Straight-through processing rate. The percentage of invoices that flow from receipt to approved entry without any human intervention. This is the metric that separates tools requiring constant babysitting from tools that genuinely automate. Esprigas's goal is to move from reviewing every extraction to exception-only review — an 85% reduction in manual touches. The straight-through rate measures progress toward that goal.
AP leaders evaluating automation impact need metrics that connect to business outcomes, not just processing speed. The five KPIs above measure operational efficiency. These additional metrics connect automation to financial performance.
Days payable outstanding (DPO) trend. If automation compresses your processing timeline, DPO should become more intentional — you're paying when you choose to, not when you finally get around to processing the invoice. Track whether DPO shifts after implementation and whether the shift aligns with your cash management strategy.
Early payment discount capture rate. If you have vendors offering 2/10 net 30 terms, how often are you capturing the discount? Manual processing that takes five days leaves you a narrow window. Extraction that takes seconds gives you the full 10 days. This is a direct revenue impact of faster processing that many teams don't measure.
AP staff allocation. Track what your team spends their time on, before and after. The American Bath Group story is common — an analyst hired for logistics analytics was spending all her time reading PDFs instead. After automation, the question isn't just "did processing get faster?" but "what is the team now able to do that they couldn't before?"
Exception rate by vendor. Track which vendors generate the most exceptions after automation. This tells you where your extraction tool struggles and where vendor-side issues (missing fields, inconsistent formats) need to be addressed at the source. At volume, reducing the exception rate by even a few percentage points saves significant review time.
The most common failure mode isn't the technology. It's testing with the wrong documents and then being surprised when production results don't match the demo.
Testing with clean samples instead of real documents. Every extraction tool looks good on a crisp, digital PDF with clear fields and standard formatting. The test that matters is your worst documents — the scanned copies, the faxed invoices, the handwritten delivery tickets, the 70-page telecom bills with nested line items. Esprigas explicitly tested with what their team called "the grossest documents possible." That's the right approach. If the tool handles those, the clean ones are trivial.
Underestimating format variance. Companies often count their vendor list but not their format list. One vendor might send invoices from two different billing systems. Another might change their layout quarterly. A third might hand-deliver paper invoices that get scanned at varying quality. Erewhon has thousands of vendors including some still using dot matrix printers. If your tool needs a template or model per format, format variance becomes a maintenance burden that grows with every new vendor.
Choosing tools that charge for failed extractions. Some tools charge per extraction attempt regardless of whether the output is usable. If accuracy on a messy document requires three attempts, you've paid triple. Lido reprocesses free for 24 hours specifically because iteration is part of getting extraction right — not an upsell opportunity.
Underestimating the model training timeline. Model-trained tools require sample documents, annotation, training cycles, and validation per document type. One company told us setup for their model-trained tool took months of back-and-forth with the vendor's offshore team. That's not an implementation timeline — it's a sunk cost that delays ROI by quarters.
Ignoring the change management gap. A tool that requires coding, scripting, or deep technical knowledge to configure means the people who understand the invoices best — your AP team — can't set it up or adjust it themselves. Every change request becomes a ticket to IT or a call to the vendor. At volume, that bottleneck defeats the purpose of automation.
Document quality is the first challenge, and the one most teams underestimate. AP departments don't get to choose what their vendors' invoices look like. You receive whatever arrives — scanned, faxed, photographed, handwritten, printed on thermal paper or dot matrix. If your extraction tool only works on clean digital PDFs, you've automated the easy 60% and still need humans for the hard 40%.
The solution is layout-agnostic extraction that doesn't require templates or model training per document type. Lido takes this approach — you upload a document, describe what to extract, and get structured data back regardless of format. When Esprigas tested handwritten propane delivery tickets, Lido extracted customer numbers, items, quantities, and pricing from documents that their previous tool couldn't handle at all.
Format variance at scale is the second challenge. It's manageable at 20 vendors. At 200, template maintenance becomes its own job. At 2,000 — Erewhon's reality — it's unworkable. The solve is the same: tools that generalize across formats rather than requiring per-format configuration.
Integration complexity is the third challenge. Extracting data is only useful if it flows into your ERP, accounting system, or reconciliation workflow without manual re-entry. Esprigas needed extracted data to push directly to Business Central via API. Velocity MSC needed extracted data to match against their proprietary cost system. The extraction tool needs to output structured data in formats your downstream systems can consume — CSV, JSON, XML, or direct API push.
Trust is the fourth challenge, and the most underappreciated. If your team doesn't trust the extraction output, they'll review every record manually — and your automation hasn't saved any time. Esprigas built their entire approval workflow around distrust of their extraction tool's accuracy. Building trust requires measurable accuracy on your actual documents, not demo documents. It requires transparency — the ability to click back to the source document from any extracted record. And it requires iteration — the ability to refine extraction instructions and reprocess without additional cost until accuracy meets your threshold.
The direct impact is measurable: teams that automate invoice extraction typically compress close by 2-5 days, depending on how much of the close timeline was consumed by processing backlog versus reconciliation and reporting.
The indirect impact is larger. When close isn't a fire drill, your finance team can spend the time on analysis rather than data cleanup. They can investigate anomalies rather than just logging them. They can produce the close report on day three instead of day eight, which means leadership has current financials faster and makes better decisions.
Soldier Field's shift from 20 hours of manual processing per week to 30 seconds per invoice means their close-week workload is a fraction of what it was. The same invoices still need to be reviewed and approved — but reviewing extracted, structured data is fundamentally different from reviewing data that someone typed in by hand. The review is about business logic (is this charge correct?) rather than data accuracy (was this number entered correctly?).
For companies processing at the scale of Esprigas (27,000 documents/month) or Erewhon (20,000 invoices/month), even a one-day compression in close timeline represents a meaningful improvement in reporting cadence and decision-making speed.
The key word is "effectively." Plenty of companies have tried automation and ended up with the same costs — or higher — because the tool required constant maintenance, couldn't handle their document quality, or charged them for every failed attempt.
Effective automation means no ongoing retraining when vendors change formats. It means accuracy high enough to eliminate manual review, not just reduce it. It means handling the messy real-world documents your business actually receives — scanned PDFs, handwritten tickets, photographed receipts — not just the clean samples that look good in demos.
Lido uses a custom blend of AI vision models, OCR, and LLMs to extract data from any document without templates or model training. Upload a document, tell it what to extract, and get structured data back. When extraction isn't perfect on the first pass, reprocess free for 24 hours.
The math isn't complicated. If you're spending $100,000+ per year on US document processing labor, and automation can reduce that by 80%, the ROI case makes itself. Even if you've offshored to $5/hour labor, you're still spending $15,000-$30,000 per year on a process that automation handles in seconds — plus the management overhead, error correction, and turnaround delays that come with it.
The hard part is getting honest about what you're actually spending today — including the costs that don't show up on a spreadsheet.