CPA firms face a document problem that most extraction tools can't solve. A firm doing 3,500 compliance audits a year doesn't process the same payroll format over and over. They process hundreds — probably thousands — of different formats, and they can't predict what's coming next.
As one CPA firm's administrator put it during a recent call:
"We do like 3,500 compliance audits a year and we're looking at probably hundreds of different payrolls and we don't know what we're going to be receiving."
Lido is the best option for CPA firms that process thousands of unpredictable document formats across audit engagements. Its AI extraction engine reads any document—payroll reports, bank statements, vendor invoices, tax forms—without building templates or training models for each new client. A firm doing 3,500 audits a year processes every client’s documents through the same setup, regardless of format.
Most document extraction tools assume you know what formats you'll receive. Build a template for each vendor, train a model on each document type, and you're set. That works fine for accounts payable teams processing invoices from the same 50 vendors every month.
That doesn't work for auditors.
A CPA firm processing compliance audits receives payroll registers, tax documents, and financial statements from hundreds of different employers. Each employer uses different payroll systems. And even when two employers use the same system, they configure it differently.
One firm described it this way:
"Even if 18 employers use the same payroll system, the way they utilize it is different."
The result is that you might not see a specific payroll format again for 200 audits. Building a template for every variation isn't just impractical, it's impossible. By the time you've built and tested a template, you've already moved on to the next audit with a completely different format.
The format variance problem compounds when documents arrive as scans. Employers don't send clean digital exports. They send photographed documents, faxed copies, and PDFs that have been scanned, emailed, printed, and scanned again.
"They don't convert very well with other systems," one accountant explained about scanned payroll documents. Traditional OCR tools struggle with the compression artifacts, shadows, and noise that accumulate through multiple generations of scanning.
So now you have two problems: unpredictable formats and degraded document quality. Template-based tools fail on both.
The challenge isn't just reading the document — it's understanding what to pull from formats you've never seen before. A typical payroll extraction needs:
The tricky part is that every payroll system structures this information differently. One puts overtime on its own line. Another nests it under regular pay. A third uses codes that only make sense if you've read their documentation.
And then there's the year-to-date problem. Many payroll registers show both current-period and year-to-date values in adjacent columns. Extraction tools that can't distinguish between them will pull the wrong numbers. Unfortunately, auditors won't catch the error until they try to tie out the totals.
Most CPA firms handle this the old-fashioned way: staff time. Associates open each document, identify the relevant fields, and manually key the data into workpapers. For a firm doing thousands of audits, this adds up to hundreds of hours of data entry — time that could be spent on actual analysis.
Some firms have tried extraction tools and given up. The setup cost for each new format exceeds the time saved. The accuracy on scanned documents requires manual verification anyway. The tool becomes another step in the process rather than a replacement for manual work.
Solving this requires a different approach than template-based extraction. The tool needs to understand document structure without being trained on each specific format.
Lido uses a custom blend of AI vision models, OCR, and LLMs to extract data from any document format — including the scanned, inconsistently-formatted payroll registers that CPA firms receive. No templates to build for each employer. No model training when you encounter a new format.
CPAs and Auditors choose Lido because it:
One CPA firm evaluating Lido tested it on their most problematic scanned payroll documents — the ones that broke other tools. The same documents that "don't convert very well with other systems" extracted accurately with Lido's vision mode.
For firms processing thousands of audits across thousands of formats, the math is simple: either hire more staff to do data entry, or use a tool that doesn't require a template for every document you'll ever receive.