In this article:

How to Convert a PDF to a Spreadsheet

Being able to convert PDF data to a spreadsheet is an important way to streamline business operations. 

This automation saves time and reduces manual data entry errors. It's particularly useful for businesses and individuals who receive data in PDF format but need to use it in a spreadsheet for analysis and operations. 

We'll cover three different ways Lido can extract data from a PDF to a spreadsheet.

First, create a blank Lido File

We will use Lido, a new spreadsheet built to automate repetitive tasks, to extract our tabular data from a PDF into a spreadsheet. There is a free tier with paid plans as well. 

You can create a new account here: https://www.lido.app/go/signup

From the Lido Files page, click New File

Method 1: Extract all tabular data in a PDF

This method will extract all data that Lido can identify as tabular data in a file. This can be multiple tables of different formats.

  • Blank rows will be removed
  • Blank columns are ok
  • Stand-alone cells of text that are outside of a table will be ignored

For example, all of the tables in the following PDF will be extracted into a Lido spreadsheet, but the two stand-alone cell of data will not be extracted:

Use the EXTRACTTABLESFROMPDF spreadsheet formula

Pick any cell (lets say A1) and start typing the EXTRACTTABLESFROMPDF formula. The formula format is:

=EXTRACTTABLESFROMPDF(<drive credential>,<file path to pdf>,<output cell>)

Step 1: Add a Google Drive credential

If this is your first time using Lido, you will likely need to add a credential to allow Lido access your Google Drive. 

In the first argument of the formula for EXTRACTTABLESFROMPDF, choose “Add Credential” and follow the prompts. 

Step 2:  Choose the PDF file 

For the 2nd argument of the formula, use the file picker to choose the PDF file that is on your drive. Make sure that the credential you used has access to the PDF file. 

Step 3: Specify an output cell

For the last argument in the formula, pick a cell in the spreadsheet (lets say A3) where you would like Lido to put the extracted data from your PDF. Make sure there is no data in your spreadsheet below A3 that might be overwritten by the new data.

So the full formula for the example we're using looks like: 

```=EXTRACTTABLESFROMPDF(<drive-credential>,"https://drive.google.com/file/d/1As7PgjeKApwIqv16GCBsV3lT3-XqTEKb/view?usp=drive_web",A3)

Step 4:  Run the EXTRACTTABLESFROMPDF formula

Choose “Run Action” from the cell menu for the cell that has EXTRACTTABLESFROMPDF in it. This will run the formula and extract your data. 

^^^

info

Action Formula

Action formulas are a special kind of formula in Lido. Unlike regular spreadsheet formulas which evaluate as soon as you type them, action formulas need to be "run" explicitly because they change external information such as sending emails, extracting data, and posting to APIs. Examples of action formulas include SENDGMAIL, EXTRACTTABLESFROMPDF, SENDSMS, and CALLURL.

^^^

You should now see your data from your PDF in your Lido spreadsheet table starting at the cell you specified as the output cell. 

Method 2: Import everything from the PDF

The second method of converting a PDF to a spreadsheet in Lido will take all data, tabular or not.

For example, all data in the following PDF will be extracted into a Lido spreadsheet:

Use the IMPORTPDF spreadsheet formula

Pick any cell (lets say A1) and start typing the IMPORTPDF formula. The formula format is:

=IMPORTPDF(<drive credential>,<file path to pdf>,<output cell>)

Step 1: Add credential to Google Drive

If this is your first time using Lido, you will likely need to add a credential to allow Lido access your Google Drive. 

In the first argument of the formula for IMPORTPDF, choose “Add Credential” and follow the prompts. 

Step 2:  Choose the PDF file 

For the 2nd argument of the formula, use the file picker to choose the PDF file that is on your drive. Make sure that the credential you used has access to the PDF file. 

Step 3: Specify an output cell

For the last argument in the formula, pick a cell in the spreadsheet (lets say A3) where you would like Lido to put the extracted data from your PDF. Make sure there is no data in your spreadsheet below A3 that might be overwritten by the new data.

So the full formula for the example we're using looks like: 

```=IMPORTPDF(<drive-credential>,"https://drive.google.com/file/d/1As7PgjeKApwIqv16GCBsV3lT3-XqTEKb/view?usp=drive_web",A3)

Step 4:  Run the IMPORTPDF formula

Choose “Run Action” from the cell menu for the cell that has IMPORTPDF in it. This will run the formula and extract your data. 

Method 3: Extract a specific range of cells you choose

The third method of converting a PDF to a spreadsheet in Lido is to visually select a range of cells to import as data into Lido.

Step 1: Click File > Import from PDF and select your PDF file

Click on the File menu, choose "Import from PDF" and select your PDF file.

Step 2: Select the rectangle of data you want to import

Use the visual rectangle tool to select the range of the spreadsheet that you want to import. Click on "Extract Data"

Step 3:  Confirm the data

Confirm that the data is correct and choose "Insert at Active Cell"

Your data will now be added to the Lido spreadsheet at the active cell.

Review

We've shown you how you can use three methods to convert a PDF to a spreadsheet.

  1. Extract just tabular data (one or more tables) using EXTRACTTABLESFROMPDF
  2. Extract all data from a PDF using IMPORTPDF
  3. Visualy select a specific range of data to convert using the "Import PDF" tool

Automate repetitive tasks with Lido

Save hours on repetitive and tedious work. Lido is a new spreadsheet that connects your spreadsheets, forms, PDFs, and email inbox.

Trigger
Action
Select trigger
When a cell value in Google Sheets changes
Check Google Sheets for today's date
When a new row is added to Google Sheets
On new Google Form submission
Send me a daily reminder
On new Typeform submission
When a cell value in database changes
Check database for today's date
When a new row is added in database
When a new HubSpot customer is created
Select action
ADDCALENDAREVENT
Creates a one- or multiple-day calendar event with optional attendees
ADDHUBSPOT
Adds an object to Hubspot
ADDSLACKCHANNEL
Create a Slack channel, and optionally add a topic or members
CALLURL
Makes any HTTP request
CREATEGOOGLEDOC
Replaces [@column_name] values in a Google Doc with the corresponding table row's values
CREATEPDF
Replaces [@column_name] values in a Google Doc with the corresponding table row's values, then export it as a PDF
FETCH
Makes any HTTP request and returns the response
INSERTROWS
Inserts given array below defined values in given worksheet
SENDGMAIL
Sends an email using your Google account
SENDOUTLOOK
Sends an email using your Microsoft Outlook account
SENDSLACK
Sends a Slack message
SENDSMS
Sends an SMS message using your Twilio account
UPDATECELL
Updates cells with given values
UPDATEHUBSPOT
Updates a property of a Hubspot object
Trigger is required
Action is required
Get started