In this article:
Blog
>
PDF

Extract Data from PDF (Easiest Way in 2024)

In this article, we will show you how to extract data from PDF from your browser. Simply follow the process below.

How to Extract Data from PDF

We will use Lido, a tool designed to automate repetitive spreadsheet tasks. You can get set up by going to the following link: https://www.lido.app/go/signup.

Method 1: Using a PDF Extracting Tool

Step 1: Start with a blank Lido spreadsheet.

Open Lido and create a new spreadsheet. This blank spreadsheet will be where you will place all the data you extract from the PDF, allowing you to organize and analyze it efficiently.

Step 2: From the File menu, click "Import from PDF".

In your Lido spreadsheet, navigate to the File menu and select the “Import from PDF” option. This activates the PDF Importer, a tool specifically designed to convert information from PDFs into a structured spreadsheet format.

pull data from pdf

Step 3: Upload the PDF document from which you wish to extract data.

Select and upload the PDF file from which you need to extract data. Ensure you have the correct file that contains the information you need.

retrieve data from pdf

Step 4: Adjust the selection to encompass the area of the data you wish to extract and click "Extract data".

Once your PDF is uploaded, use the selection tool to define precisely which area of the PDF you want to convert—be it text or tables. Make sure to select only the area that contains the data you need before clicking "Extract data" to begin the extraction process.

how to extract data from a pdf

Step 5: Verify that the data has been accurately extracted from the PDF, then click "Insert at active cell".

Inspect the extracted data shown in the preview to ensure it matches what you intended to extract, whether it's text lines or tabular data.

If everything is good, click “Insert at active cell” to transfer this data into the spreadsheet.

If the area contains both tabular data and text, remember that the tool prioritizes tabular data, so text outside tables may be ignored. If you need to adjust your selection or extract more data, click “Back.”

get data from pdf

Step 6: The extracted data is now in your Lido spreadsheet.

Review your Lido spreadsheet to confirm that the data appears as expected and is correctly formatted. Each piece of information should be in its respective cell, aligned as per the structure of the original PDF. You can now save your work or continue manipulating the data as needed.

extract info from pdf

Method 2: Using the IMPORTPDF Formula

In this approach, we will utilize Lido’s custom formula IMPORTPDF, which extracts all content from the provided PDF in one go. Note that IMPORTPDF is not effective with scanned PDF documents; if you're working with a scanned PDF, you should opt for method 3 below that employs the EXTRACTTABLESFROMPDF formula instead.

Step 1: Log into your Google Drive and upload the PDF from which you wish to extract data.

Access your Google Drive account and upload the PDF file you want to work with. This is important as you’ll need the PDF to be online to use it with Lido’s IMPORTPDF formula.

pull tables from pdf

Step 2: Start with a blank Lido spreadsheet.

Open Lido and set up a new, blank spreadsheet. This serves as your working area where the PDF data will be imported.

Step 3: Create a new worksheet by clicking the plus button at the top left of the screen.

Add a new worksheet within your Lido spreadsheet by clicking the "+" icon. This is where you'll insert the IMPORTPDF formula and eventually view the extracted data.

data extraction from pdf

Step 4: Insert formula "=IMPORTPDF(" into cell A1

Go to the newly created worksheet and type "=IMPORTPDF(" into cell A1. This sets the stage for linking your PDF.

pdf data extraction

Step 5: Click "Add Credential" and follow the instructions to connect the Google account where you uploaded the PDF file. 

Follow the prompt to "Add Credential" which involves authorizing Lido to access your Google Drive. This step is crucial for Lido to access the PDF you uploaded.

pdf data pull

Step 6: Type "," to go to the next argument and click "Select a file".

After adding your Google Drive credentials, continue the formula by typing a comma and then clicking "Select a file" to browse and choose the appropriate PDF from your Google Drive.

pdf info extraction

Step 7: Find the PDF file you uploaded to Google Drive and select it.

Navigate through your files to locate the previously uploaded PDF and select it. This links the PDF directly to your formula.

pdf data retrieval

Step 8: Complete the formula with ", Sheet1!B2)" and press ENTER

The final argument of the IMPORTPDF formula determines where the extracted data should be placed. In this case, we are specifying that the data should be placed in worksheet Sheet1, starting at cell B2.

extracting data from pdfs

Step 9: Right-click on cell A1 and select "Run action".

Once your formula is set, right-click on cell A1 where you entered the formula and select "Run action" from the menu. This will execute the formula and start the data extraction process.

pulling data from pdfs

Step 10: Navigate to worksheet "Sheet1" and make sure that the data is extracted correctly.

Switch to "Sheet1" to check the results. Ensure that the data from the PDF has been correctly extracted and is accurately displayed starting from cell B2 as intended.

retrieving data from pdfs

Method 3: Using the EXTRACTTABLESFROMPDF Formula

In this method, we will use Lido’s custom formula EXTRACTTABLESFROMPDF which extracts everything from the PDF it recognized as a table. This formula works on scanned documents.

Step 1: Log into your Google Drive and upload the PDF from which you wish to extract data.

Start by accessing your Google Drive account and upload the PDF document that contains the tables you want to extract. This makes the file accessible to Lido’s tools through your connected account.

pdf data extraction method

Step 2: Create a new Lido spreadsheet from the files page. 

Go to Lido and initiate a new spreadsheet from the files page. This spreadsheet will be where the extracted table data is stored and managed.

Step 3: Create a new worksheet by clicking the plus button at the top left of the screen.

Add a fresh worksheet in your Lido spreadsheet by clicking on the plus icon located at the top left. This will be the workspace for entering your formula and viewing the extracted data.

how to extract data from a pdf file

Step 4: Type the formula "=EXTRACTTABLESFROMPDF(" into cell A1.

In the new worksheet, enter the beginning of the EXTRACTTABLESFROMPDF formula into cell A1. This prepares the cell for the subsequent steps to link and process your PDF.

can you extract data from a pdf file

Step 5: Click "Add Credential" and follow the instructions to connect the Google account where you uploaded the PDF file.

This step involves setting up a connection between Lido and your Google Drive by adding your credentials. Follow the on-screen instructions to authorize Lido, which enables access to your uploaded PDF.

extract data from pdf files

Step 6: Type "," to go to the next argument and click "Select a file".

After establishing the connection, continue the formula by typing a comma to proceed to the next part, then click “Select a file” to open a file picker dialog. This allows you to navigate through your Google Drive to find your PDF.

pdf content extraction

Step 7: Find the PDF file you uploaded to Google Drive and select it.

Locate and select the previously uploaded PDF within your Google Drive. This is the file from which the tables will be extracted.

extract text from pdf

Step 8: Complete the formula with ", Sheet1!B2)" and press ENTER

The final argument of the EXTRACTTABLESFROMPDF formula determines where the extracted data should be placed. In this case, we are specifying that the data should be placed in worksheet Sheet1, starting at cell B2.

data pull from pdf

Step 9: Right-click on cell A1 and select "Run action".

With the formula ready, right-click on cell A1 and choose “Run action” from the context menu. This command triggers the formula to start extracting table data from the PDF.

extracting data from pdf files guide

Step 10: Navigate to Sheet1 and make sure that the data is extracted correctly.

Finally, go to "Sheet1" to verify the results. Check that the tables from the PDF have been accurately extracted and are correctly displayed starting at cell B2.

Remember, this method specifically extracts tabular data; for non-tabular text, consider using methods 1 and 2.

can you extract data from a pdf

We hope that you now have a better understanding of how to extract data from PDF.

Schedule a free automation consult
Learn more

Level up your Google Sheets skills with our free Google Sheets automation guide

Wasting too much time doing things manually in spreadsheets? Want to spend more time doing what you love? Our 100% free, 27-page Google Sheets automation guide is full of new tips and tricks that will save you time and money!