A lot of data is available over the internet in various formats. Fortunately, Google Sheets has robust functions to import them to your spreadsheet. These functions are the IMPORTDATA, IMPORTFEED, IMPORTHTML, and IMPORTRANGE. We will learn how to use them in this tutorial.
One of the simplest data file formats available over the Internet is the CSV file, where the acronym means “comma-separated values.” It is simply a file containing entries in a table, with cells separated by commas and rows separated by line breaks. A CSV file can look like this:
The format is simple but is nonetheless effective in storing arrays of data. CSV is still a popular format for storing data as CSV files are readable in simple apps such as Notepad.
Another related format is the TSV file, where the T stands for “tab”. Instead of a comma separating the cells, the tab does the job and makes it more readable since the file is formatted visually like an actual table, minus the cell borders. A TSV file can look like this:
Both CSV and TSV files hosted over the internet can be directly imported to Google Sheets using the IMPORTDATA function. The syntax is as follows:
... where URL is the URL to the hosted CSV or TSV file.
You can use IMPORTDATA for publicly-available data, as a lot of them are encoded in CSV format.
The data is in csv format. We simply copy the link address, and then we go to Google Sheets to use the IMPORTDATA function: =IMPORTDATA("http://www2.census.gov/programs-surveys/popest/datasets/2010-2019/national/totals/nst-est2019-alldata.csv")
The result looks like this:
That’s it! IMPORTDATA is as simple as that.
RSS and ATOM feeds are a format for delivering news from a source to a device. You can subscribe to a certain RSS or ATOM feed to stay updated.
You can use the IMPORTFEED function to spice up your dashboard a bit. Here is the syntax of the IMPORTFEED function:
IMPORTFEED(URL, [query], [headers], [num_items])
... where URL is the URL of the feed; query specifies what to fetch from the feed (by default, it’s the items of the feed); headers specify whether to include the feed headers or not; and num_items to limit the number of items loaded in the sheet.
As an example, we want to import the latest news from CNBC using the IMPORTFEED function: =importfeed("http://www.cnbc.com/id/19746125/device/rss/rss.xml")
Our sheet now looks like this:
You may need to do some adjustments to your sheet to make the feed look more aesthetically pleasing.
Sometimes the relevant data is presented as a table in the webpage that is being updated regularly. Fortunately, Google Sheets has the IMPORTHML function that can scan a webpage for a table or a list and then import them to the sheet. The syntax of the function is as follows:
IMPORTHTML(URL, query, index)
... where URL is the URL of the webpage; query can either be a “list” or a “table”, but must be specified; and index is to help the function identify which table or list should be imported, especially if the webpage has more than 1 table or list.
Both the URL and the data range should be enclosed by quotation marks. The workbook must be either a public workbook or that you have access to it. If the workbook you are including is also yours, a prompt will pop up, asking you to connect the sheets. Click Allow access.
Afterward, the data will be loaded.
If the sheets are located in the same workbook (thus having the same URL), they still need to be added through the IMPORTRANGE function individually.
Summary, plus a sample sheet
After reading this tutorial, you are now ready to extract more data from the Internet to your sheet!