Using Google Sheets for Basic Data Analysis (Functions, etc)
The ultimate beginner's guide on how to get started with data analytics on Google Sheets. Learn about functions like VLOOKUP, using macros, and organizing data.
Spreadsheets are an excellent tool for organizing large amounts of data in an easy-to-understand format. You can set up dynamically updating calculations, or convert your spreadsheets into visual forms like graphs and charts for presentations.
But did you know that you can also use a spreadsheet to analyze your data?
Data analytics has emerged as a big buzzword in recent years, with companies working in all kinds of fields investing millions into acquiring and analyzing data. This is because data analytics can yield powerful insights that help drive better business decisions.
And you don’t have to be a data scientist armed with the knowledge of cutting edge programming languages to analyze your data. You can glean useful information from your spreadsheets just by leveraging the functions and tools already built into the digital applications used to create them.
How? Let’s find out.
What is Analytics?
Data analytics is a very broad and involved field. Advanced computing techniques and complex statistical models are often used to extract trends and patterns from large datasets.
At its very heart, however, data analytics is simply a way of making sense of data. While a spreadsheet is good for laying out your finances and calculating expenses, by itself it gives no outlook on what that data means.
By using functions to compare the data and filter through it, you can determine the areas of interest and draw inferences from it. For example, while a quick glance can tell you which of the products in your company’s portfolio has been selling well, only data analysis can reveal which products are increasing their sales and are generating more profits.
Generally, you will be analyzing the data to figure out the real-world causes behind the numbers and piece together the ‘big picture’.
Everyone knows the common functions like SUM, AVERAGE, or COUNT (in case you need a primer on spreadsheet basics, check out our detailed guide).
For most purposes, these basic functions are usually enough. To analyze your data effectively, however, you need a little more.
Fortunately for us, spreadsheet applications like Google Sheets or MS Excel include a number of functions suited for analytics.
The ABS or the ‘Absolute’ function is one of the lesser-known functions in a spreadsheet. For those well versed in mathematics, it is equivalent to the modulo function; ie. it gives the absolute value of a number, treating positive and negative numbers as equal.
Basically, using ABS on a range of cells containing a mix of numbers returns a uniformly positive spread that differs only in the magnitude of each number, not the polarity. That may not seem like a big thing, but when you are incorporating data from a number of columns into a complex calculation, the occasional negative value can throw off your results.
A big part of analytics ultimately comes down to finding information. It is not much of an issue when you are working with a single sheet with barely a dozen entries, but when your project spans across multiple spreadsheets with hundreds of lines of data, you need a more efficient means of searching.
The VLOOKUP formula serves exactly this function. VLOOKUP essentially means ‘vertical lookup’, ie. it searches for a required value in vertical columns. This is why you are mostly advised to use columns as fields, as it simplifies VLOOKUP operations later.
The only problem is that the function requires an exact value; if you are looking for an approximate match, then VLOOKUP is not what you need.
MATCH and INDEX
The MATCH function is quite useful in searching for cells containing an approximate (or even a particular) value in the spreadsheet. It gives the relative position of the target cell within a specified range, which makes it especially important in sifting through sorted values.
But it is in conjunction with the INDEX function that MATCH truly shines, acting as a more dynamic version of the VLOOKUP formula.
This is because INDEX gives you the value of the cell whose index is specified, while MATCH helps you arrive at the index of the value you need itself. By using both the functions, one can search for values in an approximate range throughout the spreadsheet, just like the VLOOKUP function.
Built-in functions are all well and good, but for projects spanning multiple spreadsheets and hundreds of rows of data often become too cumbersome to manage manually. What if you could automate some of the functionality?
That is where macros come in. Both MS Excel and Google Sheets support macros: custom, user-defined functions that can perform all actions available manually. There is no limit to what you can achieve with a macro.
Tired of bolding and increasing the font size of the first row of every spreadsheet? Use a macro. Bored with applying currency formatting to all numerical cells in the sheet? Macro.
Generally, any repetitive, large scale task should be relegated to a macro, minimizing mistakes and freeing you up from the ‘busy work’ of managing a spreadsheet.
But how do macros work?
Creating a Macro
There are two main ways of creating a new macro: you can ‘record’ one or ‘script’ it. Recording is the simplest, if limited, way of getting started with macros. Scripting requires knowledge of coding but affords far more power and control.
For this guide, we will be sticking to recorded macros, but if you are interested in learning how to write one in script, refer to these dedicated guides for Google Sheets and Excel.
Both Excel and Google Sheets allow users to record macros. We will be using Google Sheets in our demonstration, but the same process can be carried over to Excel with little change.
Let us try this out in a new blank sheet.
To start recording your macro, head to Tools > Macros > Record Macro
A dialog box appears, telling you that a macro is being recorded, and giving you two options to toggle between.
Absolute Reference stores the exact locations of the cells you edit, which means that each time you run the macro, the same cells will be affected. As a result, this option is useful only for setting up title cells or the header row.
Relative Reference, on the other hand, adjusts according to the position of your cursor. This means that if you record a macro in which you edit properties of the four cells to the right of your cursor, the macro will affect those cells with the same position relative to your cursor when called.
We will use relative referencing. Now perform a set of actions (we have just entered ‘Name’, ‘Class’, and ‘Roll No’ in three cells of a column). Note how the dialog box informs you of every action being recorded, be it entering something or changing the formatting. Remember that this is not a video recording; time is not a factor. You can take as long as you want, for only the sequence of actions are recorded.
Once you are done, click on save.
Not only can you save your macro with a name, but also set a custom shortcut for it, which can be very helpful if you plan on using it a lot. Save the macro.
To test out our new macro, click anywhere else on the spreadsheet and use it. You can use a shortcut if you designated one, or access the macro from Tools > Macros menu.
As you can see, our macro performs flawlessly. You can use this technique to record all manner of repetitive tasks that you regularly carry out on your spreadsheet, automating them.
Analyzing a Simple Dataset
We have discussed some functions that can be useful in analyzing a spreadsheet, but descriptions can only go so far. Let us see some of those methods in action.
Here is the spreadsheet we are going to be analyzing:
Now, let us consider a few scenarios that might require analyzing this data. For example, suppose that you need the sales of the product with a selling price of $300. You can trawl through the spreadsheet, manually looking for the value, or you could just use VLOOKUP.
This is what the formula would look like:
The first ‘argument’ within the bracket is the value to look for. The second is the range of cells to search within. The third argument can be a little confusing; when multiple columns are included within the range, you need to specify which column should be selected for the results.
In our current example, we are going with an index of 2, as we need the sales numbers, which is the second column in our range. The final argument simply specifies whether the values are sorted or not, which is false in this case.
As you can see, VLOOKUP gives us the information we need immediately, saving us a lot of time and effort. The best thing is that it can work across multiple sheets, searching through thousands of rows, which makes it an indispensable tool in analyzing large data sets.
Let us consider another scenario. Suppose that you need to find a product with a manufacturing price of around $8. We can use the MATCH formula.
This returns a value of 16. Since our range began at row number 1, we can use this directly as the index, ie. F16 is the cell we are looking for.
In a small sheet like this, we can simply navigate to the cell to check out its value, but in larger spreadsheets, the INDEX function comes handy.
Note that we don’t have to specify the exact cell number, but only the array of cells and the offset. This means that you can directly plug in the results of the MATCH formula into INDEX without having to calculate which cell it refers to.
Also, you might notice that the value of the cell is actually $7, not $8. When MATCH cannot find an exact match, it returns the cell with the value just below it. In case you want an exact match, just add a ‘0’ to the arguments of the formula.
We have barely scratched at the surface of what data analytics is capable of. While there is much that you can accomplish with just these simple tools alone, Google Sheets (and Excel, to some extent) have many ways of analyzing your data. Here are some resources if you wish to explore more: