Learn the various facets of different data analysis tools such as statistics and parameters. Additionally, learn the role they can play in your business.
In the tutorial for calculating standard deviation in Google Sheets, (click here to read it again), I mentioned that Google Sheets has different formulas for standard deviation depending on whether you are dealing with a sample or a population.
The statistic vs parameter divide is also related to the sample versus population divide in statistics.
The following are the definitions of population, parameter, sample, and statistics, taken from PennState Online:
Essentially, the statistic describes the sample of the population you are studying while the parameter describes the population that you are studying. The sample serves as the representative of the population.
One question you may ask is: why not gather data about all the members of the population? The reason is because it’s often impossible or impractical. One example of gathering data about the population is the U.S. Census. The U.S. Census occurs every ten years. The budget for the 2020 Census is $7.2 billion! (Source) The high budget is to ensure that all Americans are accounted for in the Census data. The resulting information about the population can all be considered as parameters.
To get the statistic as close to the parameter as possible, the best methods of sampling are applied. The process of sampling is conducted to ensure that most, if not all the possible subgroups of the population are included in the sample, being represented in your data analysis. Sampling is a meticulous process.
Here are some examples that highlight how related and different the statistic and the parameter are:
To better grasp their differences, let’s put them in context. In running our business, we want to reach our target market and convince them to buy our products and avail of our services. To know how to reach them best, we conduct market research. In conducting market research, we are processing available business data that describes a slice of our target market. This can include the data that we gather from our site visitors, the customer transactions data, etc. These are data about the sample of our target market. We apply data analysis to summarize the data into meaningful metrics, and we use it to know how our target market behaves, which is our population.
The framing of the question is important, too. If you want to study the behavior of your customers, this means that all your customers now serve as the population. Since you have all the data on your customers, it is now possible to also calculate the parameters of your population.
The choice between the statistics and parameter is important because the formulas for the statistic and parameter version of the same metric can be different, giving you different values. Google Sheets takes note of this. The table below contains a short list of formulas where the statistic and parameter versions are different: