In this simple guide, we are going to explain how to normalize data in Excel by using the built in STANDARDIZE function.
This guide will cover:
- Why we normalize data
- Calculating the mean
- Calculating the standard deviation
- The STANDARDIZE formula
- Example output
Why we normalize data
When working with data sets, the overall range and scale of values we are working with can vary greatly – especially when working with multiple types of measurements we are looking to compare.
This is where normalization comes in. By changing the overall scale of the values we are working with it can become far easier to analyse the results.
By normalizing a set of data, we are looking to convert each value into a z-score.
A z-score is a value that provides information on how far an original value is from the overall mean of the dataset. As the mean is set to zero in this output, this is done in terms of positives and negatives. For example, a z-score of 1.04 is a value that is above the average score, while -0.65 is below the average.
To normalize data into a set of z-scores, we will need two additional values:
Mean: The mean value is simply the overall average value of the numbers we are working with.
Standard Deviation: The Standard Deviation is a value that represents the amount of variation that sits within a dataset, and provides an indication of how much the numbers differ from one another.
Now that we have summarized the basics of what each value means and what is required, lets have a look at how we go about calculating how to normalize data in Excel.
Excel is surprisingly powerful when it comes to statistical analysis, with a series of built in functions to make this process quick and easy. There are three key steps involved all using simple formulas which will be covered below. We will also explain the manual approaches to these formulas.
Calculating the mean
The mean value is nice and easy to calculate in Excel.
Lets say we have a single column worth of values as seen in our example below, all we need to do is use the AVERAGE formula across all of these cells, with the resulting value being the mean.
In our data set we have 13 values, sitting across cells B3 down to B15. The formula in this case will be:
Using the average formula in excel makes this a very simple formula, however we can also calculate this manually using addition and division.
In this case the formula would be –
Either of these formulas will do the job, although the AVERAGE function is a bit less effort and has less room for error by typing in the wrong number to divide by. In this case we had 13 values and therefore entered 13 in the formula.
Now that we have the mean for our set of numbers, lets move on to the Standard Deviation
Calculating the Standard Deviation
The approach for calculating the standard deviation within our dataset is also quite easy, and similar to the process used to calculate the mean thanks to Excel’s built in STDEV function.
Using the same values as our previous example, the formula we will need to use is:
Note the .P within this formula. There is also the STDEV.S option which takes a sample of the data rather than the full amount.
The output of this formula will look like this:
We now have both of the required values in our dataset to move on to the next step – normalizing the data.
The Excel STANDARDIZE function
To normalize our data in Excel, we will be using the built in STANDARDIZE function, which takes our original values, and brings in both the mean and standard deviation of the data we calculated earlier and outputs a z-score. This makes the process quick and easy.
The formula for STANDARDIZE is as follows:
=STANDARDIZE (value, mean, standard_deviation)
To calculate the z-scores we simply select the original numbers under value, and then select the cells where our mean and standard deviation are sitting.
The formulas and resulting z-scores for our set of numbers will look like the below:
As we can see, all 13 of our numbers have now been converted into a far smaller range, with every number being based on the mean. Anything above our mean of 254 is shown as a positive value, with anything below that as a negative value. This will make it far easier to analyse the data further, and compare to different data sets.
There are two additional points worth discussing for this formula:
As seen in the above image – we have used a dollar sign in a few spots. This is simply locking the formula to the F3 and F7 cells, so when we drag the formula down through the rows those values dont change to F4, F5 etc and look in blank cells or use incorrect values. The dollar sign before the letter locks the column, and the dollar sign before the number locks the row.
Secondly, this can also be calculated in a slightly more manual way by simply using the z-score calculation. The mathematical formula for a z-score is as follows:
In the above formula, the values represent:
x: The original value
μ: The Mean
σ: The Standard Deviation
With this in mind, to calculate the same z-score of -0.68 that we saw for the value of 54 in cell B3, we could also use the below formula:
The STANDARDIZE function is easier, and has less room for error, but its useful to know the process behind this, and allows for quick calculations when using other spreadsheet software outside of Excel where the function is not available.
This sums up our guide on how to normalize data in Excel using the STANDARDIZE function. As we can see, Excel provides some surprisingly powerful statistical analysis that can be performed quickly and easily using a few basic formulas.
For more tips and tutorials on using Excels built in functions be sure to check out our Excel Tips page