How to Normalize Data in Excel

If you need to convert individual raw data values into normalized data which shows the data as a percentage of a given time point in Excel, there’s an easy method to do so. This method allows you to show the percentage of each group’s time 0, making it easier to compare data sets.

To begin, we start with a data set that includes replicates, which are raw data, and we’re showing this in an arbitrary unit as a response to a stimulus. We have time down the side here, from naught to 6 minutes, and each of our replicates, which could be different subjects, for instance, 1, 2, 3 & 4, have responded to the stimulus in a way that has changed their response measurement.

On average, the response goes up and then comes down again, but there is a large standard deviation here simply because the data themselves are very variable. If we plot these data as a mean, there is indeed a peak at 3 minutes and then it comes down at 6, but the variation of the variance indicated by the standard deviation bars is very large. Therefore, the group data may not be suitable for presenting because any differences might be masked within this large variance.

To plot these data as a percentage of time 0, we need to set up a little formula that divides this cell by itself and then multiplies it by a hundred to give us the hundred-percent. Then, the second formula down is this cell divided by this cell multiplied by a hundred and then this cell multiplied by this cell, creating a series of percentages that show a percentage of our zero time point for subject one.

To do this, you need to enter the formula into a pre-prepared little space. You type in equals and click on that cell. The best thing to do is actually put these in brackets first, which makes life a little bit easier for the maths. Then, you select d6, divide it by D, and type in dollar six, which limits you to row number six.

Next, you multiply it by a hundred to give you a percentage and hit the Return button. You can copy this down to the next cell by clicking on the box in the corner and expanding. The second cell will also have the D dollar six in the formula, so it is still referring to d6. You can then click on that cell and drop that down all the way to the bottom.

You can then do the same with cell number two, type equals open parentheses the cell divided by e dollar six and then multiply that by 100. You can make life easy for yourself by then just dragging that across all the way and then dragging it all the way down. You will get a replica of your data there, and you can move the decimal places so it looks a little bit easier on the eye and reduce the decimal places our standard deviation just so you can compare them.

What you have now is a data set where the standard deviation of your times zero is zero because they’re all 100% of themselves. This is one of the reasons why I’ve neglected to put these data on this graph because we don’t really want to show the hundred percent with no variation because all the subjects are a hundred percent of themselves.

Now you can see the data go up and peak at three minutes and then fall down and trough back at six minutes. You might want to expand your graph to fill this zero time point so you can click on your graph and bring that up to show time point zero as well just for continuity sake. You can now see that we’ve plotted a very similar data set with the mean number looking the same shape as it was in the group data, but now all of these individual data points are plotted, shown as a percentage of their own time zero.

There’s also another way of doing this, which would be all the numbers as a percentage of the average of time zero, but given that we have quite a high variation in our baseline, you wouldn’t always want to choose this method. If you’re doing the correct analyses, for instance, if you’re doing a two-way or one-way ANOVA with this data set, you might well pick up differences between the groups without needing to show it graphically.

However, it’s worth being aware that normalizing data isn’t always the right thing to do, but it’s a very simple math that may help you understand your data.