It’s important to look at the distribution of your data when considering your performance. I’ve written before about the dangers of only looking at your average loading time. Averages can be very misleading. I’ve seen plenty of sites that have a 4 second average loading time, but a 20 second 90th percentile loading time. That’s why we offer a histogram view and always encourage our customers to track their goals using their 90th or 95th percentile loading time.
We’ve also had requests to include the geometric mean as one of our featured metrics. We thought that was a great idea and geometric mean is now featured on your Torbit dashboard along with your existing metrics (Median, Average, 90th Percentile, 95th Percentile, and 99th Percentile).
For those of you who are unfamiliar with a geometric mean, here is a quick explanation of what this new metric means for you and your performance data.
As you know, there are a lot of different factors that influence how fast your website loads. The geographic proximity of your server to your visitors has a big impact on your speed. It also makes a difference which browser each visitor is using and whether they are on a fast internet connection or not. When you look at your performance data as a whole, you are seeing the combination of many independent variables. When looking at end user performance data, it usually looks like the graph below. The data does not take a normal distribution shape, as it is skewed to the right. However, if you took the logarithm of all the data and re-graphed, you would have a normal distribution, or the standard bell curve. Thus, this is called a log-normal distribution.
The arithmetic mean (what we usually think of as an average) is very susceptible to outliers. In pageload times, it’s easy to have a few really slow data points that skew your data. It’s not a problem if you have a normal distribution since the outliers balance each other out (both visually and mathamatically). The problem is, we don’t have a normal distribution, we have a log-normal distribution. As it turns out, when you have a log-normal distribution, the geometric mean is a much better way of representing the central tendency of your data.
A geometric mean is calculated by multiplying your data points together and taking the nth root (n being the number of data points you have) of that resulting product. With this calculation, the geometric mean normalizes the ranges being averaged, so that no range dominates the weighting, and a given percentage change in any of the properties has the same effect on the geometric mean. In this way, the geometric mean helps with outliers so they don’t have undue weight. To learn more about geometric mean, I’d recommend heading to wikipedia for a more in-depth explanation of how it is calculated and when it’s most useful.
Your geometric mean will likely be the lowest value on your dashboard, but we didn’t just add this to make you feel better about your site speed. Our goal is always to give you more transparency and a more holistic view into your website performance.