因此,您被要求使用Excel计算方差,但您不确定这意味着什么或如何去做。别担心,这是一个简单的概念,甚至更简单的过程。您将立即成为一名方差专业人士!
什么是方差?
“方差(Variance)”是一种衡量平均距离的方法。“平均值”是数据集中所有值的总和除以值的数量。方差(Variance)让我们了解该数据集中的值平均而言是趋于一致地保持均值还是分散在各处。
从数学上讲,方差并不那么复杂:
- 计算一组值的平均值。要计算平均值,请将所有值的总和除以值的数量。
- 取集合中的每个值,然后从平均值中减去。
- 将结果值平方(Square)(以抵消负数)。
- 将(Add)所有平方值相加。
- 计算平方值的平均值以获得方差。
如您所见,这不是一个难以计算的值。但是,如果您有数百或数千个值,则手动执行将需要很长时间。所以Excel可以自动化这个过程是一件好事!
你用方差做什么?
方差本身有许多用途。从纯粹的统计角度来看,这是表达一组数据的分散程度的一种很好的速记方式。投资者使用方差来估计给定投资的风险。
例如,通过获取一段时间内的股票价值(stock’s value)并计算其方差,您将很好地了解其过去的波动性。在过去可以预测未来的假设下,这意味着具有低方差的事物更安全、更可预测。
您还可以比较不同时间段内某些事物的差异。这可以帮助检测另一个隐藏因素何时影响某物,从而改变其方差。
方差也与另一个称为标准偏差的统计量密切相关。请记住(Remember),用于计算方差的值是平方的。这意味着方差不是以原始值的相同单位表示的。标准偏差需要取方差的平方根才能将值返回到其原始单位。因此,如果数据以千克为单位,那么标准偏差也是如此。
在总体(Between Population)和样本方差(Sample Variance)之间进行选择
Excel中有两种方差子类型,其公式略有不同。您应该选择哪一个(Which one)取决于您的数据。如果您的数据包括整个“总体”,那么您应该使用总体方差。在这种情况下,“人口”意味着您拥有目标人口组的每个成员的每个值。
例如,如果您正在查看左撇子的体重,那么人口包括地球上每个左撇子的人。如果你已经权衡了它们,你会使用总体方差。
当然,在现实生活中,我们通常会从较大的人群中选择较小的样本。在这种情况下,您将使用样本方差。对于较小的人口,人口(Population)差异仍然是可行的。例如,一家公司可能有几百或几千名员工,每个员工都有数据。它们代表了统计意义上的“人口”。
选择正确的方差公式
Excel中有三个样本方差公式和三个总体方差公式:
- VAR、VAR.S和VARA用于样本方差。
- VARP、VAR.P和VARPA用于总体方差。
您可以忽略VAR和VARP。这些已经过时,只是为了与旧电子表格兼容。
剩下的 VAR.S(VAR.S)和VAR.P用于(VAR.P)计算一组数值的方差以及VARA和VARPA,其中包括文本字符串。
VARA和VARPA会将任何文本字符串转换为数值 0,但“TRUE”和“FALSE”除外。它们分别转换为 1 和 0。
最大的区别是VAR.S和VAR.P跳过任何非数值。这会将这些案例从值的总数中排除,这意味着平均值会有所不同,因为您要除以较少数量的案例以获得平均值。
如何在 Excel 中计算方差
在Excel(Excel)中计算方差所需的只是一组值。我们将在下面的示例中使用VAR.S,但无论您使用哪种方差公式,公式和方法都完全相同:
- 假设您准备好一个范围或离散值集,请选择您选择的空单元格(empty cell)。
- 在公式字段中,键入=VAR.S(XX:YY),其中 X 和 Y 值替换为区域的第一个和最后一个单元格编号。
- 按Enter完成计算。
或者,您可以指定特定值,在这种情况下,公式类似于=VAR.S(1,2,3,4)。将数字替换为计算方差所需的任何内容。您最多可以像这样手动输入 254 个值,但除非您只有少数几个值,否则最好在单元格区域中输入数据,然后使用上面讨论的公式的单元格区域版本。
You Can Excel at, Er, Excel
对于需要在Excel(Excel)中进行一些统计工作的人来说,计算方差是一个有用的技巧。但是,如果我们在本文中使用的任何Excel术语令人困惑,请考虑查看(Excel)Microsoft Excel 基础教程——学习如何使用 Excel(Microsoft Excel Basics Tutorial – Learning How to Use Excel)。
另一方面,如果您准备好更多,请查看将线性回归趋势线添加到 Excel 散点图中(Add a Linear Regression Trendline to an Excel Scatter Plot),以便您可以可视化方差或与算术平均值相关的数据集的任何其他方面。
How to Calculate Variance in Excel
So you’vе been asked to calculate vаriance using Excel, but you aren’t sure what that means or how to do it. Don’t worry, it’s an easy concept and even easier process. You’ll be a variance pro in no timе!
What Is Variance?
“Variance” is a way to measure the average distance from the mean. The “mean” is the sum of all values in a dataset divided by the number of values. Variance gives us an idea of whether the values in that data set tend, on average, to stick uniformly to the mean or scatter all over the place.
Mathematically, variance isn’t that complex:
- Calculate the mean of a set of values. To calculate the mean, take the sum of all the values divided by the number of values.
- Take every value in your set and subtract it from the mean.
- Square the resulting values (to cancel out negative numbers).
- Add all the squared values together.
- Calculate the mean of the squared values to get the variance.
So as you can see, it’s not a hard value to calculate. However, if you have hundreds or thousands of values, it would take forever to do manually. So it’s a good thing that Excel can automate the process!
What Do You Use Variance For?
Variance by itself has a number of uses. From a purely statistical perspective, it’s a good shorthand way to express how spread out a set of data is. Investors use variance to estimate the risk of a given investment.
For example, by taking a stock’s value over a period of time and calculating its variance, you’ll get a good idea of its volatility in the past. Under the assumption that the past predicts the future, it would mean that something with low variance is safer and more predictable.
You can also compare the variances of something across different time periods. This can help detect when another hidden factor is influencing something, changing its variance.
Variance is also strongly-related to another statistic known as the standard deviation. Remember that the values used to calculate variance are squared. This means that variance is not expressed in the same unit of the original value. The standard deviation requires taking the square root of variance to return the value to its original unit. So if the data was in kilograms then the standard deviation is as well.
Choosing Between Population and Sample Variance
There are two subtypes of variance with slightly different formulas in Excel. Which one you should choose depends on your data. If your data includes the entire “population” then you should use population variance. In this case “population” means that you have every value for every member of the target population group.
For example, if you’re looking at the weight of left-handed people, then the population includes every individual on Earth who’s left-handed. If you’ve weighed them all, you’d use population variance.
Of course, in real life we usually settle for a smaller sample from a larger population. In which case you’d use sample variance. Population variance is still practical with smaller populations. For example, a company may have a few hundred or few thousand employees with data on each employee. They represent a “population” in the statistical sense.
Choosing the Right Variance Formula
There are three sample variance formulas and three population variance formulas in Excel:
- VAR, VAR.S and VARA for sample variance.
- VARP, VAR.P and VARPA for population variance.
You can ignore VAR and VARP. These are outdated and are only around for compatibility with legacy spreadsheets.
That leaves VAR.S and VAR.P, which are for calculating the variance of a set of numerical values and VARA and VARPA, which include text strings.
VARA and VARPA will convert any text string to the numerical value 0, with the exception of “TRUE” and “FALSE”. These are converted to 1 and 0 respectively.
The biggest difference is that VAR.S and VAR.P skip over any non-numerical values. This excludes those cases from the total number of values, which means the mean value will be different, because you’re dividing by a smaller number of cases to get the mean.
How to Calculate Variance in Excel
All you need to calculate variance in Excel is a set of values. We’re going to use VAR.S in the example below, but the formula and methods are exactly the same regardless of which variance formula you use:
- Assuming you have a range or discrete set of values ready, select the empty cell of your choice.
- In the formula field, type =VAR.S(XX:YY) where the X and Y values are replaced by the first and last cell numbers of the range.
- Press Enter to complete the calculation.
Alternatively, you can specify specific values, in which case the formula looks like =VAR.S(1,2,3,4). With the numbers replaced with whatever you need to calculate the variance of. You can enter up to 254 values manually like this, but unless you only have a handful of values it’s almost always better to enter your data in a cell range and then use the cell range version of the formula discussed above.
You Can Excel at, Er, Excel
Calculating variance is a useful trick to know for anyone who needs to do some statistical work in Excel. But if any of the Excel terminology we used in this article was confusing, consider checking out Microsoft Excel Basics Tutorial – Learning How to Use Excel.
If, on the other hand, you’re ready for more, check out Add a Linear Regression Trendline to an Excel Scatter Plot so you can visualize variance or any other aspect of your data set in relation to the arithmetic mean.