No results found
We can’t find anything with that term at the moment, try searching something else.
Given a discrete data set, the calculator calculates the mean, variance, and standard deviation of a sample or a population and shows all the intermediate steps of calculations.
Result | |
---|---|
Standard Deviation | s = 4.5 |
Variance | s2 = 20.24 |
Count | n = 7 |
Mean | x̄ = 14.29 |
Sum of Squares | SS = 100 |
There was an error with your calculation.
Standard deviation is one of the most commonly used metrics to characterize the statistics of a given data set. The standard deviation, in simple terms, is a measure of how scattered the data set is. By calculating standard deviation, you can find out whether the numbers are close to or far from the mean. If the data points are far from the mean, then there is a large deviation in the data set. Thus, the greater the scatter in the data, the higher the standard deviation.
This calculator defines the standard deviation of a given data set and displays the mathematical steps involved in the calculation.
The calculator accepts the input as a list of numbers separated by a delimiter. A few examples of possible input are shown in the table below.
row input | column input | column input | column input |
---|---|---|---|
44, 63, 72, 75, 80, 86, 87, 89 | 44 | 44, | 44,63,72 |
44 63 72 75 80 86 87 89 | 63 | 63, | 75,80 |
44,, 63,, 72, 75, 80, 86, 87, 89 | 72 | 72, | 86,87 |
44 63 72 75, 80, 86, 87, 89 | 75 | 75, | 89 |
44; 63; 72, 75,, 80, 86, 87, 89 | 80 | 80, | |
44,,, 63,, 72, 75, 80, 86, 87, 89 | 86 | 86, | |
44 63,, 72,,,, 75, 80, 86, 87, 89 | 87 | 87, | |
89 | 89, |
The numbers can be separated by a comma/space/line break or a mix of them and can be inserted either in the row or column format. For all the formats shown in the above table, the calculator processes the input as 44, 63, 72, 75, 80, 86, 87, and 89.
Once entering the data, select whether it is a sample or population data and hit enter. The calculator displays five statistical parameters of the dataset: count (number of observations), mean, sum of squared deviations, variance, and standard deviation.
The calculator is designed to compute the standard deviation of a discrete data set and provides an insight into the theory behind the calculation.
The data may consist of a population consisting of all possible observations in an experiment (of any kind) under the specified conditions. In many cases, it is impossible to sample each population member.
In statistical practice, it is common to work with a subset of a larger 'population', which we refer to as a 'sample'. This is because it is often impractical or impossible to collect data from every individual in the population. We make estimates or inferences about the population based on the information gathered from the sample.
When calculating standard deviation, the formula we use is adjusted depending on whether we are dealing with a sample or the entire population. This adjustment is made through a factor known as 'degrees of freedom'. For a sample, we divide by n - 1 (where n is the sample size) instead of n when calculating variance, which is then squared to find the standard deviation. This correction compensates for the fact that we are using sample data to estimate the population standard deviation and ensures our estimate is unbiased.
Standard deviation measures a data set's average dispersion/deviation/variability relative to the mean. It is often denoted by the Greek letter σ for a population or s for a sample. A larger value of σ or s implies a larger dispersion of data points from the sample mean and vice versa.
Consider the following examples of data sets.
(Set I)
11, 3, 5, 21, 10, 15, 20, 25, 13, 26, 27
(Set II)
12, 14, 14, 15, 15, 16, 16, 17, 18, 19, 20
Substituting these data sets into the calculator, we get for the set I
For set II
In Set I, the numbers deviated significantly from the sample mean (s=8.39) while in Set II the variability is small (s=2.36) as compared to Set I.
This formula is applied when all values of the population are analyzed.
$$σ = \sqrt{\frac{\sum_{i=1}^{N}(x_i-μ)^2}{N}}$$
The formula below is used when there is a very large size of population and only its sample is taken for analysis.
$$s = \sqrt{\frac{\sum_{i=1}^{n}(x_i-\bar{x})^2}{n-1}}$$
The following steps are involved in the calculation of standard deviation.
Step 1: Calculate the sample/population mean. It is the sum of all data points divided by the number of counts N or n i.e.
Sample mean:
$$\bar{x}=\frac{x₁+x₂+x₃+........+x_n}{n}$$
Population mean
$$\mu=\frac{x₁+x₂+x₃+........+x_N}{N}$$
Step 2: Calculate the deviations by subtracting sample/population mean from each data point, i.e.
Sample deviations:
$$(x₁-\bar{x}), (x₂-\bar{x}), (x₃-\bar{x})…………………… (x_n-\bar{x})$$
Population deviations:
$$(x₁-\ \mu), (x₂-\ \mu), (x₃-\ \mu)……………….. (x_N-\ \mu)$$
Step 3: Calculate the squared deviations for each data point.
Sample squared deviations:
$$(x₁-\bar{x})^2, (x₂-\bar{x})^2, (x₃-\bar{x})^2…………………… (x_n-\bar{x})^2$$
Population squared deviations:
$$(x₁-\ \mu)^2, (x₂-\ \mu)^2, (x₃-\ \mu)^2……………….. (x_N-\ \mu)^2$$
Step 4: Calculate the sum of the squared deviations by adding all individual squared deviations
Sample sum of squared deviations:
$$SS=(x₁-\bar{x})^2+ (x₂-\bar{x})^2+(x₃-\bar{x})^2……………………+(x_n-\bar{x})^2$$
Population sum of squared deviations:
$$SS=(x₁-\ \mu)^2+ (x₂-\ \mu)^2+(x₃-\ \mu)^2……………….+ (x_N-\ \mu)^2$$
Step 5: Divide the sum of the squared deviations by the number of degrees of freedom to get the variance. For a population, divide by N, and for a sample, divide by n-1.
Sample variance
$$ s^2 = \frac{\sum_{i=1}^{n}(xᵢ - \bar{x})^2}{n - 1} $$
Population variance
$$ \sigma^2 = \frac{\sum_{i=1}^{N}(xᵢ - \mu)^2}{N} $$
When calculating the variance for a sample, we could assume that we will use the expression for the calculations:
$$\frac{(x-\bar{x})^2}{n}$$
where
x̄ is the sample mean and n is the sample size. But such a formula is not used.
Such an expression would not give a good estimate of the variance of the population. When the general population is very large and the sample is very small, the variance calculated by this formula would underestimate the variance of the population. This would show too small variance due to lack of data. So by using the expression n-1 we increase the potential variance value.
Instead of dividing by n, we find the variance of the sample by dividing by n-1. This operation gives a slightly larger variance value, closer to the actual value.
Step 6: Extract the square root of the resulting number. The standard deviation is the square root of the variance.
Sample standard deviation
$$s=\sqrt{s^2}=\sqrt{\frac{\sum_{i}^{n}{{(x_i-\ \bar{x})}^2\ }}{n-1}}$$
Population standard deviation
$$\sigma=\sqrt{\sigma^2}=\sqrt{\frac{\sum_{i}^{N}{{(x_i-\ \mu)}^2\ }}{N}}$$
Let us consider the following scores of n=8 students in the Physics final:
45, 67, 70, 75, 80, 81, 82, and 84
The calculator calculates the standard deviation of the sample using the following steps:
Step 1: Compute the mean.
$$\bar{x}=\frac{\sum_{i} x_i}{n}=\frac{45+\ 67+\ 70+\ 75+\ 80+\ 81+\ 82+\ 84}{8}=73$$
Step 2: Compute the deviations
x₁-x̄ | x₂-x̄ | x₃-x̄ | x₄-x̄ | x₅-x̄ | x₆-x̄ | x₇-x̄ | x₈-x̄ |
---|---|---|---|---|---|---|---|
45-73 | 67-73 | 70-73 | 75-73 | 80-73 | 81-73 | 82-73 | 84-73 |
-28 | -6 | -3 | 2 | 7 | 8 | 9 | 11 |
Step 3: Compute the squares of deviations
(x₁-x̄)² | (x₂-x̄)² | (x₃-x̄)² | (x₄-x̄)² | (x₅-x̄)² | (x₆-x̄)² | (x₇-x̄)² | (x₈-x̄)² |
---|---|---|---|---|---|---|---|
784 | 36 | 9 | 4 | 49 | 64 | 81 | 121 |
Step 4: Sum the squared deviations.
$$SS=\sum_{i}^{n}{{(x_i-\ \bar{x})}^2=784+36+9+4+49+64+81+121}=1148$$
Step 5: Calculate the variance by dividing the sum of squared deviations by degrees of freedom (n-1). For a population, the variance in this step would be divided by N rather than N-1. In this case, we have a sample, that is, data on a portion of the student population, not the entire population.
$$s^2=\ \frac{\sum_{i}^{n}{{(x_i-\ \bar{x})}^2\ }}{n-1}=\frac{1148}{8-1}=164$$
Step 6: Take the square root of the variance to get the standard deviation.
$$s=\sqrt{s^2}=\ \sqrt{164}=12.80$$
Dispersion and standard deviation can be used to determine the scatter of the data. If the variance or standard deviation is large, the data is more scattered. This information is useful when comparing two (or more) datasets to determine which is more (most) variable.
In industry, the standard deviation is widely used for quality control. In large-scale production, certain product characteristics must fall within a defined range that can be accessed by calculating the standard deviation. For example, in the production of nuts and bolts, the variation in their diameters must be small, otherwise, the parts will not fit together.
A standard deviation is used in finance and many other areas to assess risk. In technical analysis, the standard deviation is used to construct Bollinger lines and calculate volatility.
Also, the standard deviation is used in finance as a measure of volatility, and in sociology, it is used in public opinion polls to help calculate uncertainty.
The variance and standard deviation are used to determine the number of data values that fall within a given distribution interval. For example, Chebyshev's theorem shows that for any distribution, at least 75% of the data values will be within 2 standard deviations of the mean.
Let's take a simple example with the climate. Suppose we study the daily temperature of two cities in the same region. One city is on the coast and the other is inland. The average maximum daily temperature in these two cities may be the same. But the standard deviation, that is, the spread of maximum daily temperatures will be greater for the city located on the continent, and the coastal city will have a smaller standard deviation of maximum daily temperatures.
This means that a continental city will have a greater variation in maximum air temperature on any given day of the year than a coastal city. That is, the coastal city will have a milder climate.