Covariance Calculator

 
Dataset set x
comma separated input values
Dataset set y
comma separated input values
Number of samples  =  5
Mean `\bar{X}`  =  20.6
Mean `\bar{Y}`  =  15.2
Covariance `s_(XY)`  =  146.1
GENERATE WORK
GENERATE WORK

Covariance - work with steps

Input Data :
Data set x = 5, 12, 18, 23, 45
Data set y = 2, 8, 18, 20, 28
Total number of elements = 5

Objective :
Find what is correlation coefficient for given input data?

Formula :
$\begin{align} s_{XY} &=\frac{\sum_{i=1}^n(x_i-\bar{X})(y_i-\bar{Y})}{n-1}\end{align}$

Solution :

`x_i``x_i - \bar{X}``y_i``y_i - \bar{X}``(x_i - \bar{X})(y_i - \bar{Y})`
5-15.62-13.2`205.92`
12-8.6 8-7.2`61.92`
18-2.6 182.8`-7.28`
232.4 204.8`11.52`
4524.4 2812.8`312.32`
`\sum x_i``= 103``\sum y_i``=76``\sum (x_i - \bar{X})(y_i - \bar{X})``=584.4`
`\bar{X}``=103/5``=`20.6`\bar{Y}``=76/5``=`15.2
`s_(XY) = 584.4/(5-1)`
= `584.4/4`
`s_(XY) = 146.1`

Covariance Calculator estimates the statistical relationship (linear dependence) between the two sets of population data `X` and `Y`. It's an online statistics and probability tool requires two sets of population data `X` and `Y` and measures of how much these data sets vary together, i.e. it helps us to understand how two sets of data are related to each other.
It is necessary to follow the next steps:

  1. Enter two data sets `X` and `Y` (observed values) in the box. These values must be real numbers or variables and may be separated by commas. The number of values should be same for `X` and `Y`. The values can be copied from a text document or a spreadsheet.
  2. Press the "GENERATE WORK" button to make the computation.
  3. Covariance calculator will estimate the statistical relationship between two data set `X` and `Y`.
Input : Two lists of real numbers separated by comma;
Output : A real number.

It gives us the stepwise procedure and insight into every step of calculation. Before the final result of covariance is derived, it calculates the sample means of two sets of data. These values of the sample means can be of benefit for further solving of problems and applications.

Covariance Formulas
Sample Covariance Formulas: Sample covariance, $s_{XY}$, of two samples `X` and `Y` is determined by the formula
$$\begin{align} s_{XY} &=\frac{\sum_{i=1}^n(x_i-\bar{X})(y_i-\bar{Y})}{n-1}\end{align}$$
where $\bar{X}$ and $\bar{Y}$ are the sample means.

Population Covariance Formula: Population covariance, $\sigma_{XY}$, between two random variables `X` and `Y` is determined by the formula
$$\begin{align} \sigma_{XY}=\sum_{i=1}^N\frac{(x_i-\mu_X)(y_i-\mu_Y)}{N}\end{align}$$
where $\mu_x$ and $\mu_y$ are the population means.

What is Covariance?

Covariance indicates whether two variables `X` and `Y` are related by measuring how the variables change in relation to each other. It tells us if there is a relationship between two variables and which direction that relationship is in. A positive covariance means that the two variables are positively related, and they have the same direction. A negative covariance means that the variables are negatively related, and they have the opposite directions.

The variance is a special case of the covariance in which the two sets of data are identical. So, if $X\equiv Y$, then covariance becomes variance.

As we have mentioned, the covariance and correlation indicate whether non-identical variables are positively or negatively related. Correlation gives the degree to which the variables tend to move together in the corresponding direction. Covariance can be used to measure variables that have no the same units of measurement. By using the covariance, we are able to determine whether units are increasing or decreasing, but we are unable to solidify the degree to which the variables are moving together because the covariance does not use one standardized unit of measurement.

How to Calculate Covariance?

Let us consider two samples $X=(x_1,\ldots,x_n)$ and $Y=(y_1,\ldots, y_n)$ of $n$ outcomes. The sample covariance, $s_{XY}$, of two samples `X` and `Y` is determined by the formula

$$\begin{align} s_{XY} &=\frac{\sum_{i=1}^n(x_i-\bar{X})(y_i-\bar{Y})}{n-1}\end{align}$$
where $\bar{X}$ and $\bar{Y}$ are the sample means.
To find the sample covariance, we need to follow the next steps:
  1. Find the sample mean $\bar{X}$ for data set `X`;
  2. Find the sample mean $\bar{Y}$ for data set `Y`;
  3. Substitute values in the formula for the correlation coefficient to get the result.
If we have data for the entire population, we can find the population covariance. For a set of `N` ordered pairs of observations $(x_i, y_i), i = 1, \ldots , N$, the population covariance of `X` and `Y`, usually denoted by $\rho_{XY}$, is defined by the following formula
$$\begin{align} \sigma_{XY}=\sum_{i=1}^N\frac{(x_i-\mu_X)(y_i-\mu_Y)}{N}\end{align}$$
where $\mu_x$ and $\mu_y$ are the population means. This formula can be written equivalently,
$$\sigma_{XY}=\sum_{i=1}^N\frac{x_iy_i}{N}-\mu_X\mu_Y$$
If $\sigma_{XY}\ne0$, then `X` and `Y` are linearly related. There are some special cases:
  • If higher values of `X` are related with higher values of `Y`, then $\sigma_{XY}$ is positive. So, `X` and `Y` are directly related;
  • If higher values of `X` are related with lower values of `Y`, then $\sigma_{XY}$ is negative. So, `X` and `Y` are inversely related;
  • If `X` is neither higher nor lower for higher values of `Y`, then $\sigma_{XY}$ is zero and there is no linear relationship between `X` and `Y`;
  • If $\sigma_{XY} = \sigma_X\sigma_Y,$ then there is a perfect positive relationship between `X` and `Y`;
  • If $\sigma_{XY} = -\sigma_X\sigma_Y,$ then there is a perfect negative relationship between `X` and `Y`;
In many cases, we can calculate the covariance by hand using the tabular method, especially for small calculations. But, if we a large set of data for calculation or we want to get an accurate result, then we should use the covariance calculator.

The covariance work with steps shows the complete step-by-step calculation for how to find the covariance of the two samples `X: 5,12,18,23,45` and `Y: 2,8,18,20,28` by using tabular method. For any other samples, just supply two lists of numbers and click on the "GENERATE WORK" button. The grade school students may use this covariance calculator to generate the work, verify the results derived by hand or do their homework problems efficiently.

Covariance Practice Problems

The first application of covariance is in determining the correlation coefficient. If we divide the covariance by the standard deviation of `X` and the standard deviation of `Y`, we will get the correlation coefficient. Covariance is frequently used in statistics and probability theory since it refers to the measure of the directional relationship between two random variables `X` and `Y`. It is very useful in finance. The covariance describes the returns on two different investments over a period of time when compared to different variables.

Practice Problem 1:
Find the covariance between the given two sets of data $X: 13, 12, 15, 18, 21$ and $Y: 15, 29, 11, 14, 34$.

Practice Problem 2:
The given table shows the number of borrowed geometry and statistics books in week from Monday to Friday. Find the covariance between them.

GeometryStatistics
Monday134231
Tuesday156127
Wednesday234276
Thursday214265
Friday301124
Practice Problem 2:
The points of five students in Geometry and Statistics tests are given in the table below. Find the covariance between them.

Geometry6543897634
Statistics46788956100

The covariance calculator, formula, step by step calculation and practice problems would be very useful for grade school students (K-12 education) to learn what is covariance of two data sets in statistics and probability, and how to find it. In many real-life situations, it is necessary to use this concept.