|
Covariance and correlation are related parameters
that indicate
the extent to which two random variables co-vary. Suppose there are two
technology stocks. If they are affected by the same industry trends, their
prices will tend to rise or fall together. They co-vary. Covariance and correlation
measure such a tendency.
Let's formalize this. Consider a random vector X
whose components are random variables
:
 |
[1] |
|
|
|
Given any pair of components,
and
,
we denote their covariance as either
or
.
The covariance is defined by the
expectation
 |
[2] |
where
and
are the means of
and
.
By definition, covariance is symmetric, with
.
Also, the covariance of any component
with itself is that component’s
variance:
 |
[3] |
We summarize all the covariance's of a random vector
X
with a covariance matrix:
 |
[4] |
Due to the symmetry property of covariances, this is
necessarily a symmetric matrix. It can be shown that covariance matrices
are positive definite or
positive semidefinite.
The magnitude of a covariance
depends upon the standard deviations of the two components
and
.
To obtain a more direct indication of how two components co-vary, we scale
covariance to obtain correlation.
Given any pair of components,
and
,
we denote their correlation as either
or
.
The correlation is defined as
 |
[5] |
where
and
are the standard deviations of
and
.
By construction, a correlation is always a
number between –1 and 1. Correlation inherits the symmetry property of
covariance:
.
From [3] and [5],
,
which indicates that a random variable co-varies perfectly with itself. If
and
are independent, their correlation is 0. The converse is not true. As with
covariances, we can summarize all the correlations of a random vector
X with a symmetric correlation
matrix:
 |
[6] |
Like covariance matrices, correlation matrices must be positive
definite or positive semidefinite.
|
|
 |
|
Cholesky matrix
A lower-triangular matrix that acts as a matrix "square root" for a positive
definite matrix.
expected
value A parameter describing the "center of gravity" of a
distribution.
joint normal
distribution A multivariate distribution with normal marginal distributions.
kurtosis A parameter describing the peakedness and tails of a
distribution.
linear
polynomial of a random vector A random variable or random
vector that is defined as a linear polynomial of a random vector.
multicollinear
A covariance matrix is muticollinear if it is "almost" singular.
positive definite
matrix A real symmetric matrix, all of whose eigenvalues are real and positive.
quantile A notion from
probability that can be used as a parameter.
skewness A parameter that
describes the lack of symmetry of a distribution.
standard deviation A
parameter describing the dispersion of a distribution. |
|
|
|
 |
 |
Ads by Contingency Analysis
|
|
|
 |
|
Salsburg (2001)
is a wonderful history of probability and statistics. Degroot and
Schervish (2002)
is a standard university text.
|
|
|
|
 |
 |
|
|
|
 |
|
|
|
|
|