A random vector is multicollinear if it is "almost" singular. Let's consider an example from finance. Suppose we are analyzing the market risk in a natural gas trading portfolio. Random variables represent tomorrow’s values for each price the portfolio is exposed to. The portfolio holds New York Mercantile Exchange (NYMEX) Henry Hub futures out to 24 months, so there are 24 futures prices. It also has forward positions out to 18 months for 30 delivery points, for another 540 prices. In total, our model depends upon a vector of 564 random variables! Based upon a
time series
analysis of historical price data, we construct a 564 Intuitively, we know that the random variables are interdependent. Prices for 6-month and 7-month Transco Zone 2 delivery are highly correlated. So are 3-month prices for adjacent Transco Zones 1 and 2. Because of such interdependencies, it is conceivable that our random vector is singular, but this is probably not the case. Singularity arises infrequently in applications. A more common situation is “almost” singularity, which is known as multicollinearity. We illustrate with two four-dimensional random vectors.
Random vector X is singular. Its first three components
Random vector Z is multicollinear. Like
X, its first three components
The covariance matrix of X is singular. It
has determinant 0. The covariance matrix of Z is not
singular, but with a determinant of .000001, it is “almost” singular. The
random variable
Realizations of a multicollinear random vector tend to
cluster near a plane within
We may think of a random vector Z as being
“almost” singular if its covariance matrix has a determinant |
As describe in the article positive definite matrix, the dimensionality of a singular random vector X can be reduced with a simple change of variables. No information is lost, as we only eliminate extraneous random variables. Multicollinearity is more problematic. Reducing the dimensionality of a multicollinear random vector Z requires an approximation that somehow identifies and discards minor randomness that is preventing the covariance matrix from being singular. This is the situation we face with our natural gas portfolio. We feel confident that the natural gas market can reasonably be modeled with less than 564 random variables, but we can’t arbitrarily discard random variables! If our covariance matrix isn’t singular, how can we replace our 564 random variables with a smaller set that convey essentially the same information? Principal component analysis provides a solution.
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||