With principal
component analysis, we transform a random vector Z
with correlated components
Principal component analysis can be performed on any
random vector Z whose second moments exist, but it is most
useful with multicollinear
random vectors. Principal component analysis takes the hyperplane in which
realizations of a multicollinear random vector almost sit and aligns it
with the coordinate system of
Example: European
Currencies
Exhibit 2 graphs 18 months of daily exchange-rate data drawn from the period immediately following the launch of the new euro (EUR) currency. In our data, the EUR weakens following its launch, and the remaining European currenciesthose that did not join the EUR on January 1, 1999weaken in sympathy. All the currencies track the EUR, but the GBP does so the least. It is less correlated with the EUR and loses value more slowly.
We assume
The corresponding correlation matrix is:
The correlations are all positive. Several exceed 0.90.
The one between DKK and EUR
exceeds 0.99. The smallest is a respectable 0.45 between GBP and
SEK. With such pronounced
interdependencies between its components, we expect Z to be
multicollinear, and it is.
The correlation matrix has determinant | To define principal
components of Z, we calculate orthonormal
(orthogonal and of unit length)
eigenvectors
The eigenvectors
The eigenvectors may be thought of as modes of fluctuation of random vector Z. We observed in our historical data a tendency for the European currencies to move together. This is reflected in the first eigenvector. It describes a broad move in all the currencies, with the GBP participating about half as much as the other currencies. The second eigenvector has the GBP moving in opposition to the NOK and SEK, with the CHF moving modestly with the GBP. The third eigenvector describes the GBP, NOK, and SEK moving together in opposition to the other currencies. The remaining eigenvectors describe other modes of fluctuation. If the eigenvectors
The
We have ordered our principal components according to
their variances. From our covariance matrix
We can approximate Z by discarding from [5] insignificant principal components. The more we discard, the simplerand cruder!will be our approximation. If we want to be aggressive in our approximation, we can discard the contributions of the last four principal components, and approximate Z with just the first three. A more accurate approximation can be obtained by discarding only the last two. For this example, we pursue the more aggressive course. We define
and approximate Z with
Comparing this covariance matrix with [2], you can judge for yourself the quality of the approximation. Principal Components
Specifically, the first principal component
where
The second principal component
where
Proceeding in this manner, we define the remaining
principal components. There will be m principal components
The vector of principal components D has
mean
If
Choice of Weights
Unfortunately, there may be no correspondence between a random variables standard deviation and its significance. Standard deviations depend upon the units in which a random variable is measured. Suppose a random variable reflects the time it takes for some event to occur, and if the random variable is measured in days, it has a standard deviation of 13.5. If the standard deviation is measured in hours, it is 324. Measured in minutes, it becomes 19,440. Certainly, the 19,440 standard deviation is no more significant than the 13.5 standard deviation, but principal component analysis will treat it as more significant! If we use principal components only to orthogonalize a random vector, this will not be a problem. No information is lost. It will be a problem if principal components are discarded to form an approximation. In this case, information is lost. Before we discard principal components that appear insignificant, we should make sure that they truly are insignificant. There are various solutions to this problem. We might insist that all random variables be measured in the same units, but this is not always feasible. If one random variable represents temperature and another represents volume, these are fundamentally different quantities. Also, identical units do not necessarily correspond to identical significance. Suppose we are analyzing blood samples for lead, and we have a random variable for each component of the blood. All components are measured in parts per million (ppm). Measured in ppm, the standard deviation of lead will be trivial compared to standard deviations for other constituents of the blood. Yet, the lead component is the most important random variable! Alternatively, we might apply principal component analysis to normalized random variables obtained by dividing each random variable by its standard deviation:
With this approach, we effectively apply principal component analysis to the random variables correlation matrix. This represents a different weighting from that obtained by measuring all random variables in identical unitsbut not necessarily a better one. Any solution may be reasonable in certain contexts and unreasonable in others. Each one weights the random variables in some manner. There is no objective way to assign weights, just as there is no objective way to assign significance. Weights and significance can and should vary from one application to another. When we use principal components to reduce the dimensionality of a random vector, there is subjectivity in the process.
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||