Eigenvalues and eigenvectors

Eigenvalue, eigenvector and generalized eigenvector will be merged in here. This article is a work in progress.

In mathematics, and in particular in vectorial analysis, the eigenvalues, eigenvectors, and eigenspaces of a transformation are important properties of this transformation. These key concepts play a major role in pure mathematics but also in applied mathematics, physics, theoretical chemistry and, engineering.

Definition

Through transformation of Mona Lisa from the left to the right the blue vector has been rotated but the red one has not been modified. The red vector is an eigenvector of the transformation and is associated to an eigenvalue equal to one.

Transformations of space, like translation, rotation, reflection, stretching, compression, or any combination of all (but other nonlinear transformations can be also listed), may be visualized as the effect they produce on vectors, i.e. one-dimensional arrows pointing from one point to another. In this context, eigenvectors of transformations are vectors left unaffected or just scaled by a factor after the transformation. This factor is called eigenvalue. The corresponding eigenspace is the space formed by all eigenvectors associated to a particular eigenvalue. For instance, an eigenvector of a rotation is a vector located within the axis around which the rotation is performed. The corresponding eigenvalue is one and all vectors parallel to the axis form the associated eigenspace.

Examples

For example, as the Earth rotates, every arrow from the center of the Earth also rotates, apart from those arrows that lie on the axis of rotation. Consider the transformation of the Earth after one hour of rotation. An arrow from the center of the Earth to the South Pole would be an example of an eigenvector of this transformation, while an arrow from the center of the Earth to Paris would not be an eigenvector. Since the arrow pointing at the pole is not stretched by the rotation of the Earth, its eigenvalue would be 1.

As another example, consider a thin metal sheet expanding uniformly about a fixed point. The transformation that expands every point on the sheet to twice its original distance from the fixed point has eigenvalue 2. Every vector from the fixed point to a point on the sheet is an eigenvector, and the eigenspace is the set of all these vectors.

However vector spaces are not only the three-dimensional geometrical space. One example is a stressed rope fixed at its both ends, like the vibrating strings of a string instrument. The distances of each atoms of the rope to their positions when the rope is at rest can be seen as the components of a vector in a space with as many dimensions as atoms in the rope. If one considers the transformation of the rope as time elapses, its eigenvectors (or eigenfunctions if one assumes the rope is a continuous medium) are its standing waves well known to musicians which label them according to the notes. The standing waves correspond to particular oscillations of the rope such that the shape of the rope is scaled by a factor (the eigenvalue) as the time evolves. Each component of the vector associated to the rope is mulitiplied by this time-dependent factor. If one takes the damping of the oscillations into account, the amplitude of the standing waves (their eigenvalues) decrease with time. One can then associate to the eigenvector a lifetime and relate the concept of eigenvector to the concept of resonance which is a key concept in physics.

Eigenvalue equation

Mathematically, the equation for the eigenvalues λ and eigenvectors v_λ of a transformation ${\mathcal {T}}$ can be written

{\mathcal {T}}(\mathbf {v} _{\lambda })=\lambda \,\mathbf {v} _{\lambda }

where ${\mathcal {T}}(\mathbf {v} _{\lambda })$ stands for the vector obtain obtained when applying the transformation ${\mathcal {T}}$ to v_λ.

In the case ${\mathcal {T}}$ is a linear transformation, i.e. if ${\mathcal {T}}(a\mathbf {v} +b\mathbf {w} )=a{\mathcal {T}}(\mathbf {v} )+b{\mathcal {T}}(\mathbf {w} )$ for all scalars a, b, and vectors v, w, one can write, in a given basis set, in which ${\mathcal {T}}$ is represented by the matrix (or two-dimensional array) T and v_λ by the one-dimensional vertical array v_λ, the eigenvalue equation in its matrix (math representation

T\,v_{\lambda }=\lambda \,v_{\lambda }

which is a set of N linear equations if N is the number of basis vectors in the basis set. In this equation both the eigenvalue λ and the N components of v_λ are unknown.

Linear algebra

Linear algebra is an area of mathematics which studies linear transformations of space. A linear transformation involves rotation, reflection, directional stretching or compression, and shear. Such a linear transformation is represented numerically as an array of numbers called a matrix.

One-dimensional matrices (that is, a matrix that has only one column or only one row) are also known as vectors. A vector can be thought of as an arrow in space. Linear transformations can change the length and/or direction of a vector. An eigenvector of a given linear transformation is a vector that is stretched but not rotated, possibly with reversed direction, and the corresponding eigenvalue can be thought of as the degree of stretching, with a minus in case of reversion. An eigenspace is a set of eigenvectors which have the same eigenvalue, for a given linear transformation.

Mathematics and physics

In linear algebra a linear transformation can be represented by a matrix relative to a basis. If we can choose a basis so the matrix is diagonal, the entries of this matrix are the eigenvalues, and the basis is given by eigenvectors.

In applied mathematics and physics, the eigenvectors of a matrix or a differential operator often have important physical significance. In classical mechanics, the eigenvectors of the governing equations typically correspond to natural modes of vibration in a body, and the eigenvalues to their frequencies. In quantum mechanics, operators correspond to observables, eigenvectors are also called eigenstates, and the eigenvalues of an operator represent those values of the corresponding variable that have non-zero probability of being measured. In factor analysis, the eigenvectors of a covariance matrix correspond to factors, and eigenvalues to factor loadings.

Definition in linear algebra

An eigenvector of a linear transformation is an element in the domain of the linear transformation sent to a scalar multiple of itself; this scalar is called an eigenvalue of the linear transformation. An eigenspace of the linear transformation is a subspace of the domain all of whose elements are eigenvectors associated with the same eigenvalue. It is a subspace because any linear combination of such vectors is again such a vector.

Formally, eigenvectors and eigenvalues are defined as follows. Let A be an n-by-n matrix of real numbers or complex numbers (see below for generalizations). We say that a vector v ∈ Cⁿ is an eigenvector of A with eigenvalue λ ∈ C if v is not zero and

Av=\lambda v.\,

The spectrum of A, denoted σ(A), is the set of all eigenvalues.

More generally, if A : V → V is a linear operator on some vector space V, v is a non-zero vector in V and λ is a scalar (possibly zero) such that

Av=\lambda v\,

then we say that v is an eigenvector of the operator A, and its associated eigenvalue is λ. Note that if v is an eigenvector with eigenvalue λ, then any non-zero multiple of v is also an eigenvector with eigenvalue λ. In fact, all the eigenvectors with associated eigenvalue λ, together with 0, form a subspace of V, the eigenspace for the eigenvalue λ.

Since a vector v satisfies $Av=\lambda v$ , or equivalently $(\lambda I-A)v=0$ , if and only if it is either zero or an eigenvector for λ, we can also define the eigenspace by saying that it is the nullspace of the operator $(\lambda I-A)$ . Here $I$ denotes the identity, $Iv=v$ .

A conjugate eigenvector or coneigenvector is an element in the domain of the linear transformation sent to a scalar multiple of its conjugate, where the scalar is called the conjugate eigenvalue or coneigenvalue of the linear transformation. The coneigenvectors and coneigenvalues represent essentially the same information and meaning as the regular eigenvectors and eigenvalues, but arise when a alternative coordinate system is used. The corresponding equation is

Av=\lambda v^{*}.\,

For example, in coherent electromagnetic scattering theory, the linear transformation A represents the action performed by the scattering object, and the eigenvectors represent polarization states of the electromagnetic wave. In optics, the coordinate system is defined from the wave's viewpoint, known as the Forward Scattering Alignment (FSA), and gives rise to a regular eigenvalue equation, whereas in radar, the coordinate system is defined from the radar's viewpoint, known as the Back Scattering Alignment (BSA), and gives rise to a coneigenvalue equation.

Etymology

The word eigen is German for "own", "peculiar", or "individual": the most likely translation into English mathematical jargon would be "characteristic", and some older references do use the expressions "characteristic value", "characteristic vector" and so forth, but the more distinctive term "eigenvalue" has now become standard. The related "characteristic polynomial", however, retains this name. The reason might be that in German texts about linear algebra, the term "charakteristisches Polynom" is used.

Examples of linear transformation

Some special cases of linear transformations of two-dimensional space R² are illuminating:

rotations: no real eigenvectors (complex eigenvalue/eigenvector pairs exist). Example:
$A={\begin{bmatrix}0&-1\\1&0\end{bmatrix}}.$
reflection: eigenvectors are perpendicular and parallel to the line of symmetry. The eigenvalues are -1 and 1, respectively. Example:
$A={\begin{bmatrix}1&0\\0&-1\end{bmatrix}}.$
scaling:
- uniform scaling: all vectors are eigenvectors, and the eigenvalue is the scale factor. Example:
$A={\begin{bmatrix}2&0\\0&2\end{bmatrix}}.$
- directional scaling: eigenvalues are the scale factor and 1
- directionally differential scaling: eigenvalues are the scale factors
projection onto a line: vectors on the line are eigenvectors with eigenvalue 1 and vectors in the direction of projection (which may or may not be perpendicular) are eigenvectors with eigenvalue 0. Example:
$A={\begin{bmatrix}0&0\\0&1\end{bmatrix}}.$

Computing eigenvalues

Suppose that we want to compute the eigenvalues of a given matrix. If the matrix is small, we can compute them symbolically using the characteristic polynomial. However, this is often impossible for larger matrices, in which case we must use a numerical method.

Symbolic computations using the characteristic polynomial

An important tool for describing eigenvalues of square matrices is the characteristic polynomial: saying that λ is an eigenvalue of A is equivalent to stating that the system of linear equations (A - λid_V) v = 0 (where id_V is the identity matrix) has a non-zero solution v (namely an eigenvector), and so it is equivalent to the determinant det(A - λ id_V) being zero. The function p(λ) = det(A - λid_V) is a polynomial in λ since determinants are defined as sums of products. This is the characteristic polynomial of A: the eigenvalues of a matrix are the zeros of its characteristic polynomial.

(Sometimes the characteristic polynomial is taken to be det(λid_V - A) instead, which is the same polynomial if V is even-dimensional but has the opposite sign if V is odd-dimensional. This has the slight advantage that its leading coefficient is always 1 rather than -1.)

It follows that we can compute all the eigenvalues of a matrix A by solving the equation $p_{A}(\lambda )=0$ . If A is an n-by-n matrix, then $p_{A}$ has degree n and A can therefore have at most n eigenvalues. Conversely, the fundamental theorem of algebra says that this equation has exactly n roots (zeroes), counted with multiplicity. All real polynomials of odd degree have a real number as a root, so for odd n, every real matrix has at least one real eigenvalue. In the case of a real matrix, for even and odd n, the non-real eigenvalues come in conjugate pairs.

An example of a matrix with no real eigenvalues is the 90-degree rotation

{\begin{bmatrix}0&1\\-1&0\end{bmatrix}}

whose characteristic polynomial is $x^{2}+1$ and so its eigenvalues are the pair of complex conjugates i, -i.

The Cayley-Hamilton theorem states that every square matrix satisfies its own characteristic polynomial, that is, $p_{A}(A)=0$ .

Eigenvalues of 2×2 matrices

An analytic solution for the eigenvalues of 2×2 matrices can be obtained directly from the quadratic formula: if

A={\begin{bmatrix}a&b\\c&d\end{bmatrix}}

then the characteristic polynomial is

{\rm {det}}{\begin{bmatrix}a-\lambda &b\\c&d-\lambda \end{bmatrix}}=(a-\lambda )(d-\lambda )-bc=\lambda ^{2}-(a+d)\lambda +(ad-bc)

(notice that the coefficients, up to sign, are the trace ${\rm {tr}}(A)=a+d$ and determinant ${\rm {det}}(A)=ad-bc$ ) so the solutions are

\lambda ={\frac {a+d}{2}}\pm {\sqrt {{\frac {(a+d)^{2}}{4}}+bc-ad}}={\frac {a+d}{2}}\pm {\frac {\sqrt {4bc+(a-d)^{2}}}{2}}

A formula for the eigenvalues of a 3x3 or 4x4 matrix could be derived in an analogous way, using the formulae for the roots of a cubic or quartic equation.

Numerical computations

Main article: eigenvalue algorithm.

In practice, eigenvalues of large matrices are not computed using the characteristic polynomial. Computing the polynomial becomes expensive in itself, and exact (symbolic) roots of a high-degree polynomial can be difficult to compute and express (for example the Abel-Ruffini theorem implies that they cannot be expressed simply using $n$ th roots. Effective numerical algorithms for approximating roots of polynomials exist, but small errors in the eigenvalues can lead to large errors in the eigenvectors. Therefore, general algorithms to find eigenvectors and eigenvalues, are iterative. The easiest method is the power method: we choose a random vector v and compute Av, $A^{2}v$ , $A^{3}v$ , ... This sequence will after normalization almost always converge to an eigenvector corresponding to the dominant eigenvalue. This algorithm is easy, but not very useful by itself. However, popular methods such as the QR algorithm are based on it.

Example computation

The computation of eigenvalue/eigenvector can be computed by the following algorithm.

Consider an n-square matrix B

1. Find the roots of the characteristic polynomial of B. These are the eigenvalues.

If n different roots are found, then the matrix can be diagonalized.

2. Find an basis for the kernel of the matrix given by

B-\lambda _{n}I

. For each of the eigenvalues. These are the eigenvectors

The eigenvectors given from different eigenvalues are linear independent.
The eigenvectors given from a root-multiciply are also linear independent.

Let us determine the eigenvalues of the matrix

A={\begin{bmatrix}0&1&-1\\1&1&0\\-1&0&1\end{bmatrix}}

which represents a linear operator R³ → R³.

Identifying eigenvalues

We first compute the characteristic polynomial of A:

p(\lambda )=\det(A-\lambda I)=\det {\begin{bmatrix}-\lambda &1&-1\\1&1-\lambda &0\\-1&0&1-\lambda \end{bmatrix}}=-\lambda ^{3}+2\lambda ^{2}+\lambda -2.

This polynomial factors to $p(\lambda )=-(\lambda -2)(\lambda -1)(\lambda +1)$ . Therefore, the eigenvalues of A are 2, 1 and −1.

Identifying eigenvectors

With the eigenvalues in hand, we can solve sets of simultaneous linear equations to determine the corresponding eigenvectors. For example, one can check that

A{\begin{bmatrix}\;1\\\;1\\-1\end{bmatrix}}={\begin{bmatrix}\;2\\\;2\\-2\end{bmatrix}}=2{\begin{bmatrix}\;1\\\;1\\-1\end{bmatrix}}

which confirms that 2 is an eigenvalue of A and gives us a corresponding eigenvector.

Note that if A is a real matrix, the characteristic polynomial will have real coefficients, but its roots will not necessarily all be real. The complex eigenvalues come in pairs which are conjugates. For a real matrix, the eigenvectors of a non-real eigenvalue z , which are the solutions of $(A-zI)v=0$ , cannot be real.

If v₁, ..., v_m are eigenvectors with different eigenvalues λ₁, ..., λ_m, then the vectors v₁, ..., v_m are necessarily linearly independent.

The spectral theorem for symmetric matrices states that if A is a real symmetric n-by-n matrix, then all its eigenvalues are real, and there exist n linearly independent eigenvectors for A which are mutually orthogonal. Symmetric matrices are commonly encountered in engineering.

Our example matrix from above is symmetric, and three mutually orthogonal eigenvectors of A are

v_{1}={\begin{bmatrix}\;1\\\;1\\-1\end{bmatrix}},\quad v_{2}={\begin{bmatrix}\;0\;\\1\\1\end{bmatrix}},\quad v_{3}={\begin{bmatrix}\;2\\-1\\\;1\end{bmatrix}}.

These three vectors form a basis of R³. With respect to this basis, the linear map represented by A takes a particularly simple form: every vector x in R³ can be written uniquely as

x=x_{1}v_{1}+x_{2}v_{2}+x_{3}v_{3}

and then we have

Ax=2x_{1}v_{1}+x_{2}v_{2}-x_{3}v_{3}.

Multiplicity

The (algebraic) multiplicity of an eigenvalue λ of A is the order of λ as a zero of the characteristic polynomial of A; in other words, it is the number of factors t − λ in the characteristic polynomial. An n-by-n matrix has n eigenvalues, counted according to their algebraic multiplicity, because its characteristic polynomial has degree n.

An eigenvalue of algebraic multiplicity 1 is called a simple eigenvalue.

Occasionally, in an article on matrix theory, one may read a statement like

"the eigenvalues of a matrix A are 4,4,3,3,3,2,2,1,"

meaning that the algebraic multiplicity of 4 is two, of 3 is three, of 2 is two and of 1 is one. This style is used because algebraic multiplicity is the key to many mathematical proofs in matrix theory.

The geometric multiplicity of an eigenvalue λ is the dimension of the eigenspace associated with λ, which we defined above as the nullspace of the matrix λI − A.

The algebraic multiplicity can also be thought of as a dimension: it is the dimension of the associated generalized eigenspace, which is the nullspace of the matrix (λI − A)^k for any sufficiently large k. That is, it is the space of generalized eigenvectors, where a generalized eigenvector is any vector which eventually becomes 0 if λI − A is applied to it enough times successively. Obviously any eigenvector is a generalized eigenvector, and so each eigenspace is contained in the associated generalized eigenspace. This provides an easy proof that the geometric multiplicity is always less than or equal to the algebraic multiplicity. (Do not confuse these with generalized eigenvalue problem, below.)

Consider for example the matrix

{\begin{bmatrix}1&1\\0&1\end{bmatrix}}.

It has only one eigenvalue, namely λ = 1. The characteristic polynomial is $(\lambda -1)^{2}$ , so this eigenvalue has algebraic multiplicity 2. However, the associated eigenspace is spanned by (1, 0)^T, so the geometric multiplicity is only 1.

Decomposition theorem

An n by n matrix A has n linearly independent eigenvectors if and only if it can be decomposed into the form

A=U\Lambda U^{-1}\;

where Λ is a diagonal matrix. In this case A is said to be diagonalizable. The columns of U will form a basis of eigenvectors and the diagonal entries of Λ are the corresponding eigenvalues: thus, the entries of U can be chosen to be real (as opposed to complex) if and only if there are n linearly independent real eigenvectors. Even with complex coefficients, however, such a matrix U does not always exist; for example

A=\left({\begin{matrix}1&1\\0&1\end{matrix}}\right)

has only one 1-dimensional eigenspace.

There are several generalizations of this decomposition which can cope with the non-diagonalizable case, suited for different purposes:

the singular value decomposition, $A=U\Sigma V^{*}$ where $\Sigma$ is diagonal but U is not necessarily equal to V;
the Jordan normal form, where $A=U\Lambda U^{-1}$ but $\Lambda$ is not quite diagonal;
any matrix A can be written uniquely as A = S + N where S is semisimple (i.e. diagonalizable) and N is nilpotent, and S commutes with N. This is easily found from the Jordan form.
any invertible matrix A can be written uniquely as A = SJ where S is semisimple and J is unipotent, and S commutes with J. This is found from the previous decomposition by taking J = 1 + S^-1N.

Similarly, an n by n matrix A has n linearly independent conjugate eigenvectors (such as arising from an alternate coordinate system) if and only if it can be decomposed into the form

A=U^{*}\Lambda U^{\dagger }\;

Properties

The spectrum is invariant under similarity transformations: the matrices A and P^-1AP have the same eigenvalues for any matrix A and any invertible matrix P. The spectrum is also invariant under transposition: the matrices A and A^T have the same eigenvalues.

A matrix is invertible if and only if zero is not an eigenvalue of the matrix.

A matrix is diagonalizable if and only if the algebraic and geometric multiplicities coincide for all its eigenvalues, which is the same as requiring that all generalized eigenvectors are eigenvectors. In particular, an n-by-n matrix which has n different eigenvalues is always diagonalizable.

The vector space on which the matrix acts is always the direct sum of the generalised eigenspaces (i.e. is spanned by them and they are independent). This is true of the ordinary (non-generalised) eigenspaces if and only if they are equal to the generalized eigenspaces, i.e. if and only if the matrix is diagonalizable.

The location of the spectrum is often restricted if the matrix has a special form:

All eigenvalues of a Hermitian matrix (A = A^*) are real. Furthermore, all eigenvalues of a positive-definite matrix (v^*Av > 0 for all vectors v) are positive.
All eigenvalues of a skew-Hermitian matrix (A = −A^*) are purely imaginary.
All eigenvalues of a unitary matrix (A^-1 = A^*) have absolute value one.
The eigenvalues of a triangular matrix are the entries on the main diagonal. This holds a fortiori for diagonal matrices.

Generally, the trace of a matrix equals the sum of the eigenvalues, and the determinant equals the product of the eigenvalues (counted according to algebraic multiplicity).

Suppose that A is an m-by-n matrix, with m ≤ n, and that B is an n-by-m matrix. Then BA has the same eigenvalues as AB plus m − n eigenvalues equal to zero.

Extensions and generalizations

Infinite-dimensional spaces: eigenvalues of an operator

The concept of eigenvectors can be extended to linear operators acting on infinite-dimensional Hilbert spaces or Banach spaces.

Suppose we have a linear operator A mapping the vector space V to itself. As in the matrix case, we say that λ ∈ C is an eigenvalue of A if there exists a nonzero v ∈ V such that Av = λv.

Suppose now that A is a bounded linear operator on a Banach space V. We say that λ ∈ C is a spectral value of A if the operator $A-\lambda I$ is not invertible, where I denotes the identity operator. Note that by the closed graph theorem, if a bounded operator has an inverse, the inverse is necessarily bounded. The set of all spectral values is the spectrum of A.

If V is finite dimensional, then the spectrum of A is the same of the set of eigenvalues of A. This follows from the fact that on finite-dimensional spaces injectivity of a linear operator A is equivalent to surjectivity of A. However, an operator on an infinite-dimensional space may have no eigenvalues at all, while it always has spectral values.

There are operators on Banach spaces which have no eigenvectors at all. For example, take the bilateral shift on the Hilbert space $\ell ^{2}(\mathbf {Z} )$ ; it is easy to see that any potential eigenvector can't be square-summable, so none exist. However, any bounded linear operator on a Banach space V does have non-empty spectrum. The spectrum σ(A) of the operator A : V → V is defined as

\sigma (A)=\{\lambda \in \mathbf {C} |A-\lambda \mathrm {id} _{V}{\mbox{ is not invertible }}\}.

Then σ(A) is a compact set of complex numbers, and it is non-empty. When A is a compact operator (and in particular when A is an operator between finite-dimensional spaces as above), the spectrum of A is the same as the set of its eigenvalues.

The spectrum of an operator is an important property in functional analysis.

Eigenvalues of a matrix with entries from a ring

Suppose that A is a square matrix with entries in a ring R. An element λ ∈ R is called a right eigenvalue of A if there exists a nonzero column vector x such that Ax=λx, or a left eigenvalue if there exists a nonzero row vector y such that yA=yλ.

If R is commutative, the left eigenvalues of A are exactly the right eigenvalues of A and are just called eigenvalues. If R is not commutative, e.g. quaternions, they may be different.

Eigenvalues of a graph

In spectral graph theory an eigenvalue of a graph, is defined as an eigenvalue of the graph's adjacency matrix A, or (increasingly) of the graph's Laplacian matrix $I-T^{-1/2}AT^{-1/2}$ , where T is a diagonal matrix holding the degree of each vertex, and in $T^{-1/2}$ , 0 is substituted for $0^{-1/2}$ .

Generalized eigenvalue problem

A generalized eigenvalue problem is of the form

Av=\lambda Bv\quad \quad (1)

where A and B are matrices (with complex entries). The generalized eigenvalues λ can be obtained by solving the equation

\det(A-\lambda B)=0.\,

If B is invertible, then problem (1) can be obviously written in the form

B^{-1}Av=\lambda v\quad \quad (2)

which is a standard eigenvalue problem. However, in most situations it is preferable not to perform the inversion, and solve the generalized eigenvalue problem as stated originally.

If A and B are symmetric matrices with real entries, then problem (1) has real eigenvalues. This would have not been easy to see from the equivalent formulation (2), because the matrix $B^{-1}A$ is not necessarily symmetric if A and B are.

External links

Videos of MIT Linear Algebra Course, fall 1999 - See Lecture #21: Eigenvalues and Eigenvectors
MathWorld: Eigenvector
Earliest Known Uses of Some of the Words of Mathematics: E - see eigenvector and related terms
ARPACK is a collection of FORTRAN subroutines for solving large scale eigenvalue problems
"Eigenvalue (of a matrix)". PlanetMath.

References

Roger A. Horn and Charles R. Johnson, Matrix Analysis, Cambridge University Press (1985). ISBN 0-521-30586-1 (hardback), ISBN 0-521-38632-2 (paperback).
John B. Fraleigh and Raymond A. Beauregard, Linear Algebra (3^rd edition), Addison-Wesley Publishing Company (1995). ISBN 0-201-83999-7 (international edition).