Chapter 5 Eigenvalues and Eigenvectors

Chapter 5 Eigenvalues and Eigenvectors

5.2 Eigenvalues and Eigenvectors Definition Let A be an n×n matrix. Suppose that x is a non-zero vector in n and λ is a number (possibly zero) such that Ax = λx then x is called an eigenvector of A and λ is called an eigenvalueof A. We say that λ is the eigenvalue associated withx and x is an eigenvector corresponding toλ.

T T Let T : R2 R2 be a linear transformation defined by and

Remarks If T is a linear transformation from an infinite dimensional vector space V into itself, we can also define eigenvectors and eigenvalues for T, (even though T cannot be represented by a matrix) namely T(x) = λx Example: Let V be the vector space of all differentiable real functions on the real line, and T be defined to be then we have hence sin(x) is an eigenvector of T with eigenvalue -1.

Example: Let V be the vector space of all differentiable real functions on the real line, and T be defined to be Actually, any real number is an eigenvalue for T, because we have

Theorem The number λis an eigenvalues of the n×n matrix A if and only if it satisfies the equation det( λIn – A ) = 0 Definition When expanded, the determinant det( xIn –A ) is a polynomial in x of degree n, called the characteristic polynomial of A, denoted by A(x). The equation det( xIn –A ) = 0 is called the characteristic equation of A.

Let λ be an eigenvalue of the n×n matrix A. Theorem The zero vector and the set of eigenvectors of A associated with λ form a subspace of n (this is actually the null space of λIn – A ). Definition The subspace mentioned above is called the eigenspace of A associated with λ, denoted by Eλ Theorem The dimension of Eλ the multiplicity of (x – λ ) in A(x).

Eigenvalues of Block Triangular matrices A square matrix A is said to be in block triangular form if it can be partitioned as where P and Q are square matrices. In this case, eigenvalues of A = eigenvalues of Punion eigenvalues of S

Important Note Given a square matrix A, if you perform any row operation on it, you will change its eigenvalues, and the change is unpredictable! Therefore, it is not practical to find the eigenvalues of A by transforming A to a triangular matrix with row operations!

Theorem Let A be an n×n matrix with eigenvalues λ1, λ2,···, λn (not necessarily distinct) then • λ1· λ2· ··· ·λn = det(A) • λ1 + ··· + λn = a11 + a22 + ··· + ann (i.e. the trace of A)

5.3 Diagonalization Definition A square n×n matrix A is said to be diagonalizable if there is an invertible n×n matrix P such that P-1AP is a diagonal matrix Λ. In this case, we also say that P diagonalizes A. (Note: not all square matrices are diagonalizable.)

Theorem A n×n square matrix A is diagonalizable if and only if A has n linearly independent eigenvectors. Theorem If v1, … , vk are eigenvectors for a matrix A, and the associated eigenvalues λ1, λ2,···, λk are distinct, then v1, … , vk are linearly independent. Corollary If the characteristic polynomial A(x) of a n×n matrix A has n distinct roots (real or complex), then A is diagonalizable.

Theorem If v1, … , vk are n linearly independent eigenvectors of a matrix A, and P is a n×n matrix whose i-th column is vi, then P is invertible and P-1AP =

5.4 Symmetric Matrices Recall that a n×n matrix A is symmetric if aij = aji , i.e. Theorem • All eigenvalues of a symmetric real matrix are real. • All symmetric real matrices are diagonalizable, i.e. we can find an invertible matrix P such that P-1AP = diagonal matrix (whose diagonal elements are exactly the eigenvalues of A including multiplicities) • Every n×n symmetric real matrix A has n linearly independent eigenvectors.

Theorem If A is an n×n symmetric matrix, then eigenvectors of A associated with different eigenvalues are orthogonal. Theorem If A is an n×n symmetric matrix, then we can find n linearly independent eigenvectors of A that are • orthogonal to each other and • of unit length (this is called an orthonormal set.)

What if A is not diagonalizable? In this case we have two choices: • Give up diagonal form and accept a less beautiful form called the Jordan Normal Form. i.e. P-1AP = a square matrix in Jordan normal form • Use two different invertible matrices to make A diagonal i.e. PAQ = a diagonal matrix(but the diagonal elements are not the eigenvalues of A. They are called the singular values of A.) However, there is no easy way to compute Ak for this type of matrices.

Jordan Normal Form of a Matrix A square matrix is in Jordan Normal form if it is a block diagonal matrix where each Jordan block Ji is a square matrix of the form iis an eigenvalue.

Example of a square matrix in Jordan Normal Form This matrix has 4 Jordan blocks, but 3 different eigenvalues. The algebraic multiplicity for  = 2 is 3, geometric multi. is 1. The algebraic multiplicity for  = -1 is 1, geometric multi. is 1. The algebraic multiplicity for  = 4 is 3, geometric multi. is 2.

How to convert a matrix into Jordan Normal Form Suppose that we know that a 3×3 matrix A is similar to

In case II, we use the fact that (AT)A is always symmetric, and its eigenvalues are non-negative. Lemma The eigenvalues of (AT)A are non-negative. proof: Suppose that (AT)Ax = x for some nonzero x, then Ax2 = Ax · Ax = x · ATAx = x· x = x2 but Ax2≥ 0, hence  ≥ 0.

Theorem(Singular Value Decomposition of Square Matrices) Given any n×n matrix A, we can find two invertible n × n matrices U and V such that where • k is the rank of A • are called the singular valuesof A, and 1 ≥ 2 ≥··· ≥ k ≥ 0are all the non-zero eigenvalues of (AT)A(note: all eigenvalues of (AT)A are non-negative.)

Definition A real number σ is said to be a singular value of a n×n real matrix A if there are unit vectors u and v such that Au = σv and ATv = σu

Applications in data compression For simplicity, we only consider square B&W pictures, in which each pixel is represented by a whole number between 0 and 255. Suppose that A is such a 100×100 matrix. It then takes up a bit more than 10,000 bytes of memory. If we use the previous theorem to decompose A into then the i’s will be very small for large i’s and when the picture contains only smooth gradation. Hence we can drop those singular values and get an approximation of A.

For example, we just keep the first 20 largest singular values, i.e. in this case, we only have to keep the first 20 columns of U, and the first 20 rows of V, because all the others will not contribute to the product at all. Hence we have (100×20) (20×20) (20×100)

in this case, we only have to keep the first 20 columns of U, and the first 20 rows of V, because all the others will not contribute to the product at all. Hence we have (100×20) (20×20) (20×100) The first matrix requires 100×20 = 2000 bytes, the middle diagonal matrix requires only 20 bytes. the last matrix requires also 100×20 = 2000 bytes, hence the compressed image requires a bit more than 4,020 bytes, which is about 40% of the original 10,000. Let’s look at an example and find out how lossy this compression is. Note: This method also works for rectangular matrices.

A 276×400 pixel photo 5 values 15 values Original (with 276 values) Singular values 30 values 10 values Stanford Exploration Project 1997

Final Remarks Even though this compression method is very beautiful in theory, it is not used commercially (at least not today) possibly due to the complexity in the decomposition process. The most popular JPEG compression format uses Discrete Cosine Transform on 8×8 blocks and then discard the insignificant elements in the transformed 8×8 matrix. This process requires only matrix multiplications, term-by-term division, and rounding. Hence it is much faster than the Singular Value Decomposition.

Chapter 5 Eigenvalues and Eigenvectors