1. Gaussian Eliminatin and Its Variants
Topics covered:
- Matrix Multiplication
- Systems of Linear Equations
- Trianglar Systems
- Positive Definite Systems; Cholesky Decomposition
- Banded Positive Definite Systems
- Sparse Positive Definite Systems
- Gaussian Elimination and the LU Decomposition
- Gaussian Elimination with Pivoting
- Sparse Gaussian Elimination
Cholesky Decomposition
To solve a linear equation Ax=b ,if the A matrix is positive definite, we can use Choleskey decomposition.
To solve Ax = b with A =
- let y = Rx
- we have , since is lower triangular, we can solve for y (by forward substitution)
- then from Rx = y, with R is upper triangular, we can solve for x (by backward substitution)
Positive Definite Matrix
Sylvester’s criterion
Sylvester’s criterion can help to determine if a matrix is positive definite.
A symmetric matrix A is positive definite if and only if all of its leading principal minors have a positive determinant:
Example:
A is positive definite since:
2 > 0,
properties :
- Let M be any n * n nonsignular matrix, and let . Then A is positive definite.
- If A is positive definite, then A is nonsingular.
- If A is positive definite, then for i= 1,2,…,n.
- If A is positive definite, then A can be decomposed in exactly one way into a product such that is lower triangular with (all main diagonal are positive)
Cholesky Decomposition Algorithm
Pseudocode
Given A,
for i = 1 : n do
for j=i+1 : n do
end for
end for
Cost
Matlab function
chol:
produces a lower trianglar matrixFrom , we obtain
Tridiagonal Matrix
For a tridiagonal matrix which looks like
which for all i, j with |i-j|>1,
and there are only n+(n+1)+(n-1) = 3n-2 entries in matrix A.
A recurrence computation method can be applied on A for Cholesky decomposition.
Since , we suppose
Partition A of order n as
If we are trying to find L which ,
partition L as
Then we can denote A as
From above, we can have
Finnally we can keep finding and
then find the lower level choleskey factor ‘s and l with below fomular:
which satisfy
with a recurrsion function , the tridiagonal matrix can be efficiently computated with Cholesky decomposition.
Envelope Store Schema
The envelope of a symmetric or upper-triangular matrix A is defined as :
Env(A) = {(i, j): and for some }
The envelope storage scheme stores A into 3 one-dimensional array
- DIAG with length n, store the main diagonal of the matrix (, i = 1, …, n )
- ENV , store the envelope by columns, one after the other.
- IENV with length n+1, store the position in ENV of the first nonzero entry of column j of the matrix A, which
- IENV(1) =1
- IENV(j+1) - IENV(j) = number of elements from column j of the matrix that lie in the envelope
Example, for a matrix as below
CutHill-McKee Algorithm
Cuthill-MacKee algorithm is to transform a sparse matrix to a lower bandwidth matrix.
Theorems
Let A be positive definite with Cholesky factor R. Then
Env(A) = Env(R)
Corollary
If A is a positive definite matrix with semi-bandwidth s, then also has semi-bandwidth s.
With above theory, we can try to reduce the envelope of this sparse matrix A before Cholesky Decomposition.
With the basis theory in elementary operations and graph, we can lead to CutHill-McKee algorithm
Goal of CutHill-McKee algorithm:
For a given symmetric sparse matrix, , the problem is to reduce its band width B by permuting rows and columns such as to move all the nonzero elements of M in a band as close as possible to the diagonal.
If consider the given sparse matrix (the input matrix) and the resulting matrix (the output matrix) as adjacency matrices for some given graphs, the only thing that is different from the permuted graph and original graph is the node(vertex) labeling(the two graph structures are identical), the bandwidth reduction problem can also be viewed as a graph labeling problem.
Solving the bandwidth reduction problem is converted to finding the node labeling that minimizes the bandwidth B of the adjacency matrix of the graph G(n),
where
- , i, j = 1, ..., n, is the label of node i, is the label of node j ( formally)
- The nodes i and j are adjacent or neighbours (i.e there is an edge in graph G(n) connecting nodes i and j)
Reverse Cuthill-McKee (RCM) algorithm find relative equal quality of solution with GPS algorithm on average, but GPS is up to ten times as fast.
Description of the RCM (Reverse CutHill-McKee) algorithm:
- Prepare an empty queue Q and an empty result array R.
- Select the node in G(n) with the lowest degree (ties are broken arbitrarily) that hasn’t previously been inserted in the result array. Let up name it P.
- Add P in the first free position of R.
- Add to the queue all the nodes adjacent with P in the increasing order of their degree.
- Extract the first node from the queue and examine it. let us name it C.
- If C hasn’t previously been inserted in R, add it in the fist free position and add to Q all the neighbours of C that are not in R in the increaseing order of their degree.
- If Q is not empty repeat from Step 5.1
- If there are unexplored nodes (the graph is not connected) repeat from Step 2.
- Reverse the order of the elements in R. Element R[i] is swapped with element R[n+1-i](the Reverse Cuthill-McKee RCM algorithm)
The result array will be interpreted like R[L] = i means that the new label of node i will be L.
Gaussian Elimination
Three types of elementary operations in Gaussian elimination method.
- Add a multiple of one equation to another equation.
- Interchange two equations.
- Multiply an equation by a nonzero constant.
What is pivots:
- The diagonal elements of U are called pivots. (The kth pivot is the coefficient of the kth variable in the kth equation at the kth step of the elimination.)
When need pivoting:
- If some pivot (in Gaussian elimination without pivoting) is exactly equal to 0, then the elimination fails( divide by 0),
- If some pivot is very small in magnitude relative to other numbers in the matrix A, then the computation may be numerically unstable.
Gaussian Elimination Without Pivoting (LU Decomposition)
When a matrix A is not positive definite, then it can not apply Cholesky decomposition algorithm.
But if A is an n*n matrix with nonsingular, then there exists a unique factorizatin A = LU, where L is unit lower triangular and U is upper triangular.
(the pivots if and only if are nonsingular).
LU decomposition is a special form of Gaussian elemination method(without pivoting), it utilized only first type elementary operation.
Initialize U = A, L = I
for k =1 : n-1
for j = k+1:n
L(j,k) = U(j,k)/U(k,k)
U(j,k:n) = U(j,k:n) - L(j,k)U(k,k:n)
end
end
- Operations count is
- The factors L and U generated by LU decomposition are storeed in original matrix A in practice.
- LU factorization cannot be guaranteed to be stable.
Gaussian Elimination With Pivoting
Because the simple Gaussian Elimination without pivoting (LU decomposition) may fail or unstable when there are exact zero or small numbers in pivots, we need to consider using pivoting when apply Gaussian elimination.
For any n*n nonsingular matrix A, there exists a permutation P such that PA has an LU factorization.
or- Unless matrix A has special properties(e.g., A is positive definite or diagonally dominant), pivoting must be done to insure stability.
- A program that reduces A to upper triangular form must keep track of any interchanges made in order to solve Ax = b
- Same to the LU decomposition, L and U can be stored in one n*n matrix
Partial Pivoting:
- Partial Pivoting: exchange only rows, which does not affect the order of the
- For increased numerical stability, make sure the largest possible pivot element is used. This requires searching in the partial column below the pivot element.
- Partial pivoting is usually sufficient.
Matlab Function
The matlab function lu uses Gaussian elimination with partial pivoting.
1 | [L,U,P] = lu(A) |
Determines matrices L, U and P such that PA = LU. they can be used to sove Ax = b, with , then
, thenThe inverse of a lower unit triangular matrix is obtained by simply negating the lower triangular elements.
If
then
Complete(total/full) Pivoting:
- Complete pivoting (or total pivoting): exchange rows and column, to put the max value in remaining matrix at the pivot, which requires changing the order of the
- For increased numerical stability, make sure the largest possible pivot element is used. This requires searching in the pivot row, and in all rows below the pivot row, starting the pivot column.
- Full pivoting is less susceptible to roundoff, but the increase in stability comes at a cost of more complex programming and an increase in work associated with searching and data movement.
2. Sensitivity of Linear Systems
Topics covered in textbook:
- Vector and Matrix Norms
- Condition Numbers
- Perturbing the Coefficient Matrix
- Aposteriori Error Analysis Using the Residual
- Roundoff Errors: Backward Stability
- Propagation of Roundoff Errors
- BackWard Error Analysis of Gaussian Elimination
- Scaling
- Componentwise Sensitivity Analysis
Sensitivity of Matrices (Condition Number)
Some matrices are very sensitive to small changes in input data. Then extent of this sensitivity is measured by the condition number.
Definition of condition number:
Fomular of calculating matrix condition number:
Fomular of application of condition number:
- Considering all small changes and in A and b and the resulting change, , in the solution .
- Changes in the input data get multiplied by the condition number to produce changes in the ouputs.
- A high condition number is bad, it implies that small errors in the input can cause large errors in the output.
- A matrix with condition number of 1 is called ‘well-conditioned’, and its condition number is too large, then it is called ‘ill-conditioned’
Vector Norm
Vector Norm is a tool used to measure the ‘size’ of a vector. More precisely, a vector norm is a function that maps a vector to a non-negative real number, satisfying the following properties:
- If X 0, then ||X|| > 0.
- for all scalars .
- for vertors X and Y (triangular inequality).
Different Type of Norms
1-Norm
2-Norm(Euclidean norm)
Inf-Norm(Maximum norm)
Holder Norms (p-norms) The normal form of Norm
A-Norm
If A is any fixed n*n positive definite matrix, then
When A is a unit matrix, then A-norm is just 2-norm.
Three properties of norm are satisfied:
- for all X 0, (this is the property of positive definite matrix), so if
- which is
Application of Vector Norms
Distance between two vectors X and Y is ||X-Y||
Convergence of a sequence of vectors
- For sequence
Definition of vector sequence convergent:
if and only if for all values i = 1, 2, , n.A sequence is said convergent if it approaches some limit.
if and only if for any vector norm.For a given vector norm ||.||, the size of is used to determine if the sequence of vectors is converging to some limit vector X. The choice of a norm is arbitrary ( all norms are equivalent).
Matrix Norm
Matrix Norm has one more properties than Vector Norm:
- for all matrices A and B
Frobenius Matrix norm
Matrix Norm induced by Vector Norm
Given any fixed vector norm , the matrix norm induced by is defined by
- The ratio is the magnification that takes when X is transformed to AX
- The number is the maximum magnification that A can cause.