Square Root for the Orbit Problem
5.1 INTRODUCTION
In the previous chapter, the solution to the least squares estimation problem including a priori information, P and x, is represented in the normal equation form as
which can be expressed as
The solution for x is obtained by computing the inverse of M. In practice, computational problems are encountered in forming and inverting the matrix M = HT W H + P . An orthogonal transformation approach can be used to write Eq. (5.1.2) in the form
where R is an n x n upper triangular matrix (R is not the observation error covariance matrix). Then x can be obtained by backward substitution. This approach has the advantage that increased numerical accuracy is achieved over conventional matrix inversion methods for solving Eq. (5.1.2). Using the orthogonal transformation approach, accuracy can be achieved with a singleprecision computation that is equal to the accuracy obtained by inverting M with the normal equation approach using doubleprecision computations.
Solution Methods Determination
The normal equation approach has several operational and conceptual advantages that have led to the widespread adoption of this technique for many operational orbit determination systems. Both the normal equation and orthogonal transformation approaches are described in the subsequent sections. The solution of the linear system
is expressed as
where the operation Mi implies that the inverse of the (n x n) matrix M is computed and then postmultiplied by the column vector N. A solution based on the Cholesky decomposition of M is more efficient and, in most cases, more accurate. The Cholesky decomposition is applicable only if M is symmetric and positive defhite, a condition satisfed for the case considered here. The following discussion fist outlines the Cholesky decomposition, and describes this approach to solving the normal equations. Then a discussion of solutions based on orthogonal transformations is presented in subsequent sections.
5.2 CHOLESKY DECOMPOSITION
Let M be a symmetric positive defnite matrix, and let R be an upper triangular matrix computed such that
(Note that R is not the observation error covariance matrix.) Then Eq. (5.1.4) can be expressed as
where RT is lower triangular. The components of z can be determined using a forward recursion relation. Then, Eq. (5.2.3) can be solved using a backward recursion to obtain the elements of X.
The elements of the error covariance matrix, P = (HTWH + P )_i = Mi , can be obtained from the condition
where S, the inverse of the upper triangular matrix, R, can be computed by an efficient backward recursion.
Equation (5.2.1) represents a set of (n2 + n)/2 equations for the (n2 + n)/2 unknown elements of the upper triangular matrix, R. The expression for the Cholesky decomposition of M is obtained by expanding Eq. (5.2.1) and solving term by term for the elements of R (e.g., Rj is determined in terms of the elements of M).
Given the elements of the n x n positive definit e matrix M and the n x 1 column vector N, the following Cholesky algorithm will yield a solution for the elements of R, z, X and the upper triangular matrix, S. Step 1 of the following algorithm determines the elements of R and the vector z. Steps 2 and 3 perform a backward recursion to form X and the elements of the matrix S = R1.
5.2.1 The Cholesky Algorithm
The Cholesky algorithm for R is derived easily by equating the elements of RtR to the elements of M that are known. For example, from expanding RTR it is shown that r11 = a/M11.
The elements of z are obtained from an expansion of Eq. (5.2.4):
The elements of S are obtained from an expansion of SR = I.
Examples of the application of this algorithm are given in Sections 5.6.1 and 5.6.5.
This Cholesky algorithm is a nonblock algorithm. That is, it does not use matrix multiplication. Because matrix multiplication is much faster in terms of floating point operations per second than matrixvector operations on modern computers, a block Cholesky algorithm often will be faster than a nonblock version. In fact, the increase in speed may be a factor of three or more. See Golub and Van Loan (1996) for examples of a block Cholesky algorithm.
5.2.2 The Square Root Free Cholesky Algorithm
Notice that calculation of the diagonal elements of R requires that n square roots be taken. Computationally the square root operation is expensive compared to multiplication, division, addition, or subtraction; hence, it is desirable to avoid square roots if possible. A square root free Cholesky algorithm may be developed by defiling
where U is unit upper triangular (i.e., has ones on the diagonal), and D is a diagonal matrix. As in the previous section, Eq. (5.2.10) represents the (n2 + n)/2 equation in the (n2 + n)/2 unknown elements of U and D. The algorithm for the elements of U and D is obtained by expanding Eq. (5.2.10) and is given by n
The procedure is to set j = n and cycle through the algorithm for i = 1... n — 1, solving for Dn and the elements of Uin (i.e., the last column of U). Then set j = n — 1 and cycle through for i = 1... n — 2, solving for Dn1 and the n — 1 column of U. Repeat this procedure for the remaining values of j.
Knowing U and —, it is possible to solve for X from Eq. (5.1.2) and Eq. (5.2.10) as follows. Note that
Then
The intermediate vector z can be determined from Eq. (5.2.15). The solution is a backward substitution (i.e., we solve for zn, zni  zi):
Then the elements of X can be determined from Eq. (5.2.14) by using a forward substitution, as ii
The associated estimation error covariance is obtained from
where
Note that none of the algorithms required to compute X involve the calculation of a square root.
5.3 LEAST SQUARES SOLUTION VIA ORTHOGONAL TRANSFORMATION
An alternate approach that avoids some of the numerical problems encountered in the normal equation approach is described in the following discussion. The method obtains the solution by applying successive orthogonal transformations to the information array, (H, y). Enhanced numerical accuracy is obtained by this approach. Consider the quadratic performance index, J(x), which minimizes the weighted sum of squares of the observation errors, e = y — Hx (for the moment we will assume no a priori information; i.e., P =0, x = 0):
J(x) = eTWe = W2 (Hx — y) = (Hx — y)TW(Hx — y). (5.3.1)
If W is not diagonal, W1/2 can be computed by the Cholesky decomposition. Or the prewhitening transformation described at the end of Section 5.7.1 can be applied so that W = I. For notational convenience we are using —e in Eq. (5.3.1).
The solution to the least squares estimation problem (as well as the minimum variance and the maximum likelihood estimation problem, under certain restrictions) is obtained by finding the value x that minimizes the performance index J(x). To achieve the minimum value of J(x), we introduce the m x m orthogonal matrix, Q. An orthogonal matrix has the following properties:
3. If Q1 and Q2 are orthogonal matrices, then so is Q1Q2.
4. For any vector x,
Multiplying by Q does not change the Euclidean norm of a vector.
5. If e is an m vector of random variables with e ~ (O, I) (i.e., E(e) = 0 and E(eeT) = I), then e = Qe has the same properties,
It follows then that (5.3.1) can be expressed as J(x) = QW 2 (Hx — y)
Select Q such that
and define QW1 y =
where
R is a n x n uppertriangular matrix of rank n
O is a (m — n)x n null matrix b is a n x 1 column vector e is a (m — n)x 1 column vector.
The results given by Eq. (5.3.6) assume that m > n and H is of rank n. Using Eq. (5.3.6), Eq. (5.3.5) can be written as
Expanding leads to
2  
Rx — b 
+ 
Only the fist term in Eq. (5.3.8) is a function of x, so the value of x that minimizes J(x) is obtained by requiring that and the minimum value of the performance index becomes (equating J(x) in Eq. (5.3.1) and Eq. (5.3.8))

Post a comment