Convergence properties of ART and SOR algorithms

时间:2023-02-04 16:45:57  热度:0°C
Numer Math 59 91 106 1991 Numerische Mathematik 9 Springer Verlag 1991 Convergence properties of ART and SOR algorithms L Eisner 1 I Koltracht 2 and P Lancaster 3 1Institute of Mathematics University of Bielefeld W 4800 Bielefeld Federal Republic of Germany 2Department of Mathematics University of Connecticut Storrs CT 06268 USA 3Department of Mathematics and Statistics University of Calgary Calgary Alberta T2N 1N4 Canada Received November 4 1988 Summary ART algorithms with relaxation parameters are studied for general consistent or inconsistent linear algebraic systems Rx f and a general conver gence theorem is formulated The advantage of severe underrelaxation is re examined and clarified The relationship to solutions obtained by applying SOR methods to the equation RRXy f is investigated Subject classification AMS MOS 65F10 15A06 CR G1 3 1 Introduction In this paper we are concerned with iterative methods related to the ubiquitous equation Rx f where R is a given m x n real matrix i e is in IR and f is a given real vector of length m i e is in IRm More specifically we are concerned with the cases when m and n are so large that storing the coefficients of R in a computer can be a problem but the matrix has the property that the entries are easily generated as required The equation Rx f may be consistent or incon sistent and in the applications we have in mind is generally inconsistent Problems of this kind arise in a variety of applications that come under the broad heading of computerized tomography with incomplete data see DL He2 and Na for example and have given rise to the development of a great variety of iterative algorithms based on updates of an approximate solution vector with a suitable linear combination of some of the rows or just one row of R Prototypical among these algorithms is the Algebraic Reconstruction Technique ART which we briefly review together with some relaxation strategies Offprint requests to P Lancaster The work of this author was supported by a research grant from the Natural Sciences and Engineering Research Council of Canada 92 L Eisner et al Let r T r T T denotes transposition be the rows of R and fr fl f2 f With a given starting vector x m lR we generate the se quence xtk by the relation 1 1 X k x k 1 COkHrkl I 2 f k rTx k 1 rk k 1 2 where when k j mod m and j 1 2 m 1 2 r k rj fk fj Note the scalar fk rTx 1 appearing in 1 1 is just the residual error in the kth equation The norm used in 1 1 and elsewhere is euclidean and the real numbers COg k 0 are a sequence of relaxation parameters in the interval 0 2 With OOk 1 we obtain the classical ART algorithm originating with Kaczmarz K in 1937 Algorithms of this kind can be obtained in several different ways including minimization of residual vectors filtering gradient methods or projection methods see He2 and Trl for example See also EHL for a treatment of these and other schemes There are also several variations that can be played on the theme of equations 1 1 and 1 2 including Richardson or SIRT methods Gi He2 I and constrained systems Hell and He2 We confine attention to processes of type 1 1 and 1 2 and also related Successive Overrelaxation SOR methods for the system 1 3 RRry f see SB and Na for example Excessive under relaxation has been found to be beneficial in practice see HLL and He2 and this has been explained to some extent by theory see I CEG and Na when O k is a constant sequence More general under relaxation strategies have been investigated in Trl Tr2 and B in which Ok depends only on the quotient obtained on division of k by m in I 1 We are concerned with a cyclic choice of m parameters 01 om Thus as in 1 2 1 4 0k Oj when k j mod m and in the special case of a fixed relaxation parameter We first present a short and self contained proof of the convergence of ART with our admissible relaxation strategy The main new feature here is representa tion of the limit point s in terms of the residual vector f Rx I where x I R f the best approximate solution in the 2 sense and R 1 is the Moore Penrose general ized inverse of R This argument admits immediate generalization to a problem posed in Hilbert space see the Appendix It demonstrates the robustness of these algorithms in the sense of independence from consistency of the original equation and the choice of initial vector x Our analysis also admits an improved explana tion of the advantage to be gained by under relaxation Our second major topic is the connection of ART algorithms with solutions of Rx f obtained by applying SOR methods to the equation 1 3 In ontrast with the algorithms of ART type it has been shown by O Caroll O C that the SOR algorithms actually diverge if the system Rx f is inconsistent f Im R This suggests a serious disadvantage of the SOR strategy in the form of potential error accumulation and is discussed in our formalism in Section 5 Convergence properties of ART and SOR algorithms 93 Consider the n x n matrices Q1 Qm defined by 1 5 Qj I ogj II rj I1 2rjriT and note that IIQ II 1 as long as coj O 2 It is easily seen that the iteration matrix for algorithms using equations 1 1 1 2 and 1 4 is l 6 A to Q Q2Q where ca denotes the vector of relaxation parameters Oa co Thus II A to ll 1 Furthermore if p span rl r Im Rr l Ker R then A w p p and we must generally admit that 1 a A t9 the spectrum of A tg It will be seen and is well known that the speed of convergence depends on the quantity 1 7 7 A max IAI 2 0 w a A 4 1 which is usually the magnitude of the sub dominant eigenvalue of A This quantity y A will depend on t as well as the ordering of the rows of R and will also be investigated via the SOR connection In the paper by Smith Solmon and Wagner SSW a bound is given on the norm of A restricted to the orthogonal complement of its spectral subspace corresponding to eigenvalue 1 say c A in the case Ok 1 Since 7 A c A 1 the minimization of this bound can be used for finding certain suboptimal order ings of the rows of R see HS Here we wish to emphasize the importance of working with 7 A rather than its upper bound c A A generalization of the bound c A for products of paracontracting matrices which include the matrices Q j is given by Nelson and Neumann INN and the quantity c A is used by Natterer Na in a careful study of the effects of underrelaxation 2 ART algorithms We consider algorithms of the form 1 1 1 2 1 4 in which x is chosen arbitrarily Three lines of argument can be found in the literature One seems to originate with Tanabe Ta and we give a short self contained proof in that style see also CEG Our proof also admits generalization to a Hilbert space setting and we discuss this briefly in an Appendix to this paper Another line of argument depends on a general theorem of Halperin Ha concerning the powers of a prod uct of projection operators This is used by Natterer Na 1 for example for the consistent problem with xl ImR T The third approach uses the SOR connection and will be discussed below see also BE Na for example For s 0 1 2 write s x Sm the iterate obtained after s complete sweeps through the rows of R It follows from 1 1 that 2 1 1 A ta K f for s 0 1 2 where A to is defined by 1 5 and 1 6 Also T 2 2 K ta cojllrjll 2QmQm l Q lr e i j l 94 L Elsner et al and e denotes the jth unit coordinate vector in 1R and when j m we put Q Qj I I It is easily verified that 2 3 K to R I A o Furthermore if ol o2 o r then as w 0 2 4 K o oRTO 0 o 2 where D diag II rl II 2 1 1 rr I1 2 Now let P be the orthogonal projector onto Im R T along Ker R Thus I P is onto Ker R and along Im R T The results of the first two lemmas are familiar see Ta or EHL The proof of the first is included for completeness and because it is short The proof of the second is new and being more technical is relegated to the Appendix Lemma 1 Im R T and Ker R are invariant under A o Proof Let Pi be the orthogonal projector onto rj then in 1 5 we have Qj I oojP j 1 2 m and from 1 6 we deduce that if x Im I A o xe ImPj ImR T j l Thus Im I A c ImR T and since I P annihilates ImR T I P I A 0 or I P A o I P But R I P 0 and so equation 2 3 gives 2 5 A o I P I P Hence A o P PA o as required Lemma 2 lfoje O 2 forj 1 2 m then Ih o ll 1 and IIZ o Ptl 1 This lemma requires a short technical proof that will be presented in the Appendix in a more general Hilbert space context It is clear from the lemmas and equation 1 6 that the spectrum of the restriction of A o to Im P Im R T is inside the open unit disc while the restriction to Im I P Ker R has spectrum only at 2 1 In fact it is not difficult to show that Ker R Ker I A see Corollary 4 of Ta We may now prove the convergence theorem Recall that R I denotes the Moore Penrose inverse of R and we write x I R f The corresponding residual vector is def 2 6 g f Rx I RR f and we note that because RR I is the orthogonal projector onto ImR g e Ira R Ker R T whatever f may be Theorem 1 Let coje O 2 for j 1 2 m and x be an arbitrary vector in IR Then the sequence s defined by 1 1 is convergent and 2 7 lim s X I I P x I PA o IK eOg s oo Convergence properties of ART and SOR algorithms 95 where P is the orthogonal projector onto ImR r and g is the residual vector g f Rx I Proof It follows from 2 1 that r A e x 0 I A o A l e0 K eJ f and from 1 3 and 1 6 1 A w x O I A 9 A l og g aO Rx I g A e0 x I A e x I I A w A l w K co g In this equation put A a0 xr176 A w Px I P x and usng 2 5 we obtain A co x A m P x I P x Also Px I x I since xI Im R T Thus AS co x I A o P x I and s l X 1 A w p x O x I I P x I A to A l to K to g Then we see that for j 1 2 m PPj P j so that PQ QjP and hence PK og K to It follows that 1 x I A to p x O x l I P x I PA w I PA w K tu g Now use Lemma 2 and take the limit as s to obtain 2 7 3 Discussion of the theorem a In general the algorithm of Theorem 1 is applied in order to find or estimate x j First observe that by choosing x Im R T a linear combination of rl rm we have I P x 0 in equation 1 7 and the limit is therefore independent of x Now it is easily seen and already well known that each subsequence of x j o obtained by taking indices j that are congruent mod m will be convergent Thus the ART algorithm of equations 1 1 1 2 and 1 4 converges cyclically from any initial x t Im R T and the limits will depend on the ordering of the rows of R Let H denote the set of all permutations of indices l 2 m and for any zc e H let b I I PA o9 1PK co II and denote the sequence generated as in 2 1 after applying to the rows of R It follows from 2 7 that when x Im R T all cyclic limits of the ART algorithm will lie in the sphere with the centre x and the radius b II g II where b max b lim x 1 1 Then there are matrices B j j k k 1 0 1 such that 3 1 B kco k 9 9 9 B o9 1 Bo Baco I P coAl 9 I for sufficiently small ol 0 If k 1 then B k I P 0 and B kAI B k l I P 0 As AIP A1 this implies B kA1 0 But ImAl ImP and so ImP c KerB k or B kP 0 Then B k B kP 0 Thus k 1 and in a deleted neighbourhood of co 0 we have an expansion 3 2 I A co P B lco i B0 Blo with B 1 O Proposition Let x t Im R T Then as co 0 3 3 lim r x B RXDg O co s co and B 1R T Dg 0 if and only if Dg Ker R T Convergence properties of ART and SOR algorithms 97 Proof It is easily seen from 3 1 that B IA 1 P AxB In particular Ker B 1 Ker P Ker R Im RT Thus B IRTx 0 if and if x KerR T Using 3 2 and 2 4 in 2 7 we obtain 3 3 Furthermore B 1RTDg 0 if and only if Dg Ker R v Natterer considers this limiting case when Rx fis consistent i e f Im R v see Na With this hypothesis it follows from 2 6 that g 0 and so B 1RTDg 0 in 3 3 Now in any case g KerR v so B 1 RTDg 0 provided D I In other words when the rows of R are normalized to have unit length This is the case considered by Censor Eggermont and Gordon CEG Now it is easily seen that if the starting vector x r176 is fixed the sequence x J j o and hence the subsequence sj are invariant under row scaling of R Thus the image produced is independent of the scaling and 3 3 implies that x B IRTDg x the best approximate solution after row normalization which is approached in the limit with or without normalization Note that x D1 ZR lD1 2g and in general x 4 x I because D1 ZR I RID 1 2 As one might expect the best approximate solution of Rx f depends on the relative sizes of the residuals r x fj j 1 2 m The effects of row normalization on SIRT algorithms have been discussed by van der Sluis and van der Vorst VV These results show that when t g is small in an appropriate sense the cluster of limit points described in item a above will have a diameter that decreases to zero as co 0 Indeed all limit points converge to x as o 0 e For any given ordering of the rows of R the speed of convergence of the ART algorithm will depend on the choice of the relaxation parameters coj 0 2 The optimal to minimizes the spectral radius of the restriction of A to to Im R T y A to As we have noted it is not sufficient to minimize the norm of this restriction c A to which provides only an upper bound for v A to This will be demonstrated explicitly for the simplest nontrivial example in Section 6 i e when m 2 a case that we introduce here Using a single parameter co we show that in this case c A o c A 1 when co 0 2 a property that is not shared by the function 7 A o Example Let m 2 and A co I cor2r I corl rT where t rl I r2 1 It is known GV w 12 4 that the singular values of A 1 are equal to 1 1 r r2 0 Since the spectrum of A 1 coincides with the spectra of I rzr2T I rl rlT I r2r2 v I rzr2T I rlr I rlr I r2r AA T the eigenvalues of A are equal to 1 1 r rz 2 0 Thus c A 1 r r2 and v A 1 r r2 z Using the minimax characterization of singular values GV w it can be shown that e A co e A 1 for any o E 0 2 In contrast it will be seen in Section 5 that unless r r2 7 A to is minimized when o 1 98 L Eisner et al 4 An equivalent characterization of A ta In this section a second characterization for A co is derived where A co is defined by equations 1 5 and 1 6 with co1 COrn 09 Also we assume that the rows of R are normalized so that A co I cormr I corlr Since 7 A co AW o we consider for convenience AT co Clearly there are possibly non unique numbers fl j 1 i j m such that 4 1 rTAX co r T flilr T flimr T It is straightforward to check that flij defined by the following recursive relation lil coCil i2 co ci2 13 c123 film co Cim flil Clm q fli m l Cm l m where cij r rj cjl satisfy 4 1 Denote co flij i j 1 and write the Gram matrix RR T I L L T where I 0 0 0 0 1 C12 0 0 0 L kClm C2m Cm 1 m 0 Then it follows from the recursive relations for flu that 4 2 co co RR r co LT Hence 4 3 co coRRT I coL T 1 Adding I to both sides we obtain the triangular factorization 4 4 I co 1 co I coL I coLX Given the importance of the function A co the significance of the matrix M co for our analysis is apparent from the next theorem Theorem 2 For any number 2 4 1 2etr A co if and only f2str I co Proof The relation 4 1 can be written in the form 4 5 RAT co I co R Suppose 4 1 and 2 o A co Then there is an x 4 0 such that AT CO x 2x It follows that Rx 4 0 for otherwise rTx 0 for each i and hence A co x x Convergence properties of ART and SOR algorithms 99 contrary to hypothesis Thus it follows from 4 5 that I o Rx 2 Rx and 2 a I 2 o9 Conversely let 2ea I to 2 4 1 Then there is a y 4 0 such that I T o y 2y Clearly 2U to y 4 0 or we contradict 2 4 1 It follows from equation 4 3 that also RTy 4 0 and from 4 5 we obtain A RTy 2 RTy Note in particular that the theorem implies A o I o Remarks 1 1 a I 6o if and only if r rm are linearly independent Indeed I T CO X X implies that T co X 0 and by 4 3 RRTx O Thus RTx 0 On the other hand if RTx 0 then T 60 X 0 and I T 60 X X 2 1 a A oJ if and only if Im R T IR This is obvious 3 In case 6o l I T 1 I L IL T 5 Relaxation with a sequence of parameters Now let us return to the possibility of choosing m relaxation parameters in cyclic order Thus with co t 0 2 j 1 2 m AT to I olrlr I tomr rVm Repeating the argument of Theorem 2 it is easy to see that except possibly for the number 1 A co to and I M to1 Om have the same eigenvalues and cf equation 4 4 I 0 1 60 I W LW I LTw 1 where W diag ol o In case col 1 and oz to are arbitrary numbers in the interval 0 2 the first row of the matrix RA T is zero Indeed r I rlr I COmr r 0 Rewriting the equality 3 4 for submatrices with indices i j 2 m r COrn fllj i j 2 W diag co2 COm and 0 0 0 9 9 o L C2m C3m 0 we have I co2 O m I W L W ccT W I L T W 1 where c T c12 c1 It follows now from 4 5 as in Theorem 2 that if A 1 60 2 co 1 ff I l m 1 o12 corn 1 100 U Eisner et al This reduction by one dimension can be useful for analyzing cases with small m For example for m 2 A 1 co I corzrT I rlrT Thus I co 1 co c22co and therefore y A 1 co 1 o9 c22co Clearly for co 1 A c122 In case the angle between rl and rz is larger than z 4 one can take co 1 1 c22 6 0 2 such that A 1 1 1 c 22 0 If the angle is less than or equal to 7z 4 then 7 A 1 co we decrease as co approaches 2 6 Relation to the SOR algorithm It follows from equation 4 4 that in the case of a single relaxation parameter and with row normalization the matrix I T CO is the iteration of the SOR algorithm for the equation RRTy f see SB p 546 for example Therefore see also equation 2 4 of Nicolaides Ni 6 1 I MT CO I coemrTRT I coelr RT We observe that A co is expressed in terms of the orthogonal projectors rkr and I T CO is expressed in a similar way in terms of the generally non orthogonal projectors ekrkVR x For brevity let us denote I MT co by T co We conclude from Theorem 2 that 6 2 a A co 1 G T co 1 Let us show that the matrix T co is in fact an iteration matrix for a certain iterative process of ART type in JR This is also pointed out by Bj6rck and Elving BE If x 6 Im R T it follows from 1 1 that x k 6 Im R T for all k Hence there exist y k 6 R in general not unique such that X k RTy k k O 1 2 Suppose we start with some y O 6 ills and define iterations 6 3 y k 1 y k j kek 1 k O 1 2 where e k denotes the k th coordinate vector in IR and 6 4 k co fk 1 rT X RTy k 9

免责声明:
1. 《Convergence properties of ART and SOR algorithms》内容来源于互联网,版权归原著者或相关公司所有。
2. 若《86561825文库网》收录的文本内容侵犯了您的权益或隐私,请立即通知我们删除。