HELP LAPOP David Young September 2003 LIB * LAPOP provides a set of procedures for linear algebra. CONTENTS - (Use g to access required sections) 1 Introduction 2 Conventions applying to all procedures 2.1 Input, output and update array arguments 2.2 Array-region arguments 2.3 Vector and matrix arguments 2.4 Precision and real-complex mode 2.5 Types of array arguments and efficiency 3 Procedures 4 Examples 4.1 Setup and simple operations 4.2 Eigenvectors and eigenvalues 4.3 Recycling arrays 4.4 Array regions 4.5 Complex data ----------------------------------------------------------------------- 1 Introduction ----------------------------------------------------------------------- The library provides a set of procedures for performing some of the standard operations of linear algebra. It does not attempt to cover all possible operations on arrays, but rather those specific to their use as representations of matrices: matrix multiplication; exact and least-squares solution of linear equations; eigenvalues and eigenvectors; singular value decomposition. Both real and complex calculations are supported. The calling sequences and conventions are as simple as possible, and the library is flexible with respect to the types of data arrays. However, it may be used in a way that maximises efficiency - for example, garbage generation may be kept low if that is important. The library is supported by the external Lapack package. A fuller and more direct interface is provided by the Pop-11 *LAPACK library, and this may be used if an operation is not covered by LAPOP, or the maximum possible efficiency is needed. See HELP * LAPACK for what to do if the library does not load successfully. The *VEC_MAT library also provides some relevant facilities, but does not deal with complex data or provide least-squares solutions, eigendecomposition or singular value decomposition. ----------------------------------------------------------------------- 2 Conventions applying to all procedures ----------------------------------------------------------------------- 2.1 Input, output and update array arguments --------------------------------------------- There are three kinds of array arguments to LAPOP procedures. 1. Inputs. Arguments described as "input array" must be arrays when the procedure is called. The data in the active region (see below) must be numerical. Input arrays are never overwritten. 2. Outputs. Arguments described as "output array" may have one of 3 forms when the procedure is called: o An array. Results will be stored in this array and the array given will be returned as a result. Data stored in the active region on entry will be ignored and overwritten. o . A new array of an appropriate type will be created, results stored in it, and it will be returned. o Anything else. An array will be obtained using oldanyarray (see HELP * OLDARRAY); this array will be filled with results and returned. The argument will be used as the tag. The effect of this is that the array returned may share storage with any other array obtained using the same (identical) tag. This can greatly reduce the need for garbage collections. 3. Update arrays. Data are both read from the array and written to it. Arguments described as "update array" must be arrays when the procedure is called; data in the active region will be used and then overwritten. Update arrays are returned on the stack like output arrays as well as being overwritten. The same array should not be used as both an input (or update) array, and as an output (or update) array in one call to a procedure, unless separate non-overlapping active regions (see below) are specified. 2.2 Array-region arguments --------------------------- Wherever an array argument is given in the procedure descriptions, the array (or or the tag for an output array) may be followed by a region specification. The region spec is a list giving the "active region" of the array. For input arrays, data are read from this region; for output arrays this region is updated. It has the form [element1 element2] or [row1 row2 col1 col2] for 1-D and 2-D arrays respectively, giving the minimum and maximum subscripts in each dimension. This is the same as the form of the *boundslist used to specify array dimensions when an array is created. If the region spec for an input array is omitted, the whole of the array is used. The same is true if the region spec is the empty list []. Thus the following are equivalent: la_transpose(a, b) -> b; ;;; a is input, b is output la_transpose(a, [], b) -> b; la_transpose(a, boundslist(a), b) -> b; This also applies for output arrays if an actual array is passed. Otherwise, if a non-empty region spec is given it will be used as the boundslist of the array returned (and so the elements of the list must not be later updated). If neither an array nor a region spec are given for an output array, a suitable output region will be worked out from the input array regions unless otherwise stated in the procedure description. Thus the following are equivalent (assuming a encompasses the region given): la_transpose(a, [1 2 1 7], false) -> b; la_transpose(a, [1 2 1 7], false, [1 7 1 2]) -> b; The sizes of the active regions for all array arguments must be consistent, given the operation being performed. Thus for example la_transpose(a, [1 2 1 7], false, [101 107 201 202]) -> b; is correct, whilst la_transpose(a, [1 2 1 7], false, [1 3 1 7]) -> b; will cause a mishap. Rows and columns in matrices are always taken to be numbered from the top left corner of the active region. Thus the array indices of an element will differ from its matrix row and column indices if the active region is not of the form [1 m 1 n]. In the procedure descriptions below, expressions like A(i,j) refer to the matrix indices. 2.3 Vector and matrix arguments -------------------------------- Array arguments may be 1-D or 2-D. 2-D arrays represent matrices in the obvious way; note that the first subscript refers to the row, and the second to the column. 1-D arrays are taken to represent matrices with a single column; the index then refers to the row. These can also be regarded as vectors. Thus the boundslist/region spec forms [r0 r1] and [r0 r1 c c] are equivalent as far as the LAPOP procedure are concerned - both represent a single column of data. 2.4 Precision and real-complex mode ------------------------------------ The computation precision is controlled by the value of *popdprecision at run-time. Computations are normally done using real arithmetic. Complex arithmetic will be used if: o any input array has a complex type (e.g. it was created using *newcfloatarray or *newzfloatarray); or o any input array has a "full" arrayvector (e.g. it is an ordinary Pop-11 array created with *newarray) and its active region contains a complex value; or o any numerical input is complex. In the second and third cases, a value is regarded as complex if *iscomplex returns , even if the imaginary part is zero. The second case is, incidentally, normally detected in a way that does not compromise the efficiency of computations on real data. 2.5 Types of array arguments and efficiency -------------------------------------------- In general, arrays of any type can be used as input or output arguments, with a few restrictions. The active regions of input arrays must contain only numbers, and output arrays where supplied must be such that the results can be assigned to them. Arrays must be created with the variable *poparray_by_row set to (the default). The array arguments in a call to a procedure do not need to be the same type. Thus "ordinary" Pop-11 arrays created with *newarray can always be used for input or output. On the other hand, for example, byte arrays such as those created with *newbytearray are OK for inputs but not suitable for outputs, because floating point numbers cannot be assigned to them. Real floating point arrays such as those created with *newsfloatarray are OK except as output arrays when complex results are generated. In many cases, however, the procedures may need to copy data, and if computation time is to be minimised this is best avoided. This can be done by using "packed" arrays - that is, arrays made specially to contain numerical data - of the right type for the required precision. You can create suitable arrays using the procedures in *popvision: If popdprecision is , use *newsfloatarray for real data, *newcfloatarray for complex data. If popdprecison is , use *newdfloatarray for real data, *newzfloatarray for complex data. For maximum efficiency all the array arguments should be created with the same procedure, except where an array is specifically described as "real" or "complex". In these cases the appropriate real or complex routine should be used to make it, depending on the current precision. Further details of array types may be found in HELP * LAPACK. Output arrays that are created by the procedures and returned are made using one of the four procedures listed above, depending on the precision and whether the calculation was in real or complex mode. ----------------------------------------------------------------------- 3 Procedures ----------------------------------------------------------------------- la_print(A) la_print(A WID) la_print(A, PLACES, WID) Prints a matrix. A: input array. WID: optional integer; the width of the field used to print each element. Defaults to 16 for complex data or 8 for real data. PLACES: optional integer; *pop_pr_places is locally set to this. Defaults to 3. la_copy(A, B) -> B Copies a matrix. (Note: This should rarely be needed for matrices that are to be used wholly within LAPOP, because input matrices are not updated and region specifications can be given to all procedures. It may be useful if data have to be copied to a packed matrix for efficiency reasons, or if array regions are to be passed to procedures from other libraries.) A: input array. B: output array. la_col(A, JA, B, JB) -> B Copies a column of a matrix. A: input array. JA: integer. If A is 1-D, JA is ignored; otherwise JA specifies which column of the active region is to be copied. JA gives the index in the array A, not the matrix index within the active region. B: output/update array. If B is not an array on entry, or its active region is a single column, B is an output array. Otherwise B acts, strictly speaking, like an update array in that data in the active region but outside column JB are preserved. If B is a tag on entry, and there is a region spec of more than a single column, the values of elements of the array returned are undefined except within column JB. JB: integer or . If B is a 1-D array, or it is not an array on entry and its region specification has 2 elements, JB is ignored; otherwise JB specifies which column of the active region is to receive data. If B is not an array on entry, and no region specification is given for it, then JB may be and the array returned will be 1-D. JB gives the index in the array B, not the matrix index within the active region. la_row(A, IA, B, IB) -> B Copies a row of a matrix. A: input array. IA: integer. Specifies which row of the active region is to be copied. IA gives the index in the array A, not the matrix index within the active region. B: output/update array. If B is not an array on entry, or its active region is a single row, B is an output array. Otherwise B acts, strictly speaking, like an update array in that data in the active region but outside row IB are preserved. If B is a tag on entry, and there is a region spec of more than a single row, the values of elements of the array returned are undefined except within row IB. IB: integer. Specifies which row of the active region is to receive data. IB gives the index in the array B, not the matrix index within the active region. la_ritoc(A, B, C) -> C Copies two real arrays into the real and imaginary parts of a complex array; C = A + i * B. A: real input array. B: real input array. C: complex output array. la_ctori(C, A, B) -> (A, B) Copies the real and imaginary parts of a complex array into two real arrays; A = real(C), B = imag(C). C: complex input array. A: real output array. B: real output array. la_transpose(A, B) -> B Returns the transpose of a matrix. A: input array. B: output array. la_adjoint(A, B) -> B Returns the adjoint, that is, the complex conjugate of the transpose, of a matrix. If the data in A are real, this is the same as la_transpose. A: input array. B: output array. la_reshape(A, AORDER, B, BORDER) -> B Reshapes an array by reordering its elements. Data are taken sequentially from A and entered sequentially into B. The active regions of A and B must contain the same number of elements, but can be different shapes. A: input array. AORDER: string. If the first character of AORDER is c or C, data are taken from A "by column", that is with the first index varying fastest. If the first character of AORDER is r or R, data are taken from A "by row", that is with the second index varying fastest. B: output array. If B is not an array on entry, the region spec argument for B must be supplied and must not be empty. BORDER: string. If BORDER is c or C, data are written into B "by column", that is with the first index varying fastest. If the first character of BORDER is r or R, data are written into B "by row", that is with the second index varying fastest. la_+(A, B, C) -> C Calculates the matrix sum A+B. A: input array. B: input array. C: output array. la_accum(ALPHA, A, B) -> B Calculates ALPHA*A+B, where ALPHA is a scalar, and stores the result in B. ALPHA: number. A: input array. B: update array. la_scale(ALPHA, A, B) -> B Multiplies each element of A by the scalar ALPHA. ALPHA: number. A: input array. B: output array. la_*(A, B, C) -> C Calculates the matrix product A*B. A: input array. B: input array. C: output array. la_trans_*(A, AOP, B, BOP, C) -> C Calculates the matrix product P*Q, where P is A, the transpose of A, or the adjoint (conjugate transpose) of A, and Q is B, the transpose of B, or the adjoint of B. This is done efficiently, not by explicitly forming the transpose or adjoint. A: input array. AOP: string. If the first character of AOP is n or N, then P is A; if it is t or T, P is transpose(A); if it is c or C, P is adjoint(A). (If A has no complex data, the C and T operations are the same.) B: input array. BOP: string. If the first character of BOP is n or N, then Q is B; if it is t or T, Q is transpose(B); if it is c or C, Q is adjoint(B). (If B has no complex data, the C and T operations are the same.) C: output array. la_diag_*(A, B, C) -> C Calculates either A*diag(B) or diag(A)*B where diag(x) means the square matrix with the elements of x on its leading diagonal and zeros elsewhere. If the active region of A is a single column, and the active region of B has more than one element, then C=diag(A)*B, and the number of rows of B must equal the number of elements of A. Otherwise, the active region of B must be a single column, C=A*diag(B), and the number of columns of A must equal the number of elements of B. To put it another way, if A is a column, then the i'th element of A scales the i'th row of B; if B is a column, then the i'th element of B scales the i'th column of A. A: input array. B: input array. C: output array. la_linsolve(A, B, X) -> X Solves the matrix equation A*X = B. A: input array. The active region must be square, and the matrix must be of full rank. B: input array. X: output array. On exit, the j'th column of X is the solution of the set of simultaneous linear equations whose right-hand-sides are the j'th column of B; the coefficients of the unknowns for the i'th equation are given by the i'th row of A. la_lsqsolve(A, B, X) -> X Finds least squares solutions to overdetermined or underdetermined linear equations. For each column b of the matrix B, the corresponding column x of X returns a solution vector with the following properties: If A is m x n, and m >= n, the vector x minimises ||b-A*x||. If n > m, the vector x minimises ||x|| and satisfies A*x=b. It is assumed the A has full rank (i.e. its rank is min(m,n)). A: input array. B: input array. X: output array. la_lsqsolve2(A, B, X, ERRS, RCOND) -> (X, ERRS, RANK) As la_lsqsolve but does not necessarily assume that A has full rank, and returns information about the residuals of the solutions. A: input array. B: input array. X: output array. ERRS: output array. If A is m x n, m > n, and the estimated rank of A is equal to n, the residual sum of squares for the solution in the j'th column of X is given by the sum of squares of the j'th column of ERRS. If an input array is supplied, its active region must have dimensions m-n x nrhs where nrhs is the number of columns of B. If m <= n, ERRS will be returned as . RCOND: real number or . Used to determine the effective rank of A, which is defined as the order of the largest leading triangular submatrix in the QR factorisation with pivoting of A, whose estimated condition number is less than 1/RCOND. If RCOND is given as , a value of 0.0 is used, which is equivalent to assuming that A has full rank. RANK: integer. Returns the estimated rank of A. la_eigvals(A, W) -> W Computes the eigenvalues of a square matrix. A: input array. W: complex output array. The active region is a single column with length equal to one dimension of A. Complex conjugate pairs of eigenvalues are adjacent with the eigenvalue having the positive imaginary part first. la_eig(A, VL, W, VR) -> (VL, W, VR) Computes the eigenvalues and eigenvectors of a square matrix. A: input array. VL: complex output array. The columns of VL return the left eigenvectors of A, normalised to have Euclidean norm equal to 1 and largest component real. VL has the property that adjoint(VL)*A = diag(W)*adjoint(VL). W: complex output array. The active region is a single column with length equal to one dimension of A. Complex conjugate pairs of eigenvalues are adjacent with the eigenvalue having the positive imaginary part first. VR: complex output array. The columns of VR return the right eigenvectors of A, normalised to have Euclidean norm equal to 1 and largest component real. VR has the property that A*VR = VR*diag(W). la_eigvals_herm(A, W) -> W Computes the eigenvalues of a Hermitian (i.e. self-adjoint) matrix. In the real case, this is the same as a symmetrical matrix. A: input array. The active region must be square and it is assumed that A(i,j) = conjugate(A(j,i)). Only values in the upper triangle (i.e. elements A(i,j) for which j >= i) are accessed. If A is complex, the values on the leading diagonal (i.e. A(i,i)) are assumed to be real, and any non-zero imaginary part for these elements is ignored. W: real output array. The active region is a single column with length equal to one dimension of A. Returns the eigenvalues in ascending order. la_eig_herm(A, W, E) -> (W, E) Computes the eigenvalues and eigenvectors of a Hermitian (i.e. self-adjoint) matrix. In the real case, this is the same as a symmetrical matrix. A: input array. The active region must be square and it is assumed that A(i,j) = conjugate(A(j,i)). Only values in the upper triangle (i.e. elements A(i,j) for which j >= i) are accessed. If A is complex, the values on the leading diagonal (i.e. A(i,i)) are assumed to be real, and any non-zero imaginary part for these elements is ignored. W: real output array. The active region is a single column with length equal to one dimension of A. Returns the eigenvalues in ascending order. E: output array. Returns the orthonormal eigenvectors of A, such that A = E * diag(W) * adjoint(E). la_singvals(A, S) -> S Computes the singular values of a matrix. A: input array. S: real output array. The active region is a single column with length equal to the smaller dimension of A. Returns the singular values in descending order. la_svd(A, U, S, VT) -> (U, S, VT) Computes the singular value decomposition of a matrix. A: input array. A is m x n. U: output array. U is m x min(m, n). The columns of U return the first min(m, n) left singular vectors. S: real output array. The active region is a single column with length min(m, n). Returns the singular values in descending order. VT: output array. V is min(m, n) x n. The rows of VT return the first min(m, n) right singular vectors. ----------------------------------------------------------------------- 4 Examples ----------------------------------------------------------------------- The examples are intended to be run in order. Output from print statements is included after each example. 4.1 Setup and simple operations -------------------------------- Load the libraries: uses popvision, lapop; Obtain a matrix to work with, and print it. The following generates an array whose contents reflect the row and column indices of each element: vars a = newarray([1 3 1 4], procedure(i,j); 10*i+j endprocedure); la_print(a); 11 12 13 14 21 22 23 24 31 32 33 34 Get its transpose, creating a new array by giving as the second argument: vars at = la_transpose(a, false); la_print(at); 11.0 21.0 31.0 12.0 22.0 32.0 13.0 23.0 33.0 14.0 24.0 34.0 The results are decimals, not integers - LAPOP output is always floating point. Multiply the matrix by its transpose, making a new 3x3 matrix: vars c = la_*(a, at, false); la_print(c, 15); 630.0 1130.0 1630.0 1130.0 2030.0 2930.0 1630.0 2930.0 4230.0 4.2 Eigenvectors and eigenvalues --------------------------------- We start with a symmetric matrix: vars as = newarray([1 3 1 3], nonop + <> sqrt); la_print(as); 1.414 1.732 2.0 1.732 2.0 2.236 2.0 2.236 2.449 and find its eigenvalues and eigenvectors: vars (w, e) = la_eig_herm(as, false, false); la_print(w); -0.134 -0.001 5.999 We can see that the eigenvectors are orthonormal by finding e'*e: la_print(la_trans_*(e, 't', e, 'n', false)); 1.0 -0.0 0.0 -0.0 1.0 -0.0 0.0 -0.0 1.0 and we can reconstruct the original matrix by finding e * diag(w) * e': vars anew = la_trans_*(la_diag_*(e, w, false), 'n', e, 't', false); la_print(anew); 1.414 1.732 2.0 1.732 2.0 2.236 2.0 2.236 2.449 As an exercise, find the singular value decomposition of some non-symmetrical and non-square matrices, and look at whether U and V' are orthonormal. 4.3 Recycling arrays --------------------- The examples so far have created new arrays for their results. This can make a lot of garbage. It is easy to recycle arrays to avoid this. If the problem size is constant, a simple approach is to make the LAPOP procedure return a new array the fist time it is used, then to re-use this array, something like this template: vars c = false; ;;; get a new array first time repeat 3 times ;;; Processing to set up inputs would be here la_*(a, at, c) -> c; ;;; processing to make use of output would be here endrepeat; The first time round the loop a new array is made and assigned to c; the second and subsequent times results are stored in the existing array, overwriting the previous results. If the problem size can change between calls, the tag mechanism can be used to allow efficient recycling of storage. For details see HELP *OLDARRAY. Essentially a constant object is used as a "tag" for arrays where overwriting would be OK if the problem size did not change. Array storage is then saved and re-used where possible. Here is an example, where the input and output arrays are potentially recycled using this mechanism: vars a1 = oldsfloatarray("a", [1 5]); vars at1 = la_transpose(a1, "at"); vars a2 = oldsfloatarray("a", [1 4]); vars at2 = la_transpose(a2, "at"); Note that since we are now interested in efficiency, we have created the input arrays using newsfloatarray (assuming popdprecision is ) rather than the standard newarray. Now we can test whether the two result arrays share storage: 123456 -> at1(1, 1); at2(1, 1) => ** 123456.0 You can see that at1 and at2 share their arrayvector. Clearly, this has to be used with care. The common cases where it is applicable are in a loop, like the first example in this section, and where an array is made and used inside a procedure but is not returned as a result. 4.4 Array regions ------------------ If the matrix is represented by only part of an array, this can be stated using a region spec. For example, we can take the transpose of all but the top row and left column of our 3x4 matrix: vars atpart = la_transpose(a, [2 3 2 4], false); la_print(atpart); 22.0 32.0 23.0 33.0 24.0 34.0 We can update a section of an array this way too: vars big = newsfloatarray([1 4 1 5], 0); la_transpose(a, [2 3 2 4], big, [2 4 2 3]) -> _; la_print(big); 0.0 0.0 0.0 0.0 0.0 0.0 22.0 32.0 0.0 0.0 0.0 23.0 33.0 0.0 0.0 0.0 24.0 34.0 0.0 0.0 and if necessary, you can take all the inputs and outputs to a routine to be sub-arrays: la_*(big, [2 3 2 3], big, [1 2 2 2], big, [2 3 5 5]) -> _; la_print(big); 0.0 0.0 0.0 0.0 0.0 0.0 22.0 32.0 0.0 704.0 0.0 23.0 33.0 0.0 726.0 0.0 24.0 34.0 0.0 0.0 As in this example, it is OK for input regions to overlap, but not for an output region to overlap with any input regions, or any other output regions. 4.5 Complex data ----------------- If we assign a complex value to an element of a full array, the calculation will be in complex mode, and any result array created will be a complex type: 10+:10 -> a(3, 3); la_*(a, at, false) -> c; la_print(c, 16); 630.0_+:0.0 1130.0_+:0.0 1630.0_+:0.0 1130.0_+:0.0 2030.0_+:0.0 2930.0_+:0.0 1331.0_+:130.0 2401.0_+:230.0 3471.0_+:330.0 Note that if we tried to reuse the previous array that was assigned to c, a mishap would result because it would not be able to accept complex values. Alternatively, we can create an array of complex type. This one has data that reflect the row and column indices: newcfloatarray([1 3 1 4], nonop +:) -> a; la_print(a); 1.0_+:1.0 1.0_+:2.0 1.0_+:3.0 1.0_+:4.0 2.0_+:1.0 2.0_+:2.0 2.0_+:3.0 2.0_+:4.0 3.0_+:1.0 3.0_+:2.0 3.0_+:3.0 3.0_+:4.0 We can multiply this by its adjoint to produce a Hermitian matrix: la_print(la_trans_*(a, 'n', a, 'c', false)); 34.0_+:0.0 38.0_+:10.0 42.0_+:20.0 38.0_-:10.0 46.0_+:0.0 54.0_+:10.0 42.0_-:20.0 54.0_-:10.0 66.0_+:0.0 The examples above can be rerun using complex data - they will all generalise correctly to the complex case. --- $popvision/help/lapop --- Copyright University of Sussex 2003. All rights reserved.