Linear Algebra
SGESV
SGESV
is a LAPACK
routine used to solve a system of linear equations \(AX=B\) using LU decomposition.
Useful resources:
Syntax: CALL SGESV | DGESV | CGESV | ZGESV (n, nrhs, a, lda, ipvt, bx, ldb, info )
Output:
- bx: vector \(B\) will be overwritten by the solution vector \(X\). In other words, \(bx \gets a^{-1}*bx \Leftrightarrow bx \gets x\)
- Info: indicator of success. If info is 0, the solution vector X is printed; If info is non-zero, an error message is printed.
Example
Step1: Install LAPACK and BLAS (for MacOS)
Run the fllowing code in the terminal:
brew install lapack
brew install openblas
Step2: Fortran code
PROGRAM SolveLinearSystem
IMPLICIT NONE
INTEGER, PARAMETER :: n = 3, nrhs = 1, lda = 3, ldb = 3
INTEGER :: ipiv(n), info
REAL :: a(lda, n), b(ldb, nrhs)
INTEGER :: output_unit
! Initialize the matrix A
a = RESHAPE([ &
3.0, 1.0, 2.0, &
6.0, 3.0, 4.0, &
3.0, 1.0, 5.0], &
SHAPE(a))
! Initialize the vector B
b = RESHAPE([1.0, 2.0, 3.0], SHAPE(b))
! Call SGESV to solve the system AX = B
CALL SGESV(n, nrhs, a, lda, ipiv, b, ldb, info)
! Open a file to write the results
output_unit = 10
OPEN(UNIT=output_unit, FILE='play_solution.txt', STATUS='REPLACE')
! Check for success
IF (info == 0) THEN
WRITE(output_unit, *) 'Solution X:'
WRITE(output_unit, *) b
ELSE
WRITE(output_unit, *) 'SGESV failed with info =', info
END IF
! Close the file
CLOSE(output_unit)
END PROGRAM SolveLinearSystem
Step3: Compilation Command
- Assuming Homebrew installs the libraries in standard locations, you can compile your code with the following command:
gfortran /path/to/your/code/yourFortranFile.f90 -llapack -lblas
- If the libraries are installed in non-standard locations, specify the paths using the -L flag. For example:
gfortran /path/to/your/code/yourFortranFile.f90 -L/usr/local/opt/lapack/lib -llapack -L/usr/local/opt/openblas/lib -lblas
Step4: Execute result
./my_program # Run the executable
ls # check where is the play_solution.txt file in your current directory.
Result:
Solution X:
-3.77777839 1.66666687 0.777777851
Useful Resources
LUDCMP
Consider the following 3x3 matrix ( A ) \( A = \begin{bmatrix} 2 & 3 & 1 \newline 4 & 4 & 2 \newline 2 & 1 & 2 \newline \end{bmatrix} \)
We want to perform LU decomposition on ( A ) such that \( A = L \cdot U \).
Decomposition
Initialization: \( L = \begin{bmatrix} 1 & 0 & 0 \newline 0 & 1 & 0 \newline 0 & 0 & 1 \newline \end{bmatrix}, \quad U = \begin{bmatrix} 2 & 3 & 1 \newline 0 & 0 & 0 \newline 0 & 0 & 0 \newline \end{bmatrix} \)
First Column:
- Update L: Eliminating the elements below the pivot in the first column (using row operations) \( l_{21} = \frac{4}{2} = 2 \), \( l_{31} = \frac{2}{2} = 1 \), get updated ( L ): \( L = \begin{bmatrix} 1 & 0 & 0 \newline 2 & 1 & 0 \newline 1 & 0 & 1 \newline \end{bmatrix}\)
- Update U:
Subtract Multiples of the First Row from the Second and Third Rows:
For the second row: \( \text{Row 2} = \text{Row 2} - l_{21} \cdot \text{Row 1} = \begin{pmatrix} 4 & 4 & 2 \end{pmatrix} - 2 \cdot \begin{pmatrix} 2 & 3 & 1 \end{pmatrix} = \begin{pmatrix} 4 & 4 & 2 \end{pmatrix} - \begin{pmatrix} 4 & 6 & 2 \end{pmatrix} = \begin{pmatrix} 0 & -2 & 0 \end{pmatrix} \)
For the third row: \( \text{Row 3} = \text{Row 3} - l_{31} \cdot \text{Row 1} = \begin{pmatrix} 2 & 1 & 2 \end{pmatrix} - 1 \cdot \begin{pmatrix} 2 & 3 & 1 \end{pmatrix} = \begin{pmatrix} 2 & 1 & 2 \end{pmatrix} - \begin{pmatrix} 2 & 3 & 1 \end{pmatrix} = \begin{pmatrix} 0 & -2 & 1 \end{pmatrix} \)
Updated ( U ): \( \quad U = \begin{bmatrix} 2 & 3 & 1 \newline 0 & -2 & 0 \newline 0 & -2 & 1 \newline \end{bmatrix} \)
- Second Column:
Update L: Eliminating the elements below the pivot in the second column \( l_{32} = \frac{-2}{-2} = 1 \), get updated ( L ): \( L = \begin{bmatrix} 1 & 0 & 0 \newline 2 & 1 & 0 \newline 1 & 1 & 1 \newline \end{bmatrix}\)
Update U: Subtract Multiples of the Second Row from the Third Row
\( \text{Row 3} = \text{Row 3} - l_{32} \cdot \text{Row 2} = \begin{pmatrix} 0 & -2 & 1 \end{pmatrix} - 1 \cdot \begin{pmatrix} 0 & -2 & 0 \end{pmatrix} = \begin{pmatrix} 0 & -2 & 1 \end{pmatrix} - \begin{pmatrix} 0 & -2 & 0 \end{pmatrix} = \begin{pmatrix} 0 & 0 & 1 \end{pmatrix} \)
Updated ( U ): \( \quad U = \begin{bmatrix} 2 & 3 & 1 \newline 0 & -2 & 0 \newline 0 & 0 & 1 \newline \end{bmatrix} \)
Results
After LU decomposition: \( L = \begin{bmatrix} 1 & 0 & 0 \newline 2 & 1 & 0 \newline 1 & 1 & 1 \newline \end{bmatrix}, \quad U = \begin{bmatrix} 2 & 3 & 1 \newline 0 & -2 & 0 \newline 0 & 0 & 1 \newline \end{bmatrix} \)
Output
- \( A \) In practice, \( A \) is overwritten to store both \( L \) and \( U \): \( A = \begin{bmatrix} 2 & 3 & 1 \newline 2 & -2 & 0 \newline 1 & 1 & 1 \newline \end{bmatrix} \)
Here:
- The upper triangular part (including the diagonal) contains ( U ).
- The lower triangular part (excluding the diagonal - all 1s) contains ( L ).
- Pivot Array and Determinant Sign Indicator
INDX
would be[1, 2, 3]
(identity permutation). Explanation: The pivot array INDX keeps track of the row permutations that occur during the LU decomposition process. In this example, since no row swaps are needed (the pivot elements are already the largest available), the permutation arrayINDX
would remain[1, 2, 3]
.DD
would be1
(even number of row swaps, which is zero).
Determinant
The determinant of ( A ) can be calculated as: \( \text{det}(A) = \text{det}(L) \cdot \text{det}(U) \). Since \(\text{det}(L) = 1\) (product of its diagonal elements, which are all 1 for unit lower triangular matrices): \( \text{det}(A) = \text{det}(U) = 2 \cdot (-2) \cdot 1 = -4 \)
det = DFLOAT(DD)
DO J = 1, 3
det = det * A(J, J)
END DO
Useful Resources
spotrf and vsrnggaussianmv
Generating Multivariate Gaussian Random Variables
This example demonstrates how to generate a 3-dimensional multivariate Gaussian random vector using Cholesky decomposition and the vsrnggaussianmv
function in Fortran. Assuming independence, so the off-diagonal elements are zero.
Step 1: Cholesky
Given a covariance matrix \( \Sigma \):
\( \Sigma = \begin{bmatrix} 0.5 & 0 & 0 \newline 0 & 0.2 & 0 \newline 0 & 0 & 0.3 \end{bmatrix} \)
The Cholesky decomposition factorizes \( \Sigma \) into:
\( \Sigma = R^\top R \) ,
where \( R \) is the upper triangular matrix:
\( R = \begin{bmatrix} \sqrt{0.5} & 0 & 0 \newline 0 & \sqrt{0.2} & 0 \newline 0 & 0 & \sqrt{0.3} \end{bmatrix} = \begin{bmatrix} 0.7071 & 0 & 0 \newline 0 & 0.4472 & 0 \newline 0 & 0 & 0.5477 \end{bmatrix} \)
REAL :: COV_lambda_1(3,3)
INTEGER :: info
! Initialize the covariance matrix
Sigma = RESHAPE([0.5, 0.0, 0.0, &
0.0, 0.2, 0.0, &
0.0, 0.0, 0.3], (/3, 3/))
! Perform Cholesky decomposition (upper triangular)
CALL spotrf('U', 3, COV_lambda_1, 3, info)
After running this code, the \( \Sigma\) matrix is modified to store the upper triangular matrix \( R \):
\(\Sigma = R = \begin{bmatrix} 0.7071 & 0 & 0 \newline 0 & 0.4472 & 0 \newline 0 & 0 & 0.5477 \end{bmatrix} \)
Step 2: Gaussian R.V.
To generate a random vector ( \lambda_1 ) from the multivariate normal distribution:
\( \lambda_1 = \lambda_0 + R^\top Z\),
where \( Z \sim N(0,I_3)\).
REAL :: lambda_1(3), lambda_0(3)
INTEGER :: method, stream, errcode, me
! Assume mean vector lambda_0 is [1.0, 2.0, 3.0]
lambda_0 = (/ 1.0, 2.0, 3.0 /)
! Set method, stream, and me (as required by the random number generator)
method = 0
stream = 0
me = 0 ! Assuming full storage for simplicity
! Generate the random vector lambda_1
errcode = vsrnggaussianmv(method, stream, 1, lambda_1, 3, me, lambda_0, Sigma)
Output
After running the vsrnggaussianmv
function, the vector \( \lambda_1 \) will be a sample from the multivariate normal distribution with mean \( \lambda_0 = [1.0, 2.0, 3.0] \) and covariance \( \Sigma \). For example, the output might be: \( \lambda_1 =[1.1, 2.2, 3.1]\).
Summary
- Cholesky Decomposition (
spotrf
): Factorizes the covariance matrix \( \Sigma \) into an upper triangular matrix \( R \). - Random Vector Generation (
vsrnggaussianmv
): Generates a random vector \( \lambda_1 \) using the upper triangular matrix from the Cholesky decomposition and the specified mean vector.
Useful Resources
Parallel
OpenMP
The OpenMP parallelized block in Fortran is utilized to enhance computational efficiency by executing multiple threads simultaneously. By designating variables as private, it ensures that data is analyzed independently within each thread/block.
Typical structure
Call OMP_SET_NUM_THREADS(Nthreads)
!$omp parallel default(shared) private(...)
! Your parallel code here
!$omp end parallel
Tips: In Fortran, appending &
at the end of a line and !$omp &
at the beginning of the next indicates a continuation of an OpenMP directive across lines.
!$omp parallel default(shared) private(a, b, c, d, e, &
!$omp & f, g, h, i, ... , z)
Explanation
- Set the Number of Threads
Call OMP_SET_NUM_THREADS(Nthreads)
Nthreads specifies the number of threads. For example, in Hsieh, C. S., & Lin, X. (2017), the Nthreads is set to 14, corresponding to the number of groups or schools. Each thread will handle a portion of the computation, as processing data related to a specific group.
- OpenMP Parallel Region
!$omp parallel default(shared) private(...)
This directive marks the beginning of a parallel region, where the code will be executed by multiple threads simultaneously.
default(shared): This means that all variables are shared among threads by default, except those explicitly declared as private.
private(…): This lists the variables that will be private to each thread, meaning each thread will have its own copy of these variables.
- End
!$omp end parallel
This ends the parallel region. Once the threads reach this point, they will synchronize, and the program will continue with only the main thread executing the subsequent code sequentially.