Skip to main content

Fortran

·1732 words·9 mins·
Table of Contents

Linear Algebra

SGESV

SGESV is a LAPACK routine used to solve a system of linear equations \(AX=B\) using LU decomposition. Useful resources:

Syntax: CALL SGESV | DGESV | CGESV | ZGESV (n, nrhs, a, lda, ipvt, bx, ldb, info )

Output:

  • bx: vector \(B\) will be overwritten by the solution vector \(X\). In other words, \(bx \gets a^{-1}*bx \Leftrightarrow bx \gets x\)
  • Info: indicator of success. If info is 0, the solution vector X is printed; If info is non-zero, an error message is printed.

Example

Step1: Install LAPACK and BLAS (for MacOS)

Run the fllowing code in the terminal:

brew install lapack
brew install openblas

Step2: Fortran code

PROGRAM SolveLinearSystem
    IMPLICIT NONE
    INTEGER, PARAMETER :: n = 3, nrhs = 1, lda = 3, ldb = 3
    INTEGER :: ipiv(n), info
    REAL :: a(lda, n), b(ldb, nrhs)
    INTEGER :: output_unit

    ! Initialize the matrix A
    a = RESHAPE([ &
        3.0, 1.0, 2.0, &
        6.0, 3.0, 4.0, &
        3.0, 1.0, 5.0], &
        SHAPE(a))

    ! Initialize the vector B
    b = RESHAPE([1.0, 2.0, 3.0], SHAPE(b))

    ! Call SGESV to solve the system AX = B
    CALL SGESV(n, nrhs, a, lda, ipiv, b, ldb, info)

    ! Open a file to write the results
    output_unit = 10
    OPEN(UNIT=output_unit, FILE='play_solution.txt', STATUS='REPLACE')

    ! Check for success
    IF (info == 0) THEN
        WRITE(output_unit, *) 'Solution X:'
        WRITE(output_unit, *) b
    ELSE
        WRITE(output_unit, *) 'SGESV failed with info =', info
    END IF

    ! Close the file
    CLOSE(output_unit)

END PROGRAM SolveLinearSystem

Step3: Compilation Command

  • Assuming Homebrew installs the libraries in standard locations, you can compile your code with the following command:
gfortran /path/to/your/code/yourFortranFile.f90 -llapack -lblas
  • If the libraries are installed in non-standard locations, specify the paths using the -L flag. For example:
gfortran /path/to/your/code/yourFortranFile.f90 -L/usr/local/opt/lapack/lib -llapack -L/usr/local/opt/openblas/lib -lblas

Step4: Execute result

./my_program # Run the executable
ls # check where is the play_solution.txt file in your current directory.

Result:

 Solution X:
  -3.77777839       1.66666687      0.777777851   

Useful Resources

  1. SGESV, DGESV, CGESV, ZGESV (General Matrix Factorization and Multiple Right-Hand Side Solve) - IBM Documentation

LUDCMP

Consider the following 3x3 matrix ( A ) \( A = \begin{bmatrix} 2 & 3 & 1 \newline 4 & 4 & 2 \newline 2 & 1 & 2 \newline \end{bmatrix} \)

We want to perform LU decomposition on ( A ) such that \( A = L \cdot U \).

Decomposition

  1. Initialization: \( L = \begin{bmatrix} 1 & 0 & 0 \newline 0 & 1 & 0 \newline 0 & 0 & 1 \newline \end{bmatrix}, \quad U = \begin{bmatrix} 2 & 3 & 1 \newline 0 & 0 & 0 \newline 0 & 0 & 0 \newline \end{bmatrix} \)

  2. First Column:

  • Update L: Eliminating the elements below the pivot in the first column (using row operations) \( l_{21} = \frac{4}{2} = 2 \), \( l_{31} = \frac{2}{2} = 1 \), get updated ( L ): \( L = \begin{bmatrix} 1 & 0 & 0 \newline 2 & 1 & 0 \newline 1 & 0 & 1 \newline \end{bmatrix}\)
  • Update U: Subtract Multiples of the First Row from the Second and Third Rows:
    • For the second row: \( \text{Row 2} = \text{Row 2} - l_{21} \cdot \text{Row 1} = \begin{pmatrix} 4 & 4 & 2 \end{pmatrix} - 2 \cdot \begin{pmatrix} 2 & 3 & 1 \end{pmatrix} = \begin{pmatrix} 4 & 4 & 2 \end{pmatrix} - \begin{pmatrix} 4 & 6 & 2 \end{pmatrix} = \begin{pmatrix} 0 & -2 & 0 \end{pmatrix} \)

    • For the third row: \( \text{Row 3} = \text{Row 3} - l_{31} \cdot \text{Row 1} = \begin{pmatrix} 2 & 1 & 2 \end{pmatrix} - 1 \cdot \begin{pmatrix} 2 & 3 & 1 \end{pmatrix} = \begin{pmatrix} 2 & 1 & 2 \end{pmatrix} - \begin{pmatrix} 2 & 3 & 1 \end{pmatrix} = \begin{pmatrix} 0 & -2 & 1 \end{pmatrix} \)

    • Updated ( U ): \( \quad U = \begin{bmatrix} 2 & 3 & 1 \newline 0 & -2 & 0 \newline 0 & -2 & 1 \newline \end{bmatrix} \)

  1. Second Column:
  • Update L: Eliminating the elements below the pivot in the second column \( l_{32} = \frac{-2}{-2} = 1 \), get updated ( L ): \( L = \begin{bmatrix} 1 & 0 & 0 \newline 2 & 1 & 0 \newline 1 & 1 & 1 \newline \end{bmatrix}\)

  • Update U: Subtract Multiples of the Second Row from the Third Row

    • \( \text{Row 3} = \text{Row 3} - l_{32} \cdot \text{Row 2} = \begin{pmatrix} 0 & -2 & 1 \end{pmatrix} - 1 \cdot \begin{pmatrix} 0 & -2 & 0 \end{pmatrix} = \begin{pmatrix} 0 & -2 & 1 \end{pmatrix} - \begin{pmatrix} 0 & -2 & 0 \end{pmatrix} = \begin{pmatrix} 0 & 0 & 1 \end{pmatrix} \)

    • Updated ( U ): \( \quad U = \begin{bmatrix} 2 & 3 & 1 \newline 0 & -2 & 0 \newline 0 & 0 & 1 \newline \end{bmatrix} \)

Results

After LU decomposition: \( L = \begin{bmatrix} 1 & 0 & 0 \newline 2 & 1 & 0 \newline 1 & 1 & 1 \newline \end{bmatrix}, \quad U = \begin{bmatrix} 2 & 3 & 1 \newline 0 & -2 & 0 \newline 0 & 0 & 1 \newline \end{bmatrix} \)

Output

  1. \( A \) In practice, \( A \) is overwritten to store both \( L \) and \( U \): \( A = \begin{bmatrix} 2 & 3 & 1 \newline 2 & -2 & 0 \newline 1 & 1 & 1 \newline \end{bmatrix} \)

Here:

  • The upper triangular part (including the diagonal) contains ( U ).
  • The lower triangular part (excluding the diagonal - all 1s) contains ( L ).
  1. Pivot Array and Determinant Sign Indicator
  • INDX would be [1, 2, 3] (identity permutation). Explanation: The pivot array INDX keeps track of the row permutations that occur during the LU decomposition process. In this example, since no row swaps are needed (the pivot elements are already the largest available), the permutation array INDX would remain [1, 2, 3].
  • DD would be 1 (even number of row swaps, which is zero).

Determinant

The determinant of ( A ) can be calculated as: \( \text{det}(A) = \text{det}(L) \cdot \text{det}(U) \). Since \(\text{det}(L) = 1\) (product of its diagonal elements, which are all 1 for unit lower triangular matrices): \( \text{det}(A) = \text{det}(U) = 2 \cdot (-2) \cdot 1 = -4 \)

det = DFLOAT(DD)
DO J = 1, 3
det = det * A(J, J)
END DO

Useful Resources

  1. CEE 618 Scientific Parallel Computing (Lecture 3) - Linear Algebra Basics using LAPACK

spotrf and vsrnggaussianmv

Generating Multivariate Gaussian Random Variables

This example demonstrates how to generate a 3-dimensional multivariate Gaussian random vector using Cholesky decomposition and the vsrnggaussianmv function in Fortran. Assuming independence, so the off-diagonal elements are zero.

Step 1: Cholesky

Given a covariance matrix \( \Sigma \):

\( \Sigma = \begin{bmatrix} 0.5 & 0 & 0 \newline 0 & 0.2 & 0 \newline 0 & 0 & 0.3 \end{bmatrix} \)

The Cholesky decomposition factorizes \( \Sigma \) into:

\( \Sigma = R^\top R \) ,

where \( R \) is the upper triangular matrix:

\( R = \begin{bmatrix} \sqrt{0.5} & 0 & 0 \newline 0 & \sqrt{0.2} & 0 \newline 0 & 0 & \sqrt{0.3} \end{bmatrix} = \begin{bmatrix} 0.7071 & 0 & 0 \newline 0 & 0.4472 & 0 \newline 0 & 0 & 0.5477 \end{bmatrix} \)

REAL :: COV_lambda_1(3,3)
INTEGER :: info

! Initialize the covariance matrix
Sigma = RESHAPE([0.5, 0.0, 0.0, &
                        0.0, 0.2, 0.0, &
                        0.0, 0.0, 0.3], (/3, 3/))

! Perform Cholesky decomposition (upper triangular)
CALL spotrf('U', 3, COV_lambda_1, 3, info)

After running this code, the \( \Sigma\) matrix is modified to store the upper triangular matrix \( R \):

\(\Sigma = R = \begin{bmatrix} 0.7071 & 0 & 0 \newline 0 & 0.4472 & 0 \newline 0 & 0 & 0.5477 \end{bmatrix} \)

Step 2: Gaussian R.V.

To generate a random vector ( \lambda_1 ) from the multivariate normal distribution:

\( \lambda_1 = \lambda_0 + R^\top Z\),

where \( Z \sim N(0,I_3)\).

REAL :: lambda_1(3), lambda_0(3)
INTEGER :: method, stream, errcode, me

! Assume mean vector lambda_0 is [1.0, 2.0, 3.0]
lambda_0 = (/ 1.0, 2.0, 3.0 /)

! Set method, stream, and me (as required by the random number generator)
method = 0
stream = 0
me = 0  ! Assuming full storage for simplicity

! Generate the random vector lambda_1
errcode = vsrnggaussianmv(method, stream, 1, lambda_1, 3, me, lambda_0, Sigma)

Output

After running the vsrnggaussianmv function, the vector \( \lambda_1 \) will be a sample from the multivariate normal distribution with mean \( \lambda_0 = [1.0, 2.0, 3.0] \) and covariance \( \Sigma \). For example, the output might be: \( \lambda_1 =[1.1, 2.2, 3.1]\).

Summary

  • Cholesky Decomposition (spotrf): Factorizes the covariance matrix \( \Sigma \) into an upper triangular matrix \( R \).
  • Random Vector Generation (vsrnggaussianmv): Generates a random vector \( \lambda_1 \) using the upper triangular matrix from the Cholesky decomposition and the specified mean vector.

Useful Resources

  1. ?potrf
  2. vRngGaussian

Parallel

OpenMP

The OpenMP parallelized block in Fortran is utilized to enhance computational efficiency by executing multiple threads simultaneously. By designating variables as private, it ensures that data is analyzed independently within each thread/block.

Typical structure

Call OMP_SET_NUM_THREADS(Nthreads)
!$omp parallel default(shared) private(...)
    ! Your parallel code here
!$omp end parallel

Tips: In Fortran, appending & at the end of a line and !$omp & at the beginning of the next indicates a continuation of an OpenMP directive across lines.

!$omp parallel default(shared) private(a, b, c, d, e, &
!$omp & f, g, h, i, ... , z)

Explanation

  1. Set the Number of Threads
Call OMP_SET_NUM_THREADS(Nthreads)

Nthreads specifies the number of threads. For example, in Hsieh, C. S., & Lin, X. (2017), the Nthreads is set to 14, corresponding to the number of groups or schools. Each thread will handle a portion of the computation, as processing data related to a specific group.

  1. OpenMP Parallel Region
!$omp parallel default(shared) private(...)

This directive marks the beginning of a parallel region, where the code will be executed by multiple threads simultaneously.

  • default(shared): This means that all variables are shared among threads by default, except those explicitly declared as private.

  • private(…): This lists the variables that will be private to each thread, meaning each thread will have its own copy of these variables.

  1. End
!$omp end parallel

This ends the parallel region. Once the threads reach this point, they will synchronize, and the program will continue with only the main thread executing the subsequent code sequentially.

Useful Resources

  1. F95_OpenMPv1_v2.dvi