Multigrid Poisson Solver

Specified on Laplacian Operator.

Basic Notation

Suppose the interest region is a square only.

Variable Name	Definition	Type
L	Width of the interest region.	double
U	Approximate solution of size N x N.	double*, 1-D array address
F	Discretize source function to size N x N.	double*, 1-D array address
D	Residual of size N x N.	double*, 1-D array address

Workflow and Structure

Compile

Compile directly

Remember to change SM Version for the GPU.

g++ -fopenmp -o MG_CPU MG_solver_CPU.cpp linkedlist.cpp
nvcc -arch=compute_52 -code=sm_52 -O3 --compiler-options -fopenmp -o MG_GPU MG_solver_GPU.cu linkedlist.cpp

Using Makefile

Change SMVERSION to your GPU version.

make

Clean up.

make clean

Cycle Structure File

Cycle.txt:

(Interest region length L) (min_x) (min_y)
(con_step) (con_N)
(N_max) (N_min)
(node)
(node option if needed)
(node)
(node option if needed)
(node)
(node option if needed)
...

Example:
- V-cycle
- W-cycle
Interest region length L, min_x, min_y

Input Parameter	Description
Interest region length L	Length of the square region.
min_x, min_y	The lower left point of the region.

con_step Ooptions

con_step	Operations
-1	Use trigger that depends when to go to next level.
0	Assign smoothing steps in each level manually.
INT	Do smoothing for this many steps at any level.

con_N Options

con_N	Operations
0	Assign grid size N at each level manually, and input minimum N = 1.
1	Grid size N goes to N/2 at next level, should input minimum N.
2	Grid size N goes to N-1 at next level, should input minimum N.

N_max, N_min

Input Parameter	Description
N_max	Initialize grid size, or maximum grid size for auto generate N.
N_min	Minimum grid size.

node

node	Operations
-1	Smoothing and then do restriction.
0	Use the exact solver.
1	Do prolongation and the smoothing.
2	End of the file, break the loop.

Node Option	con_N = 0: Input Manually	con_N = 1: N = N / 2	con_N = 2: N = N - 1
con_step = -1: Error Trigger	next_N	(ignore)	(ignore)
con_step = 0: Input Manually	step next_N	step	step
con_step = INT: Steps	next_N	(ignore)	(ignore)

Run

./MG_CPU N_THREADS_OMP file_name

Input Parameter	Description
N_THREADS_OMP	Number of OpenMP threads.
file_name	Cycle Structure file.

Link List Data Structure

Node

Functions

V-cycle / W-cycle

for i in cycle:
if(cycle[i] == -1): doRestriction
if(cycle[i] == 0): doExactSolver
if(cycle[i] == 1): doProlongation

FMG

while():
triggers:

CPU Functions

All of the grids are stored as 1D array, with size N x N, including the boundary.

Source

void getSource: Get the discretize form of the source function of size N x N. Change made inside double *F.
- Input Variable:
  - Grid size: int N
  - Interest Region length L: double L
  - Source f [1D-array address]: double *F
  - x minimum range: double min_x
  - y minimum range: double min_y

Boundary

void getBoundary: Get the discretize form of the boundary of size N x N, only the boundary are setted, the others are 0.
- Input Variable:
  - Grid size: int N
  - Interest Region length L: double L
  - Boundary [1D-array address]: double *F
  - x minimum range: double min_x
  - y minimum range: double min_y

Residual

void getResidual: Get the residual d = Lu - f
- Input Variable:
  - Grid size: int N
  - Interest Region length L: double L
  - Approximate solution [1D-array address]: double *U
  - Source f [1D-array address]: double *F
  - Residual d [1D-array address]: dobule *D

Grid Addition

void doGridAddition: Add two Grids together U1 + U2, and store the result inside double *U1.
- Input Variable:
  - Grid size: int N
  - Grid 1 [1D-array address]: double *U1
  - Grid 2 [1D-array address]: double *U2

Smoothing

void doSmoothing: Change made inside double *U, and save the error from the latest smoothing in double *error.
- Input Variable:
  - Grid size: int N
  - Interest Region length L: double L
  - Approximate solution [1D-array address]: double *U
  - Source f [1D-array address]: double *F
  - Steps: int step
  - Error: double *error

Exact Solver

void doExactSolver: Change made inside double *U
- Input Variable:
  - Grid size: int N
  - Interest Region length L: double L
  - Approximation solution [1D-array address]: double *U
  - Souce f [1D-array address]: double *F
  - Target error: double target_error
  - Solver options: int option
    - option == 0: use Inverse Matrix
    - option == 1: use Gauss-Seidel, with even/odd method.

Inverse Matrix Method

void InverseMatrix: Calculate the Inverse Matrix of the current discretized Laplacian, and do the multiplication to get the answer double *U.
- Input Variable:
  - Grid size: int N
  - Interest region length L: double Length
  - Exact solution [1D-array address]: double *X
  - Source Term f [1D-array address]: double *F
NOTES:
- Parallel with OpenMP later, since the performance is generally same as Gauss-Seidel, and it is hard to parallelize.

Gauss-Seidel Relaxation Method

void GaussSeidel: Change made inside exact solution double *U. Relax till it reaches the target error.
- Input Variable:
  - Grid size: int N
  - Interest Region length L: double L
  - Exact solution [1D-array address]: double *U
  - Source Term [1D-array address]: double *F
  - Target error: double target_error

Restriction

void doRestriction: Change made inside double *U_c
- Input Variable:
  - Grid size: int N
  - To be restrict [1D-array address]: double *U_f
  - Grid size: int M
  - After restriction[1D-array address]: double *U_c
- NOTES:
  1. Restriction is specific on residual, and since we only don't do relaxation on the boundary, so the boundary of restriction target grid is always "0".

Prolongation

void doProlongation: Change made inside double *U_f
- Input Variable:
  - Grid size: int N
  - To be prolongate [1D-array address]: double *U_c
  - Grid size: int M
  - After prolongation [1D-array address]: double *U_f

Print

void doPrint: Print out the grid on screen, with the normal x-y coordinate.
- Input Variable:
  - Grid size: int N
  - To be printed [1D-array address]: double *U

Output as file

void doPrint2File: Print in file, with normal x-y coordinate.
- Input Variable:
  - Grid size: int N
  - To be printed [1D-array address]: double *U
  - output file name: char *filename

GPU Functions

All of the grids are stored as 1D array, with size N x N, including the boundary. Since the GPU is specialized in doing single precision computation, all the subroutine function of calling GPU kernal are using single precision. After calling GPU kernel, typecasting back to double precision. Originally, I use double precision for doExactSolver_GPU, but it turns out that it's tooooo slow.

Source (Problem Dependent)

void getSource_GPU: Get the discretize form of the source function of size N x N. Change made inside double *F.
- Input Variable:
  - Grid size: int N
  - Interest Region length L: double L
  - Source f [1D-array address]: double *F
  - x minimum range: double min_x
  - y minimum range: double min_y
__global__ void ker_Source_GPU: Get the discretize form of the source function of size N x N. Change made inside float *F.
- Input Variable:
  - Grid size: int N
  - delta x: float h
  - Source f [1D-array address]: float *F
  - x minimum range: float min_x
  - y minimum range: float min_y

Residual

void getResidual_GPU: Get the residual d = Lu - f
- Input Variable:
  - Grid size: int N
  - Interest Region length L: double L
  - Approximate solution [1D-array address]: double *U
  - Source f [1D-array address]: double *F
  - Residual d [1D-array address]: dobule *D
__global__ void ker_Residual_GPU: Get the residual d = Lu - f, changes made inside float D
- Input Variable:
  - Grid size: int N
  - delta x: float h
  - Residual d [1D-array address]: float *D
- NOTES:
  1. Bind float *U and float *F to texture memory, so need to pass the variable to the kernel.

Grid Addition

void doGridAddition_GPU: Add two Grids together U1 + U2, and store the result inside double *U1.
- Input Variable:
  - Grid size: int N
  - Grid 1 [1D-array address]: double *U1
  - Grid 2 [1D-array address]: double *U2
__global__ void ker_GridAddition_GPU: Add two Grids together U1 + U2, and store the result inside float *U1.
- Input Variable:
  - Grid size: int N
  - Grid 1 [1D-array address]: float *U1
- NOTES:
  1. Bind float *U2 to texture memory, so need to pass the variable to the kernel.

Smoothing

void doSmoothing_GPU: Change made inside double *U, and save the error from the latest smoothing in double *error.
- Input Variable:
  - Grid size: int N
  - Interest Region length L: double L
  - Approximate solution [1D-array address]: double *U
  - Source f [1D-array address]: double *F
  - Steps: int step
  - Error: double *error
- NOTES:
  1. Using parallel reduction, so threadsPerBlock = pow(2,m), and threadsPerBlock * blocksPerGrid <= N*N.
  2. The function sets the range of the block size is 2^0 ~ 2^10, and grid size is 10^0 ~ 10^5.
  3. The selecting threadsPerBlock and blocksPerGrid method here assumes that the greater the threadsPerBlock * blocksPerGrid and threadsPerBlock is, the faster it is.
  4. The error is defined as Sum( | Lu - f | ) / (N * N) = Sum( | U - U0 | ) * 4 / ((h * h) * (N * N)).
- TODOs if have time:
  - Improve the performance with sync with all threads within a grid Cooperative Kernel, so that we can save some time on calculating unwanted errors during the iterations.
__global__ void ker_Smoothing_GPU: Change made inside float *U or float *U0, and save the error from the latest smoothing in float *err. Using Jacobi Method, without even / odd method.
- Input Variable:
  - Grid size: int N
  - delta x: float h
  - Approximate solution [1D-array address]: float *U
  - U_old [1D-array addreses]: float *U0
  - Current numbers of iterations: int iter
  - Error array [1D-array address]: float *err
- NOTES:
  - Bind float *F to texture memory.
  - Changes made after step steps
    - If step is odd : get d_U
    - If step is even : get d_U0

Exact Solver

void doExactSolver_GPU: Change made inside double *U.
- Input Variable:
  - Grid size: int N
  - Interest Region length L: double L
  - Approximation solution [1D-array address]: double *U
  - Souce f [1D-array address]: double *F
  - Target error: double target_error
  - Solver options: int option
    - option == 0: blank
    - option == 1: use Gauss-Seidel, with even/odd method, GPU kernel in double precision.
    - option == 2: use Gauss-Seidel, with even/odd method, GPU kernel in single precision.

Gauss-Seidel Relaxation Method

Double Precision
- void GaussSeidel_GPU_Double: Change made inside exact solution double *U. Relax till it reaches the target error.
  - Input Variable:
    - Grid size: int N
    - Interest Region length L: double L
    - Exact solution [1D-array address]: double *U
    - Source Term [1D-array address]: double *F
    - Target error: double target_error
- __global__ void ker_GaussSeideleven_GPU_Double: Change made inside double *U, update even index only.
  - Input Variable:
    - Grid size: int N
    - delta x: double h
    - Approximation solution [1D-array address]: double *U
    - Souce f [1D-array address]: double *F
- __global__ void ker_GaussSeidelodd_GPU_Double: Change made inside double *U, update odd index only.
  - Input Variable:
    - Grid size: int N
    - delta x: double h
    - Approximation solution [1D-array address]: double *U
    - Souce f [1D-array address]: double *F
- __global__ void ker_Error_GPU_Double: Get the error of U, define as Sum( | Lu - F | ) / (N * N).
  - Input Variable:
    - Grid size: int N
    - delta x: double h
    - Approximation solution [1D-array address]: double *U
    - Souce f [1D-array address]: double *F
    - Error array [1D-array address]: double *err
Single Precision
- void GaussSeidel_GPU_Single: Change made inside exact solution double *U. Relax till it reaches the target error.
  - Input Variable:
    - Grid size: int N
    - Interest Region length L: double L
    - Exact solution [1D-array address]: double *U
    - Source Term [1D-array address]: double *F
    - Target error: double target_error
- __global__ void ker_GaussSeideleven_GPU_Single: Change made inside float *U, update even index only.
  - Input Variable:
    - Grid size: int N
    - delta x: float h
    - Approximation solution [1D-array address]: float *U
    - Souce f [1D-array address]: float *F
- __global__ void ker_GaussSeidelodd_GPU_Single: Change made inside float *U, update odd index only.
  - Input Variable:
    - Grid size: int N
    - delta x: float h
    - Approximation solution [1D-array address]: float *U
    - Souce f [1D-array address]: float *F
- __global__ void ker_Error_GPU_Single: Get the error of U, define as Sum( | Lu - F | ) / (N * N).
  - Input Variable:
    - Grid size: int N
    - delta x: float h
    - Approximation solution [1D-array address]: float *U
    - Souce f [1D-array address]: float *F
    - Error array [1D-array address]: float *err
NOTES:
1. Write two kernel, one deals with even index, the other deals with odd index, so that we can make sure that updates on even/odd are all done.
TODOs if have time:
- Try using sync with all threads within a grid Cooperative Kernel, which means forge two kernels together.

Restriction / Prolongation

void doRestriction: Change made inside double *U_c
- Input Variable:
  - Grid size: int N
  - To be restrict [1D-array address]: double *U_f
  - Grid size: int M
  - After restriction[1D-array address]: double *U_c
void doProlongation: Change made inside double *U_f
- Input Variable:
  - Grid size: int N
  - To be prolongate [1D-array address]: double *U_c
  - Grid size: int M
  - After prolongation [1D-array address]: double *U_f
__global__ void ker_Zoom_GPU: Zoom in/out from grid size int N to grid size int M. The final result is inside float *U_m.
- Input Variable:
  - Grid size before zooming: int N
  - Spacing between points before zooming: float h_n
  - Grid size after zooming: int M
  - Spacing between points after zooming: float h_m
  - Result of the grid: float *U_m
- NOTES:
  1. The grid before zooming is binded to the texture memory, so no need to pass in the pointer to the kernel.
NOTES:
1. Restriction is specific on residual, and since we only don't do relaxation on the boundary, so the boundary of restriction target grid is always "0".
2. The boundary of prolongation target grid is "0".

Name		Name	Last commit message	Last commit date
Latest commit History 142 Commits
Note		Note
note		note
plot		plot
res		res
src		src
testFunction		testFunction
.gitignore		.gitignore
README.md		README.md

cindytsai/multigrid_poisson_solver

Folders and files

Latest commit

History

Repository files navigation

Multigrid Poisson Solver

Contents

Basic Notation

Workflow and Structure

Compile

Cycle Structure File

Run

Link List Data Structure

Node

Functions

V-cycle / W-cycle

FMG

CPU Functions

Source

Boundary

Residual

Grid Addition

Smoothing

Exact Solver

Inverse Matrix Method

Gauss-Seidel Relaxation Method

Restriction

Prolongation

Print

Output as file

GPU Functions

Source (Problem Dependent)

Residual

Grid Addition

Smoothing

Exact Solver

Gauss-Seidel Relaxation Method

Restriction / Prolongation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages