Add OpenCL/prim add_diag, diag_matrix, subtract and minor OpenCL fixes #2250

rok-cesnovar · 2020-12-09T13:55:49Z

Summary

This PR does the following:

fixes inv_cloglog header guard
adds OpenCL prim version of add_diag, diag_matrix, subtract
add is now handled by ADD_BINARY_OPERATION
adds an OpenCL prim test utility function and replaced/used it in: add, subtract, and new functions
removes OpenCL/prim cholesky_decompose test (prim is tested in opencl/rev/ already), expanded opencl/rev/cholesky a bit
removes the zeros function and fill kernel as they can be replaced by KG, the _strict_tri versions remain for now

Tests

All new functions have them, add/subtract now use the utility function.

Side Effects

/

Release notes

OpenCL: added prim support for add_diag(), diag_matrix(), subtract(), inv_cloglog() and clenead up superseded kernels/functions zeros and fill.

Checklist

Math issue Implement matrix_cl overloads for prim & rev functions #1854
Copyright holder: Rok Češnovar, Uni. of Ljubljana

The copyright holder is typically you or your assignee, such as a university or company. By submitting this pull request, the copyright holder is agreeing to the license the submitted work under the following licenses:
- Code: BSD 3-clause (https://opensource.org/licenses/BSD-3-Clause)
- Documentation: CC-BY 4.0 (https://creativecommons.org/licenses/by/4.0/)

use prim util in add tests

…4.1 (tags/RELEASE_600/final)

t4c1 · 2020-12-09T14:00:14Z

stan/math/opencl/kernel_generator/binary_operation.hpp

@@ -185,7 +185,10 @@ class binary_operation : public operation_cl<Derived, T_res, T_a, T_b> {
  }

 ADD_BINARY_OPERATION(addition_, operator+, common_scalar_t<T_a COMMA T_b>, "+");
-ADD_BINARY_OPERATION(subtraction_, operator-, common_scalar_t<T_a COMMA T_b>,
+ADD_BINARY_OPERATION(addition_operator_, add, common_scalar_t<T_a COMMA T_b>, "+");


You have the class names for addition aroud - the other one is operator. Also it might be better to add a wrapper function for add to delegate to operator instead of constructing another class

t4c1 · 2020-12-09T14:01:27Z

stan/math/opencl/err/check_diagonal_zeros.hpp

@@ -33,8 +33,7 @@ inline void check_diagonal_zeros(const char* function, const char* name,
  cl::Context ctx = opencl_context.context();
  try {
    int zero_on_diagonal_flag = 0;
-    matrix_cl<int> zeros_flag(1, 1);
-    zeros_flag = to_matrix_cl(zero_on_diagonal_flag);
+    matrix_cl<int> zeros_flag = constant(0, 1, 1);


[optional] This looks good, but we could completely replace this function with kernel generator implementation.

stan/math/opencl/kernel_generator/check_cl.hpp

stan/math/opencl/multiply.hpp

stan/math/opencl/prim/add_diag.hpp

stan/math/opencl/tri_inverse.hpp

test/unit/math/opencl/prim/add_test.cpp

test/unit/math/opencl/prim/subtract_test.cpp

test/unit/math/opencl/util.hpp

Co-authored-by: Tadej Ciglarič <[email protected]>

…4.1 (tags/RELEASE_600/final)

stan/math/opencl/prim/add_diag.hpp

Co-authored-by: Tadej Ciglarič <[email protected]>

…4.1 (tags/RELEASE_600/final)

stan-buildbot · 2020-12-11T02:39:19Z

Name	Old Result	New Result	Ratio	Performance change( 1 - new / old )
gp_pois_regr/gp_pois_regr.stan	3.45	3.48	0.99	-0.77% slower
low_dim_corr_gauss/low_dim_corr_gauss.stan	0.02	0.02	0.99	-0.71% slower
eight_schools/eight_schools.stan	0.11	0.11	1.01	1.45% faster
gp_regr/gp_regr.stan	0.15	0.15	0.99	-1.4% slower
irt_2pl/irt_2pl.stan	5.74	5.81	0.99	-1.31% slower
performance.compilation	87.77	86.0	1.02	2.02% faster
low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan	8.38	8.39	1.0	-0.15% slower
pkpd/one_comp_mm_elim_abs.stan	29.49	30.53	0.97	-3.52% slower
sir/sir.stan	144.73	144.01	1.01	0.5% faster
gp_regr/gen_gp_data.stan	0.04	0.04	1.0	0.34% faster
low_dim_gauss_mix/low_dim_gauss_mix.stan	2.94	2.94	1.0	0.1% faster
pkpd/sim_one_comp_mm_elim_abs.stan	0.38	0.38	1.0	0.2% faster
arK/arK.stan	2.48	2.47	1.0	0.36% faster
arma/arma.stan	0.59	0.59	1.0	-0.17% slower
garch/garch.stan	0.61	0.61	1.0	0.24% faster
Mean result: 0.998278870763

Jenkins Console Log
Blue Ocean
Commit hash: e87f2b3

Machine information

ProductName: Mac OS X ProductVersion: 10.11.6 BuildVersion: 15G22010

CPU:
Intel(R) Xeon(R) CPU E5-1680 v2 @ 3.00GHz

G++:
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

Clang:
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

rok-cesnovar and others added 10 commits December 8, 2020 17:54

fix inv_cloglog header guard

303e974

Merge branch 'develop' into opencl_prim_misc

5dabb84

fix add and subtract

33db9ce

add prim test util

f4cc731

use prim util in add tests

merge prim and rev cholesky_decompose tests

34f1150

use util in subtract test

7d5ad8d

add add_diag

44f06bf

expand add_diag test

ae1cda7

add diag_matrix and remove zeros()

c4bdbf1

[Jenkins] auto-formatting by clang-format version 6.0.0-1ubuntu2~16.0…

eda2ca2

…4.1 (tags/RELEASE_600/final)

rok-cesnovar requested a review from t4c1 December 9, 2020 14:18

t4c1 requested changes Dec 9, 2020

View reviewed changes

rok-cesnovar and others added 5 commits December 9, 2020 15:30

fix headers

0fed529

Apply suggestions from code review

a99db84

Co-authored-by: Tadej Ciglarič <[email protected]>

[Jenkins] auto-formatting by clang-format version 6.0.0-1ubuntu2~16.0…

cf3dd28

…4.1 (tags/RELEASE_600/final)

apply other review suggestions

a2d4ec3

copy in add_diag

666da6f

t4c1 reviewed Dec 9, 2020

View reviewed changes

stan/math/opencl/prim/add_diag.hpp Outdated Show resolved Hide resolved

Update stan/math/opencl/prim/add_diag.hpp

fc18f42

Co-authored-by: Tadej Ciglarič <[email protected]>

t4c1 previously approved these changes Dec 9, 2020

View reviewed changes

remove remaining zeros from tests

ed8c822

rok-cesnovar dismissed t4c1’s stale review via ed8c822 December 10, 2020 14:30

[Jenkins] auto-formatting by clang-format version 6.0.0-1ubuntu2~16.0…

f61b823

…4.1 (tags/RELEASE_600/final)

rok-cesnovar requested a review from t4c1 December 10, 2020 14:35

rok-cesnovar added 2 commits December 10, 2020 18:37

cleanup zeros

892b39d

bugfix zeros return in multiply

e87f2b3

t4c1 approved these changes Dec 11, 2020

View reviewed changes

t4c1 merged commit 0bb9e01 into develop Dec 11, 2020

rok-cesnovar deleted the opencl_prim_misc branch December 11, 2020 08:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add OpenCL/prim add_diag, diag_matrix, subtract and minor OpenCL fixes #2250

Add OpenCL/prim add_diag, diag_matrix, subtract and minor OpenCL fixes #2250

rok-cesnovar commented Dec 9, 2020

t4c1 Dec 9, 2020

t4c1 Dec 9, 2020

stan-buildbot commented Dec 11, 2020

Add OpenCL/prim add_diag, diag_matrix, subtract and minor OpenCL fixes #2250

Add OpenCL/prim add_diag, diag_matrix, subtract and minor OpenCL fixes #2250

Conversation

rok-cesnovar commented Dec 9, 2020

Summary

Tests

Side Effects

Release notes

Checklist

t4c1 Dec 9, 2020

Choose a reason for hiding this comment

t4c1 Dec 9, 2020

Choose a reason for hiding this comment

stan-buildbot commented Dec 11, 2020