Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add OpenCL/prim add_diag, diag_matrix, subtract and minor OpenCL fixes #2250

Merged
merged 20 commits into from
Dec 11, 2020

Conversation

rok-cesnovar
Copy link
Member

Summary

This PR does the following:

  • fixes inv_cloglog header guard
  • adds OpenCL prim version of add_diag, diag_matrix, subtract
  • add is now handled by ADD_BINARY_OPERATION
  • adds an OpenCL prim test utility function and replaced/used it in: add, subtract, and new functions
  • removes OpenCL/prim cholesky_decompose test (prim is tested in opencl/rev/ already), expanded opencl/rev/cholesky a bit
  • removes the zeros function and fill kernel as they can be replaced by KG, the _strict_tri versions remain for now

Tests

All new functions have them, add/subtract now use the utility function.

Side Effects

/

Release notes

OpenCL: added prim support for add_diag(), diag_matrix(), subtract(), inv_cloglog() and clenead up superseded kernels/functions zeros and fill.

Checklist

@rok-cesnovar rok-cesnovar requested a review from t4c1 December 9, 2020 14:18
@@ -185,7 +185,10 @@ class binary_operation : public operation_cl<Derived, T_res, T_a, T_b> {
}

ADD_BINARY_OPERATION(addition_, operator+, common_scalar_t<T_a COMMA T_b>, "+");
ADD_BINARY_OPERATION(subtraction_, operator-, common_scalar_t<T_a COMMA T_b>,
ADD_BINARY_OPERATION(addition_operator_, add, common_scalar_t<T_a COMMA T_b>, "+");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have the class names for addition aroud - the other one is operator. Also it might be better to add a wrapper function for add to delegate to operator instead of constructing another class

@@ -33,8 +33,7 @@ inline void check_diagonal_zeros(const char* function, const char* name,
cl::Context ctx = opencl_context.context();
try {
int zero_on_diagonal_flag = 0;
matrix_cl<int> zeros_flag(1, 1);
zeros_flag = to_matrix_cl(zero_on_diagonal_flag);
matrix_cl<int> zeros_flag = constant(0, 1, 1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[optional] This looks good, but we could completely replace this function with kernel generator implementation.

t4c1
t4c1 previously approved these changes Dec 9, 2020
@rok-cesnovar rok-cesnovar requested a review from t4c1 December 10, 2020 14:35
@stan-buildbot
Copy link
Contributor


Name Old Result New Result Ratio Performance change( 1 - new / old )
gp_pois_regr/gp_pois_regr.stan 3.45 3.48 0.99 -0.77% slower
low_dim_corr_gauss/low_dim_corr_gauss.stan 0.02 0.02 0.99 -0.71% slower
eight_schools/eight_schools.stan 0.11 0.11 1.01 1.45% faster
gp_regr/gp_regr.stan 0.15 0.15 0.99 -1.4% slower
irt_2pl/irt_2pl.stan 5.74 5.81 0.99 -1.31% slower
performance.compilation 87.77 86.0 1.02 2.02% faster
low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan 8.38 8.39 1.0 -0.15% slower
pkpd/one_comp_mm_elim_abs.stan 29.49 30.53 0.97 -3.52% slower
sir/sir.stan 144.73 144.01 1.01 0.5% faster
gp_regr/gen_gp_data.stan 0.04 0.04 1.0 0.34% faster
low_dim_gauss_mix/low_dim_gauss_mix.stan 2.94 2.94 1.0 0.1% faster
pkpd/sim_one_comp_mm_elim_abs.stan 0.38 0.38 1.0 0.2% faster
arK/arK.stan 2.48 2.47 1.0 0.36% faster
arma/arma.stan 0.59 0.59 1.0 -0.17% slower
garch/garch.stan 0.61 0.61 1.0 0.24% faster
Mean result: 0.998278870763

Jenkins Console Log
Blue Ocean
Commit hash: e87f2b3


Machine information ProductName: Mac OS X ProductVersion: 10.11.6 BuildVersion: 15G22010

CPU:
Intel(R) Xeon(R) CPU E5-1680 v2 @ 3.00GHz

G++:
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

Clang:
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

@t4c1 t4c1 merged commit 0bb9e01 into develop Dec 11, 2020
@rok-cesnovar rok-cesnovar deleted the opencl_prim_misc branch December 11, 2020 08:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants