Replace sqrt(dot_self()) with norm2 #2648

lyndond · 2022-01-09T22:12:13Z

Summary

I'm hoping this one won't be too controversial. This PR addresses the remaining TODO in #2562.
PR #2636 introduced specialized prim/fwd/rev functions for L1 and L2 norms. This PR tidies some instances in the codebase where r = sqrt(dot_self(x)); can now be replaced with r = norm2(x);. The main ones I found were:

unit_vector_constrain in prim/fwd/rev

If there are any instances I may have missed, please let me know.

Tests

No new tests. Existing tests pass.

Side Effects

N/A

Release notes

Replace this text with a short note on what will change if this pull request is merged in which case this will be included in the release notes.

Checklist

Math issue L1 and L2 norms (Euclidean and taxicab lengths) #2562
Copyright holder: Lyndon Duong
The copyright holder is typically you or your assignee, such as a university or company. By submitting this pull request, the copyright holder is agreeing to the license the submitted work under the following licenses:
- Code: BSD 3-clause (https://opensource.org/licenses/BSD-3-Clause)
- Documentation: CC-BY 4.0 (https://creativecommons.org/licenses/by/4.0/)
the basic tests are passing
- unit tests pass (to run, use: ./runTests.py test/unit)
- header checks pass, (make test-headers)
- dependencies checks pass, (make test-math-dependencies)
- docs build, (make doxygen)
- code passes the built in C++ standards checks (make cpplint)
the code is written in idiomatic C++ and changes are documented in the doxygen
the new changes are tested

…4.1 (tags/RELEASE_600/final)

stan-buildbot · 2022-01-10T03:51:52Z

Name	Old Result	New Result	Ratio	Performance change( 1 - new / old )
gp_pois_regr/gp_pois_regr.stan	3.6	3.64	0.99	-1.08% slower
low_dim_corr_gauss/low_dim_corr_gauss.stan	0.02	0.02	1.0	-0.48% slower
eight_schools/eight_schools.stan	0.08	0.08	1.01	0.91% faster
gp_regr/gp_regr.stan	0.14	0.14	0.99	-0.63% slower
irt_2pl/irt_2pl.stan	5.76	5.71	1.01	0.81% faster
performance.compilation	93.18	91.22	1.02	2.1% faster
low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan	8.22	8.15	1.01	0.84% faster
pkpd/one_comp_mm_elim_abs.stan	31.78	31.02	1.02	2.41% faster
sir/sir.stan	122.21	121.79	1.0	0.34% faster
gp_regr/gen_gp_data.stan	0.03	0.03	0.99	-1.36% slower
low_dim_gauss_mix/low_dim_gauss_mix.stan	2.97	3.02	0.98	-1.67% slower
pkpd/sim_one_comp_mm_elim_abs.stan	0.42	0.38	1.11	9.92% faster
arK/arK.stan	2.08	2.06	1.01	1.02% faster
arma/arma.stan	0.28	0.27	1.01	0.56% faster
garch/garch.stan	0.6	0.61	0.99	-0.69% slower
Mean result: 1.00954507309

Jenkins Console Log
Blue Ocean
Commit hash: d280811

Machine information

ProductName: Mac OS X ProductVersion: 10.11.6 BuildVersion: 15G22010

CPU:
Intel(R) Xeon(R) CPU E5-1680 v2 @ 3.00GHz

G++:
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

Clang:
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

andrjohns · 2022-01-10T03:59:43Z

Thanks for the PR! I'm not sure whether this change is actually necessary though, as it looks like this essentially just replaces sqrt(dot_self()) with calls to square(norm2()). So the amount of operations/complexity is unchanged

lyndond · 2022-01-10T13:14:49Z

That's fair -- I was thinking that too as I was making the changes. Perhaps there was something else @bob-carpenter had in mind in the issue #2562 .

bob-carpenter · 2022-01-10T15:23:50Z

@andrjohns: We want to replace sqrt(dot_self(x)) with norm2(x), not with square(norm2(x)), because norm2(x) = sqrt(dot_self(x)).

lyndond · 2022-01-10T18:14:20Z

Right. I replaced instances of sqrt(dot_self()) with norm2(). There were a few instances where the squared norm is required (e.g. line 46 in stan/math/fwd/fun/unit_vector_constrain.hpp), and in those cases I left dot_self()as is.

bob-carpenter

Hmm. These changes are OK semantically, but I think the original versions are slightly faster in all cases because it's not just the norm2 calculation, we also need its square.

If you want to work on speed here, what we could use is the autodiff implemented analytically here like for any other special function. The twist is that we also need to set lp (right now it just does lp +=, but we should set it to a new var that holds a pointer to lp and to y), and also define the version that doesn't increment lp.

What we'd really like instead of this replacement is analytic gradients for the constraint functions. We should be able to define this just like any other autodiffable function but have it also set lp to a new value.

bob-carpenter · 2022-01-10T21:18:40Z

stan/math/fwd/fun/unit_vector_constrain.hpp

-  eig_partial squared_norm = dot_self(y_val);
-  eig_partial norm = sqrt(squared_norm);
+  eig_partial norm = norm2(y_val);
+  eig_partial squared_norm = norm * norm;


These are going to be very similar operationally. I think the original code may require fewer operations and be more stable arithmetically, because the new one has the sqrt inside norm2, and then you have to the do the additional squaring. So unless there's something I'm not seeing, I think this should be switched back. If for some reason you do want to keep the new code, line 27 should be square(norm).

bob-carpenter · 2022-01-10T21:29:30Z

stan/math/rev/fun/unit_vector_constrain.hpp

@@ -34,7 +34,7 @@ inline auto unit_vector_constrain(const T& y) {
  arena_t<T> arena_y = y;
  arena_t<promote_scalar_t<double, T>> arena_y_val = arena_y.val();

-  const double r = arena_y_val.norm();
+  const double r = norm2(arena_y_val);


For double arguments, it's probably better to use the built-in Eigen functions as they're more highly optimized for this. Of course, norm2 should delegate to this .norm() for values in the vector case.

bob-carpenter · 2022-01-10T21:29:36Z

stan/math/prim/fun/unit_vector_constrain.hpp

-  return y_ref.array() / sqrt(SN);
+  value_type_t<T1> r = norm2(y_ref);
+  check_positive_finite("unit_vector_constrain", "norm", r);
+  lp -= 0.5 * (r * r);


This looks like it's also just paying that squaring cost again here on line 51, so I think the original code may again be faster here.

What we really want here is a specialized function for unit_vector_constrain that does all of this with analytic derivatives for lp and the return type.

lyndond · 2022-01-11T14:02:58Z

Thanks for the review @bob-carpenter. It looks like this is a bit more nuanced than initially anticipated. I think I'll close the PR for now until we settle on something that requires more significant change.

lyndond and others added 4 commits January 9, 2022 16:31

change prim unit_vector_constrain

b9a064b

change some instances of sqrt(dot_self(.)) to norm2

06d4591

change fwd tests

baba5ee

[Jenkins] auto-formatting by clang-format version 6.0.0-1ubuntu2~16.0…

d280811

…4.1 (tags/RELEASE_600/final)

bob-carpenter requested changes Jan 10, 2022

View reviewed changes

lyndond closed this Jan 11, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace sqrt(dot_self()) with norm2 #2648

Replace sqrt(dot_self()) with norm2 #2648

lyndond commented Jan 9, 2022

stan-buildbot commented Jan 10, 2022

andrjohns commented Jan 10, 2022 •

edited

Loading

lyndond commented Jan 10, 2022

bob-carpenter commented Jan 10, 2022

lyndond commented Jan 10, 2022

bob-carpenter left a comment

bob-carpenter Jan 10, 2022

bob-carpenter Jan 10, 2022

bob-carpenter Jan 10, 2022

lyndond commented Jan 11, 2022

Replace sqrt(dot_self()) with norm2 #2648

Replace sqrt(dot_self()) with norm2 #2648

Conversation

lyndond commented Jan 9, 2022

Summary

Tests

Side Effects

Release notes

Checklist

stan-buildbot commented Jan 10, 2022

andrjohns commented Jan 10, 2022 • edited Loading

lyndond commented Jan 10, 2022

bob-carpenter commented Jan 10, 2022

lyndond commented Jan 10, 2022

bob-carpenter left a comment

Choose a reason for hiding this comment

bob-carpenter Jan 10, 2022

Choose a reason for hiding this comment

bob-carpenter Jan 10, 2022

Choose a reason for hiding this comment

bob-carpenter Jan 10, 2022

Choose a reason for hiding this comment

lyndond commented Jan 11, 2022

andrjohns commented Jan 10, 2022 •

edited

Loading