-
-
Notifications
You must be signed in to change notification settings - Fork 190
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Get sd
, log_softmax
, and log_sum_exp
fully var<mat>
compatible (Issue #2101)
#2169
Conversation
…4.1 (tags/RELEASE_600/final)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Questions & comments
stan/math/prim/meta/is_container.hpp
Outdated
@@ -19,7 +20,8 @@ namespace stan { | |||
*/ | |||
template <typename Container> | |||
using is_container = bool_constant< | |||
math::disjunction<is_eigen<Container>, is_std_vector<Container>>::value>; | |||
math::disjunction<is_eigen<Container>, is_std_vector<Container>, | |||
is_var_matrix<Container>>::value>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A var<mat>
should count as a container, right? This might break template logic elsewhere but we can fix it elsewhere.
I'll take a look at this in the morning |
Ping |
…4.1 (tags/RELEASE_600/final)
…into feature/sd_varmats
…4.1 (tags/RELEASE_600/final)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@SteveBronder this is ready to look at. Not 100% there but I have a couple questions. One we get those ironed out it will make it easy to finish mat<var>
ing log_sum_exp
and log_softmax
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple quick comments
…mat> types (Issue #2101)
stan/math/rev/fun/log_softmax.hpp
Outdated
alpha.adj().noalias() | ||
+= res.adj_ | ||
- (res.adj_.sum() * res.val_.array().exp()).matrix(); | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@t4c1, @SteveBronder writing this code I wanted a make_callback_var
to wrap make_callback_vari
. The argument pass to the functor can still be the vari to avoid the pointer chasing.
I also want .adj()
and val()
on vari_value
.
That all sound okay to add? (Edit: if so I'll just do it here)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want make_callback_var
to avoid type algebra. In this case the input and output types will match, in other cases they won't and I would end up writing:
using ret_type = decltype((theta.array() - log(theta.exp().sum()));
return var_value<ret_type>(...);
require_std_vector_vt<is_matrix, Type>* = nullptr> | ||
void check_return_type(const ReturnType& ret, const Type& x) { | ||
if (ret.size() > 0 && x.size() > 0) | ||
check_return_type(ret[0], x[0]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will not be offended if you want me to take a closer look at check_return_type
.
I think the logic we want is:
- If there are only
var_value<double>
s on the input, then there should only bevar_value<double>
s on the output - If there are
var_value<not double>
s on the input, then there should be novar_value<double>
s on the output
I think this would need an extra template program var_value_t
to extract the var_value
from a generic input type and I am too lazy to write it today.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the logic is fine here unless I'm missing something. This is a specialization for std::vector<T>
that just checks that the inner type for the input/output is correct. If that's what this does then I think it's fine
sd
var<mat>
compatible (Issue #2101)sd
, log_softmax
, and log_sum_exp
fully var<mat>
compatible (Issue #2101)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a couple little things to change and then this looks good.
stan/math/rev/fun/log_softmax.hpp
Outdated
return make_callback_vari( | ||
(theta.array() - log(theta.exp().sum())).matrix(), | ||
[alpha](const auto& res) mutable { | ||
alpha.adj().noalias() | ||
+= res.adj_ - (res.adj_.sum() * res.val_.array().exp()).matrix(); | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[side note] It would be nice to have a make_callback_var
so we could still use .adj()
and .val()
etc. Mostly a slice of life feature
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added this
stan/math/rev/fun/log_softmax.hpp
Outdated
T log_softmax_impl(const T& alpha) { | ||
check_nonzero_size("log_softmax", "alpha", alpha); | ||
|
||
const auto& theta = to_ref(alpha.val().array() - alpha.val().maxCoeff()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you need to_ref()
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want it to evaluate to a temporary -- switched to an .eval()
.
/** | ||
* Specialisation for use with var_value<T> types where T inherits from | ||
* EigenBase. Inputs are mapped to Eigen column vectors. | ||
* | ||
* The returned scalar type is deduced to allow for cases where the input and | ||
* return scalar types differ (e.g., functions implicitly promoting | ||
* integers). | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Double check these docs
template <typename ResultMatVar, typename ResultVarMat, typename MatVar, | ||
typename VarMat, | ||
require_std_vector_vt<is_var, ResultMatVar>* = nullptr, | ||
require_std_vector_vt<is_var, ResultVarMat>* = nullptr> | ||
inline void test_matvar_gradient(const ad_tolerances& tols, | ||
ResultMatVar& A_mv_f, ResultVarMat& A_vm_f, | ||
const MatVar& A_mv, const VarMat& A_vm) { | ||
for (size_t i = 0; i < A_vm_f.size(); ++i) { | ||
A_vm_f[i].adj() = 1; | ||
A_mv_f[i].adj() = 1; | ||
stan::math::grad(); | ||
expect_near_rel_var("var<Matrix> vs Matrix<var> input", A_vm, A_mv, tols); | ||
stan::math::set_zero_all_adjoints(); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is confusing me, the requires look like it's for std::vector<var>
? I think that's the only way that A_vm_f[i].adj() = 1;
would work because you can't assign a constant to an entire eigen matrix like that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or is this for when a var<mat>
function would return a std::vector<var>
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is for std::vector<var>
outputs, not std::vector<var<mat>>
require_std_vector_vt<is_matrix, Type>* = nullptr> | ||
void check_return_type(const ReturnType& ret, const Type& x) { | ||
if (ret.size() > 0 && x.size() > 0) | ||
check_return_type(ret[0], x[0]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the logic is fine here unless I'm missing something. This is a specialization for std::vector<T>
that just checks that the inner type for the input/output is correct. If that's what this does then I think it's fine
Quick Q - Why the addition of |
@andrjohns it's separate implementations for I could make them |
Is the separate implementations because Found the branch I was thinking of, I thought I had both |
Also, sorry if this is retreading obvious stuff with |
@andrjohns yeah this pull has those changes too (so that It's possible to do the
|
Ah that all makes sense thanks |
got rid of _impl versions of sd, log_softmax, and log_sum_exp (Issue #2101)
…4.1 (tags/RELEASE_600/final)
@SteveBronder I made a bunch of changes:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This all looks good to me! You have to fix up the docs but then ping me and I'll approve
auto arena_diff = to_arena((x.val().array() - x.val().mean()).matrix()); | ||
double sum_of_squares = arena_diff.squaredNorm(); | ||
double sd = std::sqrt(sum_of_squares / (x.size() - 1)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[optional] You could do the little loop thing here to make this faster but it's fine as is
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You have to fix up the docs
Do you mean the doxygen docs or the function reference docs (second definitely needs updated and I wouldn't doubt the first lol)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh doxygen my bad
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@SteveBronder I don't know what was going on with the docs. I changed all the variables to be named x
to get them to work. That was a real hairtugger
…into feature/sd_varmats
Woof, I had to add |
Jenkins Console Log Machine informationProductName: Mac OS X ProductVersion: 10.11.6 BuildVersion: 15G22010CPU: G++: Clang: |
Summary
This makes
sd
,log_sum_exp
, andlog_softmax
fullyvar<mat>
compatible (andapply_vector_unary
in the process).Release notes
Updated
sd
,log_softmax
, andlog_sum_exp
to work fully withvar<mat>
Checklist
Math issue Make functions with custom autodiff
var<mat>
friendly #2101Copyright holder: Columbia University
The copyright holder is typically you or your assignee, such as a university or company. By submitting this pull request, the copyright holder is agreeing to the license the submitted work under the following licenses:
- Code: BSD 3-clause (https://opensource.org/licenses/BSD-3-Clause)
- Documentation: CC-BY 4.0 (https://creativecommons.org/licenses/by/4.0/)
the basic tests are passing
./runTests.py test/unit
)make test-headers
)make test-math-dependencies
)make doxygen
)make cpplint
)the code is written in idiomatic C++ and changes are documented in the doxygen
the new changes are tested