-
-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wrong result when simd-ing over BitArrays #27482
Comments
Minimal example: function g(f, op, R::AbstractArray, A::AbstractArray)
@inbounds for IA in axes(A, 2)
@simd for i in axes(A, 1)
R[i] = op(R[i], f(A[i,IA]))
end
end
return R
end
g(identity, |, falses(10), trues(10, 10)) |
Yeah, I just got to the same reduction. That |
We could have a less strong version of |
Oh, interesting. This restriction makes sense, but it's news to me. So we cannot |
Correct. Technically not over reads from BitArrays either. |
Dang. That really hamstrings our ability to use I'm guessing lots of packages will hit this, too. I'd really love it if we could just fail to vectorize in cases like this instead of returning wrong results. |
Remove |
Fixes #27482. Copies the information from the performance tips into a docstring; makes it a bit scarier.
Just for what it's worth — yesterday before we had totally figured this out I had kicked off a bisect. You probably already know this, but it was 56d7ebe that started causing the incorrect results. |
For what it's worth, I don't think our compiler "hints" should ever have license to produce incorrect results like this. I understand that this is a bug; working around it and documenting it is fine for now, but at some point this does need fixing. |
That's literally the only thing that there annotations are for. Giving the compiler license to do transformations that it otherwise thinks are illegal. If the compiler though it was allowed to do this, it'd have done it iself. |
Typical uses of |
Keno's definitely correct. Could we do something like having an |
That's a fair point that bit arrays are a bit unusual in this regard. |
Sparse arrays would also incur trouble here. And even structured arrays won't have a consistent read memory access pattern — does that mean you have liberty to give me garbage? Or does that just mean it'll fail to vectorize? |
The guarantee we appear to be making is that each iteration of the loop will not observe the memory write from any other loop iteration. Random access pattern shouldn't be a problem (it'll likely just not be possible to vectorize). The BitArray issue is that a write actually consists of a read-modify-write (RMW) that makes it appear race-y here. |
So we have to remove simd annotation on all generic code? |
Definitely generic |
Instead could we make |
This bug makes Pkg3 completely useless on Skylake machines. We know from bugreports that we have a significant numbers of users on such machines. Bumping this for the next 0.7 pre-release. |
Yeah, this is definitely a priority to get a fix in for. |
In some sense, the problem is not that BitArray get/set are bad for SIMD, it is that they are too good for SIMD. That is, BitArrays getindex/setindex should talk to the simd machinery in order to try to get data processed in 64 bit chunks (or even better 128/256/512) instead of dropping down to 1-bit sequential processing. |
It's not just |
Fixed by #27670 |
I'm guessing this is an AVX512 codegen issue with the new compiler, but #27182 does not fix it. Doesn't occur with a 44-day old build (9f5351c) which is before the new compiler was enabled.
The text was updated successfully, but these errors were encountered: