-
Notifications
You must be signed in to change notification settings - Fork 23.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
random_() is not supported for bfloat16 CUDA tensors on Windows #33793
Labels
module: bfloat16
module: cuda
Related to torch.cuda, and CUDA support in general
module: random
Related to random number generation in PyTorch (rng generator)
triaged
This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Comments
pbelevich
added a commit
that referenced
this issue
Feb 26, 2020
This pull request solves 4 problems: 1. Migrates `Tensor.random_()` from TH to ATen, both CPU and CUDA versions. 2. Allows `random_()` to generate full range of 64 bit numbers(including unsigned 64 bit max value). 3. Implements `random_()` for boolean tensors on CUDA. 4. Makes `random_()` template methods to allow using it with custom RNGs. It is done by the following changes: 1. Drop TH CPU implementations of `random_()`. 2. Change API of `random_()` to make to argument optional to allow call `random_(from=min_value, to=None)` that generates full 64 bit range numbers. 3. Make three native functions `random_()`, `random_(to)` and `random_(to, from)` which call three different kernels(both CPU and CUDA). 4. Create three random_ kernels(both CPU and CUDA) to handle `random_(no params)`, `random(to, from)` and `random(from=min_value, to=None)` cases. 5. Templatize all random_ implementations and kernels to use them with custom RNGs. 6. Create C++ tests that uses custom RNG and `random_()` templates to check correctness. 7. Create Python tests to cover all `random_()` scenarios with all possible dtypes and devices. Fixes #24752 Fixes #32510 Fixes #33299 Fixes #33725 Known issues: #33793 random_() is not supported for bfloat16 CUDA tensors on Windows Differential Revision: [D20056350](https://our.internmc.facebook.com/intern/diff/D20056350) [ghstack-poisoned]
pbelevich
added a commit
that referenced
this issue
Feb 26, 2020
This pull request solves 4 problems: 1. Migrates `Tensor.random_()` from TH to ATen, both CPU and CUDA versions. 2. Allows `random_()` to generate full range of 64 bit numbers(including unsigned 64 bit max value). 3. Implements `random_()` for boolean tensors on CUDA. 4. Makes `random_()` template methods to allow using it with custom RNGs. It is done by the following changes: 1. Drop TH CPU implementations of `random_()`. 2. Change API of `random_()` to make to argument optional to allow call `random_(from=min_value, to=None)` that generates full 64 bit range numbers. 3. Make three native functions `random_()`, `random_(to)` and `random_(to, from)` which call three different kernels(both CPU and CUDA). 4. Create three random_ kernels(both CPU and CUDA) to handle `random_(no params)`, `random(to, from)` and `random(from=min_value, to=None)` cases. 5. Templatize all random_ implementations and kernels to use them with custom RNGs. 6. Create C++ tests that uses custom RNG and `random_()` templates to check correctness. 7. Create Python tests to cover all `random_()` scenarios with all possible dtypes and devices. Fixes #24752 Fixes #32510 Fixes #33299 Fixes #33725 Known issues: #33793 random_() is not supported for bfloat16 CUDA tensors on Windows Differential Revision: [D20056350](https://our.internmc.facebook.com/intern/diff/D20056350) [ghstack-poisoned]
pbelevich
added a commit
that referenced
this issue
Feb 26, 2020
This pull request solves 4 problems: 1. Migrates `Tensor.random_()` from TH to ATen, both CPU and CUDA versions. 2. Allows `random_()` to generate full range of 64 bit numbers(including unsigned 64 bit max value). 3. Implements `random_()` for boolean tensors on CUDA. 4. Makes `random_()` template methods to allow using it with custom RNGs. It is done by the following changes: 1. Drop TH CPU implementations of `random_()`. 2. Change API of `random_()` to make to argument optional to allow call `random_(from=min_value, to=None)` that generates full 64 bit range numbers. 3. Make three native functions `random_()`, `random_(to)` and `random_(to, from)` which call three different kernels(both CPU and CUDA). 4. Create three random_ kernels(both CPU and CUDA) to handle `random_(no params)`, `random(to, from)` and `random(from=min_value, to=None)` cases. 5. Templatize all random_ implementations and kernels to use them with custom RNGs. 6. Create C++ tests that uses custom RNG and `random_()` templates to check correctness. 7. Create Python tests to cover all `random_()` scenarios with all possible dtypes and devices. Fixes #24752 Fixes #32510 Fixes #33299 Fixes #33725 Known issues: #33793 random_() is not supported for bfloat16 CUDA tensors on Windows Differential Revision: [D20056350](https://our.internmc.facebook.com/intern/diff/D20056350) [ghstack-poisoned]
pbelevich
added a commit
that referenced
this issue
Feb 26, 2020
This pull request solves 4 problems: 1. Migrates `Tensor.random_()` from TH to ATen, both CPU and CUDA versions. 2. Allows `random_()` to generate full range of 64 bit numbers(including unsigned 64 bit max value). 3. Implements `random_()` for boolean tensors on CUDA. 4. Makes `random_()` template methods to allow using it with custom RNGs. It is done by the following changes: 1. Drop TH CPU implementations of `random_()`. 2. Change API of `random_()` to make to argument optional to allow call `random_(from=min_value, to=None)` that generates full 64 bit range numbers. 3. Make three native functions `random_()`, `random_(to)` and `random_(to, from)` which call three different kernels(both CPU and CUDA). 4. Create three random_ kernels(both CPU and CUDA) to handle `random_(no params)`, `random(to, from)` and `random(from=min_value, to=None)` cases. 5. Templatize all random_ implementations and kernels to use them with custom RNGs. 6. Create C++ tests that uses custom RNG and `random_()` templates to check correctness. 7. Create Python tests to cover all `random_()` scenarios with all possible dtypes and devices. Fixes #24752 Fixes #32510 Fixes #33299 Fixes #33725 Known issues: #33793 random_() is not supported for bfloat16 CUDA tensors on Windows Differential Revision: [D20056350](https://our.internmc.facebook.com/intern/diff/D20056350) [ghstack-poisoned]
pbelevich
added a commit
that referenced
this issue
Feb 26, 2020
This pull request solves 4 problems: 1. Migrates `Tensor.random_()` from TH to ATen, both CPU and CUDA versions. 2. Allows `random_()` to generate full range of 64 bit numbers(including unsigned 64 bit max value). 3. Implements `random_()` for boolean tensors on CUDA. 4. Makes `random_()` template methods to allow using it with custom RNGs. It is done by the following changes: 1. Drop TH CPU implementations of `random_()`. 2. Change API of `random_()` to make to argument optional to allow call `random_(from=min_value, to=None)` that generates full 64 bit range numbers. 3. Make three native functions `random_()`, `random_(to)` and `random_(to, from)` which call three different kernels(both CPU and CUDA). 4. Create three random_ kernels(both CPU and CUDA) to handle `random_(no params)`, `random(to, from)` and `random(from=min_value, to=None)` cases. 5. Templatize all random_ implementations and kernels to use them with custom RNGs. 6. Create C++ tests that uses custom RNG and `random_()` templates to check correctness. 7. Create Python tests to cover all `random_()` scenarios with all possible dtypes and devices. Fixes #24752 Fixes #32510 Fixes #33299 Fixes #33725 Known issues: #33793 random_() is not supported for bfloat16 CUDA tensors on Windows Differential Revision: [D20056350](https://our.internmc.facebook.com/intern/diff/D20056350) [ghstack-poisoned]
@pbelevich I solved an issue that may be related to this in #37302. Would you please try again based on that PR? |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
module: bfloat16
module: cuda
Related to torch.cuda, and CUDA support in general
module: random
Related to random number generation in PyTorch (rng generator)
triaged
This issue has been looked at a team member, and triaged and prioritized into an appropriate module
This is a known issue, that requires further investigation, currently calling any random_() method on bfloat16 CUDA tensor on Windows makes CUDA context invalid, all subsequent CUDA calls fail with 'CUDA error: unspecified launch failure'
Assignee, please look for "TODO: https://github.com/pytorch/pytorch/issues/33793" in the source code
cc @ngimel
The text was updated successfully, but these errors were encountered: