[WIP][JIT] Add benchmarking support of NV Fuser with FP16 dtype support #44101

kevinstephano · 2020-09-03T08:08:34Z

Modified files in benchmarks/tensorexpr to add support for NVIDIA's Fuser for the jit compiler.

This support has some modifications besides adding an option to support the NVIDIA fuser:

Adds FP16 Datatype support
Fixes SOL/Algo calculations to generally use the data type instead of being fixed to 4 bytes
Adds IR printing and kernel printing knobs
Adds a knob input_iter to create ranges of inputs currently only for reductions
Adds further reduction support for Inner and Outer dimension reductions that are compatible with the input_iter knob.
Added simple_element, reduce2d_inner, and reduce2d_outer to isolate performance on elementwise and reduction operations in the most minimal fashion.

…outer dimension versions. Add input_iter option to cover ranges of tensor sizes. Fix SOL/Algo calculations for datatype support. Added NV Fuser support.

dr-ci · 2020-09-03T08:16:22Z

💊 CI failures summary and remediations

As of commit 366578f (more details on the Dr. CI page):

💚 💚 Looks good so far! There are no failures yet. 💚 💚

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group.

See how this bot performed.

This comment has been revised 26 times.

codecov · 2020-09-03T11:12:01Z

Codecov Report

Merging #44101 into master will increase coverage by 0.00%.
The diff coverage is n/a.

@@           Coverage Diff           @@
##           master   #44101   +/-   ##
=======================================
  Coverage   67.93%   67.94%           
=======================================
  Files         384      384           
  Lines       49746    49746           
=======================================
+ Hits        33797    33798    +1     
+ Misses      15949    15948    -1

Impacted Files	Coverage Δ
torch/testing/_internal/expecttest.py	`77.55% <0.00%> (-1.03%)`	⬇️
torch/utils/_benchmark/utils/common.py	`79.33% <0.00%> (+1.65%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2254e5d...366578f. Read the comment docs.

csarofeen · 2020-09-04T14:08:54Z

We still have some work to do cleaning this up, but wanted to test the waters for additions here with @ZolotukhinM and @bertmaher

…nt benchmark to isolate the simplest fusions bandwidth performance.

bertmaher

Cool! I have just a couple suggestions which you're welcome to take or leave :). (And, sorry this benchmark harness is maybe a bit clunky. Hopefully this change wasn't too onerous to write!)

bertmaher · 2020-09-08T17:31:00Z

benchmarks/tensorexpr/__main__.py

+
+        configs = []
+        for dim in tensor_dim_specs :
+            assert len(dim) == 3, "There should be three dimensions for: start, stop, inc!"


minor, this block of code would be clearer if dim were unpacked as start, stop, inc = dim

I like your way better. Thanks for the suggestion!

bertmaher · 2020-09-08T17:54:57Z

benchmarks/tensorexpr/benchmark.py

+
+    def dtype_to_bytes(self) :
+        import torch
+        if self.dtype == torch.float32 :


You could ask torch for the size of a dtype by doing torch.tensor(0, dtype=self.dtype).element_size(). I have mixed feelings about creating a tensor just for this purpose, but it does at least keep the source of truth in torch instead of our benchmark harness :-p.

I am perfectly fine with this. I just added the method as I wasn't finding an ideal choice. One other note is that I think the Dtype based memory calculations will need to change again. I was about to add some ops that had mixed precision and quickly realized that generically multiplying by a type size was not going to work.

kevinstephano · 2020-09-08T18:28:28Z

@bertmaher I was curious why you chose to using numpy for reference checking versus just Pytorch eager mode? The only reason I ask is it just seemed easier to check against Pytorch Eager mode, to me. Easier isn't necessarily the best reason. :-)

bertmaher · 2020-09-08T19:50:05Z

@kevinstephano I think comparing with eager mode is definitely the right way to go! @zheng-xq and I started this benchmark harness in a different repo where we were comparing several different frameworks (tf, tvm, etc.) in addition to pytorch and numpy was a sort of least common denominator to use as a reference. But that attempt fell by the wayside and we now use it for pytorch exclusively, so we should just compare with eager :-).

kevinstephano · 2020-09-15T17:38:13Z

Hey @bertmaher, I think this PR is ready to be merged unless you have any more concerns/issues.

facebook-github-bot

@bertmaher has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot · 2020-09-16T00:39:33Z

@bertmaher merged this pull request in 26a91a9.

…rt (#44101) Summary: Modified files in `benchmarks/tensorexpr` to add support for NVIDIA's Fuser for the jit compiler. This support has some modifications besides adding an option to support the NVIDIA fuser: * Adds FP16 Datatype support * Fixes SOL/Algo calculations to generally use the data type instead of being fixed to 4 bytes * Adds IR printing and kernel printing knobs * Adds a knob `input_iter` to create ranges of inputs currently only for reductions * Adds further reduction support for Inner and Outer dimension reductions that are compatible with the `input_iter` knob. * Added `simple_element`, `reduce2d_inner`, and `reduce2d_outer` to isolate performance on elementwise and reduction operations in the most minimal fashion. Pull Request resolved: #44101 Reviewed By: ngimel Differential Revision: D23713658 Pulled By: bertmaher fbshipit-source-id: d6b83cfab559aefe107c23b3c0f2df9923b3adc1

Summary: This PR modifies `benchmarks/tensorexpr`. It follows up[ https://github.com/pytorch/pytorch/issues/44101](https://github.com/pytorch/pytorch/pull/44101) and further supports characterizing fusers with dynamic shape benchmarks. Dynamic shape condition models the use case when the input tensor shape changes in each call to the graph. Changes include: Added an auxiliary class `DynamicShape `that provides a simple API for enabling dynamic shapes in existing test cases, example can be found with `DynamicSimpleElementBench` Created new bench_cls: `DynamicSimpleElementBench`, `DynamicReduce2DInnerBench`, `DynamicReduce2DOuterBench`, and `DynamicLSTM`. They are all dynamic shaped versions of existing benchmarks and examples of enabling dynamic shape with `DynamicShape`. Pull Request resolved: #46107 Reviewed By: glaringlee Differential Revision: D24229400 Pulled By: bertmaher fbshipit-source-id: 889fece5ea87d0f6f6374d31dbe11b1cd1380683

kevinstephano added 2 commits September 3, 2020 00:52

Add support for FP16 data type. Add reduction support plus inner and …

156a27f

…outer dimension versions. Add input_iter option to cover ranges of tensor sizes. Fix SOL/Algo calculations for datatype support. Added NV Fuser support.

Forgot to modify pooling SOL calc.

bebc7f1

pytorchbot added the open source label Sep 3, 2020

csarofeen requested review from ZolotukhinM and bertmaher September 4, 2020 14:08

kevinstephano added 4 commits September 4, 2020 12:36

Fix flake errors. Fix up all classes to use dtype. Added simple_eleme…

225e9aa

…nt benchmark to isolate the simplest fusions bandwidth performance.

Fix flake error.

22b329e

Fix flake error.

76f5212

Fix flake in normalization

a2a9397

bertmaher approved these changes Sep 8, 2020

View reviewed changes

kevinstephano added 4 commits September 14, 2020 14:05

Fixing issues found in upstream review.

10c3f5d

Fix flake problem.

e7ace44

Fix flake problem.

5b8f828

Merge branch 'master' of https://github.com/pytorch/pytorch

366578f

facebook-github-bot reviewed Sep 15, 2020

View reviewed changes

facebook-github-bot closed this in 26a91a9 Sep 15, 2020

facebook-github-bot added the merged label Sep 16, 2020

This was referenced Oct 9, 2020

[JIT] NV fuser benchmarking support under dynamic shapes #46101

Closed

[JIT] Add dynamic shape benchmark for NV Fuser #46107

Closed

mruberry added the Merged label Oct 28, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP][JIT] Add benchmarking support of NV Fuser with FP16 dtype support #44101

[WIP][JIT] Add benchmarking support of NV Fuser with FP16 dtype support #44101

kevinstephano commented Sep 3, 2020 •

edited

Loading

dr-ci bot commented Sep 3, 2020 •

edited

Loading

codecov bot commented Sep 3, 2020 •

edited

Loading

csarofeen commented Sep 4, 2020

bertmaher left a comment

bertmaher Sep 8, 2020

kevinstephano Sep 8, 2020

bertmaher Sep 8, 2020

kevinstephano Sep 8, 2020

kevinstephano commented Sep 8, 2020 •

edited

Loading

bertmaher commented Sep 8, 2020

kevinstephano commented Sep 15, 2020

facebook-github-bot left a comment

facebook-github-bot commented Sep 16, 2020

[WIP][JIT] Add benchmarking support of NV Fuser with FP16 dtype support #44101

[WIP][JIT] Add benchmarking support of NV Fuser with FP16 dtype support #44101

Conversation

kevinstephano commented Sep 3, 2020 • edited Loading

dr-ci bot commented Sep 3, 2020 • edited Loading

💊 CI failures summary and remediations

codecov bot commented Sep 3, 2020 • edited Loading

Codecov Report

csarofeen commented Sep 4, 2020

bertmaher left a comment

Choose a reason for hiding this comment

bertmaher Sep 8, 2020

Choose a reason for hiding this comment

kevinstephano Sep 8, 2020

Choose a reason for hiding this comment

bertmaher Sep 8, 2020

Choose a reason for hiding this comment

kevinstephano Sep 8, 2020

Choose a reason for hiding this comment

kevinstephano commented Sep 8, 2020 • edited Loading

bertmaher commented Sep 8, 2020

kevinstephano commented Sep 15, 2020

facebook-github-bot left a comment

Choose a reason for hiding this comment

facebook-github-bot commented Sep 16, 2020

kevinstephano commented Sep 3, 2020 •

edited

Loading

dr-ci bot commented Sep 3, 2020 •

edited

Loading

codecov bot commented Sep 3, 2020 •

edited

Loading

kevinstephano commented Sep 8, 2020 •

edited

Loading