-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve EC methods (focusing on the compressible Euler equations) #643
Improve EC methods (focusing on the compressible Euler equations) #643
Conversation
…out non-conservative terms
dec9ae8
to
61ca617
Compare
61ca617
to
12be336
Compare
That blog post is very informative - thanks for sharing! The Two minor typos in the blog post: "Divisions are more expansive on modern hardhare than multiplications." |
Thanks 👍 |
Codecov Report
@@ Coverage Diff @@
## main #643 +/- ##
==========================================
- Coverage 93.69% 93.63% -0.06%
==========================================
Files 171 171
Lines 16600 16595 -5
==========================================
- Hits 15553 15539 -14
- Misses 1047 1056 +9
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great job in doing all these performance optimizations, including the two blog posts! I have some questions and remarks, some of which might be challenging ;-) In this case, it we'll have to see how far we get here or if we should discuss it in person...
This version makes it considerably more difficult to work on Trixi due to a bug in Revise (timholy/Revise.jl#634). Hence, I strongly prefer the more verbose option to annotate each file until that bug is fixed. Is that okay for you, @sloede? |
@sloede: Can we also put this on the agenda to be able to finish this PR next week? |
No need. I propose to proceed with your original version but to open an issue that tracks the related Revise issue. I really like the |
New benchmark from Rocinante
## Job Properties
* Time of benchmarks:
- Target: 28 Jun 2021 - 13:39
- Baseline: 28 Jun 2021 - 13:59
* Package commits:
- Target: cae2e2
- Baseline: a8d5f9
* Julia commits:
- Target: 6aaede
- Baseline: 6aaede
* Julia command flags:
- Target: `-Cnative,-J/mnt/hd1/opt/julia/1.6.1/lib/julia/sys.so,-g1,--check-bounds=no,--threads=1`
- Baseline: `-Cnative,-J/mnt/hd1/opt/julia/1.6.1/lib/julia/sys.so,-g1,--check-bounds=no,--threads=1`
* Environment variables:
- Target: None
- Baseline: None
|
ID | time ratio | memory ratio |
---|---|---|
["benchmark/elixir_2d_euler_vortex_structured.jl", "p3_rhs!"] |
0.93 (5%) ✅ | 1.05 (1%) ❌ |
["benchmark/elixir_2d_euler_vortex_structured.jl", "p7_rhs!"] |
0.89 (5%) ✅ | 1.04 (1%) ❌ |
["benchmark/elixir_2d_euler_vortex_tree.jl", "p3_rhs!"] |
0.93 (5%) ✅ | 1.05 (1%) ❌ |
["benchmark/elixir_2d_euler_vortex_tree.jl", "p7_rhs!"] |
0.91 (5%) ✅ | 1.04 (1%) ❌ |
["benchmark/elixir_2d_euler_vortex_unstructured.jl", "p3_rhs!"] |
0.94 (5%) ✅ | 1.04 (1%) ❌ |
["benchmark/elixir_2d_euler_vortex_unstructured.jl", "p7_rhs!"] |
0.90 (5%) ✅ | 1.03 (1%) ❌ |
["structured_2d_dgsem/elixir_euler_ec.jl", "p3_rhs!"] |
0.53 (5%) ✅ | 1.05 (1%) ❌ |
["structured_2d_dgsem/elixir_euler_ec.jl", "p7_rhs!"] |
0.52 (5%) ✅ | 1.04 (1%) ❌ |
["structured_2d_dgsem/elixir_euler_source_terms_nonperiodic.jl", "p3_rhs!"] |
0.95 (5%) | 1.05 (1%) ❌ |
["structured_2d_dgsem/elixir_euler_source_terms_nonperiodic.jl", "p7_rhs!"] |
0.92 (5%) ✅ | 1.04 (1%) ❌ |
["structured_2d_dgsem/elixir_mhd_ec.jl", "p3_rhs!"] |
0.90 (5%) ✅ | 1.00 (1%) |
["structured_2d_dgsem/elixir_mhd_ec.jl", "p7_rhs!"] |
0.94 (5%) ✅ | 1.00 (1%) |
["structured_3d_dgsem/elixir_euler_ec.jl", "p3_rhs!"] |
0.51 (5%) ✅ | 1.05 (1%) ❌ |
["structured_3d_dgsem/elixir_euler_ec.jl", "p7_rhs!"] |
0.53 (5%) ✅ | 1.04 (1%) ❌ |
["structured_3d_dgsem/elixir_euler_source_terms_nonperiodic.jl", "p3_rhs!"] |
0.96 (5%) | 1.05 (1%) ❌ |
["structured_3d_dgsem/elixir_euler_source_terms_nonperiodic.jl", "p7_rhs!"] |
0.93 (5%) ✅ | 1.04 (1%) ❌ |
["structured_3d_dgsem/elixir_mhd_ec.jl", "p3_rhs!"] |
0.91 (5%) ✅ | 1.00 (1%) |
["structured_3d_dgsem/elixir_mhd_ec.jl", "p7_rhs!"] |
0.89 (5%) ✅ | 1.00 (1%) |
["tree_2d_dgsem/elixir_advection_amr_nonperiodic.jl", "p3_analysis"] |
0.95 (5%) ✅ | 1.00 (1%) |
["tree_2d_dgsem/elixir_euler_ec.jl", "p3_rhs!"] |
0.61 (5%) ✅ | 1.05 (1%) ❌ |
["tree_2d_dgsem/elixir_euler_ec.jl", "p7_rhs!"] |
0.58 (5%) ✅ | 1.04 (1%) ❌ |
["tree_2d_dgsem/elixir_euler_vortex_mortar.jl", "p3_rhs!"] |
0.92 (5%) ✅ | 1.05 (1%) ❌ |
["tree_2d_dgsem/elixir_euler_vortex_mortar.jl", "p7_rhs!"] |
0.90 (5%) ✅ | 1.04 (1%) ❌ |
["tree_2d_dgsem/elixir_euler_vortex_mortar_shockcapturing.jl", "p3_rhs!"] |
0.86 (5%) ✅ | 1.03 (1%) ❌ |
["tree_2d_dgsem/elixir_euler_vortex_mortar_shockcapturing.jl", "p7_rhs!"] |
0.80 (5%) ✅ | 1.03 (1%) ❌ |
["tree_2d_dgsem/elixir_mhd_ec.jl", "p3_rhs!"] |
0.78 (5%) ✅ | 1.00 (1%) |
["tree_2d_dgsem/elixir_mhd_ec.jl", "p7_rhs!"] |
0.74 (5%) ✅ | 1.00 (1%) |
["tree_3d_dgsem/elixir_euler_ec.jl", "p3_analysis"] |
1.01 (5%) | 1.03 (1%) ❌ |
["tree_3d_dgsem/elixir_euler_ec.jl", "p3_rhs!"] |
0.66 (5%) ✅ | 1.04 (1%) ❌ |
["tree_3d_dgsem/elixir_euler_ec.jl", "p7_analysis"] |
1.00 (5%) | 1.02 (1%) ❌ |
["tree_3d_dgsem/elixir_euler_ec.jl", "p7_rhs!"] |
0.64 (5%) ✅ | 1.03 (1%) ❌ |
["tree_3d_dgsem/elixir_euler_mortar.jl", "p3_analysis"] |
0.98 (5%) | 1.03 (1%) ❌ |
["tree_3d_dgsem/elixir_euler_mortar.jl", "p3_rhs!"] |
0.95 (5%) | 1.04 (1%) ❌ |
["tree_3d_dgsem/elixir_euler_mortar.jl", "p7_analysis"] |
0.98 (5%) | 1.02 (1%) ❌ |
["tree_3d_dgsem/elixir_euler_mortar.jl", "p7_rhs!"] |
0.97 (5%) | 1.03 (1%) ❌ |
["tree_3d_dgsem/elixir_euler_shockcapturing.jl", "p3_analysis"] |
1.00 (5%) | 1.02 (1%) ❌ |
["tree_3d_dgsem/elixir_euler_shockcapturing.jl", "p3_rhs!"] |
0.67 (5%) ✅ | 1.03 (1%) ❌ |
["tree_3d_dgsem/elixir_euler_shockcapturing.jl", "p7_analysis"] |
0.99 (5%) | 1.02 (1%) ❌ |
["tree_3d_dgsem/elixir_euler_shockcapturing.jl", "p7_rhs!"] |
0.64 (5%) ✅ | 1.03 (1%) ❌ |
["tree_3d_dgsem/elixir_mhd_ec.jl", "p3_rhs!"] |
0.80 (5%) ✅ | 1.00 (1%) |
["tree_3d_dgsem/elixir_mhd_ec.jl", "p7_rhs!"] |
0.75 (5%) ✅ | 1.00 (1%) |
["unstructured_2d_dgsem/elixir_euler_wall_bc.jl", "p3_analysis"] |
0.93 (5%) ✅ | 1.00 (1%) |
["unstructured_2d_dgsem/elixir_euler_wall_bc.jl", "p3_rhs!"] |
0.95 (5%) ✅ | 1.04 (1%) ❌ |
["unstructured_2d_dgsem/elixir_euler_wall_bc.jl", "p7_rhs!"] |
0.90 (5%) ✅ | 1.03 (1%) ❌ |
Benchmark Group List
Here's a list of all the benchmark groups executed by this job:
["benchmark/elixir_2d_euler_vortex_structured.jl"]
["benchmark/elixir_2d_euler_vortex_tree.jl"]
["benchmark/elixir_2d_euler_vortex_unstructured.jl"]
["latency"]
["p4est_2d_dgsem/elixir_advection_extended.jl"]
["p4est_3d_dgsem/elixir_advection_basic.jl"]
["structured_2d_dgsem/elixir_advection_extended.jl"]
["structured_2d_dgsem/elixir_advection_nonperiodic.jl"]
["structured_2d_dgsem/elixir_euler_ec.jl"]
["structured_2d_dgsem/elixir_euler_source_terms_nonperiodic.jl"]
["structured_2d_dgsem/elixir_mhd_ec.jl"]
["structured_3d_dgsem/elixir_advection_nonperiodic.jl"]
["structured_3d_dgsem/elixir_euler_ec.jl"]
["structured_3d_dgsem/elixir_euler_source_terms_nonperiodic.jl"]
["structured_3d_dgsem/elixir_mhd_ec.jl"]
["tree_2d_dgsem/elixir_advection_amr_nonperiodic.jl"]
["tree_2d_dgsem/elixir_advection_extended.jl"]
["tree_2d_dgsem/elixir_euler_ec.jl"]
["tree_2d_dgsem/elixir_euler_vortex_mortar.jl"]
["tree_2d_dgsem/elixir_euler_vortex_mortar_shockcapturing.jl"]
["tree_2d_dgsem/elixir_mhd_ec.jl"]
["tree_3d_dgsem/elixir_advection_extended.jl"]
["tree_3d_dgsem/elixir_euler_ec.jl"]
["tree_3d_dgsem/elixir_euler_mortar.jl"]
["tree_3d_dgsem/elixir_euler_shockcapturing.jl"]
["tree_3d_dgsem/elixir_mhd_ec.jl"]
["unstructured_2d_dgsem/elixir_euler_wall_bc.jl"]
Julia versioninfo
Target
Julia Version 1.6.1
Commit 6aaedecc44 (2021-04-23 05:59 UTC)
Platform Info:
OS: Linux (x86_64-pc-linux-gnu)
Ubuntu 20.04.2 LTS
uname: Linux 5.4.0-70-generic #78-Ubuntu SMP Fri Mar 19 13:29:52 UTC 2021 x86_64 x86_64
CPU: AMD Ryzen Threadripper 3990X 64-Core Processor:
speed user nice sys idle irq
#1-128 4030 MHz 143325542 s 9068 s 641775 s 8402251998 s 0 s
Memory: 251.6334342956543 GB (1520.875 MB free)
Uptime: 6.67736e6 sec
Load Avg: 2.0 2.05 2.16
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-11.0.1 (ORCJIT, znver2)
Baseline
Julia Version 1.6.1
Commit 6aaedecc44 (2021-04-23 05:59 UTC)
Platform Info:
OS: Linux (x86_64-pc-linux-gnu)
Ubuntu 20.04.2 LTS
uname: Linux 5.4.0-70-generic #78-Ubuntu SMP Fri Mar 19 13:29:52 UTC 2021 x86_64 x86_64
CPU: AMD Ryzen Threadripper 3990X 64-Core Processor:
speed user nice sys idle irq
#1-128 2176 MHz 143343277 s 9068 s 642286 s 8403796597 s 0 s
Memory: 251.6334342956543 GB (9770.75 MB free)
Uptime: 6.678581e6 sec
Load Avg: 1.0 1.13 1.54
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-11.0.1 (ORCJIT, znver2)
@sloede: This PR should be ready for the final review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's essentially only two questions: Greek letters vs. consistency and some additional comments. Otherwise it LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks and kudos for the great performance improvements! And thanks for the patience!
Well, thank you for your patience with me and your helpful review! This really improved the quality of this PR 👍 |
I wrote another blog post explaining the reasons for these changes. tl;dr: Trixi was hit quite a bit by the difference between LLVM and GCC. I could fix that by introducing
@muladd
. With some additional performance optimizations, I could improve the performance of the RHS computations at the initial datumexamples/2d/elixir_euler_ec.jl
examples/2d/elixir_euler_ec_curved.jl
Results from Rocinante:
1 Thread: Nice performance improvements overall, in particular for Euler EC; no significant runtime regressions
Job Properties
-Cnative,-J/mnt/hd1/opt/julia/1.6.1/lib/julia/sys.so,-g1,--check-bounds=no,--threads=1
-Cnative,-J/mnt/hd1/opt/julia/1.6.1/lib/julia/sys.so,-g1,--check-bounds=no,--threads=1
Results
A ratio greater than
1.0
denotes a possible regression (marked with ❌), while a ratio lessthan
1.0
denotes a possible improvement (marked with ✅). Only significant results - resultsthat indicate possible regressions or improvements - are shown below (thus, an empty table means that all
benchmark results remained invariant between builds).
["2d", "elixir_2d_euler_vortex_structured.jl", "p3_rhs!"]
["2d", "elixir_2d_euler_vortex_structured.jl", "p7_rhs!"]
["2d", "elixir_2d_euler_vortex_tree.jl", "p3_rhs!"]
["2d", "elixir_2d_euler_vortex_tree.jl", "p7_rhs!"]
["2d", "elixir_2d_euler_vortex_unstructured.jl", "p3_rhs!"]
["2d", "elixir_2d_euler_vortex_unstructured.jl", "p7_rhs!"]
["2d", "elixir_advection_amr_nonperiodic.jl", "p3_analysis"]
["2d", "elixir_euler_ec.jl", "p3_rhs!"]
["2d", "elixir_euler_ec.jl", "p7_rhs!"]
["2d", "elixir_euler_ec_curved.jl", "p3_rhs!"]
["2d", "elixir_euler_ec_curved.jl", "p7_rhs!"]
["2d", "elixir_euler_nonperiodic_curved.jl", "p3_rhs!"]
["2d", "elixir_euler_nonperiodic_curved.jl", "p7_rhs!"]
["2d", "elixir_euler_unstructured_quad_wall_bc.jl", "p3_analysis"]
["2d", "elixir_euler_unstructured_quad_wall_bc.jl", "p3_rhs!"]
["2d", "elixir_euler_unstructured_quad_wall_bc.jl", "p7_rhs!"]
["2d", "elixir_euler_vortex_mortar.jl", "p3_rhs!"]
["2d", "elixir_euler_vortex_mortar.jl", "p7_rhs!"]
["2d", "elixir_euler_vortex_mortar_shockcapturing.jl", "p3_rhs!"]
["2d", "elixir_euler_vortex_mortar_shockcapturing.jl", "p7_rhs!"]
["3d", "elixir_euler_ec.jl", "p3_analysis"]
["3d", "elixir_euler_ec.jl", "p3_rhs!"]
["3d", "elixir_euler_ec.jl", "p7_analysis"]
["3d", "elixir_euler_ec.jl", "p7_rhs!"]
["3d", "elixir_euler_ec_curved.jl", "p3_rhs!"]
["3d", "elixir_euler_ec_curved.jl", "p7_rhs!"]
["3d", "elixir_euler_mortar.jl", "p3_analysis"]
["3d", "elixir_euler_mortar.jl", "p3_rhs!"]
["3d", "elixir_euler_mortar.jl", "p7_analysis"]
["3d", "elixir_euler_mortar.jl", "p7_rhs!"]
["3d", "elixir_euler_nonperiodic_curved.jl", "p3_rhs!"]
["3d", "elixir_euler_nonperiodic_curved.jl", "p7_rhs!"]
["3d", "elixir_euler_shockcapturing.jl", "p3_analysis"]
["3d", "elixir_euler_shockcapturing.jl", "p3_rhs!"]
["3d", "elixir_euler_shockcapturing.jl", "p7_analysis"]
["3d", "elixir_euler_shockcapturing.jl", "p7_rhs!"]
Benchmark Group List
Here's a list of all the benchmark groups executed by this job:
["2d", "elixir_2d_euler_vortex_structured.jl"]
["2d", "elixir_2d_euler_vortex_tree.jl"]
["2d", "elixir_2d_euler_vortex_unstructured.jl"]
["2d", "elixir_advection_amr_nonperiodic.jl"]
["2d", "elixir_advection_extended.jl"]
["2d", "elixir_advection_extended_curved.jl"]
["2d", "elixir_advection_nonperiodic_curved.jl"]
["2d", "elixir_euler_ec.jl"]
["2d", "elixir_euler_ec_curved.jl"]
["2d", "elixir_euler_nonperiodic_curved.jl"]
["2d", "elixir_euler_unstructured_quad_wall_bc.jl"]
["2d", "elixir_euler_vortex_mortar.jl"]
["2d", "elixir_euler_vortex_mortar_shockcapturing.jl"]
["3d", "elixir_advection_extended.jl"]
["3d", "elixir_advection_nonperiodic_curved.jl"]
["3d", "elixir_euler_ec.jl"]
["3d", "elixir_euler_ec_curved.jl"]
["3d", "elixir_euler_mortar.jl"]
["3d", "elixir_euler_nonperiodic_curved.jl"]
["3d", "elixir_euler_shockcapturing.jl"]
["latency"]
Julia versioninfo
Target
Baseline
2 Threads: Nice performance improvements overall, in particular for Euler EC; no significant runtime regressions
Job Properties
-Cnative,-J/mnt/hd1/opt/julia/1.6.1/lib/julia/sys.so,-g1,--check-bounds=no,--threads=2
-Cnative,-J/mnt/hd1/opt/julia/1.6.1/lib/julia/sys.so,-g1,--check-bounds=no,--threads=2
Results
A ratio greater than
1.0
denotes a possible regression (marked with ❌), while a ratio lessthan
1.0
denotes a possible improvement (marked with ✅). Only significant results - resultsthat indicate possible regressions or improvements - are shown below (thus, an empty table means that all
benchmark results remained invariant between builds).
["2d", "elixir_2d_euler_vortex_structured.jl", "p3_rhs!"]
["2d", "elixir_2d_euler_vortex_tree.jl", "p3_rhs!"]
["2d", "elixir_2d_euler_vortex_tree.jl", "p7_rhs!"]
["2d", "elixir_2d_euler_vortex_unstructured.jl", "p3_rhs!"]
["2d", "elixir_2d_euler_vortex_unstructured.jl", "p7_rhs!"]
["2d", "elixir_advection_amr_nonperiodic.jl", "p3_analysis"]
["2d", "elixir_euler_ec.jl", "p3_rhs!"]
["2d", "elixir_euler_ec.jl", "p7_rhs!"]
["2d", "elixir_euler_ec_curved.jl", "p3_rhs!"]
["2d", "elixir_euler_ec_curved.jl", "p7_rhs!"]
["2d", "elixir_euler_nonperiodic_curved.jl", "p3_rhs!"]
["2d", "elixir_euler_nonperiodic_curved.jl", "p7_rhs!"]
["2d", "elixir_euler_vortex_mortar.jl", "p3_rhs!"]
["2d", "elixir_euler_vortex_mortar.jl", "p7_rhs!"]
["2d", "elixir_euler_vortex_mortar_shockcapturing.jl", "p7_rhs!"]
["3d", "elixir_euler_ec.jl", "p3_analysis"]
["3d", "elixir_euler_ec.jl", "p3_rhs!"]
["3d", "elixir_euler_ec.jl", "p7_analysis"]
["3d", "elixir_euler_ec.jl", "p7_rhs!"]
["3d", "elixir_euler_ec_curved.jl", "p3_rhs!"]
["3d", "elixir_euler_ec_curved.jl", "p7_rhs!"]
["3d", "elixir_euler_mortar.jl", "p3_analysis"]
["3d", "elixir_euler_mortar.jl", "p7_analysis"]
["3d", "elixir_euler_nonperiodic_curved.jl", "p3_rhs!"]
["3d", "elixir_euler_nonperiodic_curved.jl", "p7_rhs!"]
["3d", "elixir_euler_shockcapturing.jl", "p3_analysis"]
["3d", "elixir_euler_shockcapturing.jl", "p3_rhs!"]
["3d", "elixir_euler_shockcapturing.jl", "p7_analysis"]
["3d", "elixir_euler_shockcapturing.jl", "p7_rhs!"]
Benchmark Group List
Here's a list of all the benchmark groups executed by this job:
["2d", "elixir_2d_euler_vortex_structured.jl"]
["2d", "elixir_2d_euler_vortex_tree.jl"]
["2d", "elixir_2d_euler_vortex_unstructured.jl"]
["2d", "elixir_advection_amr_nonperiodic.jl"]
["2d", "elixir_advection_extended.jl"]
["2d", "elixir_advection_extended_curved.jl"]
["2d", "elixir_advection_nonperiodic_curved.jl"]
["2d", "elixir_euler_ec.jl"]
["2d", "elixir_euler_ec_curved.jl"]
["2d", "elixir_euler_nonperiodic_curved.jl"]
["2d", "elixir_euler_unstructured_quad_wall_bc.jl"]
["2d", "elixir_euler_vortex_mortar.jl"]
["2d", "elixir_euler_vortex_mortar_shockcapturing.jl"]
["3d", "elixir_advection_extended.jl"]
["3d", "elixir_advection_nonperiodic_curved.jl"]
["3d", "elixir_euler_ec.jl"]
["3d", "elixir_euler_ec_curved.jl"]
["3d", "elixir_euler_mortar.jl"]
["3d", "elixir_euler_nonperiodic_curved.jl"]
["3d", "elixir_euler_shockcapturing.jl"]
["latency"]
Julia versioninfo
Target
Baseline