Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid RecursiveFactorization except on simple number types #722

Merged
merged 1 commit into from
Dec 13, 2021

Conversation

ChrisRackauckas
Copy link
Member

@ChrisRackauckas ChrisRackauckas commented Dec 13, 2021

using OrdinaryDiffEq, SnoopCompile, ForwardDiff

lorenz = (du,u,p,t) -> begin
        du[1] = 10.0(u[2]-u[1])
        du[2] = u[1]*(28.0-u[3]) - u[2]
        du[3] = u[1]*u[2] - (8/3)*u[3]
end

u0 = [1.0;0.0;0.0]; tspan = (0.0,100.0);
prob = ODEProblem(lorenz,u0,tspan); alg = Rodas5();
tinf = @snoopi_deep ForwardDiff.gradient(u0 -> sum(solve(ODEProblem(lorenz,u0,tspan),alg)), u0)

Before:

InferenceTimingNode: 1.720146/12.692469 on Core.Compiler.Timings.ROOT() with 32 direct children

After:

InferenceTimingNode: 1.243689/4.828318 on Core.Compiler.Timings.ROOT() with 33 direct children

@chriselrod
Copy link
Contributor

chriselrod commented Dec 13, 2021

RecursiveFactorization seems worse at large sizes? @YingboMa

julia> using BenchmarkTools, RecursiveFactorization, ForwardDiff, LinearAlgrebra

julia> for i in 12:12:144
           @show i
           A = ForwardDiff.Dual.([randn(i,i) for _ in 1:9]...); B = similar(A);
           println("RecursiveFactorization:")
           display(@benchmark RecursiveFactorization.lu!(copyto!($B,$A)))
           println("Generic:")
           display(@benchmark LinearAlgebra.lu!(copyto!($B,$A)))
           println()
       end
i = 12
RecursiveFactorization:
BenchmarkTools.Trial: 10000 samples with 10 evaluations.
 Range (min  max):  1.572 μs    5.131 μs  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     1.591 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   1.642 μs ± 113.170 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▅██▆▂            ▅▇▆▅▄▃▂        ▂▂▁▁                        ▂
  █████▇▇▇▅▃▄▄▄▁▁▃█████████▇▆▆▄▆▆██████▇▆▆▃▄▅▃▄▄▄▅▅▃▃▃▄▅▃▃▁▁▃ █
  1.57 μs      Histogram: log(frequency) by time      1.93 μs <

 Memory estimate: 176 bytes, allocs estimate: 1.
Generic:
BenchmarkTools.Trial: 10000 samples with 10 evaluations.
 Range (min  max):  1.640 μs    5.133 μs  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     1.662 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   1.713 μs ± 107.547 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

    █▅
  ▂███▆▃▂▂▂▂▂▂▂▂▂▂▂▂▂▂▄▇█▆▅▄▃▂▂▂▂▂▂▂▂▂▂▂▃▃▂▂▂▂▂▂▂▂▂▁▂▁▂▁▂▂▂▂▂ ▃
  1.64 μs         Histogram: frequency by time        1.96 μs <

 Memory estimate: 176 bytes, allocs estimate: 1.

i = 24
RecursiveFactorization:
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min  max):  11.616 μs   46.104 μs  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     11.716 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   11.833 μs ± 803.628 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

   ▅██▆▂▁                           ▂▂▁                        ▂
  ███████▇▅▁▄▅▁▄▄▃▃▃▄▃▁▁▁▁▁▁▁▁▁▁▁▁▁████████▇▇▅▅▆▄▃▃▁▅▄▄▅▄▅▃▄▃▄ █
  11.6 μs       Histogram: log(frequency) by time      13.5 μs <

 Memory estimate: 272 bytes, allocs estimate: 1.
Generic:
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min  max):  11.889 μs   40.791 μs  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     11.980 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   12.118 μs ± 953.985 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▂▆█▇▄                           ▁▁▂▁▁                        ▂
  ██████▆▅▄▁▃▁▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▃▇██████▇▇▆▆▅▅▆▅▄▃▄▄▅▄▃▅▃▃▄▁▆▅ █
  11.9 μs       Histogram: log(frequency) by time      13.9 μs <

 Memory estimate: 272 bytes, allocs estimate: 1.

i = 36
RecursiveFactorization:
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min  max):  37.911 μs  97.809 μs  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     38.171 μs              ┊ GC (median):    0.00%
 Time  (mean ± σ):   38.402 μs ±  1.438 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▇█▂  ▁▄▃▁                                                   ▁
  ███▅▁██████▇▇▆▄▅▅▄▃▁▁▁▁▁▁▁▁▁▁▁▄▁▄▄▃▁▁▁▃▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▄▇ █
  37.9 μs      Histogram: log(frequency) by time      49.4 μs <

 Memory estimate: 368 bytes, allocs estimate: 1.
Generic:
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min  max):  38.736 μs  95.261 μs  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     38.976 μs              ┊ GC (median):    0.00%
 Time  (mean ± σ):   39.201 μs ±  1.350 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▇█▁  ▁▄▂                                                    ▁
  ███▄▃███████▇▅▅▆▃▃▁▅▃▃▁▁▁▁▁▃▁▃▃▄▄▃▃▁▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▅ █
  38.7 μs      Histogram: log(frequency) by time        50 μs <

 Memory estimate: 368 bytes, allocs estimate: 1.

i = 48
RecursiveFactorization:
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min  max):  85.114 μs  167.962 μs  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     85.706 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   86.237 μs ±   2.429 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▇▆██▄▂▃▃▄▃▂▂▁                                            ▂▂▂ ▂
  ███████████████▇▆▆▆▅▄▅▁▄▃▄▃▁▁▁▁▁▁▁▁▁▁▃▁▁▁▁▃▁▃▁▁▁▁▁▁▁▁▁▃▄████ █
  85.1 μs       Histogram: log(frequency) by time      97.2 μs <

 Memory estimate: 496 bytes, allocs estimate: 1.
Generic:
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min  max):  86.563 μs  152.116 μs  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     87.065 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   87.629 μs ±   2.680 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▂▅█▄   ▃▄▁  ▁▁                                             ▂ ▁
  ████▅▆████████▇▇▆▆▆▆▄▅▅▄▃▄▁▃▃▁▁▃▁▁▃▁▄▁▃▁▄▁▁▁▃▃▁▁▃▁▃▁▁▁▁▁▄▆██ █
  86.6 μs       Histogram: log(frequency) by time      98.4 μs <

 Memory estimate: 496 bytes, allocs estimate: 1.

i = 60
RecursiveFactorization:
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min  max):  162.841 μs  241.807 μs  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     164.033 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   164.581 μs ±   2.854 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▁▆▇▁▄█▇▄▂▁▄▄▃▂▂▂▁                                       ▁▁▁▂▂ ▂
  ██████████████████▇▇▇▆▅▄▁▅▄▁▃▃▃▃▁▄▁▁▃▁▁▁▁▁▃▁▁▁▁▃▁▁▁▁▃▁▃▇█████ █
  163 μs        Histogram: log(frequency) by time        176 μs <

 Memory estimate: 576 bytes, allocs estimate: 1.
Generic:
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min  max):  164.795 μs  242.450 μs  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     165.997 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   166.703 μs ±   2.622 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▁▂▄▄▆█▇▁▁▂▄▄▂▂▃▂                                         ▁▁▂▂ ▂
  █████████████████▇██▆▅▄▄▄▄▃▅▄▁▁▁▄▁▁▁▁▁▁▁▁▁▃▁▁▁▃▁▁▁▁▁▁▃▁▅▇████ █
  165 μs        Histogram: log(frequency) by time        178 μs <

 Memory estimate: 576 bytes, allocs estimate: 1.

i = 72
RecursiveFactorization:
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min  max):  277.741 μs  354.358 μs  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     279.339 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   280.107 μs ±   3.347 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

   █▁   ▅█
  ▄██▄▃▃██▆▄▃▄▄▃▃▃▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▁▂▁▂▂▁▁▁▁▂▁▁▁▁▁▁▂▂▂▃▂▂▂▃▃▃▂▂▂ ▃
  278 μs           Histogram: frequency by time          292 μs <

 Memory estimate: 672 bytes, allocs estimate: 1.
Generic:
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min  max):  280.720 μs  358.809 μs  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     282.281 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   283.444 μs ±   3.266 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

   ▃▃ ▂▇█▆▄▃▄▅▄▄▃▂▁▁▁▁                                 ▁▄▃▁     ▂
  █████████████████████▆▅▅▆▄▃▅▃▃▃▄▄▄▃▃▃▁▁▄▁▁▁▁▁▁▁▁▄▆▇▇▆████▇▇██ █
  281 μs        Histogram: log(frequency) by time        295 μs <

 Memory estimate: 672 bytes, allocs estimate: 1.

i = 84
RecursiveFactorization:
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min  max):  434.599 μs  520.429 μs  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     437.904 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   439.082 μs ±   4.263 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

       ▃  ▁▅▅██▂
  ▁▄▆▅███▆███████▅▆▅▅▅▃▂▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▂▂▂▂▂▂▂▃▃▂▂▂▂▁▁▁▁ ▃
  435 μs           Histogram: frequency by time          452 μs <

 Memory estimate: 816 bytes, allocs estimate: 1.
Generic:
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range (min  max):  439.360 μs  596.009 μs  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     442.078 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   444.222 μs ±   5.427 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▁▃▃▂▁▆██▆▅▄▆▆▅▄▃▃▂▁                   ▁▄▅▄▂▁▂▂▂▁              ▂
  ███████████████████▇▇▇▇▅▃▅▅▁▄▄▃▁▁▃▆██▇█████████████▇▅▅▆▅▃▁▁▄▃ █
  439 μs        Histogram: log(frequency) by time        460 μs <

 Memory estimate: 816 bytes, allocs estimate: 1.

i = 96
RecursiveFactorization:
BenchmarkTools.Trial: 7609 samples with 1 evaluation.
 Range (min  max):  646.730 μs  731.978 μs  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     650.949 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   653.262 μs ±   5.974 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

    ▂    ▃█▇▁
  ▃███▄▄▅████▅▆▇▆▄▃▃▃▂▂▂▂▂▂▂▂▃▃▄▄▃▄▄▅▄▄▃▃▃▃▃▂▂▂▂▁▂▂▁▂▁▂▂▂▂▂▂▂▂▂ ▃
  647 μs           Histogram: frequency by time          673 μs <

 Memory estimate: 896 bytes, allocs estimate: 1.
Generic:
BenchmarkTools.Trial: 7558 samples with 1 evaluation.
 Range (min  max):  650.726 μs  762.345 μs  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     655.038 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   657.642 μs ±   5.787 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

         ▇█▂
  ▂▃▃▃▂▃▆███▅▅▇▆▅▄▃▃▂▂▂▂▂▂▂▂▂▂▂▂▂▂▃▅▅▃▃▃▃▃▃▂▂▂▂▂▂▁▂▁▂▂▂▂▂▂▂▂▂▂▂ ▃
  651 μs           Histogram: frequency by time          678 μs <

 Memory estimate: 896 bytes, allocs estimate: 1.

i = 108
RecursiveFactorization:
BenchmarkTools.Trial: 5368 samples with 1 evaluation.
 Range (min  max):  917.617 μs  1.291 ms  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     924.505 μs             ┊ GC (median):    0.00%
 Time  (mean ± σ):   927.544 μs ± 9.961 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

     ▁▂    ▃▆█▃
  ▂▅▇███▅▆█████▇▆██▆▃▃▂▂▂▂▂▃▄▄▄▄▅▅▇▆▅▄▃▃▃▂▂▁▁▁▁▁▁▁▂▂▂▂▂▂▂▂▂▂▁ ▃
  918 μs          Histogram: frequency by time         948 μs <

 Memory estimate: 1008 bytes, allocs estimate: 1.
Generic:
BenchmarkTools.Trial: 5336 samples with 1 evaluation.
 Range (min  max):  922.993 μs  1.024 ms  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     930.210 μs             ┊ GC (median):    0.00%
 Time  (mean ± σ):   933.049 μs ± 7.063 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

          ▂█▆  ▁▁
  ▂▂▂▃▂▂▂▄███▇▅███▅▄▃▃▂▂▂▂▂▂▂▂▂▃▅▆▅▄▃▄▄▄▃▃▂▂▂▂▂▁▂▂▂▂▂▂▃▃▃▂▂▃▂ ▃
  923 μs          Histogram: frequency by time         954 μs <

 Memory estimate: 1008 bytes, allocs estimate: 1.

i = 120
RecursiveFactorization:
BenchmarkTools.Trial: 2305 samples with 1 evaluation.
 Range (min  max):  2.123 ms   2.443 ms  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     2.163 ms              ┊ GC (median):    0.00%
 Time  (mean ± σ):   2.165 ms ± 18.189 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

                 ▃▄▂▄▅▅▇█▇▇▆█▇▅▆▄▄▁▃▃▁
  ▂▁▂▂▃▂▃▃▃▃▄▅▇▆▇█████████████████████▇██▆▆▅▆▅▄▄▃▄▄▃▃▃▃▃▃▂▃▂ ▅
  2.12 ms        Histogram: frequency by time        2.21 ms <

 Memory estimate: 15.78 KiB, allocs estimate: 14.
Generic:
BenchmarkTools.Trial: 3424 samples with 1 evaluation.
 Range (min  max):  1.438 ms   2.019 ms  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     1.453 ms              ┊ GC (median):    0.00%
 Time  (mean ± σ):   1.456 ms ± 15.187 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

           ▃▂▆▅▅█▅▆▃▄▂▁▁
  ▁▁▁▂▂▄▅▇███████████████▇▇▆▇▆▇▅▄▅▄▃▄▃▄▃▃▃▂▃▃▂▂▂▂▂▂▂▂▂▁▂▁▂▁▁ ▄
  1.44 ms        Histogram: frequency by time        1.49 ms <

 Memory estimate: 1.06 KiB, allocs estimate: 1.

i = 132
RecursiveFactorization:
BenchmarkTools.Trial: 1750 samples with 1 evaluation.
 Range (min  max):  2.813 ms   3.204 ms  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     2.850 ms              ┊ GC (median):    0.00%
 Time  (mean ± σ):   2.853 ms ± 19.074 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

               ▃▃▃▄▃▆▇█▆▇▆▄▅▅▅▃▄▃▁
  ▂▁▁▂▂▃▃▃▄▅▅▇██████████████████████▆▆▆▅▆▄▄▄▃▄▃▄▃▃▃▃▃▃▂▂▂▂▃▂ ▅
  2.81 ms        Histogram: frequency by time         2.9 ms <

 Memory estimate: 17.03 KiB, allocs estimate: 14.
Generic:
BenchmarkTools.Trial: 2542 samples with 1 evaluation.
 Range (min  max):  1.931 ms   2.717 ms  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     1.960 ms              ┊ GC (median):    0.00%
 Time  (mean ± σ):   1.963 ms ± 27.273 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

             ▁▁▃▃▃▄▆█▇▆▆▄▆▂▂▃▃▁▂
  ▂▁▁▁▃▃▃▃▄▆▆██████████████████████▇▅▆▅▅▅▄▄▄▄▄▄▃▄▃▃▃▂▃▂▃▂▂▂▂ ▅
  1.93 ms        Histogram: frequency by time         210 ms <

 Memory estimate: 1.14 KiB, allocs estimate: 1.

i = 144
RecursiveFactorization:
BenchmarkTools.Trial: 1328 samples with 1 evaluation.
 Range (min  max):  3.726 ms   4.059 ms  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     3.758 ms              ┊ GC (median):    0.00%
 Time  (mean ± σ):   3.762 ms ± 20.361 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

           ▁ ▄▂▆▅▄▄▇█▃▅▄▃ ▁
  ▃▂▃▄▃▃▄▇▇█▇████████████████▆▇▇▇▇▅▇▆▄▅▅▄▄▃▃▄▃▃▃▂▄▃▂▃▂▃▂▂▃▁▃ ▄
  3.73 ms        Histogram: frequency by time        3.82 ms <

 Memory estimate: 18.42 KiB, allocs estimate: 14.
Generic:
BenchmarkTools.Trial: 1899 samples with 1 evaluation.
 Range (min  max):  2.595 ms   2.773 ms  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     2.627 ms              ┊ GC (median):    0.00%
 Time  (mean ± σ):   2.629 ms ± 16.579 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

             ▁ ▁▅▄▃▃▅▄▄▆█▅▅▃▅▆▃▁▁▁  ▁
  ▂▂▂▂▂▃▅▇▅█▇██████████████████████▇█▆▅▅▅▇▄▃▄▄▅▃▄▄▃▃▃▂▂▂▂▂▂▂ ▄
  2.6 ms         Histogram: frequency by time        2.67 ms <

 Memory estimate: 1.22 KiB, allocs estimate: 1.

It's interesting that the distribution is often bimodal. I suspect that is the result of GC?

@chriselrod
Copy link
Contributor

RecursiveFactorization continues to do worse at larger sizes:

julia> for i in 168:24:504
           @show i
           A = ForwardDiff.Dual.([randn(i,i) for _ in 1:9]...); B = similar(A);
           println("RecursiveFactorization:")
           display(@benchmark RecursiveFactorization.lu!(copyto!($B,$A)))
           println("Generic:")
           display(@benchmark LinearAlgebra.lu!(copyto!($B,$A)))
           println()
       end
i = 168
RecursiveFactorization:
BenchmarkTools.Trial: 866 samples with 1 evaluation.
 Range (min  max):  5.731 ms   6.186 ms  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     5.767 ms              ┊ GC (median):    0.00%
 Time  (mean ± σ):   5.772 ms ± 26.164 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

          ▁▁ ▂▃▄▃▆█▆▆▄▃▂  ▂
  ▃▁▃▃▃▅█▇██▇██████████████▇▇▇▆▆▆▅▆▅▅▅▅▄▄▄▂▃▃▃▃▄▃▃▁▃▃▂▃▁▁▂▂▃ ▄
  5.73 ms        Histogram: frequency by time        5.84 ms <

 Memory estimate: 23.80 KiB, allocs estimate: 18.
Generic:
BenchmarkTools.Trial: 1097 samples with 1 evaluation.
 Range (min  max):  4.484 ms   4.760 ms  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     4.553 ms              ┊ GC (median):    0.00%
 Time  (mean ± σ):   4.555 ms ± 28.297 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

               ▁   ▂▂▃▂▂▄▇▄▃█▂▃▄▃
  ▂▂▂▁▃▂▃▃▄▄▄▆▅██▅███████████████████▇█▇▆▅▆▄▅▄▅▃▃▃▃▂▃▃▃▂▂▂▂▂ ▄
  4.48 ms        Histogram: frequency by time        4.64 ms <

 Memory estimate: 1.45 KiB, allocs estimate: 1.

i = 192
RecursiveFactorization:
BenchmarkTools.Trial: 541 samples with 1 evaluation.
 Range (min  max):  9.156 ms   9.727 ms  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     9.241 ms              ┊ GC (median):    0.00%
 Time  (mean ± σ):   9.251 ms ± 55.196 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

          ▁▁▃▄▄▇▄▄█▆▆█▃▁▄
  ▂▂▂▃▃▃▇▇████████████████▆▇▆▆▇▆▆▃▅▂▆▆▅▄▃▃▂▃▃▃▂▅▂▂▁▂▁▁▁▂▁▂▁▃ ▄
  9.16 ms        Histogram: frequency by time        9.43 ms <

 Memory estimate: 26.91 KiB, allocs estimate: 18.
Generic:
BenchmarkTools.Trial: 709 samples with 1 evaluation.
 Range (min  max):  6.955 ms   7.168 ms  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     7.049 ms              ┊ GC (median):    0.00%
 Time  (mean ± σ):   7.050 ms ± 32.179 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

                     ▂▁▃ ▂ ▂▆▅▅▄▇▅▅▃█▆▃▅▂▁▅ ▃▂
  ▂▁▁▁▁▃▂▂▄▃▃▇▅▃▅▇▆▅▇███▆█▇████████████████▆██▄█▄▆▅▅▅▄▄▃▄▃▂▄ ▅
  6.96 ms        Histogram: frequency by time        7.13 ms <

 Memory estimate: 1.59 KiB, allocs estimate: 1.

i = 216
RecursiveFactorization:
BenchmarkTools.Trial: 394 samples with 1 evaluation.
 Range (min  max):  12.598 ms  13.249 ms  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     12.700 ms              ┊ GC (median):    0.00%
 Time  (mean ± σ):   12.714 ms ± 71.379 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

         ▂▂▂▃█▇▆█▅█▆▃▁▁
  ▂▁▂▂▄▃▇██████████████▅▇▅▅▃▂▆▁▂▃▃▁▁▂▁▂▃▃▃▁▃▃▂▂▁▂▁▁▁▂▂▁▂▂▁▂▁▂ ▃
  12.6 ms         Histogram: frequency by time          13 ms <

 Memory estimate: 33.48 KiB, allocs estimate: 23.
Generic:
BenchmarkTools.Trial: 485 samples with 1 evaluation.
 Range (min  max):  10.209 ms  10.920 ms  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     10.303 ms              ┊ GC (median):    0.00%
 Time  (mean ± σ):   10.309 ms ± 47.010 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

                  ▂▁▂▁    █▇▅▅▇▃▄ ▂ ▅▁ ▁
  ▃▁▁▁▁▃▃▃▄▄▄▅▅▆▅▇████▆██▆███████▆████▇█▆▇▆▇▄▆▅▆▆▄▄▃▃▄▄▃▃▄▅▁▄ ▄
  10.2 ms         Histogram: frequency by time        10.4 ms <

 Memory estimate: 1.77 KiB, allocs estimate: 1.

i = 240
RecursiveFactorization:
BenchmarkTools.Trial: 282 samples with 1 evaluation.
 Range (min  max):  17.638 ms  18.148 ms  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     17.735 ms              ┊ GC (median):    0.00%
 Time  (mean ± σ):   17.751 ms ± 71.965 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

       ▃  ▃▆▃▆▄▁▄▅█▂▂▃
  ▄▃▄▅▃█▅▇████████████████▄▄▇▅▆▅▄▅▄▃▅▄▄▄▆▄▄▃▃▃▄▁▁▁▁▃▁▃▁▁▁▃▁▁▄ ▄
  17.6 ms         Histogram: frequency by time          18 ms <

 Memory estimate: 36.94 KiB, allocs estimate: 23.
Generic:
BenchmarkTools.Trial: 346 samples with 1 evaluation.
 Range (min  max):  14.290 ms  14.684 ms  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     14.443 ms              ┊ GC (median):    0.00%
 Time  (mean ± σ):   14.444 ms ± 64.187 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

               ▁       ▂▄▄▁▂▃▃▄▂▅▂▄▁ ██ ▃  ▄▁
  ▄▁▄▃▁▄▃▆▄▆▄▄▄██▆▇▇▇███████████████▆████▆▅██▆▇▆▅▄█▅▇▄▆▃▃▆▅▃▅ ▅
  14.3 ms         Histogram: frequency by time        14.6 ms <

 Memory estimate: 1.98 KiB, allocs estimate: 1.

i = 264
RecursiveFactorization:
BenchmarkTools.Trial: 206 samples with 1 evaluation.
 Range (min  max):  24.069 ms   25.076 ms  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     24.259 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   24.299 ms ± 153.273 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

      ▂   ▁ ▄▇██▁▁▁  ▂▂     ▁ ▁▁▂
  ▆▄▄▄█▆▆███████████▃██▆█▄▇▆█▃████▄▆▃▁▆▃▄▁▆▇▁▇▃▃▄▁▁▆▁▄▁▁▃▃▁▁▃▃ ▄
  24.1 ms         Histogram: frequency by time         24.7 ms <

 Memory estimate: 41.09 KiB, allocs estimate: 25.
Generic:
BenchmarkTools.Trial: 256 samples with 1 evaluation.
 Range (min  max):  19.268 ms   19.905 ms  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     19.601 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   19.598 ms ± 113.983 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

                       ▂▅  ▃  ▄  ▂▃▅ ▂▃ ▄▅█      ▅ ▄
  ▃▁▁▁▁▁▁▁▁▃▃▃▅▃▆█▄▄▄▅▆██▆██▇▇█▇▆██████▇███▇▇▅▆█▇███▄▅▇▆▁▃▃▄▁▇ ▄
  19.3 ms         Histogram: frequency by time         19.8 ms <

 Memory estimate: 2.19 KiB, allocs estimate: 1.

i = 288
RecursiveFactorization:
BenchmarkTools.Trial: 149 samples with 1 evaluation.
 Range (min  max):  33.142 ms   34.622 ms  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     33.630 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   33.635 ms ± 215.801 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

             ▂ ▂  ▂     ▂▆ ▂ ▆█▂▆ ▄  ▂ ▄ ▆  ▂
  ▄▁▁▁▁▄▁▁▄▄██▆████▄███▄██▄█████████████▆█▆██▆▆██▆▄██▄▆▁▄▁▁▁▄▆ ▄
  33.1 ms         Histogram: frequency by time         34.1 ms <

 Memory estimate: 48.25 KiB, allocs estimate: 30.
Generic:
BenchmarkTools.Trial: 194 samples with 1 evaluation.
 Range (min  max):  25.469 ms   26.329 ms  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     25.818 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   25.824 ms ± 172.614 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

           ▃ ▄▁▃▃▁▁▄ ▁▁ █▁▁▁▃▁  ▃▁▄  ▆ ▁▁▁▄▄   ▁▁▄▁▁
  ▆▁▆▁▆▁▆▄▁█▇███████▇██▆██████▇▇███▇▄█▆█████▆▇▄█████▇▇▆▄▁▄▄▁▁▆ ▄
  25.5 ms         Histogram: frequency by time         26.2 ms <

 Memory estimate: 2.38 KiB, allocs estimate: 1.

i = 312
RecursiveFactorization:
BenchmarkTools.Trial: 122 samples with 1 evaluation.
 Range (min  max):  40.617 ms   41.665 ms  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     41.004 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   41.009 ms ± 200.696 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

           ▆ ▁ ▁ ▃▃▁██ █ ▃▆▆▁▆▃▁▃▃▁          ▁
  ▄▇▇▇▁▄▁▇▁█▇█▇█▇█████▄█▄██████████▇▇▄▇▇▇▁▁▇▄█▄▄▁▁▄▄▁▁▄▁▁▁▁▁▁▄ ▄
  40.6 ms         Histogram: frequency by time         41.6 ms <

 Memory estimate: 51.84 KiB, allocs estimate: 30.
Generic:
BenchmarkTools.Trial: 150 samples with 1 evaluation.
 Range (min  max):  32.912 ms   34.709 ms  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     33.542 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   33.525 ms ± 283.494 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

               ▂▂▄   ▂▂▄   █ ▄ ▂ ▄▂ ▂▄▄▄▄▄▄▂ ▂  ▄
  ▄▄▆▄▄▄▄▁▄▁▄▄▄███▄▄▆███▆▆▄█▆█▆█▆██████████████▆█▆▄▄▁▆▄▁▄▄▄▁▁▆ ▄
  32.9 ms         Histogram: frequency by time         34.1 ms <

 Memory estimate: 2.56 KiB, allocs estimate: 1.

i = 336
RecursiveFactorization:
BenchmarkTools.Trial: 97 samples with 1 evaluation.
 Range (min  max):  51.195 ms   54.155 ms  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     51.500 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   51.643 ms ± 504.153 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

     ▄▃█▆
  ▆▆█████▇▇▆██▃▅▁▄▄▁▃▃▁▁▁▁▃▁▁▁▁▃▁▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▃▁▁▁▁▁▃ ▁
  51.2 ms         Histogram: frequency by time         54.1 ms <

 Memory estimate: 55.41 KiB, allocs estimate: 30.
Generic:
BenchmarkTools.Trial: 118 samples with 1 evaluation.
 Range (min  max):  41.714 ms   44.782 ms  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     42.414 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   42.440 ms ± 416.749 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

       ▄   ▆█▃▃▁ ▁▄ ▃ ▁▄▆  ▃▄
  ▄▁▁▆▁█▁▆▆█████▄██▄█▇███▇▆██▇▆▆▇▁▄▆▆▆▁▄▄▁▄▄▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▄ ▄
  41.7 ms         Histogram: frequency by time         43.9 ms <

 Memory estimate: 2.75 KiB, allocs estimate: 1.

i = 360
RecursiveFactorization:
BenchmarkTools.Trial: 78 samples with 1 evaluation.
 Range (min  max):  64.140 ms   69.442 ms  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     64.542 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   64.803 ms ± 853.327 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

     ▅▅▃█ ▃   ▂ ▅
  ▇▅█████▅█▇▇▁█▄█▅▄▁▄▁▁▁▄▁▁▁▁▄▁▁▁▁▁▁▁▁▄▁▁▁▁▁▄▁▁▁▁▁▁▁▁▄▁▁▁▄▁▁▄▄ ▁
  64.1 ms         Histogram: frequency by time         67.3 ms <

 Memory estimate: 63.12 KiB, allocs estimate: 36.
Generic:
BenchmarkTools.Trial: 95 samples with 1 evaluation.
 Range (min  max):  51.808 ms   56.556 ms  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     53.020 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   53.114 ms ± 899.077 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▁    ▄    ▁   █  ▁█▁ ▁▁   ▁
  █▄▄▄▆█▇▄▆▇█▆▄▄█▇▇███▄██▆▄▄█▁▁▄▁▄▄▄▄▄▄▄▄▄▁▁▁▁▄▄▄▁▁▁▁▄▁▁▁▁▁▁▁▄ ▁
  51.8 ms         Histogram: frequency by time           56 ms <

 Memory estimate: 2.94 KiB, allocs estimate: 1.

i = 384
RecursiveFactorization:
BenchmarkTools.Trial: 59 samples with 1 evaluation.
 Range (min  max):  83.964 ms  92.575 ms  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     84.602 ms              ┊ GC (median):    0.00%
 Time  (mean ± σ):   85.407 ms ±  2.127 ms  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▂ █ ▇
  █▆█▇█▇▆▇▃▃▅▁▁▁▁▁▁▁▁▁▁▃▁▁▁▃▁▁▃▁▃▁▁▁▃▁▁▁▁▁▁▁▁▃▁▁▁▁▁▁▁▁▃▁▃▁▁▁▃ ▁
  84 ms           Histogram: frequency by time        92.1 ms <

 Memory estimate: 68.58 KiB, allocs estimate: 39.
Generic:
BenchmarkTools.Trial: 82 samples with 1 evaluation.
 Range (min  max):  60.014 ms   62.082 ms  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     61.102 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   61.091 ms ± 414.833 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

                              ▂▂   █ ▂▅   ▅▅      █   ▂
  ▅▁▁▁▅▁▁▅▅▁▁▅█▅▁▁▁▅▅▅▅▅▅▁▁█▅▅██▅███▅████▅███▁▅██▅█▁▅▁██▅█▁▅█▅ ▁
  60 ms           Histogram: frequency by time         61.8 ms <

 Memory estimate: 3.12 KiB, allocs estimate: 1.

i = 408
RecursiveFactorization:
BenchmarkTools.Trial: 54 samples with 1 evaluation.
 Range (min  max):  93.414 ms   95.268 ms  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     93.858 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   94.007 ms ± 451.799 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

      ▂  █   █         ▂                 ▂
  ▅▅▅▁██▁██▅██▅▁▅█▁▅▁▅▁█▅▅█▅▅▅▁▁▁▁▁▁▁▁▅▅▅█▁▅▁▅▁█▅▁▅▁▁▁▁▁▅▁▁▁▁▅ ▁
  93.4 ms         Histogram: frequency by time           95 ms <

 Memory estimate: 72.38 KiB, allocs estimate: 39.
Generic:
BenchmarkTools.Trial: 68 samples with 1 evaluation.
 Range (min  max):  72.788 ms   75.219 ms  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     74.003 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   74.030 ms ± 537.652 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

                        ▂     ▅▅  ▂         █  ▂
  █▁▁▁▁▁▅▁▁▅▅▅▅▅▅▁▅▁▅▅█▅█▅▁▅▅▅██▅▅█▅▁▅▅█▅█▅▁█████▅▁▁██▁█▅▁▁▁▁▅ ▁
  72.8 ms         Histogram: frequency by time           75 ms <

 Memory estimate: 3.31 KiB, allocs estimate: 1.

i = 432
RecursiveFactorization:
BenchmarkTools.Trial: 45 samples with 1 evaluation.
 Range (min  max):  112.522 ms  113.863 ms  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     112.999 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   113.038 ms ± 251.410 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

           ▁   ▁▁▄ ▄  ▁█▄█      ▄   ▁
  ▆▁▁▁▁▁▁▆▁█▆▁▁███▆█▆▁████▆▁▁▆▆▁█▆▆▁█▁▆▁▁▁▆▁▁▁▆▁▁▁▁▁▆▁▁▁▁▁▁▁▁▁▆ ▁
  113 ms           Histogram: frequency by time          114 ms <

 Memory estimate: 77.50 KiB, allocs estimate: 39.
Generic:
BenchmarkTools.Trial: 57 samples with 1 evaluation.
 Range (min  max):  86.836 ms   89.692 ms  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     88.482 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   88.438 ms ± 592.800 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

                              ▅     ▂▅ █ ▂    ▂    ▂
  ▅▁▁▁▁▁▁▁▅▅▁▁▁▅▅▁▁▁▁▅▁█▅▅▅▁█▁█▅█▅▁▁██▅███▁▅▁▁█▅█▁██▁▅▁▁▁▅▅▅▁▅ ▁
  86.8 ms         Histogram: frequency by time         89.5 ms <

 Memory estimate: 3.50 KiB, allocs estimate: 1.

i = 456
RecursiveFactorization:
BenchmarkTools.Trial: 37 samples with 1 evaluation.
 Range (min  max):  135.696 ms  137.304 ms  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     136.045 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   136.114 ms ± 324.938 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▃   ▃ ▃▃█ ▃ ▃█  ▃ ▃ ▃                ▃
  █▇▁▁█▇███▇█▁██▁▁█▇█▇█▇▁▁▁▇▇▁▁▁▇▁▇▁▁▁▁█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▇ ▁
  136 ms           Histogram: frequency by time          137 ms <

 Memory estimate: 80.72 KiB, allocs estimate: 40.
Generic:
BenchmarkTools.Trial: 48 samples with 1 evaluation.
 Range (min  max):  103.839 ms  106.750 ms  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     105.004 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   105.037 ms ± 640.019 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▃           █  ▃█  ▃▃▃  █    ▃▃    ▃▃ ▃ █
  █▁▇▁▇▁▇▁▁▁▇▁█▇▁██▁▇███▇▁█▇▇▁▇██▁▇▁▁██▇█▁█▁▁▁▇▇▁▁▇▁▁▁▁▁▁▁▁▁▁▁▇ ▁
  104 ms           Histogram: frequency by time          107 ms <

 Memory estimate: 3.69 KiB, allocs estimate: 1.

i = 480
RecursiveFactorization:
BenchmarkTools.Trial: 32 samples with 1 evaluation.
 Range (min  max):  158.583 ms  161.430 ms  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     158.980 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   159.118 ms ± 582.721 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

  █▃█ ▃▃  █▃    ▃   ▃
  ███▇██▇▇██▁▇▇▁█▇▁▇█▁▁▁▁▇▁▁▇▁▁▁▁▁▁▁▁▁▁▇▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▇ ▁
  159 ms           Histogram: frequency by time          161 ms <

 Memory estimate: 91.03 KiB, allocs estimate: 48.
Generic:
BenchmarkTools.Trial: 41 samples with 1 evaluation.
 Range (min  max):  121.945 ms  124.500 ms  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     123.559 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   123.520 ms ± 508.894 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

                             █    ▂          ▅
  ▅▁▁▁▁▁▁▁▁▁▁▅▁▁▁▁▁▁▁▁▁▁▁▅▅▁▁█▅▅▅▁██▁▁█▁███▅▅█▅▁▁▅█▅▁▅▅▁▁▁▁▁▅▅▅ ▁
  122 ms           Histogram: frequency by time          125 ms <

 Memory estimate: 3.88 KiB, allocs estimate: 1.

i = 504
RecursiveFactorization:
BenchmarkTools.Trial: 28 samples with 1 evaluation.
 Range (min  max):  180.187 ms  182.071 ms  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     180.897 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   180.904 ms ± 426.190 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

  ▁  ▁  ▁█    ▁▁█ ▁ ▁▁ █  █ █ █▁▁▁ ▁  ▁        ▁ ▁            ▁
  █▁▁█▁▁██▁▁▁▁███▁█▁██▁█▁▁█▁█▁████▁█▁▁█▁▁▁▁▁▁▁▁█▁█▁▁▁▁▁▁▁▁▁▁▁▁█ ▁
  180 ms           Histogram: frequency by time          182 ms <

 Memory estimate: 96.86 KiB, allocs estimate: 51.
Generic:
BenchmarkTools.Trial: 35 samples with 1 evaluation.
 Range (min  max):  142.253 ms  145.542 ms  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     143.214 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   143.344 ms ± 711.690 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

      ▃▃  ▃    ▃█  ▃      ▃ ▃     ▃▃
  ▇▁▇▁██▁▁█▁▇▁▇██▇▇█▁▁▁▇▁▇█▇█▁▇▁▇▁██▁▁▁▁▁▁▇▁▁▇▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▇ ▁
  142 ms           Histogram: frequency by time          146 ms <

 Memory estimate: 4.06 KiB, allocs estimate: 1.

@ChrisRackauckas
Copy link
Member Author

Screenshot 2021-12-13 110317

@ChrisRackauckas
Copy link
Member Author

black is ForwardDiff.jl. ForwardDiff.jl is unable to cache compiles because of tags, so if you precompiled for a given tag those would be gone otherwise we can't do anything else about it
11:04
teal is all mapreduce stacks, I don't know why mapreduce is so bad
11:05
green is logging, I don't know why logging is so specialized to types? That seems fixable
11:05
brown can be fixed on the DiffEq side if we aren't lazy.

@ChrisRackauckas
Copy link
Member Author

the three browns

Chris Rackauckas:headphones: 11:06 AM
@inline @muladd function calculate_residuals(ũ::Number, u₀::Number, u₁::Number,
α, ρ, internalnorm,t)
@fastmath ũ / (α + max(internalnorm(u₀,t), internalnorm(u₁,t)) * ρ)
end

1 reply
Today at 11:09 AMView thread

Chris Rackauckas:headphones: 11:06 AM
@fastmax div is not precompiled in the Base system image.
11:07
right next to it
11:07
@inline function mul!(C::AbstractArray, s::Number, X::AbstractArray, alpha::Number, beta::Number)
if axes(C) == axes(X)
C .= (s .* X) .*ₛ alpha .+ C .ₛ beta
else
generic_mul!(C, s, X, MulAddMul(alpha, beta))
end
return C
end
@inline function mul!(C::AbstractArray, X::AbstractArray, s::Number, alpha::Number, beta::Number)
if axes(C) == axes(X)
C .= (X .
s) .*ₛ alpha .+ C .*ₛ beta
else
generic_mul!(C, X, s, MulAddMul(alpha, beta))
end
return C
end
11:08
those take a substantial amount of time (that's the big stack on the right most brown one). If we make a dispatch for array that has no broadcasting then that would chop out that whole stack.
11:09
The Brown on the left is
function DiffEqBase.remake(thing::Union{OrdinaryDiffEqAdaptiveImplicitAlgorithm{CS,AD,FDT},
OrdinaryDiffEqImplicitAlgorithm{CS,AD,FDT},
DAEAlgorithm{CS,AD,FDT}}; kwargs...) where {CS, AD, FDT}
T = SciMLBase.remaker_of(thing)
T(; chunk_size=Val{CS}(),autodiff=Val{AD}(),SciMLBase.struct_as_namedtuple(thing)...,kwargs...)
end
11:09
that doesn't precompile in DiffEq. 🤷 I might be too lazy for that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants