Skip to content

Allow disabling lookup of missing prop names #4100

Open
@bmaranville

Description

@bmaranville

When running a large update on a plot, sometimes the time is dominated by the lookup for missing names, even when this lookup serves no purpose, such as in plotly.subplots.make_subplots where new axes are being added by the plotly library itself, and finding "nearby" names is not helpful when all that is needed is a check if the name already exists.

In plotly.basedatatypes.BaseDatatype._perform_update, for each key to be updated it calls _check_path_in_prop_tree, but then ignores the error that is generated if the plotly_obj is a BaseLayoutType and the key matches one of the _subplot_re_match keywords, and just adds the key to the plotly_obj instead.

This call to _check_path_in_prop_tree is expensive when

  • there are a large number of defined keys in the layout (such as when there are a large number of subplots being initialized) and
  • the path is not yet in the prop tree

When those conditions are met, the prop lookup check_path_in_prop_tree calls BasePlotlyType.__getitem__, which then calls BasePlotlyType._raise_on_invalid_property_error, which then calls _plotly_utils.utils.find_closest_string for each new property being added. find_closest_string uses a levenshtein lookup and this takes geometrically more time the more keys there are to compare, which is the root of the problem.

If the lookup could be disabled in _raise_on_invalid_property_error, that would help, but there is a problem in that _check_path_in_prop_tree is using an implicit __getitem__ on the plotly_obj, here

The issue can be resolved by disabling the lookup completely, as shown in the second example notebook below, where _plotly_utils.utils is monkey-patched so that find_closest_string always raises an error, but this doesn't seem like the best way to handle it.

Example one: without disabling lookup

import plotly.subplots
import cProfile
import pstats
from pstats import SortKey

cProfile.run('fig = plotly.subplots.make_subplots(rows=20, cols=20)', 'restats')
p = pstats.Stats('restats')
p.sort_stats(SortKey.CUMULATIVE).print_stats(20)
Mon Mar 13 15:24:56 2023    restats

         55179619 function calls (54968062 primitive calls) in 18.368 seconds

   Ordered by: cumulative time
   List reduced from 868 to 20 due to restriction <20>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    150/1    0.000    0.000   18.368   18.368 {built-in method builtins.exec}
        1    0.000    0.000   18.368   18.368 <string>:1(<module>)
        1    0.000    0.000   18.368   18.368 /home/bbm/miniforge3/envs/refl1d_py310/lib/python3.10/site-packages/plotly/subplots.py:7(make_subplots)
        1    0.002    0.002   18.368   18.368 /home/bbm/miniforge3/envs/refl1d_py310/lib/python3.10/site-packages/plotly/_subplots.py:45(make_subplots)
        1    0.000    0.000   18.043   18.043 /home/bbm/miniforge3/envs/refl1d_py310/lib/python3.10/site-packages/plotly/graph_objs/_figure.py:736(update_layout)
        1    0.000    0.000   18.043   18.043 /home/bbm/miniforge3/envs/refl1d_py310/lib/python3.10/site-packages/plotly/basedatatypes.py:1378(update_layout)
        1    0.000    0.000   18.043   18.043 /home/bbm/miniforge3/envs/refl1d_py310/lib/python3.10/site-packages/plotly/basedatatypes.py:5096(update)
    802/2    0.010    0.000   17.536    8.768 /home/bbm/miniforge3/envs/refl1d_py310/lib/python3.10/site-packages/plotly/basedatatypes.py:3841(_perform_update)
19302/14494    0.065    0.000   17.219    0.001 /home/bbm/miniforge3/envs/refl1d_py310/lib/python3.10/site-packages/plotly/basedatatypes.py:4659(__getitem__)
4893/4891    0.018    0.000   17.209    0.004 /home/bbm/miniforge3/envs/refl1d_py310/lib/python3.10/site-packages/plotly/basedatatypes.py:159(_check_path_in_prop_tree)
14498/14494    0.015    0.000   17.096    0.001 /home/bbm/miniforge3/envs/refl1d_py310/lib/python3.10/site-packages/plotly/basedatatypes.py:5828(__getitem__)
      798    0.005    0.000   16.890    0.021 /home/bbm/miniforge3/envs/refl1d_py310/lib/python3.10/site-packages/plotly/basedatatypes.py:5047(_ret)
      798    0.001    0.000   16.879    0.021 /home/bbm/miniforge3/envs/refl1d_py310/lib/python3.10/site-packages/_plotly_utils/utils.py:449(find_closest_string)
     1600    0.176    0.000   16.878    0.011 {built-in method builtins.sorted}
   390621    0.099    0.000   16.702    0.000 /home/bbm/miniforge3/envs/refl1d_py310/lib/python3.10/site-packages/_plotly_utils/utils.py:450(_key)
470776/390621   11.539    0.000   16.603    0.000 /home/bbm/miniforge3/envs/refl1d_py310/lib/python3.10/site-packages/_plotly_utils/utils.py:430(levenshtein)
 24961225    3.540    0.000    3.540    0.000 {built-in method builtins.min}
 24971843    1.419    0.000    1.419    0.000 {method 'append' of 'list' objects}
        1    0.000    0.000    0.507    0.507 /home/bbm/miniforge3/envs/refl1d_py310/lib/python3.10/contextlib.py:139(__exit__)
        2    0.000    0.000    0.507    0.253 {built-in method builtins.next}


<pstats.Stats at 0x7faae67616f0>

Example two: disabling lookup

## Patch the library to disable levenshtein lookup for missing strings.
def disable_find_closest_string(string, strings):
    raise ValueError()

import _plotly_utils.utils
_plotly_utils.utils.find_closest_string = disable_find_closest_string
import plotly.subplots
import cProfile
import pstats
from pstats import SortKey

cProfile.run('fig = plotly.subplots.make_subplots(rows=20, cols=20)', 'restats')
p = pstats.Stats('restats')
p.sort_stats(SortKey.CUMULATIVE).print_stats(20)
Mon Mar 13 15:25:05 2023    restats

         2674042 function calls (2542640 primitive calls) in 1.498 seconds

   Ordered by: cumulative time
   List reduced from 866 to 20 due to restriction <20>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    150/1    0.000    0.000    1.498    1.498 {built-in method builtins.exec}
        1    0.000    0.000    1.498    1.498 <string>:1(<module>)
        1    0.000    0.000    1.498    1.498 /home/bbm/miniforge3/envs/refl1d_py310/lib/python3.10/site-packages/plotly/subplots.py:7(make_subplots)
        1    0.002    0.002    1.497    1.497 /home/bbm/miniforge3/envs/refl1d_py310/lib/python3.10/site-packages/plotly/_subplots.py:45(make_subplots)
        1    0.000    0.000    1.170    1.170 /home/bbm/miniforge3/envs/refl1d_py310/lib/python3.10/site-packages/plotly/graph_objs/_figure.py:736(update_layout)
        1    0.000    0.000    1.170    1.170 /home/bbm/miniforge3/envs/refl1d_py310/lib/python3.10/site-packages/plotly/basedatatypes.py:1378(update_layout)
        1    0.000    0.000    1.170    1.170 /home/bbm/miniforge3/envs/refl1d_py310/lib/python3.10/site-packages/plotly/basedatatypes.py:5096(update)
    802/2    0.009    0.000    0.626    0.313 /home/bbm/miniforge3/envs/refl1d_py310/lib/python3.10/site-packages/plotly/basedatatypes.py:3841(_perform_update)
        1    0.000    0.000    0.544    0.544 /home/bbm/miniforge3/envs/refl1d_py310/lib/python3.10/contextlib.py:139(__exit__)
        2    0.000    0.000    0.544    0.272 {built-in method builtins.next}
        2    0.000    0.000    0.544    0.272 /home/bbm/miniforge3/envs/refl1d_py310/lib/python3.10/site-packages/plotly/basedatatypes.py:2995(batch_update)
        1    0.000    0.000    0.544    0.544 /home/bbm/miniforge3/envs/refl1d_py310/lib/python3.10/site-packages/plotly/basedatatypes.py:2860(plotly_update)
     9697    0.072    0.000    0.443    0.000 /home/bbm/miniforge3/envs/refl1d_py310/lib/python3.10/site-packages/plotly/basedatatypes.py:51(_str_to_dict_path_full)
        1    0.000    0.000    0.417    0.417 /home/bbm/miniforge3/envs/refl1d_py310/lib/python3.10/site-packages/plotly/basedatatypes.py:2934(_perform_plotly_update)
        1    0.004    0.004    0.417    0.417 /home/bbm/miniforge3/envs/refl1d_py310/lib/python3.10/site-packages/plotly/basedatatypes.py:2611(_perform_plotly_relayout)
     1600    0.003    0.000    0.371    0.000 /home/bbm/miniforge3/envs/refl1d_py310/lib/python3.10/site-packages/plotly/basedatatypes.py:5842(__setitem__)
     1596    0.004    0.000    0.355    0.000 /home/bbm/miniforge3/envs/refl1d_py310/lib/python3.10/site-packages/plotly/basedatatypes.py:5726(_set_subplotid_prop)
     1598    0.010    0.000    0.353    0.000 /home/bbm/miniforge3/envs/refl1d_py310/lib/python3.10/site-packages/plotly/basedatatypes.py:5237(_set_compound_prop)
   103626    0.042    0.000    0.347    0.000 /home/bbm/miniforge3/envs/refl1d_py310/lib/python3.10/site-packages/plotly/basedatatypes.py:1811(_str_to_dict_path)
19302/14494    0.066    0.000    0.343    0.000 /home/bbm/miniforge3/envs/refl1d_py310/lib/python3.10/site-packages/plotly/basedatatypes.py:4659(__getitem__)


<pstats.Stats at 0x7ff048631750>

Metadata

Metadata

Assignees

No one assigned

    Labels

    P3backlogfeaturesomething new

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions