-
-
Notifications
You must be signed in to change notification settings - Fork 31.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
speedup for / while / if with better bytecode #46711
Comments
This is a preliminary patch to speedup for and while loops (it will also For this two new opcodes are introduced:
Some micro-benchmarks: ./python -m timeit "for x in xrange(10000): pass" ./python -m timeit "x=100" "while x: x -= 1" ./python Tools/pybench/pybench.py -t ForLoops Also, pystone gets 5% faster (from 43300 to 45800). Now for the less shiny things:
Is there some interest in this patch? If yes, I'd like to have your |
This new patch includes surgery to the compiler package (especially (test_dis will need to be updated for the new opcodes, not a big deal) |
Removed latest patch, it was half-baked. |
This new patch should be ok. The block ordering algorithm in Attaching loops3.py. |
loops4.patch adds a mechanism to avoid blocking signal catching in empty ./python -m timeit "for x in xrange(10000): pass" ./python -m timeit "x=100" "while x: x -= 1" ./python Tools/pybench/pybench.py -t ForLoops |
By the way, the compiler package fix has been isolated and cleaned up as |
This new patch also updates the code generation for list comprehensions. ./python -m timeit -s "l = range(100)" "[x for x in l]" ./python -m timeit -s "l = range(100)" "[x for x in l if x]" ./python -m timeit -s "l = range(100)" "[x for x in l if not x]" Please note that this patch is orthogonal with Neal's patch in bpo-2183, so |
This new patch completes the bytecode modifications. For/while loops as Some micro-benchmarks (completing the ones already given above): ./python Tools/pybench/pybench.py -t IfThenElse ./python -m timeit -s "y=range(100)" "sum(x for x in y)" ./python -m timeit -s "y=range(100)" "sum(x for x in y if x)" ./python -m timeit -s "y=range(100)" "sum(x for x in y if not x)" ./python -m timeit -s "x,y,z=1,2,3" "x if y else z" A couple of tests seem to be failing in obscure ways in the test suite, |
Ok, the fix for the bizarre failures was really simple. Now the only |
Antoine, I hope to look at this patch eventually. Unfortunately, there |
Can you see if this simpler patch also gives speed-ups? |
Armin, your patch gives a speed-up for "for" loops and comprehensions, ./python -m timeit "for x in xrange(10000): pass" ./python -m timeit "x=100" "while x: x -= 1" ./python -m timeit -s "l = range(100)" "[x for x in l]" ./python -m timeit -s "l = range(100)" "[x for x in l if x]" ./python -m timeit -s "l = range(100)" "[x for x in l if not x]" ./python Tools/pybench/pybench.py -t IfThenElse |
Finally I had to slightly change the lnotab format to have the right Still, there is a small change in tracing behaviour (see test_trace.py): All in all, the whole test suite now passes fine. The performance |
A pointer to previous (minor) research: http://groups.google.com/group/comp.lang.python/browse_frm/thread/72505e3cb6d9cb1a/e486759f06ec4ee5 esp. after Terry Reedy's post |
Reminder, make sure we can still break out of a "while 1: pass". |
Yes, the patch takes care of that. |
The patches don't apply cleanly anymore, I'll regenerate a new one. |
Here is an updated patch against trunk.
|
I would like to see this go forward. It looks promising. |
I don't see the changes to the lnotab format being a roadblock; just I'm seeing encouraging speed-ups out of this (with gcc 4.3.1 x86_64, Spitfire templates (render a 1000x1000 table 100 times): None of the apps I've benchmarked are negatively impacted. I only have Review comments:
|
Hello Collin, Thanks for taking a look.
Well, I have good news: the fixes to the pure Python compiler have been
Not a tremendous speedup but not totally insignificant either.
Before committing I want to know what to do with the new jump opcodes, |
On Fri, Feb 13, 2009 at 10:37 AM, Antoine Pitrou <[email protected]> wrote:
Yeah, I saw that. Fantastic.
Well, Spitfire and Django represent very different ways of
That sounds good, especially since Jeffrey and I have already reviewed bpo-4715. |
On Fri, Feb 13, 2009 at 3:23 PM, Collin Winter <[email protected]> wrote:
If you don't have the bandwidth to integrate 4715 into this patch, I Collin |
Collin, that would be very nice from you. You could also apply Jeffrey's Thanks! Antoine. |
I've updated for_iter.patch to the latest trunk, merging in bpo-4715. Review at http://codereview.appspot.com/20103 if you like. Performance: 32-bit gcc-4.3 Intel Core2: Django: Pickle: (cPickle) PyBench: SlowPickle: (pickle) Spitfire: SlowUnpickle: (pickle) Unpickle: (cPickle) 64-bit gcc-4.3 Intel Core2 Django: Pickle: PyBench: SlowPickle: Spitfire: SlowUnpickle: Unpickle: |
Thanks a lot! By the way, why do you bench cPickle? Does your test call Python code Overall, the results look positive although not overwhelming. |
No particular reason for cPickle. It sometimes shows when we've caused |
Hold off on reviewing this. There's one bug around the peepholer not |
Is this still relevant / will it get some love in the future? |
Is this enhancement still relevant? |
As a lot of work has gone into this it saddens me to see it languishing. Surely if Python performance is to be improved the bytecode for conditionals and loops is one of the places if not the place to do it? Are there any names missing from the nosy list that ought to be there? |
This experiment was abandoned over a decade ago and things have moved on since then. Unless someone objects I will close the issue. |
Definitely outdated :-) |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: