Skip to content

gh-134873: Avoid quadratic complexities in idlelib #134874

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 23 commits into
base: main
Choose a base branch
from

Conversation

johnzhou721
Copy link
Contributor

@johnzhou721 johnzhou721 commented May 29, 2025

A DOS by Quadratic complexity issue is fixed in idlelib. Part of (but does not fix) #134873.

@terryjreedy terryjreedy moved this to In Progress in IDLE Issues May 29, 2025
@terryjreedy terryjreedy self-assigned this May 29, 2025
@terryjreedy
Copy link
Member

I believe that the 6 lines from 1205 to 1210 can be replaced by 2 lines -- an re.match and an f-string. I will submit an alternate proposal later. I believe that the input vevent name should have either no <>s or 2 of each, with maybe the latter for back compatibility (I will test). But I will may stick with the more general code to not break buggy extensions.

Copy link
Member

@ZeroIntensity ZeroIntensity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming this is the fix that we go with, let's add a test case.

@ZeroIntensity ZeroIntensity added type-security A security issue needs backport to 3.9 only security fixes needs backport to 3.10 only security fixes needs backport to 3.11 only security fixes needs backport to 3.12 only security fixes needs backport to 3.13 bugs and security fixes needs backport to 3.14 bugs and security fixes labels May 29, 2025
@johnzhou721
Copy link
Contributor Author

johnzhou721 commented May 29, 2025 via email

@johnzhou721
Copy link
Contributor Author

johnzhou721 commented May 29, 2025 via email

@kexinoh
Copy link

kexinoh commented May 30, 2025

@johnzhou721
I would greatly appreciate it if you could kindly address the issue located at

cpython/Lib/idlelib/editor.py

Lines 1373 to 1378 in 5ab66a8

while True:
chars = chars[:-1]
ncharsdeleted = ncharsdeleted + 1
have = len(chars.expandtabs(tabwidth))
if have <= want or chars[-1] not in " \t":
break
. I sincerely apologize for overlooking this in my previous message.

As an example, I successfully utilized Gemini 2.5 Pro to generate a reasonable fix for this problem. Could you give it a try?

@johnzhou721
Copy link
Contributor Author

@kexinoh Yes, I would give it a try once I have time; however, I am working on something else right now -- is it acceptable if I delay this by about a day or so?

(if anyone else has a fix ready before I get to this, feel free to make a pr onto the branch of my pr and I'll merge it into my PR)

@johnzhou721
Copy link
Contributor Author

@kexinoh I have a small amount of time not enough to work on anything else before I end my day so I attempted the issue you pointed out -- but can't test though.

@johnzhou721
Copy link
Contributor Author

Assuming this is the fix that we go with, let's add a test case.

Where? How? For what? Thanks! @ZeroIntensity

@ZeroIntensity
Copy link
Member

Where? How? For what?

We need a test case in test_idlelib that results in DOS/extreme slowness off main. Basically, just do something to prove that this PR fixes it (probably just testing with large amounts of data).

@picnixz picnixz removed their request for review June 2, 2025 09:47
@picnixz
Copy link
Member

picnixz commented Jun 2, 2025

(I'm removing the request until I have time)

@zware
Copy link
Member

zware commented Jun 2, 2025

Sidenote: I have a very hard time believing that there is any reasonable DOS in this code at all. Can someone please explain how it can be exploited?

@johnzhou721
Copy link
Contributor Author

@zware Maybe have a huge indentwidth and carefully configured string of whitespaces so that want gets very small?

@serhiy-storchaka
Copy link
Member

Did you test it with real file in IDLE, not just by calling delete_trail_char_and_space()? I afraid that you will get hard time just displaying a long line containing 10000 tabulations and 10000 spaces.

@johnzhou721
Copy link
Contributor Author

@serhiy will do. Yes the dos is very non exploitable and not sure if idle is used in production anyways but since it’s reported let sfix

@zware
Copy link
Member

zware commented Jun 4, 2025

I disagree. If it is non-exploitable, it's not a DOS. Fixing it only encourages further reports of invulnerable vulnerabilities and a waste of volunteer time.

That said, there does appear to be some opportunity for cleanup here. My objection is to classifying the change as "fixing a DOS" rather than to using cleaner code, at the maintainer (@terryjreedy and/or @serhiy-storchaka)'s discretion.

@ZeroIntensity ZeroIntensity removed type-security A security issue needs backport to 3.9 only security fixes needs backport to 3.10 only security fixes needs backport to 3.11 only security fixes needs backport to 3.12 only security fixes labels Jun 4, 2025
@johnzhou721
Copy link
Contributor Author

@zware Good point! If so, we'll remove the fix for the have and want thing and only focus on the .lstrip and .rstrip part for <<<<...>>>>> since that makes the code cleaner. The have and want trimming space thing only reduces the time by a factor of the indent width at best which wouldn't be very high

@johnzhou721
Copy link
Contributor Author

That said, there does appear to be some opportunity for cleanup here. My objection is to classifying the change as "fixing a DOS" rather than to using cleaner code, at the maintainer (@terryjreedy and/or @serhiy-storchaka)'s discretion.

So turning the second quadratic into linear... not sure if it produces cleaner code though. Scanning backwards might be better instead of scanning forwards, because it saves some time... that said it's still a speedup by anywhere from 1-8x in normal usage.

@zware
Copy link
Member

zware commented Jun 15, 2025

Please don't keep merging main like that, it consumes CI resources unnecessarily. A merge is only really necessary to resolve conflicts or as a final check before merge, which is not the stage we're at here yet.

@johnzhou721
Copy link
Contributor Author

Please don't keep merging main like that, it consumes CI resources unnecessarily. A merge is only really necessary to resolve conflicts or as a final check before merge, which is not the stage we're at here yet.

Noted.

@zware zware changed the title gh-134873: Fix a DOS issue in idlelib gh-134873: Avoid quadratic complexity in idlelib Jun 15, 2025
@zware zware removed their request for review June 15, 2025 22:53
@johnzhou721 johnzhou721 changed the title gh-134873: Avoid quadratic complexity in idlelib gh-134873: Avoid quadratic complexities in idlelib Jun 16, 2025
@johnzhou721
Copy link
Contributor Author

@picnixz Have you gotten time yet?

@picnixz
Copy link
Member

picnixz commented Jun 22, 2025

I don't use a lot idlelib so I don't know if there is a way to inject very long strings into it to trigger the bug. The rstrip/lstrip improvement can be isolated in a separate PR though, but for the bug in itself, we can just come up with a comment saying that it's unrealistic to trigger it.

However, I either want a proof that we can't trigger the DOS (or it's very hard to do so) or a PoC demonstrating that we can trigger it (not just some "we could do ...")

@johnzhou721
Copy link
Contributor Author

johnzhou721 commented Jun 22, 2025

@picnixz In fact, it might be possible to prove that this is untriggerable -- does IDLE allow you to load any kind of settings file that allows specifying huge indent widths? If not -- this is probably unrealistic to do it -- because the want values I tested in this PR are extremely unrealistic unless you have huge indentwidths, and some of them are flat-out impossible edge cases I made up just to make sure the new implementation is equivalent to the old one.

It's trivial to prove that the old loop is of complexity (EDIT: of AT MOST) O(self.indentwidth*length) and the new one is O(length)... I suspect I might be able to just reduce this to O(self.indentwidth) if I work backwards in delete_trail_char_and_space and be careful but chars.expandtabs is already >= O(n) so this is a moot point.

In other words: This is exploitable only if (but NOT if) you can set huge indent widths, AND the line width has to be reasonably large in order to have any possibility to do this.

However, I still can't figure out how to get to this function at all -- I don't really understand the surrounding code well.

I will be happy if I understand this code enough, to write a "virus" that changes config-main.cfg and immediately open a file with a bunch of spaces and tabs and press backspace. However we need to show that it's non-exploitable if we don't press backspace (EDIT -- I mean that it's going to be more exploitable / freeze longer using this than simply, say, loading a file with a bunch of spaces.).

I really apologize for the weird discussion.

@johnzhou721
Copy link
Contributor Author

That said: someone could put an even huger indentwidth into the config file, and when the user presses tab, freezes the IDLE (will this happen?) If this will happen, my suggestion is to ensure that when you load the cfg file that you validate the values are below 10 or something, which would make this pretty much not more exploitable over just having a huge length

@picnixz Are you okay with this?

@johnzhou721
Copy link
Contributor Author

@kexinoh Since you reported this bug, I would appreciate if you could read this patch and see if my analysis above is correct. Thank you for making CPython a safer product.

@kexinoh
Copy link

kexinoh commented Jun 22, 2025

Whether there is a concrete issue depends on CPython's threat model. Regarding the specific problem in the idle library, I agree that its harm is not as severe as vulnerabilities in email or path parsing, since it is difficult for remote users to directly control or exploit it. However, considering that a DoS vulnerability, as a time-complexity issue, may not cause visibly noticeable downtime, fixing it could still improve CPython's runtime performance. Such a fix would be entirely harmless (as it wouldn’t alter any existing functionality) and beneficial to CPython. In that case, there seems to be no reason not to accept it.

I believe this fix for the idle flaw could be treated as part of an improvement rather than as a security patch.

@johnzhou721
Copy link
Contributor Author

But

  • It makes the code a bit less cleaner
  • IDLE is not used in production, so it's sort of isolated and does nothing to overall performance
  • It's only a 4x speedup at most (indentwidth) with usual settings -- and that's excluding all the other UI handling going on.

I'll leave it to @terryjreedy to decide on this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting review needs backport to 3.13 bugs and security fixes needs backport to 3.14 bugs and security fixes
Projects
Status: In Progress
Development

Successfully merging this pull request may close these issues.

7 participants