Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dry-run builds are ~15x slower on Windows due to lack of stdout buffering #2084

Closed
randomascii opened this issue Feb 16, 2022 · 4 comments
Closed
Milestone

Comments

@randomascii
Copy link
Contributor

While doing some testing of -n behavior with the latest ninja (it seems to be better, yay) I noticed that it is quite slow. A NOP build of Chrome's base takes about 20 seconds, whereas with earlier versions of ninja it takes about 6 seconds.

The actual slowdown is greater than that because the new ninja counts its startup time in its reporting and the old one doesn't. On longer builds where the startup time doesn't dominate the ratio is much greater. A dry-run build of the chrome target shows the new version taking 447 s while the old version took 30.2, for a 14.8x slowdown.

The problem is the same as for #2018 - disabling of buffering. I guess the fix to that issue was incorrect. I will investigate.

@jhasse jhasse added this to the 1.11.0 milestone Feb 16, 2022
@randomascii
Copy link
Contributor Author

My original analysis of this slowdown was incorrect, and maybe my analysis of #2018 was wrong as well. The culprit is this bit of code:

ec8de9c#diff-4d62e29bd645aeebe251d8cb2f74e37ff5caf565915066bb6d85ff6a93d19788R90

That is, the supports_color_ support is quite slow. Skipping that branch:

< if (supports_color_) { // this means ENABLE_VIRTUAL_TERMINAL_PROCESSING

if (false && supports_color_) { // this means ENABLE_VIRTUAL_TERMINAL_PROCESSING

restores performance for the -n builds of base.

My original bisect may have been wrong - I definitely had some confusion around there. Here is what I think is/has been happening:

  1. buffering of stdout was routinely disabled in ninja since the dawn of time. However this was not a problem because most output went through WriteConsoleOutput
  2. Pull request Use UTF-8 on Windows 10 Version 1903, fix #1195 #1915 switched output to using printf. Since stdout printing was unbuffered this led to performance problems. These problems were usually not noticed but they became visible with -d stats and -n where printing was the majority of the work.

These slowdowns can be fixed on Windows either by re-enabling buffering, or by skipping the supports_color_ path. Since "WriteConsoleOutput doesn't support UTF-8" (from issue #1915) we should stop disabling buffering. The fflush in the supports_color_ path should be sufficient anyway.

My fix (#2031) didn't cause this problem, but it also failed to properly find and fix the culprit, which is why it only fixed some of the symptoms.

TL;DR - deleting the lines that I added to ninja.cc should fix this issue once and for all without harmful consequences.
@jhasse

randomascii added a commit to randomascii/ninja that referenced this issue Feb 17, 2022
Long ago ninja disabled stdout buffering. Since then much has changed
and this appears to no longer be needed, while also being actively
harmful. Disabling of stdout buffering is not needed because printing
is done (in LinePrinter::Print) either with WriteConsoleOutput (which
is unaffected by stdout buffering) or by printf followed by fflush.

Meanwhile, the unbuffered printing in the printf case causes dramatic
slowdowns which are most visible in dry-run builds, as documented in
issue ninja-build#2084.

This fixes issue ninja-build#2084, finishing off the incomplete buffering fix done
in pull request ninja-build#2031.
randomascii added a commit to randomascii/ninja that referenced this issue Feb 17, 2022
Long ago ninja disabled stdout buffering. Since then much has changed
and this appears to no longer be needed, while also being actively
harmful. Disabling of stdout buffering is not needed because printing
is done (in LinePrinter::Print) either with WriteConsoleOutput (which
is unaffected by stdout buffering) or by printf followed by fflush.

Meanwhile, the unbuffered printing in the printf case causes dramatic
slowdowns which are most visible in dry-run builds, as documented in
issue ninja-build#2084.

This fixes issue ninja-build#2084, finishing off the incomplete buffering fix done
in pull request ninja-build#2031.
@randomascii randomascii changed the title dry-run builds are ~15x slower on Windows due to stdout buffering dry-run builds are ~15x slower on Windows due to lack of stdout buffering Feb 17, 2022
@randomascii
Copy link
Contributor Author

#2085 should fix this issue. Let me know if there is any additional testing needed to understand the issue, but I think it is pretty well understood now (better than before at least).

@randomascii
Copy link
Contributor Author

Thinking about this some more:

  1. Unbuffered stdout is too slow. Buffering cannot be disabled. Removing the code that disables buffering is the right thing to do.
  2. Issue 2018 (-t deps being slow) was triggered when stdout was disabled unilaterally, by change e77c561
  3. This issue, 2084, was triggered a couple of changes later. by change ec8de9c. However, ec8de9c (which switched to using printf for regular status output) was only a problem because e77c561 disabled stdout buffering.

Which is to say, it all makes sense now, and removing the code that disables buffering is definitely the right fix. If we find a place where full buffering is problematic then fflush is the correct fix for Windows, not disabling of buffering

@jhasse
Copy link
Collaborator

jhasse commented Feb 17, 2022

Thanks for investigating and the fix #2085 :)

@jhasse jhasse closed this as completed Feb 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants