-
Notifications
You must be signed in to change notification settings - Fork 928
fix: initial app status and zero-based id #18622
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
We are discarding all "working" updates from the screen watcher that does not include a new user message, but it makes sense to accept the very first message, because the agent could be working but neglected to report that fact.
62b94b7
to
7454e49
Compare
Two things: 1. Message IDs can be zero-based. 2. We were not actually updating the last user message ID. I refactored a little to keep track of the last message in lastReport.
7454e49
to
9cef0df
Compare
On second glance, there seems to be a problem when an AI agent reports a completed status but continues working on the task without providing a new working status. Our web UI would render an idle status until the next user message, even though it's apparent the agent is still working. I think we should treat the agent as completely unreliable and only trust it for summaries. Whenever we get an MCP update, we should report a "working" status. We'd consider the agent idle only once AgentAPI reports that the screen is stable. |
Oooo yeah that makes sense. I will make that change. |
Simple change, but re-requested review just in case I did something brain-dead 😄 |
Ah wait need to update tests 🤦 |
24bd249
to
d1a09cb
Compare
lol with this change the PR now fixes coder/internal#713 (in the sense that it no longer triggers the race) but I still plan to fix the race and add a new test that can reproduce the race Before, I was relying on the agent updating to "failure", using it as a sentinel value to make sure that the previous update was skipped, but that is no longer possible (and it was unreliable anyway due to the race). I think maybe I might have to mock the queue itself to properly test but open to ideas if you have any. Another idea is that if AgentAPI could report a failure state, I could use that as a sentinel value in the test. Or if it could also send a summary, I could use that. Any kind of third thing I could send to prove the update was skipped. |
We are discarding all "working" updates from the screen watcher because we cannot tell the difference between the agent or user changing the screen, but it makes sense to accept it as the very first update, because the agent could be working but neglected to report that fact, so you would never get an initial "working" update (it would just eventually go straight to "idle").
Also messages can start at zero, so I made a fix for that as well, although I think the first message will be from the LLM and we ignore those anyway, so this probably has no actual effect, but seems more technically correct.
And it seems I forgot to actually update the last message ID, which I think also does not actually matter for user messages (since I think the SSE endpoint will not re-emit a user message it has already emitted), but seems more technically correct to check.
I still need to fix coder/internal#713 but I will do that in a separate PR. Going to do the one second wait thing we talked about for updates from the watcher.