-
-
Notifications
You must be signed in to change notification settings - Fork 728
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Capture line number for code frames #7786
Capture line number for code frames #7786
Conversation
Unit Test ResultsSee test report for an extended history of previous test failures. This is useful for diagnosing flaky tests. 19 files + 15 19 suites +15 9h 58m 43s ⏱️ + 8h 45m 44s For more details on these failures, see this check. Results for commit b913b59. ± Comparison against base commit ff6327f. ♻️ This comment has been updated with latest results. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @milesgranger
@ntabris IIRC, you had mentioned you wanted to check out how this would affect the platform? I'm reasonably sure, as it is now, it shouldn't affect much of anything. All callers immediately drop the use of the line number and only use the src code lines as before so I think it shouldn't have any side effects. I was thinking this would be strictly implementing capturing the line numbers, then in a later PR actually making use of them in. |
Okay, I didn't realize that. Doing this in multiple steps makes it unclear to me when review of design should happen. Would code from this PR be in scope when reviewing the PR which makes use of this code? (What's the reason for doing this in multiple PR? It seems like a single unit from design and UX perspective.) |
Because the original issue seemed to be, in my interpretation, a feature for the platform indicating the line number; and I'm not sure where that would be. If I'm mistaken, I'd be happy to do this in one; just need a bit of guidance on where the indicator would go, in which caller. |
@milesgranger i'm a bit clueless to the internals of dask but here is where where we are capturing code frames and sending them to the platform We run this as a scheduler plugin on clusters. We just need to know when a PR goes in that would break that code (or when it might need an update) |
Thanks @shughes-uk, seems like we'd be in the first situation, this PR would break that code. As Alternatively, I suppose we could update the sending to scheduler to have another |
e7912ab
to
ec7a971
Compare
7b6a7dd
to
0d6dac1
Compare
0d6dac1
to
8bc2357
Compare
Okay @shughes-uk, I've updated it to send the line number, so as it stands now, this patch would break the code you've linked to. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oooh would this be useful for adding line numbers to performance reports?
I am wondering whether we should capture the line number within the file - line 7174 in def test_computation_object_code_dask_compute(client):
... From what I understand, in this PR we capture the former, which might not be very useful without capturing the source file (name). It might also not suffice if one wanted to add indicators to the calling line. @shughes-uk: What were you thinking about when creating #7779? Similarly, @jacobtomlinson: What exactly would you be looking for to add line numbers to performance reports? |
@@ -2992,7 +2994,7 @@ def run( | |||
@staticmethod | |||
def _get_computation_code( | |||
stacklevel: int | None = None, nframes: int = 1 | |||
) -> tuple[str, ...]: | |||
) -> tuple[SourceCode, ...]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assuming we want to use this data, whatever type we return here is going to be what gets passed to update_graph
distributed/distributed/scheduler.py
Line 4269 in fbab38a
code: tuple[str] | None = None, |
and used as Computation.code
, which then needs to be read/parsed/used by the analytics plugin for this to be useful on the Coiled platform side.
Given that it's a pain to coordinate schema changes and that tuples don't give a lot of flexibility, what do you think of switching here to a dict
? We're making a breaking change (from str
to something else) anyway, so why not make it to something more flexible so that in the future we can avoid breaking changes. From my perspective this would make things easier to maintain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it not viable to import the SourceCode
obj on the platform side? AFAIK, a schema change in the dict would be just as inconvenient as a change in a namedtuple.
Aside from really not liking to pass around generic dicts, it would also require a change in Computation.code
from being a SotrtedSet
to something else since dict
isn't hashable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, I guess the plugin could import SourceCode
to see what's available. Things are more complicated because plugin has to be fairly agnostic about dask/distributed version but this should work (with a fallback to current way of interpreting code
when SourceCode
isn't importable).
@hendrikmakait taking this performance report as an example there is a code snippet at the bottom. It would be nice if it included the file name and line numbers for where that snippet came from. I was curious if this change gets us part of the way to implementing that. |
Thanks for the review @hendrikmakait I've updated to make line number reference the current source code/frame 75cfdcb, give a holler if this isn't what you had in mind. |
To summarize an offline discussion with @ntabris and @milesgranger: We care about knowing what line number triggered a computation/submission within a frame. It would also be nice to know the file number the frame's code belongs to and its absolute position within the file (as per @jacobtomlinson's comment). Whatever changes we make in this PR should allow us to add those attributes. If it's easy to do it would be nice if this PR also added those attributes. We do not care about deduplicating equal code frames with different line numbers (but we do care about deduplicating equal code frames with equal line numbers which might be caused by a for-loop). We assume that the remaining duplicates will not cause issues on the scheduler in terms of size and could easily be deduplicated by the consumer. Given that we feel a need to generally iterate on computations, we do not want to sink resources into building a nice tree-like deduplication. We should run an A/B test before merging this PR to avoid an unexpected performance impact. cc @fjetter in case you see any red flags with this approach. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's fire off an A/B test to see if this has a performance impact!
@@ -130,6 +130,13 @@ | |||
TOPIC_PREFIX_FORWARDED_LOG_RECORD = "forwarded-log-record" | |||
|
|||
|
|||
class SourceCode(NamedTuple): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Maybe Frame
might be a better name? If so, I'd also suggest renaming lineno_frame
to firstlineno
.
❤️ for the typed NamedTuple
!
@@ -3035,8 +3044,15 @@ def _get_computation_code( | |||
or fr.f_code.co_name in ("<listcomp>", "<dictcomp>") | |||
): | |||
continue | |||
|
|||
kwargs = dict( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This kwargs
dict feels awkward, but I don't see an obvious way of restructuring the code to remove the need for it. Do you? Please ignore this if you don't see an obvious change jumping out at you, this is not blocking this PR.
Haven't looked at the code but what you are describing sounds good. |
As assumed, there's no visible effect in the A/B test: https://github.com/coiled/benchmarks/actions/runs/5081854642 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've made the relevant change in Coiled (and manually confirmed that Coiled will continue to work with existing distributed and with change made in the PR), so LGTM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @milesgranger!
Closes #7779
pre-commit run --all-files