Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define how much data should be buffered (also: does 'buffered' impose a substantial cost on the whole web here?) #52

Open
dholbert opened this issue Jun 28, 2020 · 17 comments

Comments

@dholbert
Copy link
Contributor

dholbert commented Jun 28, 2020

Right now, the Layout Instability spec only has one mention of buffered, in the usage example.

However, it's unclear what this means & what the requirements are for it. The buffered flag is defined (loosely) in https://w3c.github.io/performance-timeline/ which has an important note that needs to be addressed in the Layout Instability spec:

NOTE
The number of buffered events is determined by the specification that defines the metric

Also:

and buffering is intended to used for first-N events only; buffering is not unbounded or continuous.

(emphasis added)

Accordingly: please clarify how many reports (if any) should be buffered here.

Also: I'm not sure low numbers are usefully "cheaper" here... Independent of the specific number, I'm a little concerned in general about supporting buffered at all for this metric, because I think that implies that the browser needs to perform layout-instability-calculations everywhere -- for all web pages, for every animation frame -- in order to have however many buffered historical values available to report just in case the web page suddenly registers a layout-instability observer with the buffered flag.

Even in the extreme case where buffered only requires you to report 1 or 2 old observations, it would still mean you have to compute layout instability data for every frame for all time, because you don't know when the page is going to suddenly register and ask for your buffered observations. They can't be computed lazily/on-demand, because they're about past relayout behavior.

@npm1
Copy link
Collaborator

npm1 commented Jun 29, 2020

I think the Note in PerformanceTimeline should instead point to https://w3c.github.io/timing-entrytypes-registry/#registry. You can see there the size is 150. I do understand the concern about computing the data, so we should only do buffering for APIs that do not affect performance. We believe this is one of them, but would be curious to see any performance problems in other implementors! The importance of buffering is that we want performance monitoring to be registered lazily, but for that we need to be able to provide early page load data when they do register. Otherwise the developer is incentivized to register early, and running this performance JS early on will impact the actual performance of the page.

@dholbert
Copy link
Contributor Author

I think the Note in PerformanceTimeline should instead point to https://w3c.github.io/timing-entrytypes-registry/#registry. You can see there the size is 150.

Ah, thanks. Yeah, I guess https://w3c.github.io/performance-timeline/ needs a correction on this.

I do understand the concern about computing the data, so we should only do buffering for APIs that do not affect performance. We believe this is one of them, but would be curious to see any performance problems in other implementors!

I'm not sure what the perf impact would be at this point. I could imagine this adding a small-but-measurable increase to layout durations (e.g. maybe something on the order of a 0.1% to 1% increase for some workloads, though maybe it's substantially less).

Any small-but-measurable duration would end up feeling like a kinda silly price to be paying, for no benefit, for sites that don't use the feature. Especially if we come up with a better way of doing this in a few years, and yet we still have to compute & buffer these values "just in case" even after sites switch to the hypothetical new/better thing.

The importance of buffering is that we want performance monitoring to be registered lazily, but for that we need to be able to provide early page load data when they do register. Otherwise the developer is incentivized to register early, and running this performance JS early on will impact the actual performance of the page.

Right, yeah; I get the need to capture data from early pageload.

If that's really the main use case that we're trying to address with buffered, my instinct is that we should put a time-box on the buffering, and say e.g. "If you haven't registered a layout-shift observer by time $FOO, then we'll stop buffering at that point; you can still register after that point, but buffered won't give you any history." (where $FOO = the document's load event firing, for example, or something "early on" that's likely to be after analytics scripts have been given a chance to start)

@dholbert
Copy link
Contributor Author

dholbert commented Jun 29, 2020

(My hypothetical "time-box" musing is sort of a thought about buffered in general; it's not specific to layout-shift, aside from the fact that layout-shift seems like it might require a bit more computation for each of its reports as compared to metrics where we're just recording a timestamp.)

@npm1
Copy link
Collaborator

npm1 commented Jun 29, 2020

Any small-but-measurable duration would end up feeling like a kinda silly price to be paying, for no benefit, for sites that don't use the feature. Especially if we come up with a better way of doing this in a few years, and yet we still have to compute & buffer these values "just in case" even after sites switch to the hypothetical new/better thing.

In the hypothetical case of finding a magical solution to this, I think we'd be willing to help outline a deprecation and removal path to get rid of buffering in our metrics. In the current state of things though, I think it has more benefits than the problems it may cause.

If that's really the main use case that we're trying to address with buffered, my instinct is that we should put a time-box on the buffering, and say e.g. "If you haven't registered a layout-shift observer by time $FOO, then we'll stop buffering at that point; you can still register after that point, but buffered won't give you any history." (where $FOO = the document's load event firing, for example, or something "early on" that's likely to be after analytics scripts have been given a chance to start)

Yes, this is another alternative. However it seems to me that this is less preferable to the current approach, as there does not seem to be a one-size-fits-all time for websites. For instance, we know from @nicjansma that analytics providers may not be ready by onload, which is the most obvious candidate. In addition, buffering by size instead of time also bounds the costs that may be incurred from the API when it's not being used at all, as you do not need to compute anything once the buffer is full and there are no observers.

@dholbert
Copy link
Contributor Author

you do not need to compute anything once the buffer is full and there are no observers.

Aha, I totally missed this aspect -- I'd misunderstood the buffer as being a "circular buffer" sort of structure, and I'd overlooked the word "first" in "first-N events", in the flag's description.

This addresses my concern about imposing a cost on sites that don't use this feature, I think. Sorry for not noticing that sooner. :)

@npm1
Copy link
Collaborator

npm1 commented Jun 29, 2020

No worries, let us know if there are wording suggestions you have to make this clearer (or happy to review a PR too).

@dholbert
Copy link
Contributor Author

dholbert commented Jun 29, 2020

I think the main thing that started me wondering about this was this snippet in the explainer:

By passing buffered: true to observe, the observer is immediately notified of any layout shifts that occurred before it was registered.

Right now, this ^ sentence (particularly "any layout shifts that occurred") sounds like it's saying there's no limit on what is buffered. That's what got me worrying in the first place and prompted me to dig in a bit more and file this issue.

It would be helpful to clarify this in such a way as to clarify that (a) there is some limit, and (b) shifts that occur after that limit won't be buffered.

Maybe we could add a parenthetical at the end of the above-quoted sentence? E.g.

...before it was registered (within reason; if too many shifts have already occurred, buffered:true will only preserve the earliest ones, up to a fixed limit).

@npm1
Copy link
Collaborator

npm1 commented Jun 29, 2020

Yea, that makes sense to me. Let's keep this issue open till such clarification is added.

@dholbert
Copy link
Contributor Author

dholbert commented Aug 7, 2020

you do not need to compute anything once the buffer is full and there are no observers.

This addresses my concern about imposing a cost on sites that don't use this feature, I think. Sorry for not noticing that sooner.

I should walk this ^^ back somewhat; I am still somewhat concerned about the burden that this spec imposes on every pageload (given that the browser now has to do bookkeeping to accumulate 150 observations for every page ever loaded, despite no observer being registered & devtools not being open, etc.) If there's any perf overhead at all to this bookkeeping, then it will obviously have an impact on pageload time, which is an especially important performance metric.

Hypothetically, if this feature's newly-required bookkeeping were to match up in a straightforward way with the sorts of things that the browser already has to keep track of for paint invalidation (e.g. if we were just making a note of the invalidated/repainted area), then perhaps this would be believably zero-cost.

But right now, the spec doesn't match up with the sorts of things that the browser would normally have to keep track of for paint invalidation (though perhaps Chrome's implementation has blurred it in that direction, based on observations/testcases in e.g. #61 and #65?) So, to conform with the spec, an implementation necessarily has to maintain additional state ("previous painted position & size" for each text node & each box in the box tree) which it otherwise does not have any need to maintain; and it'll have to do geometric comparisons between those states, for the first 150 paints of each site.

It's possible this is negligible, but without having implemented it yet, I'm not sure.

@npm1
Copy link
Collaborator

npm1 commented Aug 10, 2020

I see, that's a valid concern. I can think of several mitigations to this if it turns out Mozilla cannot implement this in a performant way:

  1. Reduce the buffer size to a number we're more comfortable with
  2. Allow the user agent to choose a buffer size that is lower than the default value.
  3. Allow the user agent to not implement the buffered flag at all. Of course Mozilla could choose to do this anyways, but I do have some hesitation about enabling this in the spec because this would mean that developers would lose data occurring before they have a chance to register their PerformanceObservers.
  4. Change the definitions to better match with what browsers would typically compute during paint invalidation, if this is feasible and you have suggestions in this regard. This would basically be an effort to make the API as easy to implement as possible, though I suppose it's hard to know from now how that would look like.

@dholbert
Copy link
Contributor Author

Thanks. Yeah, I was musing over those same options (reducing/disregarding buffered and/or tracking a somewhat different set of geometric data about repaints that's more readily available).

  1. Change the definitions to better match with what browsers would typically compute during paint invalidation, if this is feasible

(For the record, Chrome's implementation seems to already be doing this internally, to some extent, per #65 and related issues. :) )

@mmocny
Copy link

mmocny commented Sep 22, 2020

Trying to summarize this thread:

  • Calculating certain metrics may impose a non-zero cost (relative to work browser is already doing).
  • This is fine when the developer actually asks for these metrics, but the act of buffering metrics by default implies this cost burden for every pageload, even when unused.
  • Buffers do already have size limits (and will stop calculating once full, not overwrite), and the intention is for buffering to be enabled only for metrics that can be computed without affecting performance.
  • This still leaves room for disagreement about which metrics are necessarily trivial to compute (or, that the cost is already paid anyway), and which MUST be buffered.

More generally, for any metrics which are deemed non-trivial and not buffered by default, it puts the developer in the precarious situation of having to register observers very early (which buffering aimed to solve).

@dholbert @npm1 -- Could perhaps a Document Policy be used to opt-in to buffering of default-off metrics?

@npm1
Copy link
Collaborator

npm1 commented Sep 22, 2020

It could be used that way, though we'd need to investigate if this is something analytics providers can set up, otherwise we risk losing too much data. I would like to know if we'll need mitigations here, so we should await for the Firefox feedback from their implementation.

@dholbert
Copy link
Contributor Author

dholbert commented Sep 22, 2020

@mmocny:

Trying to summarize this thread: [...]
Could perhaps a Document Policy be used to opt-in to buffering of default-off metrics?

Thank you for that summary - that's a good encapsulation of the points/concerns here.

I'm not familiar with Document Policy itself, but from a glance at the proposal, it seems like a suitable opt-in mechanism for these sorts of metrics, assuming Document Policy is eventually implemented outside of Blink.

@npm1:

we should await for the Firefox feedback from their implementation.

You might be waiting a little while for this. Nobody is currently working on a Firefox implementation, and it's not a priority for the immediate future, as far as I know. My github-issues & feedback have mostly been to inform our standards-positions assessment and to be sure we can eventually implement it, when we've got resources to put into an implementation.

But at this point, I would not bet on this falling cleanly into "trivial-to-compute/cost-already-paid" category, for Gecko. Note that your comment in #53 (comment) suggests that it's also not in that trivial-cost bucket for Blink, either, without a kinda-magic threshold being baked into the feature.

It's possible we could make it trivial-cost with a sufficient magic-threshold & perhaps with other shortcuts[1], but I'd much prefer a wholly opt-in mechanism for enabling this, so we don't need to worry as much about shoehorning it into being trivial-cost.

[1] (e.g. if we allow ourselves to ignore certain classes of unpainted content, it's possible we could reuse our existing repaint/invalidation "retained display-list" mechanism, which already tracks old/new positions for pieces of boxes that move around.)

@npm1
Copy link
Collaborator

npm1 commented Sep 23, 2020

I'm not familiar with Document Policy itself, but from a glance at the proposal, it seems like a suitable opt-in mechanism for these sorts of metrics, assuming Document Policy is eventually implemented outside of Blink.

Some objections I would have to gating this on Document Policy from the start:

  • I am not aware of the positions from other browser vendors, and as you mention this is also an early proposal currently only implemented in Blink.
  • If this will require headers, analytics providers would be unable to gather data for their sites without those sites performing work that is deemed by them non-trivial (often not performed). This is already the case, for instance, with some performance features such as Timing-Allow-Origin, although in that case there are good reasons for it. In this case, we have some potential performance concerns. I'd be open to considering this when there are very concrete reasons for doing it, as this will most likely have some impact on the ability for sites to gain insight into their layout shifts. Granted, it would only affect up until the observer is registered, but then this incentivizes early script, and defeats the purpose of buffering in the first place.

@npm1:

we should await for the Firefox feedback from their implementation.

You might be waiting a little while for this. Nobody is currently working on a Firefox implementation, and it's not a priority for the immediate future, as far as I know. My github-issues & feedback have mostly been to inform our standards-positions assessment and to be sure we can eventually implement it, when we've got resources to put into an implementation.

I see, well that is unfortunate :( I appreciate the feedback though, and hope Firefox can implement sometime in the future!

But at this point, I would not bet on this falling cleanly into "trivial-to-compute/cost-already-paid" category, for Gecko. Note that your comment in #53 (comment) suggests that it's also not in that trivial-cost bucket for Blink, either, without a kinda-magic threshold being baked into the feature.

It's possible we could make it trivial-cost with a sufficient magic-threshold & perhaps with other shortcuts[1], but I'd much prefer a wholly opt-in mechanism for enabling this, so we don't need to worry as much about shoehorning it into being trivial-cost.

I understand the concern, but I think I'd rather find ways to make the API performant when Firefox finds blockers instead of gating it. The reasoning for this is that developers do not expect --- and should not, for that matter --- that gathering data via a performance API that can be gathered for every page load will have a nontrivial effect on the performance of the site. There is some precedent on what to do for performance-impacting performance APIs: the JS Sampling Profiler API (not shipped on any browser yet). In that case, what we plan to do is to enable measuring for only a subset of page loads, since clearly profiling every site all the time is not needed or desired, and the profiling can have some cost. I would be open to exploring this as an alternative solution if we see that CLS is indeed not possible in Firefox without costs.

For additional data, this API is quickly growing in popularity as measured by Chrome counters: https://chromestatus.com/metrics/feature/timeline/popularity/2782. So if it keeps growing in popularity like this, it will make more and more sense to compute the data from the start, especially given that it prevents incentivizing atrocious behavior (registering perf scripts early during the page load).

[1] (e.g. if we allow ourselves to ignore certain classes of unpainted content, it's possible we could reuse our existing repaint/invalidation "retained display-list" mechanism, which already tracks old/new positions for pieces of boxes that move around.)

I think ignoring unpainted content is something desirable for this API, so if that is something that is easier to implement in Firefox, all the better!

@mmocny
Copy link

mmocny commented Sep 25, 2020

Some objections I would have to gating this on Document Policy from the start:

I used the phrase "Document Policy" but I really meant to imply any document hint, like headers or <meta>, regardless if the format conforms to any specific Document Policy spec. I haven't really been following the status there, nor whats common.

If this will require headers, analytics providers would be unable to gather data

The suggestion of using a header is to address cases where a UA has already decided it is impossible to buffer by default. This gives you at least some way to get back what you cannot have otherwise.

I think your point is important: we cannot just "off-by-default" everything as a crutch alternative to just making things available efficiently, whenever possible. That will add complexity for implementers and probably lead to measurement gaps in practice.

===

Of all the suggestions, if indeed some mitigation is needed, I think I like @dholbert 's the most:

  • Default option: Process and buffer metrics until buffer is full.
  • @dholbert suggestion: Also stop buffering once it seems the page won't be asking for the data (i.e. after n-seconds [after nav/load], possibly also using prior page loads as hints).
  • @npm1 suggestion: Subsample, by sometimes enabling/disabling processing entirely.
  • @mmocny suggestion: Disable buffering entirely, but allow devs to force re-enable with hint.

I think @npm1 suggestion is viable when processing is so expensive that we cannot always give metrics to developers even when they ask for them, but probably too heavy handed otherwise (unless I misunderstand).

My suggestion is flexible, but least ergonomic.

(I do still hope we don't need to reach for any of these alternatives for Layout Instability)

@mmocny
Copy link

mmocny commented Sep 20, 2021

Trying to review and summarize this thread so we can have actionable next steps, and I see:

  • Ensure that the spec clarifies that buffered has limits
  • Ensure that the spec clarifies that buffer limits are configurable and may differ across user agents
    • see also recent droppedEntriesCount work which helps here

I no longer get the sense for there being an appetite for supporting configuration options to support if buffering is enabled in the first place. While this offers some theoretical advantages, it seems difficult when considering complexities with Document Policy and that PerformanceObserver consumers are often third parties. By the time a 3p can register an intent to consume buffered PO data via other document hints, they can probably just as well register an observer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants