Replace ReadonlyTargets interface with a TargetsConfig channel #815

ian-mi · 2020-04-10T23:25:38Z

A channel provides a simple and safe interface for config updates. The in-memory
config is no longer needed as tests can simply write config values to the config
channel. For the volume config the config channel replaces the notify channel
and the config watcher writes updated config values to the channel.

Proposed Changes

Replace ReadonlyTargets interface with a TargetsConfig channel. This simplifies configuration updates by preferring channel based communication over shared memory.

knative-prow-robot · 2020-04-10T23:25:41Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ian-mi

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [ian-mi]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

yolocs · 2020-04-10T23:40:56Z

I think this is probably not the best time to do this refactoring unless it provides significant value. Grant is working the controller which also uses the lib for writing data. @grantr

A channel provides a simple and safe interface for config updates. The in-memory config is no longer needed as tests can simply write config values to the config channel. For the volume config the config channel replaces the notify channel and the config watcher writes updated config values to the channel.

ian-mi · 2020-04-10T23:59:22Z

/hold wait for broker reconcile to be merged

yolocs · 2020-04-11T02:51:57Z

Got time to take a more detailed look. Thanks for sharing ways to improve this lib and adopt best practices!

We should discuss what benefit this change could bring once the reconciler PR is merged. Here are a couple of things that might require some consideration:

In the broker reconciler, it needs an interface to safely modify TargetsConfig. The in-memory one isn't just for unit test.
At the moment, TargetsConfig has a single consumer in fanout/ingress, which makes passing the updated TargetsConfig over a channel looks preferable. But that could change. Ideally TargetsConfig should be a singleton that can be read by multiple consumers for the latest value.
Fanout and retry need to reconcile handlers 1) when TargetsConfig is updated 2) when a resync period is over (in case reconcile failure in case 1). So the update signal shouldn't be the only signal.
I see in handler.go you still need to use a mutex to guard the broker config, which is still sharing state via memory (rather than communication). I suspect in this case whether we can totally avoid sharing state via memory.

ian-mi · 2020-04-11T04:37:27Z

Thanks for taking a look! The benefit of this change would be merely the simplification of concurrent communication and adherence to idiomatic go best practices.

I see, I still need to review the controller PR in more detail to see if what functionality it needs. Perhaps synchronized access to the config proto would be sufficient? The throughput requirements of the controller should be less stringent than data plane components.
If the need arrives it should be straightforward to add broadcast functionality.
Could you elaborate on this requirement? It seems that this could be accomplished by adding a resync period to the volume watcher.
I agree that this refactor could go further. I can continue to iterate on this PR if you are onboard with the approach that I've taken.

Only create a goroutine when there is concurrent work to do. Use a buffered channel to bound concurrency instead of creating a fixed number of workers. Pass the targets config to workers when they are created rather than synchronizing shared access to config.

TargetsWatcher uses a channel to indicate the available of an updated TargetsConfig which is returned alongside each config read.

handler will now not attempt to pull a message until it has acquired a concurrency token. Messages are pulled in a separate goroutine in order to avoid blocking handle. New messages are passed back to the goroutine so that they can be handled using the latest broker config.

This allows SyncPool to use a plain map since all access is synchronized through its goroutine.

ian-mi · 2020-04-13T05:07:49Z

I believe this PR now addresses 2-4. For 1 I can simply re-add the in-memory config code, I think it makes sense for the in-memory config store used by the controller to be completely separate from the config broadcast functionality used by the data plane.

This is necessary since the test introspects internal state.

knative-metrics-robot · 2020-04-13T05:31:22Z

The following is the coverage report on the affected files.
Say /test pull-google-knative-gcp-go-coverage to re-run this coverage report

File	Old Coverage	New Coverage	Delta
pkg/broker/config/config.go	100.0%	0.0%	-100.0
pkg/broker/config/volume/volume.go	73.8%	71.4%	-2.4
pkg/broker/config/watcher.go	Do not exist	100.0%
pkg/broker/handler/handler.go	82.8%	78.3%	-4.5
pkg/broker/handler/pool/fanout/pool.go	81.4%	76.3%	-5.1
pkg/broker/handler/pool/options.go	92.6%	92.0%	-0.6
pkg/broker/handler/processors/fanout/processor.go	95.6%	97.6%	2.0
pkg/broker/ingress/handler.go	71.4%	71.2%	-0.2
pkg/broker/ingress/multi_topic_decouple_sink.go	55.9%	60.0%	4.1

knative-prow-robot · 2020-04-13T05:55:52Z

@ian-mi: The following tests failed, say /retest to rerun all failed tests:

Test name	Commit	Details	Rerun command
pull-google-knative-gcp-go-coverage	`fd2f6b7`	link	`/test pull-google-knative-gcp-go-coverage`
pull-google-knative-gcp-integration-tests	`fd2f6b7`	link	`/test pull-google-knative-gcp-integration-tests`

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

yolocs

Thanks for the updates! My thoughts:
It could be debatable if this change is a pattern overuse (sharing state via communication). There is code that only reads the config data (config-reader, e.g. handler, processors) and there is code that needs to know when the config is updated (config-notifiee, e.g. syncpool). This change inevitably blurs the boundary between config-reader and config-notifiee - config-reader now has to actively manage the update signal, which would make the config-reader's code more complicated and harder to follow. I believe config-reader should see config as "static" data and shouldn't worry about signals at all. FWIW, for some existing config lib I know (viper, micro-go/config), I don't see them forcing config-readers to accept config data from channel (as signal).

My thoughts aside, I'm very likely biased and can't see the benefits from this change (e.g. if this change could provide a significant performance improvement). So better to have more eyes on this (@grantr @Harwayne @grac3gao @ericlem, feel free to share your thoughts). I will put a temp hold on this since the broker controller code hasn't yet been merged.

/hold

yolocs · 2020-04-13T15:08:52Z

pkg/broker/handler/handler.go

 }

 // Start starts the handler.
 // done func will be called if the pubsub inbound is closed.
-func (h *Handler) Start(ctx context.Context, done func(error)) {
+func (h *Handler) Start(ctx context.Context, config *config.Broker) {


I intentionally avoided specific parameters (e.g. config.Broker). Handler <-> broker is not necessarily 1-1 mapping. For retry, it would be one handler per target (trigger). In my thinking, the handler's only responsibility is to receive messages and invoke processors. It shouldn't make any assumption of what data is needed for processors. The handler creator needs to put the data in the context.

Currently, there is a 1-1 correspondence between brokers and handlers, and context is used as a form of implicit argument passing from handlers to processors and as well as between processors. The context docs warn against this:

Use context Values only for request-scoped data that transits processes and APIs, not for passing optional parameters to functions.

I think in this case if we need a handler for a different purpose, we should simply write a new handler rather than trying to write an all-purpose handler as this level of abstraction is not well suited for go.

I see it as a suggestion rather than requirement. Knative code in a lot of places doesn't follow this suggestion - client/informer injection, logger in the context, etc.

Knative is definitely guilty of abusing context as well, but I don't necessarily think we should copy that practice if we can avoid it.

yolocs · 2020-04-13T15:14:24Z

pkg/broker/handler/handler.go

+			} else {
+				go h.handleMessage(handlerctx.WithBroker(ctx, config), r.msg, tokens)
+			}
+		case config = <-h.ConfigCh:


My $0.02 here: I feel this actually made code harder to read because the handler now has to support config updates

I think it is a matter of getting used to the message passing style of communication. Maybe it would be useful to have a wider discussion on whether we should prefer this style of concurrency in general.

My point was not about whether people can read this style. It was about whether it's necessary. Like I mentioned in the other comment, I see handler as a config-reader and it doesn't have to act on config updates, which also happens to make code simpler.

I have @ the folks who are actively working on broker to share their opinions.

yolocs · 2020-04-13T15:23:53Z

pkg/broker/handler/pool/fanout/pool.go

+		case <-targets.Updated:
+			targets = p.targetsWatcher.Targets()


This part confuses me. This looks like the same "signal" idea as we now have? Except you wrap them into a struct.

It's slightly different, this signal only indicates that the current config is out of date and it can broadcast to any number of listeners.

The old signal channel required every signal to pulled downstream or the watcher would block and it also did not support broadcast. It also did not tie the signal to the config, so if the config was updated multiple times before a listener had a chance to sync the listener would still receive multiple signals and sync more than necessary.

With TargetsWatcher, a listener receives the latest config along with an update signal tied to that specific config value. The listener can use this signal to determine when that config value should be updated, but is not required to receive from that channel and the watcher will not be blocked.

I guess I'm fine with TargetsWatcher if it provides improvements like you mentioned here. But I'd like the syncpool being the only one handles update signal. Not handler, not processors.

yolocs · 2020-04-13T15:27:22Z

pkg/broker/handler/pool/fanout/pool.go

-		if _, ok := p.targets.GetBrokerByKey(key.(string)); !ok {
-			value.(*handler.Handler).Stop()
-			p.pool.Delete(key)
+	for key, handler := range p.pool {


The idea of readonly targets is (although not 100% safe) to make it more obvious that the targets are readonly and more difficult to modify targets.

I see, maybe I should add the interface back to improve safety? I'm typically a fan of using proto directly but I'm also used to a language where protos are immutable. Perhaps immutable by convention is good enough? My concern with using an interface is that it has to be kept up-to-date with the proto.

@ian-mi what's your specific concern with the interface needing to be kept up-to-date? Are you concerned about runtime errors? Engineering time used keeping the interface current?

ian-mi · 2020-04-13T18:57:58Z

I took a look at the other config systems that you mentioned. viper is not thread-safe and would require a process to serialize access to config: spf13/viper#268. On the other hand go-micro/config does include a config Watcher type that can be used to watch for configuration changes so I think its design is more comparable to TargetsWatcher.

One aspect of TargetsWatcher that needs clarification is that it does not require listening on the update channel. Targets() can always be called get the latest config value. The update channel is provided as a means of avoiding spurious updates when state needs to be synchronized based on config updates as is the case with SyncPool.

yolocs · 2020-04-13T21:44:07Z

One aspect of TargetsWatcher that needs clarification is that it does not require listening on the update channel. Targets() can always be called get the latest config value. The update channel is provided as a means of avoiding spurious updates when state needs to be synchronized based on config updates as is the case with SyncPool.

I was wondering how TargetsWatcher is different from the existing approach. The existing approach provides a signal (as a channel) and a ReadonlyTargets which always contains the latest value. TargetsWatcher seems to me like just wrapping them into a struct.

By mentioning those config libs, my point was that we need treat config-reader (who just wants to read the latest data) and config-notifiee (who needs to act on config updates) differently. We shouldn't force config-reader to handle update signals. But again, that's just my opinion.

knative-prow-robot · 2020-04-14T20:28:19Z

@ian-mi: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

grantr · 2020-04-15T18:50:31Z

I haven't had a chance to look at this yet but hope to soon. Let's try to resolve it for v0.15.0-M1 sprint.

grantr

@ian-mi I feel like I don't have enough context to understand what the practical improvement is here. Can you provide some examples of how one would use this new interface to implement a handler or processor, and why that makes reading or maintaining the code easier?

From the broker controller perspective, there is one goroutine that reads the current state, modifies, and writes back. The broker controller must manage this concurrency itself because it needs to serialize writes to the configmap and the memory targets impl doesn't do this (but #830 would help).

I expect this to be a temporary condition because once we adopt #852, the read-modify-write code won't need to deal with concurrency because only one thread will ever be operating on the targets config and configmap.

grantr · 2020-04-20T23:25:29Z

pkg/broker/handler/pool/fanout/pool.go

-		if _, ok := p.targets.GetBrokerByKey(key.(string)); !ok {
-			value.(*handler.Handler).Stop()
-			p.pool.Delete(key)
+	for key, handler := range p.pool {


@ian-mi what's your specific concern with the interface needing to be kept up-to-date? Are you concerned about runtime errors? Engineering time used keeping the interface current?

ian-mi · 2020-07-01T20:25:57Z

Closing since I haven't had time to update this PR.

knative-prow-robot added the do-not-merge/work-in-progress label Apr 10, 2020

knative-prow-robot requested review from capri-xiyue and yolocs April 10, 2020 23:25

googlebot added the cla: yes (override cla status due to multiple authors bug) label Apr 10, 2020

knative-prow-robot added approved size/XXL labels Apr 10, 2020

ian-mi force-pushed the target-config-channel branch from 0a5f861 to 8da553f Compare April 10, 2020 23:34

ian-mi force-pushed the target-config-channel branch from 8da553f to 07aeb1a Compare April 10, 2020 23:45

ian-mi changed the title ~~[WIP] Replace ReadonlyTargets interface with a TargetsConfig channel~~ Replace ReadonlyTargets interface with a TargetsConfig channel Apr 10, 2020

knative-prow-robot removed the do-not-merge/work-in-progress label Apr 10, 2020

knative-prow-robot added the do-not-merge/hold label Apr 10, 2020

ian-mi added 4 commits April 10, 2020 23:29

Add TargetsWatcher for broadcasting TargetsConfig updates

bbd6dca

TargetsWatcher uses a channel to indicate the available of an updated TargetsConfig which is returned alongside each config read.

Use TargetsWatcher in SyncPool

4bb4ecf

knative-prow-robot added the needs-rebase label Apr 13, 2020

Merge remote-tracking branch 'origin/master' into target-config-channel

51016eb

knative-prow-robot removed the needs-rebase label Apr 13, 2020

ian-mi added 2 commits April 12, 2020 22:00

Replace handler done callback with channel

686a587

This allows SyncPool to use a plain map since all access is synchronized through its goroutine.

Add back the 15 second resync to pool

682dcab

Stop and start pool between tests to prevent data races

fd2f6b7

This is necessary since the test introspects internal state.

yolocs reviewed Apr 13, 2020

View reviewed changes

knative-prow-robot added the needs-rebase label Apr 14, 2020

grantr mentioned this pull request Apr 14, 2020

Implementation of Targets that knows it's backed by a configmap #830

Closed

grantr added this to the v0.15.0-M1 milestone Apr 15, 2020

grantr self-assigned this Apr 15, 2020

grantr reviewed Apr 21, 2020

View reviewed changes

grantr modified the milestones: v0.15.0-M1, v0.15.0-M2 Apr 29, 2020

grantr removed their assignment May 11, 2020

ian-mi closed this Jul 1, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace ReadonlyTargets interface with a TargetsConfig channel #815

Replace ReadonlyTargets interface with a TargetsConfig channel #815

ian-mi commented Apr 10, 2020

knative-prow-robot commented Apr 10, 2020

yolocs commented Apr 10, 2020 •

edited

Loading

ian-mi commented Apr 10, 2020

yolocs commented Apr 11, 2020

ian-mi commented Apr 11, 2020

ian-mi commented Apr 13, 2020

knative-metrics-robot commented Apr 13, 2020

knative-prow-robot commented Apr 13, 2020

yolocs left a comment

yolocs Apr 13, 2020

ian-mi Apr 13, 2020

yolocs Apr 13, 2020

ian-mi Apr 13, 2020

yolocs Apr 13, 2020

ian-mi Apr 13, 2020

yolocs Apr 13, 2020

yolocs Apr 13, 2020

ian-mi Apr 13, 2020

yolocs Apr 13, 2020

yolocs Apr 13, 2020

ian-mi Apr 13, 2020

grantr Apr 20, 2020

ian-mi commented Apr 13, 2020

yolocs commented Apr 13, 2020

knative-prow-robot commented Apr 14, 2020

grantr commented Apr 15, 2020

grantr left a comment

grantr Apr 20, 2020

ian-mi commented Jul 1, 2020

Replace ReadonlyTargets interface with a TargetsConfig channel #815

Replace ReadonlyTargets interface with a TargetsConfig channel #815

Conversation

ian-mi commented Apr 10, 2020

Proposed Changes

knative-prow-robot commented Apr 10, 2020

yolocs commented Apr 10, 2020 • edited Loading

ian-mi commented Apr 10, 2020

yolocs commented Apr 11, 2020

ian-mi commented Apr 11, 2020

ian-mi commented Apr 13, 2020

knative-metrics-robot commented Apr 13, 2020

knative-prow-robot commented Apr 13, 2020

yolocs left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ian-mi commented Apr 13, 2020

yolocs commented Apr 13, 2020

knative-prow-robot commented Apr 14, 2020

grantr commented Apr 15, 2020

grantr left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ian-mi commented Jul 1, 2020

yolocs commented Apr 10, 2020 •

edited

Loading