feat: add backoff and jitter to retry #643

proost · 2024-10-03T15:51:51Z

close: #633

Adding retry policy.

default is exponential backoff and jitter.

clean logic for retry handler is not easy. If have a idea, give me review.

retry.go

cluster.go

rueian · 2024-10-06T17:04:27Z

cluster.go

+
+				if !isWaitingForRetry {
+					shouldRetry := c.retryHandler.WaitUntilNextRetry(
+						ctx, attempts, resp.Error(),
+					)
+
+					if !shouldRetry {
+						continue
+					}
+
+					isWaitingForRetry = true
+				}


Why do you put the waiting here? Instead of putting it at:

rueidis/cluster.go

Lines 679 to 681 in fc348f5

if len(retries.m) != 0 {

goto retry

}

If we sleep not here, code is quite complicated. Because we have to distinguish retry or ask/moved in the L679. if retry, should wait. if not retry immediately.

Even though there is randomness, i think waiting time for each goroutines, converges similarly. If you want exactly once to wait, i will change it.

Thanks for catching this detail I overlooked.

I think a better way to handle this is if any move/ask redirection happens, we don't back off for a retry.

Therefore, how about we add a new counter field to the retries struct? So that we can do this:

if len(retries.m) != 0 { if retries.redirects != 0 { retries.redirects = 0 goto retry } if c.retryHandler.WaitUntilNextRetry(...) { goto retry } }

We can increment the counter after the lock:

2df785e

We support configurable retry or not so should give error to RetryDelay. Because of this i use channel to gather errors.

We can't use just int. because it is not thread-safe. I prefer atomic in this case, because the type itself guarantees thread-safe. Even though doretry and doretrycache use mutex inside, but it is implicit to me to be thread-safe when load int value.

Hi @proost,

I think a retryErrCh := make(chan error, len(multi)) is too over kill. Let me find a better way for you later.

but it is implicit to me to be thread-safe when load int value.

It is actually thread-safe, because we read the field after a wg.Wait().

Atomic is ok too, but please just use uint32 which is smaller and has no alignment issue.

Hi @proost,

I think a better alternative for retryErrCh is calling RetryDelay directly in the doresultfn function and we store its maximum into the retries struct. Could be something like this:

It is actually thread-safe, because we read the field after a wg.Wait().

Oh, i missed waiting using sync.WaitGroup.

change to use retries.RetryDelay.
f04b433

syncp.go

rueian · 2024-10-11T16:33:10Z

cluster.go

+		for retryableErr := range retryErrCh {
+			shouldRetry := c.retryHandler.WaitUntilNextRetry(ctx, attempts, retryableErr)
+			if shouldRetry {
+				attempts++
+				goto retry
+			}
+		}
+
+		if retries.Redirects.Load() > 0 {
+			retries.Redirects.Store(0) // reset
+			goto retry
+		}


I think if retries.Redirects.Load() > 0 should have a higher priority than for retryableErr := range retryErrCh.

If there is a redirect response, we should not delay the retry.

rueian

I think this PR looks very good now. We can merge it. Thank you @proost for this great work! Really impressive.

One more thing I want to optimize is the positions of the newly added retryHandler field:

Putting it after a bool can cause some space waste, could you rearrange the field in a followup PR? Also, can retryHandler be just a struct instead of a pointer to struct?

proost · 2024-10-13T08:49:50Z

@rueian
Sure. I will change in a followup PR.

feat: add backoff and jitter to retry

5337da7

rueian reviewed Oct 3, 2024

View reviewed changes

retry.go Outdated Show resolved Hide resolved

retry.go Outdated Show resolved Hide resolved

proost added 2 commits October 5, 2024 17:38

refactor: check before retry wait

9e884c3

style: rollback old code style

78ad959

proost requested a review from rueian October 5, 2024 08:49

rueian reviewed Oct 5, 2024

View reviewed changes

retry.go Outdated Show resolved Hide resolved

proost added 2 commits October 6, 2024 23:01

refactor: do not return error

ea85114

refactor: strictly exponential backoff

d172784

proost requested a review from rueian October 6, 2024 14:22

refactor: backward compatibility

e32373d

rueian reviewed Oct 6, 2024

View reviewed changes

cluster.go Outdated Show resolved Hide resolved

cluster.go Outdated Show resolved Hide resolved

cluster.go Outdated Show resolved Hide resolved

rueian reviewed Oct 6, 2024

View reviewed changes

proost added 3 commits October 7, 2024 22:41

refactor: remove useless return error

1725eb5

refactor: remove useless code

15415a1

refactor: rollback code

217acd7

proost requested a review from rueian October 7, 2024 14:00

refactor: use retry error channel and redirects

2df785e

rueian reviewed Oct 11, 2024

View reviewed changes

syncp.go Outdated Show resolved Hide resolved

rueian reviewed Oct 11, 2024

View reviewed changes

refactor: divide method

f04b433

proost requested a review from rueian October 12, 2024 09:57

rueian approved these changes Oct 12, 2024

View reviewed changes

proost mentioned this pull request Oct 13, 2024

[follow-up] perf: optimize struct layout #647

Merged

rueian merged commit 113c567 into redis:main Oct 13, 2024
27 checks passed

proost deleted the feat-adding-backoff-and-jitter-to-retry branch October 14, 2024 02:53

rueian mentioned this pull request Oct 28, 2024

Enhancement: Graceful Handling of Redis LOADING State in Rueidis #656

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add backoff and jitter to retry #643

feat: add backoff and jitter to retry #643

proost commented Oct 3, 2024

rueian Oct 6, 2024

proost Oct 7, 2024

proost Oct 7, 2024

rueian Oct 7, 2024

rueian Oct 7, 2024

proost Oct 11, 2024

rueian Oct 11, 2024

rueian Oct 11, 2024

rueian Oct 11, 2024

proost Oct 12, 2024

rueian Oct 11, 2024

proost Oct 12, 2024

rueian left a comment

proost commented Oct 13, 2024

feat: add backoff and jitter to retry #643

feat: add backoff and jitter to retry #643

Conversation

proost commented Oct 3, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rueian left a comment

Choose a reason for hiding this comment

proost commented Oct 13, 2024