Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add synchronization to createPlan #25272

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

SemionPar
Copy link
Contributor

Description

createPlan incremented concurrentQueries without acquiring the lock

Additional context and related issues

Fixes #23582

Before the change:
image

After the change:
image

Release notes

(x) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
( ) Release notes are required, with the following suggested text:

## Section
* Fix some things. ({issue}`issuenumber`)

@cla-bot cla-bot bot added the cla-signed label Mar 10, 2025
@SemionPar SemionPar requested a review from ebyhr March 10, 2025 18:04
@@ -583,18 +583,24 @@ public Plan createPlan(Session session, String sql)
// session must be in a transaction registered with the transaction manager in this query runner
getTransactionManager().getTransactionInfo(session.getRequiredTransactionId());

spansValid = concurrentQueries.incrementAndGet() == 1;
lock.readLock().lock();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the purpose of this lock?

concurrentQueries is an AtomicInteger, so it's safe to update outside of a lock (plus, this is a read lock, so it wouldn't protect from concurrent modifications anyway)

Copy link
Contributor Author

@SemionPar SemionPar Mar 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

concurrentQueries is an AtomicInteger, so it's safe to update outside of a lock

True to a T! Please note, however, that

spansValid = concurrentQueries.incrementAndGet() == 1;

is a compound operation which is not atomic, but even this isn't the root cause.

What I believe was happening here (and I am talking about flakiness of BaseBigQueryConnectorTest.testBigQueryMaterializedView which we see when the whole class is ran concurrently) is this:

testBigQueryMaterializedView(){
  executeExclusively(...) // <----- gets write lock and executes two queries sequentially, each time calling getSpans
}

// in some other thread, concurrently
testSomeOtherThing(){
  QueryAssert.someAssertCallingCreatePlan(...)  // <---- without locking, it goes ahead and sets spansValid == false
}

Adding the lock here means someAssertCallingCreatePlan won't start until executeExclusively releases the write lock - I assume this is the intended behavior. Why a read lock and not a write lock? TBH I wasn't sure myself, but not looking far, there is executeInternal() which does use read lock in the same way.

Copy link
Member

@martint martint Mar 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so the lock is to prevent running concurrently when executeExclusively, not for protecting access to the concurrentQueries counter. Please change the description in the PR and add an explanation in the commit message explaining this.

@SemionPar SemionPar requested a review from martint March 11, 2025 12:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

Flaky TestBigQueryAvroConnectorTest.testBigQueryMaterializedView
2 participants