[Bug]: Severe performance degradation in Nessie v.0.102.3 #10536

bigluck · 2025-03-13T16:54:37Z

What happened

We are experiencing critical performance issues with Nessie v.0.102.3, specifically with the /v2/trees endpoint when retrieving branch listings. In severe cases, these performance problems are causing server failures.

Nessie Version: v.0.102.3
DB Repo Id 3
Data Volume:
- 7 branches
- 38k tags
- 38,312 elements in the refs table for this repo_id
- 120,000 total refs on the development RDS instance

Following the support team's recommendation, we disabled option NESSIE.VERSION.STORE.PERSIST.CACHE-ENABLE-SOFT-REFERENCES on Wednesday, March 12 ( #10526 ). While this change has improved API stability, it has also significantly increased response times.

Current Behavior

Response times for the /trees endpoint range from 4-5 seconds for retrieving just 7 branches
Before disabling that option, response times were still high at 2-3 seconds
The server occasionally crashes under this load

Expected Behavior

The /trees endpoint should return results in milliseconds for such a small result set (7 branches), regardless of the total number of refs/tags in the system, expectially for a simple http request like this one:

/api/v2/trees?fetch=MINIMAL&filter=refType+%3D%3D+%22BRANCH%22&max-records=500

We suspect that Nessie is not applying the appropriate database-level filters. It appears to be iterating through the entire collection of 40,000+ objects and filtering them in-memory rather than using efficient database queries.

How to reproduce it

IDK

Nessie server type (docker/uber-jar/built from source) and version

docker

Client type (Ex: UI/Spark/pynessie ...) and version

No response

Additional information

This is the trace from a request that took 4 seconds to generate a response; it contains:

632 ObservingPersist.fetchReferences
31354 ObservingPersist.fetchTypedObj
638 ObservingPersist.fetchReference

Why does a query to retrieve only 7 branches take 4-5 seconds to execute?
Is there a configuration setting or optimization that would allow Nessie to properly filter at the database level?
Are there any known issues with the /trees endpoint when dealing with repositories that have a large number of refs in the db?

This performance issue is significantly affecting our dev and prod systems. The excessive response times and occasional server failures are unfortunately blocking critical operations.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: Severe performance degradation in Nessie v.0.102.3 #10536

[Bug]: Severe performance degradation in Nessie v.0.102.3 #10536

bigluck commented Mar 13, 2025

[Bug]: Severe performance degradation in Nessie v.0.102.3 #10536

[Bug]: Severe performance degradation in Nessie v.0.102.3 #10536

Comments

bigluck commented Mar 13, 2025

What happened

Current Behavior

Expected Behavior

How to reproduce it

Nessie server type (docker/uber-jar/built from source) and version

Client type (Ex: UI/Spark/pynessie ...) and version

Additional information