Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Catalog: Return consistent metadata-location for Iceberg REST APIs #10508

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

jshmchenxi
Copy link

@jshmchenxi jshmchenxi commented Mar 5, 2025

Problem

Different Nessie Iceberg REST APIs return metadata locations inconsistently:

  • loadTable → a metadata location with a filesystem schema, e.g., s3://bucket/table/metadata/xxx.metadata.json.
  • updateTable → a metadata location pointing to a Nessie API endpoint, e.g., http://nessie-host:port/catalog/v1/trees/main%some-hash-value/snapshot/my.table?format=iceberg.

This causes issues for clients like iceberg-rust, which doesn’t support HTTP metadata locations.

Fix

Standardizes metadata location responses to always use the filesystem schema.

@CLAassistant
Copy link

CLAassistant commented Mar 5, 2025

CLA assistant check
All committers have signed the CLA.

Copy link
Member

@dimas-b dimas-b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution @jshmchenxi !

I believe the /catalog/v1 URIs are a remnant of early days... @snazy : Do you think they are still relevant?

@dimas-b dimas-b requested a review from snazy March 6, 2025 01:53
@dimas-b
Copy link
Member

dimas-b commented Mar 6, 2025

@jshmchenxi : Could you make sure the updateTable flow is covered by existing tests?... or add a new test if it was not covered, please.

@jshmchenxi jshmchenxi force-pushed the feat/iceberg-rest-metadata-location-consistency branch from 6af6bd6 to 5658881 Compare March 6, 2025 03:30
Copy link
Member

@dimas-b dimas-b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

@jshmchenxi
Copy link
Author

@jshmchenxi : Could you make sure the updateTable flow is covered by existing tests?... or add a new test if it was not covered, please.

@dimas-b Since we're modifying the logic in the Resource class, we should add tests for it. However, I couldn't find any existing tests for the Iceberg Resource classes. Do we still need to add an end-to-end test for this PR? It looks complex to implement, and I might not have the bandwidth to add one right now.

@dimas-b
Copy link
Member

dimas-b commented Mar 6, 2025

We run the usual Iceberg Catalog Test suite, but I guess it does not assert that metadata locations in the loadTable response are the same as in the updateTable response.

Could you add a specific test for that?... Maybe to AbstractIcebergCatalogUnitTests?

@jshmchenxi
Copy link
Author

We run the usual Iceberg Catalog Test suite, but I guess it does not assert that metadata locations in the loadTable response are the same as in the updateTable response.

Could you add a specific test for that?... Maybe to AbstractIcebergCatalogUnitTests?

IIUC, the test suite uses a RESTCatalog from org.projectnessie.server.catalog.Catalogs#getCatalog for testing. However, the metadata location is only processed in RESTSessionCatalog or RESTTableOperations, and is not exposed by RESTCatalog methods..

@dimas-b
Copy link
Member

dimas-b commented Mar 7, 2025

Right, it looks like the metadataLocation property in the REST response is not exposed outside the REST Catalog java code.

However, given that the discrepancy before this fix affected iceberg-rust I think it would be nice to add a test for this property.

WDYT about adding a new method to AbstractIcebergCatalogUnitTests that would use the usual Catalog java object for setting up the table, but use RestAssured (example) for issuing an /updateTable (and view) request to verify the metadataLocation property in the response.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants