Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PROTOCOL RFC] Enforce Vacuum Writer Protocol Check #2630

Closed
1 of 3 tasks
prakharjain09 opened this issue Feb 13, 2024 · 0 comments · Fixed by #2808
Closed
1 of 3 tasks

[PROTOCOL RFC] Enforce Vacuum Writer Protocol Check #2630

prakharjain09 opened this issue Feb 13, 2024 · 0 comments · Fixed by #2808
Milestone

Comments

@prakharjain09
Copy link
Collaborator

Protocol Change Request

Description of the protocol change

This RFC proposes a new ReaderWriter feature called vacuumProtocolCheck which makes sure that Vacuum do reader and writer protocol checks.

Motivation

Vacuum today doesn’t do Writer Protocol check in all cases. It unintentionally performs the check in some clouds (azure/gcp) where we make a commit corresponding to Vacuum start and end operation. But in AWS, Vacuum skips the check as the vacuum logging is disabled. This problem doesn’t exist for Read Protocol check as Vacuum always performs it in the beginning as part of snapshot creation.

The missing Writer protocol check causes backward compatibility issues for various Writer only features e.g. Managed Commit where the commit discovery has changed. Due to this, an older Delta Client could run a Vacuum command and may wrongly delete in-use files and corrupt the table.

Design

Please find the detailed design here: https://docs.google.com/document/d/15o8WO2T0vN21S5JG-FT_ZNhXFCWyh0i9tqhr9kBmZpE/edit

At high level, the new ReaderWriter feature vacuumProtocolCheck will control following:

  1. A delta reader doesn’t need to understand/change anything new.
  2. A delta writer could support this feature by adoption one of the options:
    • Option-1: Affirm that it doesn’t support Vacuum operation: This option could be used in external Delta connectors which already do not support Vacuum e.g. Flink etc.
    • Option-2: Make sure the Vacuum implementation makes a Writer protocol check before deleting any files.
    • Option-3: Refuse to run VACUUM operation on tables that have the feature enabled (but other reads and writes can still proceed normally). Discouraged because it would likely take more work to conditionally block VACUUM than to just perform the write protocol check.

Willingness to contribute

The Delta Lake Community encourages protocol innovations. Would you or another member of your organization be willing to contribute this feature to the Delta Lake code base?

  • Yes. I can contribute.
  • Yes. I would be willing to contribute with guidance from the Delta Lake community.
  • No. I cannot contribute at this time.
@prakharjain09 prakharjain09 added this to the 3.2.0 milestone Feb 29, 2024
scottsand-db pushed a commit that referenced this issue Mar 4, 2024
…uest (#2693)

## Protocol Change Request

### Description of the protocol change
Adds the VacuumProtocolCheck PROTOCOL change proposal. Design Doc:
https://docs.google.com/document/d/15o8WO2T0vN21S5JG-FT_ZNhXFCWyh0i9tqhr9kBmZpE/edit#heading=h.4cz970y1mk93

Protocol RFC issue: #2630

### Willingness to contribute

The Delta Lake Community encourages protocol innovations. Would you or
another member of your organization be willing to contribute this
feature to the Delta Lake code base?

- [x] Yes. I can contribute.
- [ ] Yes. I would be willing to contribute with guidance from the Delta
Lake community.
- [ ] No. I cannot contribute at this time.

---------

Co-authored-by: Prakhar Jain <[email protected]>
@tdas tdas moved this from Todo to In Progress in Linux Foundation Delta Lake Roadmap Mar 25, 2024
@tdas tdas closed this as completed in #2808 Apr 3, 2024
tdas pushed a commit that referenced this issue Apr 3, 2024
## Protocol Change Request

### Description

Adds the VacuumProtocolCheck PROTOCOL change. Design Doc:

https://docs.google.com/document/d/15o8WO2T0vN21S5JG-FT_ZNhXFCWyh0i9tqhr9kBmZpE/edit#heading=h.4cz970y1mk93

closes #2630

### Willingness to contribute

The Delta Lake Community encourages protocol innovations. Would you or
another member of your organization be willing to contribute this
feature to the Delta Lake code base?

- [x] Yes. I can contribute.
- [ ] Yes. I would be willing to contribute with guidance from the Delta
Lake community.
- [ ] No. I cannot contribute at this time.
andreaschat-db pushed a commit to andreaschat-db/delta that referenced this issue Apr 16, 2024
## Protocol Change Request

### Description

Adds the VacuumProtocolCheck PROTOCOL change. Design Doc:

https://docs.google.com/document/d/15o8WO2T0vN21S5JG-FT_ZNhXFCWyh0i9tqhr9kBmZpE/edit#heading=h.4cz970y1mk93

closes delta-io#2630

### Willingness to contribute

The Delta Lake Community encourages protocol innovations. Would you or
another member of your organization be willing to contribute this
feature to the Delta Lake code base?

- [x] Yes. I can contribute.
- [ ] Yes. I would be willing to contribute with guidance from the Delta
Lake community.
- [ ] No. I cannot contribute at this time.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

Successfully merging a pull request may close this issue.

1 participant