-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for S3 EncryptionMaterialsProvider to PrestoS3FileSystem #3802
Add support for S3 EncryptionMaterialsProvider to PrestoS3FileSystem #3802
Conversation
Add generic support for S3 encryption materials providers to the hive connector so that files in S3 encrypted using client-supplied keys can be used.
Thank you for your pull request and welcome to our community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. In order for us to review and merge your code, please sign up at https://code.facebook.com/cla - and if you have received this in error or have any questions, please drop us a line at [email protected]. Thanks! |
Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Facebook open source project. Thanks! |
@@ -444,6 +455,9 @@ protected ObjectListing computeNext(ObjectListing previous) | |||
|
|||
private Iterator<LocatedFileStatus> statusFromObjects(List<S3ObjectSummary> objects) | |||
{ | |||
// NOTE: for encrypted objects, S3ObjectSummary.size() used below is NOT correct, | |||
// however, to get the correct size we'd need to make an additional request to get | |||
// user metadata, and in this case it doesn't matter. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this "doesn't matter" because the encrypted size is always larger? The Hive connector computes splits based on this size, so if the encrypted size was smaller, it would be possible to not generate a split for part of the file and thus miss some data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct — doesn't matter because the encrypted size is always larger, and in this case since it's just used for the split calc. It matters most when trying to seek within a given file since seeking in S3 must be done relative to the unencrypted size (for instance, to read the last X bytes of a file to pull out metadata the reported size must be the unencrypted size).
Is there an ETA on when this might be integrated? |
Merged, thanks! |
Sorry for the delay. This will be in the 0.129 release. |
Wooo! Nice, thanks. No worries on delays, this is awesome. -nate From: David Phillips <[email protected]mailto:[email protected]> Sorry for the delay. This will be in the 0.129 release. — CONFIDENTIALITY NOTICE: This e-mail and any attachments are for the exclusive and confidential use of the intended recipient and may constitute non-public information. If you received this e-mail in error, disclosing, copying, distributing or taking any action in reliance of this e-mail is strictly prohibited and may be unlawful. Instead, please notify us immediately by return e-mail and promptly delete this message and its attachments from your computer system. We do not waive any work product or other applicable legal privilege(s) by the transmission of this message. |
Add generic support for S3 encryption materials providers to the hive
connector so that files in S3 encrypted using client-supplied keys can
be used.