-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for delta timestamp_ntz datatype #24418
base: master
Are you sure you want to change the base?
Conversation
presto-delta/src/test/java/com/facebook/presto/delta/TestTimestampColumns.java
Outdated
Show resolved
Hide resolved
Thanks for the release note! Nits suggested: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please refactor the commit message, PR title/description and Release note to timestampNTZ rather than timezoneNTZ.
c6371a6
to
fb5fef3
Compare
presto-delta/src/test/java/com/facebook/presto/delta/TestDeltaIntegration.java
Outdated
Show resolved
Hide resolved
fb5fef3
to
bd1117e
Compare
bd1117e
to
2d81166
Compare
Thanks for the release note! Some changes to follow the Release Notes Guidelines:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code looks good, please update the test
presto-delta/src/test/java/com/facebook/presto/delta/TestDeltaIntegration.java
Show resolved
Hide resolved
presto-delta/src/test/java/com/facebook/presto/delta/TestDeltaIntegration.java
Outdated
Show resolved
Hide resolved
5da9113
to
fe7d66e
Compare
7859b12
to
3ece8ef
Compare
3ece8ef
to
15f8db5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@infvg thank you for the fix. I am still in the process of understanding and verifying its correctness. I have a couple of comments, and I will add more if necessary.
Have you verified compatibility with Spark? I think it would be beneficial to ensure that Presto's handling of timestamps for delta tables aligns with Spark's behavior.
Additionally, have we included tests for both partitioned and non-partitioned tables?
presto-delta/src/main/java/com/facebook/presto/delta/DeltaTypeUtils.java
Outdated
Show resolved
Hide resolved
targetType.writeLong(newBlockBuilder, longArrayBlock.getLong(position, 0)); | ||
// If the first 12 bits of a timestamp with timezone are all 0, then only the unix timestamp is encoded | ||
// in the long & we can set the last 12 bits as 0 to default to UTC. | ||
if ((longArrayBlock.getLong(position) & 0xFFFF000000000000L) == 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As Presto only uses the first 12 bits for the time zone key, shouldn't we use a 12-bit mask here? 0xFFFF000000000000L
is a 16-bit mask.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unix timestamps are max 32 bits, which gets pushed 12 bits. If no timezone is encoded, then the first 32 bits will contain the timestamp & the last 32 bits will be empty. So if the first 32 bits are empty then no timezone was encoded & I push it it 12 bits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we separate the PR into two commits?
- Add support for the timestamp_ntz datatype in the Delta Lake Connector
- Fix the handling of timestamps with time zones in the Delta Lake Connector
Please feel free to modify the commit messages as you see fit.
Previously, TimestampNTZ was not a supported column type. This PR will map the delta TimestampNTZ type to Presto's timestamp type
15f8db5
to
879df1c
Compare
@imjalpreet The |
b99313b
to
97cbacc
Compare
Delta table column type Timestamp type should be mapped to TIMESTAMP_WITH_TIMEZONE as there is a no timezone column type counterpart.
97cbacc
to
207211b
Compare
Description
Add support for TIMESTAMP_NTZ data type in v3 delta tables.
Currently, only TIMESTAMP column types are supported in presto.
Applied current presto logic for timestamp types. Although this issue still persists:
trinodb/trino#37
Timestamp type will be modified by timezone despite this functionality differing in Spark if legacy timestamp is set to true.
Impact
Will allow presto to read tables with TIMESTAMP_NTZ columns
Will change presto delta table columns of TIMESTAMP type to TIMESTAMP_WITH_TIME_ZONE
Test Plan
Added UT
Contributor checklist
Release Notes
Please follow release notes guidelines and fill in the release notes below.