-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] Support default values for specified fields #5023
base: master
Are you sure you want to change the base?
[Feature] Support default values for specified fields #5023
Conversation
Thanks for the implementation, We hope its final form refers to https://issues.apache.org/jira/browse/SPARK-38334:
|
.../paimon-spark-common/src/main/scala/org/apache/paimon/spark/commands/PaimonSparkWriter.scala
Outdated
Show resolved
Hide resolved
lgtm, i will try to implement this |
@Zouxxyy hello, wanna ask a question, if we load create table with defalt value's logical plan, the default filed should put in where? In fields? {
"version" : 3,
"id" : 0,
"fields" : [ {
"id" : 0,
"name" : "id",
"type" : "INT"
}, {
"id" : 1,
"name" : "t1",
"type" : "INT"
}, {
"id" : 2,
"name" : "t2",
"type" : "INT"
} ],
"highestFieldId" : 2,
"partitionKeys" : [ ],
"primaryKeys" : [ ],
"options" : {
"owner" : "xxx"
},
"timeMillis" : 1739187234928
} like this? Because the spark field has metadata property, i think we should follow it, WDYT? {
"version" : 3,
"id" : 0,
"fields" : [ {
"id" : 0,
"name" : "id",
"type" : "INT",
"metadata": {}
}, {
"id" : 1,
"name" : "t1",
"type" : "INT",
"metadata": {}
}, {
"id" : 2,
"name" : "t2",
"type" : "INT",
"metadata": {}
} ],
"highestFieldId" : 2,
"partitionKeys" : [ ],
"primaryKeys" : [ ],
"options" : {
"owner" : "xxx"
},
"timeMillis" : 1739187234928
} |
…t column value (because the default lexer only spark 3.5+ version support)
basic of this, i implement an simple pr, we could discuss how to improve it |
@davidyuan1223 Since Paimon already has a prop for default values, we can just use that, what do you think @JingsongLi So I think we can implement it in the following steps:
CREATE TABLE t (id INT, name STRING)TBLPROPERTIES ('fields.name.default-value' = 'default v');
-- check
INSERT INTO t (id) values (1);
SELECT * FROM t;
CREATE TABLE t (id INT, name STRING DEFAULT 'default v'); |
because i saw the pr https://issues.apache.org/jira/browse/SPARK-38334 which implementation metadata field, so i implementation a pr with the metadata field in paimon metadata file, shall we use this? or keep the properties? |
Our previous default design was flawed, and we may need to consider refactoring, but this refactoring requires consideration of:
This may require a larger design, preferably with a PIP to discuss it in detail. |
Agree, we can also consider supporting schema evolution, like changing the default value. These need to be carefully designed. |
LGTM, i think add metadata or other in filed detail may could help us compatiable with more components, we could discuss. |
… string type)
Purpose
Linked issue: close #5015
paimon spark support default value for specified fields.(Current only string type)
(Testing, not need check)
Tests
API and Format
Documentation