Description
I've been relying on the newest implementation of executemany() to perform bulk upserts, but it has the shortcoming that it will not allow to easily determine the number of affected rows by parsing the statusmsg.
The number of effectively upserted rows can easily be less than the number of rows I attempt to upsert, since I qualify my ON CONFLICT clause with a further WHERE clause specifying that the update should only happen if the new and excluded tuples are distinct.
INSERT INTO "table_name" AS __destination_row (
id,
other_column
) VALUES ($1, $2)
ON CONFLICT (id)
DO UPDATE SET
id = excluded.id,
other_column = excluded.other_column
WHERE
(__destination_row.id IS DISTINCT FROM excluded.id)
OR
(__destination_row.other_column IS DISTINCT FROM excluded.other_column)
;
(regular Postgres would allow for a much terser syntax, but this is the only syntax that is accepted by CockroachDB)
Suppose that at times knowing the exact number of effectively upserted rows is more crucial than the bulk performance, and yet I would prefer not to go to the extreme of upserting one row at a time, what would be the best compromise?
Should I rely on a temporary table and then upserting into the physical tables from that temporary table?
INSERT INTO "table_name" AS __destination_row (
id,
other_column
) SELECT (
id,
other_column
) FROM "__temp_table_name"
ON CONFLICT (id)
DO UPDATE SET
id = excluded.id,
other_column = excluded.other_column
WHERE
(__destination_row.id IS DISTINCT FROM excluded.id)
OR
(__destination_row.other_column IS DISTINCT FROM excluded.other_column)
;
Should I instead use a transaction with several individual upserts of values once again provided by the client?
Are there other approaches I should explore?