Ensure we are resilient across PG versions #695

sgrif · 2017-02-13T14:38:05Z

We currently always use the binary format for PG. However, the only type which explicitly documents its binary format is integer types, which is simply documented as using network byte order. For all other types we literally have to read the PG source code to find the binary representations. This appears to have been reasonably stable for at least the last couple versions, but we shouldn't assume that will always be the case. In fact the documentation explicitly states:

Keep in mind that binary representations for complex data types might change across server versions; the text format is usually the more portable choice.

I do want to continue using binary whenever possible. For some types, the difference in performance is several orders of magnitude in terms of the work being done to handle the binary form vs what we'd have to do to parse the text form. However, we should at minimum do the following:

Test against as many PG versions as we can and is reasonable
Check the server's version in the connection, error if it's too old, and at least warn if it's newer than the most recent version we've tested against.
Potentially fall back to the text form if the version isn't one that we know uses the expected binary format

Deserialization already worked, so there's no reason we shouldn't just allow this. It's worth noting that there is no way to represent whether an array is allowed to contain nulls or not within postgres's type system. As such, the type `Array<Nullable<ST>>` will never be generated by `infer_schema!` (I do want to make `FromSql<Array<ST>> for `Vec<Option<T>>` just work though). Since we never provide a `ToSql<Array<ST>> for Vec<Option<T>>`, Diesel will prevent you from accidentally inserting nulls at compile time. Ultimately the user does just have to make sure they represent whether the array could contain null in the output type, however. One alternative would be to represent the fact that all arrays can contain null within the type system, and only provide `impl FromSql<Array<ST>> for Vec<Option<T>>`. I suspect this would hinder ergonomics pretty heavily, but we should consider it before 1.0. One note about the implementation -- Technically we should be setting `flags` to `1` if the array contains any nulls, as that's what happens on the data sent to us. However, setting it would require us to perform an additional level of intermediate buffering on all the values to determine whether the flag should be set, and that flag is currently ignored by PG. "But Sean, that's just relying on an implementation detail!" this is true, but our entire usage of the binary format is relying on an implementation detail (See #695).

wt · 2025-01-19T04:51:07Z

This seems to have been reported some time ago. Is this still an active concern?

weiznich · 2025-01-21T18:33:06Z

@wt This is still not addressed as we didn't implement any of this. We still look for someone to work at least on some of the points, possibly not the last one.

sgrif mentioned this issue Feb 13, 2017

Allow arrays containing null to be serialized #698

Merged

killercup added the postgres label Feb 17, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ensure we are resilient across PG versions #695

Ensure we are resilient across PG versions #695

sgrif commented Feb 13, 2017

wt commented Jan 19, 2025

weiznich commented Jan 21, 2025

Ensure we are resilient across PG versions #695

Ensure we are resilient across PG versions #695

Comments

sgrif commented Feb 13, 2017

wt commented Jan 19, 2025

weiznich commented Jan 21, 2025