Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure we are resilient across PG versions #695

Open
sgrif opened this issue Feb 13, 2017 · 2 comments
Open

Ensure we are resilient across PG versions #695

sgrif opened this issue Feb 13, 2017 · 2 comments
Labels

Comments

@sgrif
Copy link
Member

sgrif commented Feb 13, 2017

We currently always use the binary format for PG. However, the only type which explicitly documents its binary format is integer types, which is simply documented as using network byte order. For all other types we literally have to read the PG source code to find the binary representations. This appears to have been reasonably stable for at least the last couple versions, but we shouldn't assume that will always be the case. In fact the documentation explicitly states:

Keep in mind that binary representations for complex data types might change across server versions; the text format is usually the more portable choice.

I do want to continue using binary whenever possible. For some types, the difference in performance is several orders of magnitude in terms of the work being done to handle the binary form vs what we'd have to do to parse the text form. However, we should at minimum do the following:

  • Test against as many PG versions as we can and is reasonable
  • Check the server's version in the connection, error if it's too old, and at least warn if it's newer than the most recent version we've tested against.
  • Potentially fall back to the text form if the version isn't one that we know uses the expected binary format
sgrif added a commit that referenced this issue Feb 13, 2017
Deserialization already worked, so there's no reason we shouldn't just
allow this. It's worth noting that there is no way to represent whether
an array is allowed to contain nulls or not within postgres's type
system. As such, the type `Array<Nullable<ST>>` will never be generated
by `infer_schema!` (I do want to make `FromSql<Array<ST>> for
`Vec<Option<T>>` just work though). Since we never provide a
`ToSql<Array<ST>> for Vec<Option<T>>`, Diesel will prevent you from
accidentally inserting nulls at compile time. Ultimately the user does
just have to make sure they represent whether the array could contain
null in the output type, however.

One alternative would be to represent the fact that all arrays can
contain null within the type system, and only provide `impl
FromSql<Array<ST>> for Vec<Option<T>>`. I suspect this would hinder
ergonomics pretty heavily, but we should consider it before 1.0.

One note about the implementation -- Technically we should be setting
`flags` to `1` if the array contains any nulls, as that's what happens
on the data sent to us. However, setting it would require us to perform
an additional level of intermediate buffering on all the values to
determine whether the flag should be set, and that flag is currently
ignored by PG. "But Sean, that's just relying on an implementation
detail!" this is true, but our entire usage of the binary format is
relying on an implementation detail (See #695).
sgrif added a commit that referenced this issue Feb 14, 2017
Deserialization already worked, so there's no reason we shouldn't just
allow this. It's worth noting that there is no way to represent whether
an array is allowed to contain nulls or not within postgres's type
system. As such, the type `Array<Nullable<ST>>` will never be generated
by `infer_schema!` (I do want to make `FromSql<Array<ST>> for
`Vec<Option<T>>` just work though). Since we never provide a
`ToSql<Array<ST>> for Vec<Option<T>>`, Diesel will prevent you from
accidentally inserting nulls at compile time. Ultimately the user does
just have to make sure they represent whether the array could contain
null in the output type, however.

One alternative would be to represent the fact that all arrays can
contain null within the type system, and only provide `impl
FromSql<Array<ST>> for Vec<Option<T>>`. I suspect this would hinder
ergonomics pretty heavily, but we should consider it before 1.0.

One note about the implementation -- Technically we should be setting
`flags` to `1` if the array contains any nulls, as that's what happens
on the data sent to us. However, setting it would require us to perform
an additional level of intermediate buffering on all the values to
determine whether the flag should be set, and that flag is currently
ignored by PG. "But Sean, that's just relying on an implementation
detail!" this is true, but our entire usage of the binary format is
relying on an implementation detail (See #695).
@wt
Copy link

wt commented Jan 19, 2025

This seems to have been reported some time ago. Is this still an active concern?

@weiznich
Copy link
Member

@wt This is still not addressed as we didn't implement any of this. We still look for someone to work at least on some of the points, possibly not the last one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants