-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
polyval: Add runtime PCLMULQDQ detection #11
Conversation
target_feature = "sse2", | ||
target_feature = "sse4.1", | ||
any(target_arch = "x86", target_arch = "x86_64") | ||
))] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The conditional gating this implementation is using is gross. Open to alternatives... perhaps the cfg-if
crate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried cfg-if
. Unfortunately it doesn't seem to support nesting, and the resulting code ended up looking just as bad
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made things a little better by using cfg!
. Still a lot of annoying, redundant gating though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My kingdom for cfg_alias
: rust-lang/cargo#7260 (comment)
44ce8e5
to
231a23b
Compare
Everything looks good to me. |
65f6879
to
86e59eb
Compare
@newpavlov WDYT? I'm calling this "done" but I'd be interested to know any ideas for potential simplifications/cleanups |
86e59eb
to
87cc950
Compare
When the `std` feature is enabled (which it is now by default), this adds runtime detection for PCLMULQDQ support on x86/x86_64 architectures. The detection happens once at the time `Polyval` is instantated. The `polyval::field::Element` type has been changed into an enum which remembers the detection result, and its API changed to operate on bytestring representations of POLYVAL field elements. This appears to have a negligable performance impact.
87cc950
to
f0db4e4
Compare
pub fn from_bytes(bytes: Block) -> Self { | ||
Element(bytes.into()) | ||
if cfg!(feature = "std") { | ||
if is_x86_feature_detected!("pclmulqdq") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't quite like that it does not work on no_std
targets. I think we better to use RDRAND directly and cache results in atomic.
IIUC this implementation branches between Also I think you are using One approach which we can take looks roughly like this: trait Element { .. }
impl Element for Clmul { .. }
impl Element for Soft { .. }
struct GhashImpl<E: Element> { .. }
enum GhashPrivate {
#[cfg(any(target_arch = "x86", target_arch = "x86_64"))]
Clmul(GhashImpl<Clmul>),
// disable soft fallback if CLMUL is enabled by user
#[cfg(not(all(
any(target_arch = "x86", target_arch = "x86_64"),
all(target_feature = "pclmulqdq", ..)
)))]
Soft(GhashImpl<Soft>),
}
struct Ghash(GhashPrivate); But I am not sure how good inlining will work with |
@newpavlov upon further reflection, I agree runtime detection isn't helpful. It would be extremely helpful for, say, AES-NI, where it isn't available inside KVM and therefore relocatable binaries produced for architectures that support it fail in these environments, but from what I can tell CLMUL isn't similarly impacted. That said, your suggested approach is pretty much what the existing implementation already does. I think I can eliminate the need for an enum and a trait entirely though via a simple newtype which only ensures the relevant APIs are equivalent though, which I'll take a stab at in a separate PR. It seems we can also fake Closing this PR out. |
I wouldn't say it isn't helpful. But if we want to squeeze a maximum performance, I think we should push switch between target dependent implementations as high as possible (i.e. we should not do runtime detections for |
@newpavlov I'm down to actually implement AES-GCM-SIV first, then circle back on this |
When the
std
feature is enabled (which it is now by default), this adds runtime detection for PCLMULQDQ support on x86/x86_64 architectures.The detection happens once at the time
Polyval
is instantiated. Thepolyval::field::Element
type has been changed into an enum which remembers the detection result, and its API changed to operate on bytestring representations of POLYVAL field elements.This appears to have a negligible performance impact.