Hashes of numbers with identical final digits match #143

ilyvion · 2025-03-08T12:46:03Z

This feels very problematic to me and is causing me to have to hash BigDecimal values using a different technique than their real Hash impl.

#[test]
fn big_decimal_hash_issue() {
    use bigdecimal::BigDecimal;
    use std::hash::{DefaultHasher, Hash, Hasher};

    let d1: BigDecimal = "1011".parse().unwrap();
    let d2: BigDecimal = "0.01011".parse().unwrap();

    let mut d1_hasher = DefaultHasher::new();
    let mut d2_hasher = DefaultHasher::new();

    d1.hash(&mut d1_hasher);
    d2.hash(&mut d2_hasher);

    let d1_hash = d1_hasher.finish();
    let d2_hash = d2_hasher.finish();

    assert_ne!(d1_hash, d2_hash, "{d1:?}, {d2:?}");
}

This test fails with

assertion `left != right` failed: BigDecimal(sign=Plus, scale=0, digits=[1011]), BigDecimal(sign=Plus, scale=5, digits=[1011])
  left: 1407011009942640606
 right: 1407011009942640606

You can change d1 and d2 to any two numbers that "end in" (or I guess, more accurately, have the significant digits) 1011, and you still get the panic, e.g. 10.11 or 0.00000001011 -- basically as long as digits=[1011], you'll get the panic.

It does not happen when you go in the other direction, like with 10110 because then you get scale=0, digits=[10110], which hashes differently. ~~Unless you use .normalized() on them first; since you then get something like scale=-1, digits=[1011] instead, which once more gives an identical hash,~~ (I was mistaken about that; while normalizing do give them identical digits, they still hash differently)

The text was updated successfully, but these errors were encountered:

akubera · 2025-03-08T14:36:02Z

Yeah it looks like a pretty naive algorithm, written some years ago. I think it’s just a hash of integer value?

Definitely needs fixed

akubera · 2025-03-08T15:37:24Z

Easy solution would be to be format to scientific notation and hash that… but it’d be nicer of course to not have to allocate a string (but Hash has been doing that for 7 years and nobody has complained)

Hash should include precision, right? So hash(1.2) != hash(1.200), despite 1.2 == 1.200?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hashes of numbers with identical final digits match #143

Hashes of numbers with identical final digits match #143

ilyvion commented Mar 8, 2025 •

edited

Loading

akubera commented Mar 8, 2025

akubera commented Mar 8, 2025

Hashes of numbers with identical final digits match #143

Hashes of numbers with identical final digits match #143

Comments

ilyvion commented Mar 8, 2025 • edited Loading

akubera commented Mar 8, 2025

akubera commented Mar 8, 2025

ilyvion commented Mar 8, 2025 •

edited

Loading