Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple Unsoundness issue #25

Open
lwz23 opened this issue Mar 3, 2025 · 4 comments
Open

Multiple Unsoundness issue #25

lwz23 opened this issue Mar 3, 2025 · 4 comments

Comments

@lwz23
Copy link

lwz23 commented Mar 3, 2025

Hello, thank you for your contribution in this project, I an testing our static analysis tool in github's Rust project and I notice the following code:

pub fn murmurhash3_x86_32(bytes: &[u8], seed: u32) -> u32 {

fn get_32_block(bytes: &[u8], index: usize) -> u32 {
    let b32: &[u32] = unsafe { core::mem::transmute(bytes) };

    return b32[index];
}

The issue is in get_32_block where it unsafely transmutes a byte slice to a u32 slice:
rustCopylet b32: &[u32] = unsafe { core::mem::transmute(bytes) };
This transmute assumes that bytes is properly aligned for u32 (4-byte alignment), but the public function murmurhash3_x86_32 does not validate this alignment requirement. When calling the function with an unaligned byte slice, it causes undefined behavior when accessing the transmuted u32 values. Although it is a private function, I notice a possible way to call this function from a pub function murmurhash3_x86_32.

pub fn murmurhash3_x86_32 -> fn get_32_block
// 函数: murmurhash3_x86_32
#[inline(always)]
pub fn murmurhash3_x86_32(bytes: &[u8], seed: u32) -> u32 {
    let c1 = 0xcc9e2d51u32;
    let c2 = 0x1b873593u32;
    let read_size = 4;
    let len = bytes.len() as u32;
    let block_count = len / read_size;
    let mut h1 = seed;
    for i in 0..block_count as usize {
        let mut k1 = get_32_block(bytes, i);
        k1 = k1.wrapping_mul(c1);
        k1 = k1.rotate_left(15);
        k1 = k1.wrapping_mul(c2);
        h1 ^= k1;
        h1 = h1.rotate_left(13);
        h1 = h1.wrapping_mul(5);
        h1 = h1.wrapping_add(0xe6546b64);
    }
    let mut k1 = 0u32;
    if len & 3 == 3 {
        k1 ^= (bytes[(block_count * read_size) as usize + 2] as u32) << 16;
    }
    if len & 3 >= 2 {
        k1 ^= (bytes[(block_count * read_size) as usize + 1] as u32) << 8;
    }
    if len & 3 >= 1 {
        k1 ^= bytes[(block_count * read_size) as usize + 0] as u32;
        k1 = k1.wrapping_mul(c1);
        k1 = k1.rotate_left(15);
        k1 = k1.wrapping_mul(c2);
        h1 ^= k1;
    }
    h1 ^= bytes.len() as u32;
    h1 = fmix32(h1);
    return h1;
}


// 函数: get_32_block
#[inline(always)]
fn get_32_block(bytes: &[u8], index: usize) -> u32 {
    let b32: &[u32] = unsafe { core::mem::transmute(bytes) };
    return b32[index];
}

POC

fn main() {
    // Create a byte array that is not correctly aligned for u32
    // By taking a slice starting at index 1, we ensure misalignment
    let all_bytes = [1u8, 2, 3, 4, 5, 6, 7, 8];
    let unaligned_bytes = &all_bytes[1..];
    
    // Call the public function with the unaligned bytes
    // This will cause undefined behavior in get_32_block when it
    // transmutes the unaligned &[u8] to &[u32]
    let hash = murmurhash3_x86_32(unaligned_bytes, 0);
    
    // The function will read unaligned u32 values, which is UB
    println!("Hash: {}", hash);
}
@lwz23
Copy link
Author

lwz23 commented Mar 3, 2025

same for *(&bytes[offset..(offset + 2)] as *const [u8] as *const [u8; 2])
valid path is: pub fn new_compiled -> fn read_u16

The issue is in the read_u16 function where it unsafely accesses memory:

fn read_u16(bytes: &[u8], offset: usize) -> usize {
    u16::from_be_bytes(unsafe {
        *(&bytes[offset..(offset + 2)] as *const [u8] as *const [u8; 2])
    }) as usize
}

There's no bounds checking to ensure that offset + 2 is within the bounds of the bytes slice before performing the unsafe operation. In new_compiled, it makes multiple calls to read_u16 with various offsets without validating the input slice has sufficient length.

Poc

fn main() {
    // Create a byte array that's too small to satisfy the required read operations
    // The code expects at least 21 bytes initially, but we'll provide fewer
    let too_small_bytes: &[u8] = &[0x00, 0x05]; // Only 2 bytes
    
    // Call new_compiled with the small byte array
    // This will cause undefined behavior in read_u16 when it tries to read
    // bytes at offset 21, which is far beyond our array's length
    let factory_result = RpcFactory::new_compiled(too_small_bytes);
    
    // The function will fail with undefined behavior because read_u16
    // tries to access out-of-bounds memory
}

@lwz23
Copy link
Author

lwz23 commented Mar 3, 2025

same for

unsafe { from_utf8_unchecked(&self.bytes.read()[addr.idx..end]) }

there is no check for valid utf8 bytes.
valid path: pub fn new_compiled -> fn read_str

@lwz23 lwz23 changed the title Unsoundness in get_32_block Multiple Unsoundness issue Mar 3, 2025
@lwz23
Copy link
Author

lwz23 commented Mar 3, 2025

also path from pub fn open_request -> fn read_u16
When open_request calls read_u16(&bytes, 4), it attempts to read two bytes starting at index 4, but there's no validation that the input vector has sufficient length before making this unsafe access.

@lwz23
Copy link
Author

lwz23 commented Mar 3, 2025

same for pub fn murmurhash3_x64_128 -> fn get_128_block which is also unsound

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant