Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ACP: ref access to core::net::Ip*Addr octet bytes, impl AsRef<[u8]> for core::net::Ip*Addr #535

Closed
Tracked by #137259
mammothbane opened this issue Feb 8, 2025 · 2 comments
Labels
ACP-accepted API Change Proposal is accepted (seconded with no objections) api-change-proposal A proposal to add or alter unstable APIs in the standard libraries T-libs-api

Comments

@mammothbane
Copy link

mammothbane commented Feb 8, 2025

Proposal

Problem statement

IpAddr, Ipv4Addr, and Ipv6Addr don't provide reference access to their contents, while they do provide an array copy via .octets(). This is a problem in situations where we want the Ip*Addr to own and support a lifetime for the address octets as a slice — &[u8] is very common in function arguments, as is AsRef<[u8]>, for instance.

Current Ip*Addr types can only support a signature like fn(&[u8]) by first stack-allocating a local via .octets():

fn f(_: &[u8]);

let addr = Ipv4Addr::LOCALHOST;

// This works
let ary = addr.octets();
f(&ary); 

// Desugars to the above -- implicit stack local
f(&addr.octets());

// Does not exist
f(addr.as_ref()); 

This is workable, if awkward, in situations where the lifetime of the array is functionally equivalent to the lifetime of the Ip*Addr. However, in other cases, the lack of ref access is a problem:

static ADDR: Ipv4Addr::new(1, 2, 3, 4);

let bytes: &[u8] = {
    // This array's lifetime is limited to this scope.
    let ary = ADDR.octets();

    // We "should" be able to provide &'static access to the contents of ADDR
    // due to its lifetime, but we can't because we need to copy first -- this doesn't work.
    &ary
};

We could always carry around a copy of the contents if we know we need a reference:

static ADDR: Ipv4Addr::new(1, 2, 3, 4);
static ADDR_BYTES: [u8; 4] = ADDR.octets();

let bytes = &ADDR_BYTES;

// But that implies that we probably want something like this:
struct Ipv4AddrWithAccessibleStorage(pub Ipv4Addr, pub [u8; 4]);

// ... But this type rather absurdly duplicates the IpAddr's storage, and we would naturally implement AsRef on it
// Which begs the question why this functionality isn't available on the Ip*Addrs themselves

Motivating examples or use cases

This code is messy/WIP, but see DynAddress here. This is part of a userspace packet switch experiment that can operate at L2 or L3, and I define and provide trait DynAddress for IPv4, IPv6, Ethernet, and IEEE802.15.4 addresses. The addresses are treated as opaque bytestrings to make the switch implementation agnostic to the address representation.

Relevant code subset pulled out for readability:

pub trait DynAddress: Debug {
    fn as_bytes(&self) -> &[u8];
    // ...
}

impl DynAddress for core::net::Ipv4Addr {
    fn as_bytes(&self) -> &[u8] {
        &self.octets() // doesn't work
    }
    // ...
}

impl DynAddress for smoltcp::wire::EthernetAddress {
    fn as_bytes(&self) -> &[u8] {
        self.as_bytes() // this exists
    }
    // ...
}

Notably, I can't associate a const LEN: usize such that fn as_bytes(&self) -> [u8; LEN] because this would make DynAddress not object-safe. I also can't bound DynAddress by Hash e.g. (which I need, and which &[u8] provides) because it's not object-safe.

My code (this line specifically) used to work with smoltcp's independent Ipv4Addr and Ipv6Addr definitions, but when smoltcp switched to the core::net types, my code stopped working, because this removed the as_bytes method. I can't restore the functionality now because Ip*Addr doesn't provide ref access to the contents.

My patch solution is to add an associated type to DynAddress: type BytesRepr<'a>: AsRef<[u8]> = &[u8] where Self: 'a;, which is overridden with [u8; 4] and [u8; 16] for the Ip*Addr types.

Solution sketch

Implement (using the same byte order as .octets()):

  • pub const fn Ipv4Addr::as_array(&self) -> &[u8; 4]
  • pub const fn Ipv6Addr::as_array(&self) -> &[u8; 16]
  • pub const fn IpAddr::as_slice(&self) -> &[u8]
  • impl AsRef<[u8]> for Ipv4Addr
  • impl AsRef<[u8]> for Ipv6Addr
  • impl AsRef<[u8]> for IpAddr

Example from PR (linked below) for visibility:

impl Ipv4Addr {
    pub const fn as_array(&self) -> &[u8; 4] {
        &self.octets
    }
}

I've opened a PR with an initial implementation.

Alternatives

Deal with it

This clearly isn't such a huge problem that people are coming out of the woodwork to demand a change, so this ACP could be rejected. As my example shows, it's possible to work around this with traits, e.g. AsRef<[u8]> or Borrow<Borrowed=[u8]>.

One possible reason for this rejection would be to leave the internals of the Ip*Addr types flexible. Currently, I believe the internal representation of these types is totally undetermined by the API — the address could be stored in a string and parsed at access time repeatedly, for all the user knows. Stabilizing this feature would mean that Ipv4Addr must contain (or eventually point to, I suppose) a contiguous sequence of four bytes representing the address (ditto with Ipv6Addr / 16 bytes), such that we can construct the returned refs for the new API with a valid lifetime. I think the current internal representations are canonical, so I think it is worth doing this, but the argument bears mentioning.

If this result ends up being chosen, I'd at least like to leave a comment explaining the reasoning and linking back to this ACP in the source code.

Endianness

There's an endianness problem here — .octets() is very clear that it represents the octet (big-endian) order of the address, so a user calling this function shouldn't be confused about this. But the names as_slice, as_array, and as_ref don't make the same specification, which could possibly lead users to expect different behavior. Without bikeshedding a specific name, an alternative like .octets_ref() that represents this more clearly could be appropriate.

AsRef

AsRef also has a canonicity problem -- there can only be one AsRef<[u8]> for each type, and so using octet order is an opinionated choice. For Ipv4Addr, I think providing AsRef<[u8]> is correct, as octet order is pragmatically speaking canonical (u32 interpretations of IPv4s are little more than a historical curiosity now), but for Ipv6Addr this is less clear. Indeed, segment order is the most common interpretation, so providing AsRef<[u8]> could be ambiguous.

On the other hand, the Ipv6Addr type as currently implemented cannot provide an AsRef<[u8]> in any other order than big-endian, and if we provided as_array, that would be fixed by the API (or require internal duplicate storage in the other endianness), so this is some argument for providing AsRef.

as_slice() vs. as_array()

My initial implementation in the PR below provides as_slice() -> &[u8] instead of as_array() -> [u8; N]@scottmcm mentioned that this throws away information, but still provides the ref access, so I'm planning on changing it.

Links and related work

I've opened a PR implementing the functionality here. It currently provides as_slice rather than as_array, but I'm planning on updating that.

AsRef<[u8]> impls are pulled out here to avoid needing an FCP to land an initial implementation.

@mammothbane mammothbane added api-change-proposal A proposal to add or alter unstable APIs in the standard libraries T-libs-api labels Feb 8, 2025
@joshtriplett
Copy link
Member

We discussed this in today's @rust-lang/libs-api meeting.

We had a variety of opinions. The common consensus was:

  • We'd prefer to not have the AsRef impls.
  • We'd like to have the three const functions.
  • We'd like to name all three functions as_octets, for consistency with octets, so that people don't wonder if they're different.

@joshtriplett joshtriplett added the ACP-accepted API Change Proposal is accepted (seconded with no objections) label Feb 11, 2025
@tgross35
Copy link

As a note, this could also be solved with bytemuck::bytes_of / zerocopy::IntoBytes::as_bytes / eventually project safe transmute. Neither of the crates provide an implementation on IpAddr, likely because we don't guarantee a layout. So maybe an alternative here is to mark the types #[repr(transparent)] and document that the layout is guaranteed, then request those crates add implementations?

mammothbane added a commit to mammothbane/rust that referenced this issue Feb 18, 2025
Adds `const` `Ip*Addr::as_octets` methods providing reference access to
`Ip*Addr` octets contents.

See rust-lang/libs-team#535 for accepted ACP
with a more detailed justification.
mammothbane added a commit to mammothbane/rust that referenced this issue Feb 19, 2025
Adds `const` `Ip*Addr::as_octets` methods providing reference access to
`Ip*Addr` octets contents.

See rust-lang/libs-team#535 for accepted ACP
with a more detailed justification.
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Feb 21, 2025
libcore/net: `IpAddr::as_octets()`

[ACP](rust-lang/libs-team#535)
[Tracking issue](rust-lang#137259)

Adds `const` `core::net::IpAddr{,v4,v6}::as_octets()` methods to provide reference access to IP address contents.

The concrete usecase for me is allowing the `IpAddr` to provide an extended lifetime in contexts that want a `&[u8]`:

```rust
trait AddrSlice {
    fn addr_slice(&self) -> &[u8];
}

impl AddrSlice for IpAddrV4 {
    fn addr_slice(&self) -> &[u8] {
        // self.octets() doesn't help us here, because we can't return a reference to the owned array.
        // Instead we want the IpAddrV4 to continue owning the memory:
        self.as_octets()
    }
}
```

(Notably, in this case we can't parameterize `AddrSlice` by a `const N: usize` (such that `fn addr_slice(&self) -> [u8; N]`) and maintain object-safety.)
rust-timer added a commit to rust-lang-ci/rust that referenced this issue Feb 21, 2025
Rollup merge of rust-lang#136609 - mammothbane:master, r=scottmcm

libcore/net: `IpAddr::as_octets()`

[ACP](rust-lang/libs-team#535)
[Tracking issue](rust-lang#137259)

Adds `const` `core::net::IpAddr{,v4,v6}::as_octets()` methods to provide reference access to IP address contents.

The concrete usecase for me is allowing the `IpAddr` to provide an extended lifetime in contexts that want a `&[u8]`:

```rust
trait AddrSlice {
    fn addr_slice(&self) -> &[u8];
}

impl AddrSlice for IpAddrV4 {
    fn addr_slice(&self) -> &[u8] {
        // self.octets() doesn't help us here, because we can't return a reference to the owned array.
        // Instead we want the IpAddrV4 to continue owning the memory:
        self.as_octets()
    }
}
```

(Notably, in this case we can't parameterize `AddrSlice` by a `const N: usize` (such that `fn addr_slice(&self) -> [u8; N]`) and maintain object-safety.)
github-actions bot pushed a commit to tautschnig/verify-rust-std that referenced this issue Mar 6, 2025
Adds `const` `Ip*Addr::as_octets` methods providing reference access to
`Ip*Addr` octets contents.

See rust-lang/libs-team#535 for accepted ACP
with a more detailed justification.
github-actions bot pushed a commit to thanhnguyen-aws/verify-rust-std that referenced this issue Mar 6, 2025
Adds `const` `Ip*Addr::as_octets` methods providing reference access to
`Ip*Addr` octets contents.

See rust-lang/libs-team#535 for accepted ACP
with a more detailed justification.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ACP-accepted API Change Proposal is accepted (seconded with no objections) api-change-proposal A proposal to add or alter unstable APIs in the standard libraries T-libs-api
Projects
None yet
Development

No branches or pull requests

3 participants