Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net/netip: ParseAddr should reject invalid zone #71362

Open
qinlonglong123 opened this issue Jan 21, 2025 · 9 comments
Open

net/netip: ParseAddr should reject invalid zone #71362

qinlonglong123 opened this issue Jan 21, 2025 · 9 comments
Labels
BugReport Issues describing a possible bug in the Go implementation. NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made.
Milestone

Comments

@qinlonglong123
Copy link

Proposal Details

I have the following use case:

package main

import (
    "fmt"
    "net/netip"
)

func main() {
    addr, err := netip.ParseAddr("2006:abcd::1%%")
    fmt.Printf("addr = %s,zone = %s,err = %v\n", addr.String(), addr.Zone(), err)
}

https://go.dev/play/p/soSl7hzPhJx
output:

addr = 2006:abcd::1%%,zone = %,err = <nil>

When running with Go 1.23, it considers the last % as the IPv6 zone ID. However, when parsing the same address in other languages, it considers the parameter as an invalid IP, for example:

1.C Language

#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <netdb.h>
#include <arpa/inet.h>

const char* parse_ipv6_with_zone_id(const char *input) {
    struct addrinfo hints, *res;
    int status;

    memset(&hints, 0, sizeof(hints));
    hints.ai_family = AF_INET6;
    hints.ai_socktype = SOCK_STREAM;

    status = getaddrinfo(input, NULL, &hints, &res);

    if (status != 0) {
        return "Invalid";
    }

    freeaddrinfo(res);
    return "Valid";
}

int main() {
    const char *input1 = "2006:abcd::1%%";

    printf("Address 1: %s - %s\n", input1, parse_ipv6_with_zone_id(input1));

    return 0;
}

gcc version

gcc version 10.3.1 (GCC)

and output:

Address 1: 2006:abcd::1%% - Invalid

2. java

import java.net.Inet6Address;
import java.net.InetAddress;
import java.net.URI;
import java.net.URISyntaxException;

public class IPv6ZoneIDParser {
 public static void main(String[] args) {
        final String ipv6WithZone = "2006:abcd::1%%";
        try {
            InetAddress inetAddress = InetAddress.getByName(ipv6WithZone);
            if (inetAddress instanceof Inet6Address) {
                System.out.println(ipv6WithZone + " It is an IPv6 address.");
            }

        } catch (java.net.UnknownHostException e) {
            System.out.println(ipv6WithZone + " It is a invalid ip address.");
        }
    }
}

java -version
openjdk version "1.8.0_392"

output:
2006:abcd::1%% It is a invalid ip address.

3. python

import ipaddress

def is_valid_ip(ip):
    try:
        ipaddress.ip_address(ip)
        return True
    except ValueError:
        return False

print(is_valid_ip("2006:abcd::1%%"))

python version
Python 3.9.11 (main, Jul 18 2024, 15:21:06)

output:
False

why Go language behaves differently when parsing IPv6 zone IDs compared to other languages as shown above? Thank you.

@gopherbot gopherbot added this to the Proposal milestone Jan 21, 2025
@gabyhelp gabyhelp added the LibraryProposal Issues describing a requested change to the Go standard library or x/ libraries, but not to a tool label Jan 21, 2025
@seankhliao seankhliao changed the title proposal: net/netip: netip.ParseAddr is inconsistent with certain parameters compared to other languages net/netip: ParseAddr should reject invalid zone Jan 21, 2025
@seankhliao seankhliao added NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. BugReport Issues describing a possible bug in the Go implementation. and removed Proposal LibraryProposal Issues describing a requested change to the Go standard library or x/ libraries, but not to a tool labels Jan 21, 2025
@seankhliao
Copy link
Member

I think this should be a bug not a proposal.
RFC6874 https://datatracker.ietf.org/doc/html/rfc6874 has

ZoneID = 1*( unreserved / pct-encoded )
IPv6addrz = IPv6address "%25" ZoneID

referring to RFC3986 https://datatracker.ietf.org/doc/html/rfc3986#section-2.3

unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"

a lone % should be considered invalid.

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/642896 mentions this issue: net/netip: reject invalid zone identifiers in ParseAddr

@ianlancetaylor
Copy link
Member

Note that as far as I can see the C program does not accept any zone ID at all.

@qinlonglong123
Copy link
Author

Note that as far as I can see the C program does not accept any zone ID at all.

Sorry, the above C program cannot be used as one of the validation programs, because even a normal IPv6 address with a zone ID cannot be resolved.Thank you for your correction.

@seankhliao
Copy link
Member

From CL 642896, windows contains a builtin "Loopback Pseudo-Interface 1" (contains spaces)
and linux accepts interface names like # or &.
I'm not sure what sort of validation (if any) we can do on the zone.

@seankhliao
Copy link
Member

The python docs point to RFC 4007.
Section 11.2 includes the following: https://datatracker.ietf.org/doc/html/rfc4007.html#section-11.2

  An implementation MAY support other kinds of non-null strings as
   <zone_id>.  However, the strings must not conflict with the delimiter
   character.  The precise format and semantics of additional strings is
   implementation dependent.

That points to % being the only disallowed character, which somewhat conflicts with RFC6874's pct-encoded.
Given that we don't know if ParseAddr is going to operate on an encoded or decoded form, perhaps we just shouldn't try to validate it (keep things as is).

@qinlonglong123
Copy link
Author

The python docs point to RFC 4007. Section 11.2 includes the following: https://datatracker.ietf.org/doc/html/rfc4007.html#section-11.2

  An implementation MAY support other kinds of non-null strings as
   <zone_id>.  However, the strings must not conflict with the delimiter
   character.  The precise format and semantics of additional strings is
   implementation dependent.

That points to % being the only disallowed character, which somewhat conflicts with RFC6874's pct-encoded. Given that we don't know if ParseAddr is going to operate on an encoded or decoded form, perhaps we just shouldn't try to validate it (keep things as is).

I personally believe that the pct-encoding in RFC6874 refers to the special characters in the URI's host, which should be represented in pct-encoded form. If we only consider the ParseAddr function, it should not take encoding into account. Because url.Parse already decodes pct-encoded characters.

@seankhliao seankhliao modified the milestones: Proposal, Unplanned Feb 1, 2025
@seankhliao seankhliao added NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. and removed NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Feb 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
BugReport Issues describing a possible bug in the Go implementation. NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made.
Projects
None yet
Development

No branches or pull requests

5 participants