-
Notifications
You must be signed in to change notification settings - Fork 171
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The subpath
normalization algorithm should follow RFC 3986
#405
Comments
The rule is simple: for all path segments: path segment MUST NOT be I personally don't think this is confusion at all. It is a question of the robustness principle to still accept these invalid PURLS and make the best out of it. |
To me , "dropped" is not the same as "forbidden". "Dropped" means it's allowed, but ignored whereas "forbidden" means that an implementation should throw an error. Currently, it's not invalid as per the test suite because the test suite has In I will lay out again my arguments again as to why the current behavior is suboptimal:
|
Probably there should a section for a lenient and a strict parser. A strict parser would not discard |
@dwalluck, please help improve the current spec, by pull requesting improvements of wording and phrasing. |
It's hard to suggest an exact change when I am not even certain what is intended. I can give the same example that we already used. The statement about dots in the path here, particularly using strong language like MUST NOT purl-spec/PURL-SPECIFICATION.rst Lines 208 to 211 in 7f7e82f
when compared to what it says here purl-spec/PURL-SPECIFICATION.rst Line 319 in 7f7e82f
If "it" (it is what form? I am not sure it is specified) must not contain them, then we must have thrown an error, hence there's nothing to drop. Otherwise it MAY contain them, but we MUST drop them from the canonical form. I think I have stated elsewhere that the spec should be clear what is allowed as input vs. what is allowed in canonical form (e.g., what a "relaxed" parser is expected to fix when converted to canonical form and not error out). I think I know that you intend the dots to be allowed in the input, but not the canonical form, but I do not find that clearly stated. There are conflicts, mostly involving
Since the current spec is not finalized, I did not understand that breaking changes are impossible at this point. |
Simply dropping
'..'
from the output leads to the case where the normal (human) interpretation of a path containing two dots differs from what the spec says.For example, the normal interpretation of the
fragment
of a URI likepkg:golang/google.golang.org/genproto#googleapis/../api/annotations
isgoogleapis/annotations
, notgoogleapis/api/annotations
as in the spec.If someone provides that URI as input, they are most likely intending the later version to be the output.
I am suggesting just following https://datatracker.ietf.org/doc/html/rfc3986#section-5.2.4 instead so that the usual meaning is preserved.
The text was updated successfully, but these errors were encountered: